Mike Fährmann
16e276fca4
[nijie] fix image URLs for single image posts ( #5842 )
...
fixes regression introduced in 2e11b6e7
2 months ago
Mike Fährmann
2e11b6e756
[nijie] support downloading videos ( #5707 , #5617 )
3 months ago
Mike Fährmann
fe7e2281ac
[nijie] increase default delay between requests ( #5221 )
...
1-2s is not enough
7 months ago
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts ( #5049 )
8 months ago
Mike Fährmann
b6903a4c90
[nijie] add 'count' metadata field
...
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1812849102
9 months ago
Mike Fährmann
a30a3e44d5
[nijie] move 'username required' out of _login_impl
9 months ago
Mike Fährmann
57fc6fcf83
replace '24*3600' with '86400'
...
and generalize cache maxage values
9 months ago
Mike Fährmann
4eb3590103
[nijie] fix image URLs of multi-image posts ( #4876 )
10 months ago
Mike Fährmann
3984a49abf
[nijie] set 1-2s delay between requests to avoid 429 errors
11 months ago
Mike Fährmann
3ecb512722
send Referer headers by default
1 year ago
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
1 year ago
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
1 year ago
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
1 year ago
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2 years ago
Mike Fährmann
3b369ce3d1
[nijie] add 'followed' extractor ( #3048 )
2 years ago
Mike Fährmann
c4a62a48ae
[nijie] add 'feed' extractor ( #3048 )
2 years ago
Mike Fährmann
636d03df95
[nijie] reduce cache maxage to 90 days
2 years ago
Mike Fährmann
241e82e18d
[horne] add support for horne.red ( #2700 )
2 years ago
Mike Fährmann
d11e2191ae
[nijie] support /history_nuita.php listings ( closes #2541 )
2 years ago
Mike Fährmann
1f9a0e2fd8
update extractor test results
2 years ago
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
...
only leave them in oauth.py as noop results
3 years ago
Mike Fährmann
b58e605dc7
raise error when required username or password are missing
...
do not try to login as 'None' (#1192 )
4 years ago
Mike Fährmann
6514312126
[nijie] add 'include' option ( closes #1018 )
4 years ago
Mike Fährmann
e62c209ca0
[nijie] fix 'date' parsing
5 years ago
Mike Fährmann
94dbdbf506
[nijie] change default filename format
...
… to be consistent with Pixiv filenames
5 years ago
Mike Fährmann
1faec285d1
[nijie] further improvements ( closes #423 )
...
- provide a 'user_name' metadata field
- usually the same as 'artist_id', except for favorite downloads
- extract the whole description text and properly escape HTML entities
- fixed an issue with titles or tags containing double quotes
5 years ago
Mike Fährmann
20eb6c401f
[nijie] improvements and fixes ( #423 )
...
- ignore unavailable image pages
- more metadata fields: artist_name, date, tags
- rename 'index' to 'num'
- improved code structure
5 years ago
Mike Fährmann
12da6bd0c9
[simplyhentai] fix/improve extraction
5 years ago
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
...
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
5 years ago
Mike Fährmann
b89f0d8d3c
update extractor result tests
5 years ago
Mike Fährmann
a2af2d2965
adjust cache maxage values
6 years ago
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
6 years ago
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
6 years ago
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
6 years ago
Mike Fährmann
00dc37ccbf
replace AsynchronousMixin Extractor with a Mixin
6 years ago
Mike Fährmann
dd358b4564
improve cookie handling during logins
6 years ago
Mike Fährmann
173add6935
[nijie] fix artist_id extraction
...
view_popup.php pages for older images or dojins either have the
artist_id value at a different place or not at all.
6 years ago
Mike Fährmann
017188d268
improve extractor.request()
...
Replace the 'fatal' parameter with 'expect', which is a list/range
of HTTP status codes >= 400 that should also be accepted.
6 years ago
Mike Fährmann
2d17a9e07f
improve extractor.request()
...
- better retry behavior
- exponential back-off
- removed 'allow_empty' argument
7 years ago
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
7 years ago
Mike Fährmann
7b562907c3
[nijie] add favorites extractor
...
adds support for 'https://nijie.info/user_like_illust_view.php?id= ...'
7 years ago
Mike Fährmann
445db75955
[nijie] improve extraction and metadata
...
- add 'title' and 'description'
- split 'artist_id' into 'user_id' and 'artist_id'
- 'user_id' is the ID of the user from which the image entry
originates from
- 'artist_id' is the ID of the actual image artist
- improve pagination and URL patterns
7 years ago
Mike Fährmann
a112e3f2a0
[nijie] add doujin extractor
...
adds support for "https://nijie.info/members_dojin.php?id= <artist_id>"
7 years ago
Mike Fährmann
3cec533c28
Merge branch 'archive'
7 years ago
Mike Fährmann
f5f2d29f56
[nijie] fix dojin extraction
...
- correctly extract artist_id
- set extension to "jpg" if it was empty and let filetype checks do
the rest
7 years ago
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
7 years ago
Mike Fährmann
9c138dfc1f
[common] detect empty HTTP response bodies
7 years ago
Mike Fährmann
6f30cf4c64
change keyword names to valid Python identifiers
...
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.
(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
7 years ago
Mike Fährmann
915a0137de
improve 'extractor.request'
...
- add 'fatal' argument
- improve internal logic and flow
- raise known exception on error
- update exception hierarchy
7 years ago
Mike Fährmann
7aa9fa796a
code cleanup and fixes
7 years ago