Mike Fährmann
9a25534490
use Extractor._check_cookies() for all cookie checks
3 years ago
Mike Fährmann
63c6bc26b5
[rule34us] extract tags per category ( #1527 )
...
like for other boorus with 'tags': true
3 years ago
Mike Fährmann
f587458a3c
[twitter] include '4096x4096' as a default image fallback
...
(closes #2107 , closes #1881 )
3 years ago
Mike Fährmann
8ed282f7f2
[kemonoparty] support coomer.party URLs ( #2100 )
3 years ago
Mike Fährmann
87ce3fa669
[furaffinity] warn when no session cookies were found
3 years ago
Mike Fährmann
159631c808
[philomena] use a default 'filter_id' if non is given
3 years ago
Mike Fährmann
ad30653b17
allow running a BaseExtractor for any URL
...
by prefixing it with '<base-category>:'
For example:
shopify:https://partakefoods.com/products/crunchy-cookie-variety-pack
gelbooru_v01:https://5naf.booru.org/index.php?page=post&s=view&id=46963
Available base categories are:
mastodon, shopify, moebooru, gelbooru_v01, gelbooru_v02,
reactor, foolslide, foolfuuka, philomena
3 years ago
Mike Fährmann
299bd2f1f5
[rule34us] add 'tag' and 'post' extractors ( #1527 )
3 years ago
Mike Fährmann
3cf1075d86
[inkbunny] add 'search' extractor ( closes #2094 )
3 years ago
Mike Fährmann
c6a23c26d7
[instagram] allow downloading specific stories ( closes #2088 )
...
https://instagram.com/stories/ <USER>/<ID> now only downloads the one
story specified by <ID> and not all stories from that user.
3 years ago
Mike Fährmann
352ffcddb0
[instagran] match post URLs with usernames ( fixes #2085 )
3 years ago
Mike Fährmann
f4e3cee6ac
use yt-dlp by default ( #1850 , #2028 )
3 years ago
Mike Fährmann
f1b142e993
{kemonoparty[ change default 'files' order to attachments,file,inline
...
(#1991 )
3 years ago
Mike Fährmann
275543b2d2
update extractor test results
3 years ago
Mike Fährmann
e7ea4f2567
[mangoxo] fix metadata extraction
3 years ago
Mike Fährmann
e298882acc
[kemonoparty] match URLs with www subdomain
3 years ago
Mike Fährmann
addb72e1bb
[reactor] support thatpervert.com ( closes #2029 )
3 years ago
Mike Fährmann
d8d9502e1e
[reactor] inherit from BaseExtractor
3 years ago
Mike Fährmann
f4ea216c95
[shopify] support loungeunderwear.com ( closes #2053 )
3 years ago
Mike Fährmann
93cef78450
[gelbooru] workaround pagination limits
...
Gelbooru only allows to retrieve the latest 20k posts for a tag search.
Add 'id:<N' to the search tags to work around that limitation, where N
is the ID of the last retrieved post.
http://gelbooru.me/index.php?page=forum&s=view&id=1467
3 years ago
Mike Fährmann
f2ae179713
[exhentai] fix extraction for disowned galleries ( closes #2055 )
3 years ago
Alice
612850438e
[skeb] add 'thumbnails' option ( #2047 ) ( #2051 )
3 years ago
Mike Fährmann
11a3d96d13
[mangadex] load additional metadata using includes[] directives
...
- always provide 'artist', 'author', and 'group' metadata fields (#2049 )
- remove 'metadata' option
3 years ago
Mike Fährmann
19e00f1322
[dynastyscans] provide 'date' as proper datetime object ( #2050 )
3 years ago
Mike Fährmann
af6424f398
allow testing metadata in list elements
3 years ago
Mike Fährmann
c67756e187
[kemonoparty] add 'dms' option ( #2008 )
3 years ago
Mike Fährmann
3a7a19c7b9
[dynastyscans] add 'manga' extractor ( closes #2035 )
3 years ago
Mike Fährmann
9bc83af3a6
[kemonoparty] 'postfile' -> 'file' ( #1991 )
...
to stay consistent with the existing file types for kemono
3 years ago
Mike Fährmann
522782c09d
[subscribestar] emit metadata for posts without media ( #1569 )
3 years ago
Mike Fährmann
1c8aaf9318
[subscribestar] add 'num' enumeration index ( closes #2040 )
3 years ago
Mike Fährmann
d433735750
[kemonoparty] skip duplicate files ( #2032 , #1991 , #1899 )
...
Extract the SHA-256 file hash from URLs
and skip files with the same hash in the same post.
- provide a 'hash' metadata field (empty string if not available)
- remove 'patreon-skip-file' option
3 years ago
Mike Fährmann
d4ec245554
[kemonoparty] implement a 'files' option ( #1991 )
...
similar to 8d676151
3 years ago
Mike Fährmann
ab8eea1a24
[twitter] fix extractor for direct image links ( fixes #2030 )
3 years ago
Mike Fährmann
2076d40681
[ytdl] improve error handling ( #1680 )
3 years ago
Mike Fährmann
2aaac3c997
[instagram] include user metadata for 'tagged' downloads ( #2024 )
...
Adds
- tagged_owner_id
- tagged_full_name
- tagged_username
containing the values for the user profile the URL originated from,
e.g. 'instagram' for https://www.instagram.com/instagram/tagged/ .
3 years ago
Mike Fährmann
cfa4876848
[philomena] support furbooru.org ( closes #1995 )
3 years ago
Mike Fährmann
4377f1c284
[twitter] distinguish between fatal & nonfatal errors ( #2020 )
...
only show a warning for nonfatal errors
and do not raise a StopExtraction exception
3 years ago
Kyle Anthony Williams
a14b72be21
[webtoons] Use swebtoon-phinf.pstatic.net instead of webtoon-phinf.pstatic.net ( #2005 )
...
* [webtoons] Use swebtoon-phinf.pstatic.net instead of webtoon-phinf.pstatic.net
This trick to avoid having to set a Referer header comes from
Webtoon's RSS feeds. The two URLs below are equivalent in content:
https://webtoon-phinf.pstatic.net/20210929_153/1632867980912DmcGK_JPEG/16328679808882705182.jpg?type=q90
https://swebtoon-phinf.pstatic.net/20210929_153/1632867980912DmcGK_JPEG/16328679808882705182.jpg?type=q90
The URL with the domain "webtoon-phinf.pstatic.net" needs a Referer
header, and the domain "swebtoon-phinf.pstatic.net" does not. This
is because of the environment "swebtoon" images live in, one without
explicit network control: RSS feeds on sites such as Feedly. This change should
make it easier for gallery-dl developers to embed Webtoon comics without
worrying about headers.
3 years ago
Mike Fährmann
6e3658ef52
[kemonoparty] provide 'date' metadata for gumroad ( #2007 )
...
Not the 'published' or 'edited' values since they are 'null',
but still better then nothing at all.
3 years ago
Mike Fährmann
37c9dedee1
[seisoparty] remove module
3 years ago
Mike Fährmann
efa178cc91
[ytdl] implement parsing ytdl command-line options ( #1680 )
...
- adds 'config-file' and 'cmdline-args' options
for both ytdl downloader and extractor
- create 'ytdl' helper module, which combines YoutubeDL creation
and option parsing.
- most likely a buggy mess due to incompatibilities between the
original youtube-dl and yt-dlp.
3 years ago
Mike Fährmann
7cb303d745
[redgifs] improve URL extraction
...
Fields inside 'urls' can be None, which would have caused an exception
with the old method.
3 years ago
Mike Fährmann
2befed1a96
[redgifs] update search URL pattern ( #1984 )
3 years ago
Mike Fährmann
b315a0ecef
[redgifs] update to API v2 ( #1984 )
3 years ago
Mike Fährmann
f0fc3b0ba1
[kemonoparty] add 'comments' option ( #1980 )
3 years ago
Mike Fährmann
1fac74b14d
[reddit] prevent crash for galleries with no 'media_metadata'
...
(fixes #2001 )
3 years ago
Mike Fährmann
211de95dd0
update extractor test results
3 years ago
Mike Fährmann
8bea02c38c
[deviantart] fix 'index' values for stashed deviations
3 years ago
Mike Fährmann
dd88a7d980
{cyberdrop] restore video extraction ( fixes #1993 )
...
fixes a regression introduced in f33c2ef7
3 years ago
Mike Fährmann
fa5646eadc
[mangoxo] fix login and extraction
3 years ago