Marius Kaufmann
0aa8345a13
[mastodon] allow downloading without access token ( #2782 )
...
Most mastodon instances allow accessing /api/v1/accounts/XXXX/statuses and api/v1/statuses/XXXX without an API access token.
This commit allows users to download at least some links from such a mastodon instance that does not already have access tokens hard-coded into the extractor.
User extractor only works on links that include the user id such as https://mastodon.tld/@id:12345 . Status links work as-is.
2 years ago
thatfuckingbird
ea5ffb19a6
fanbox: download cover images in original size ( #2784 )
2 years ago
Chew Shee Yang
977d53b640
[Instagram] Add support for user's saved collection ( #2769 )
...
* [Instagram] Add support for user's saved collection
* [Instagram] Run formatter
* [Instagram] Simplify collection_id retrieval and add metadata
* [Instagram] Fix bug when params is not passed to _pagination_api
2 years ago
blankie
5b63df46c0
[tumblr] attempt to get higher-quality images ( #2761 )
2 years ago
blankie
59b16b3f70
[artstation] add 'num' and 'count' metadata fields ( #2764 )
2 years ago
Mike Fährmann
eb68d45544
add global 'warnings' option ( #2762 )
2 years ago
Mike Fährmann
f225247670
[gelbooru] add support for `api_key` and `user_id` ( #2767 )
2 years ago
Mike Fährmann
77bdd8fe0f
[twitter] implement constant 'user' for 'from:…' searches
2 years ago
Mike Fährmann
a267a05a3f
[twitter] update 'quote_id' and 'quote_by'
...
- 'quote_id' is now non-null for quoted Tweets and has the ID of the
quoting Tweet, instead the other way round like before
- 'quote_by' is now the 'screen_name' of the quoting user
(was the same the new 'quote_id' is now)
2 years ago
Mike Fährmann
749802c7bd
[twitter] update 'user' and 'author' fields
...
- 'author' is always the user who authored a tweet
- 'user' is always the user specified in the input URL
or equal to 'author' when the former is not given
2 years ago
Mike Fährmann
a566e63cdf
[tumblr] support '/blog/view' URLs ( #2760 )
2 years ago
Mike Fährmann
46f11a3118
[bunkr] fix extraction ( #2732 )
...
move bunkr.is code to its own module
2 years ago
Mike Fährmann
baf3815ebd
[nozomi] small code optimizations
2 years ago
blankie
836402bf58
[twitter] unescape content ( #2756 ) ( #2757 )
...
Fixes #2756
2 years ago
Mike Fährmann
62cc47755b
[nozomi] reduce memory consumption during searches ( #2754 )
...
only load and use the entire 'index.nozomi' database
if there are only negative search terms
2 years ago
Mike Fährmann
467a2a4d35
[instagram] add 'pinned' metadata field ( #2752 )
...
'pinned' is a list of user IDs for which a post is pinned
and empty if not pinned anywhere.
2 years ago
Mike Fährmann
fe2b3d57d4
[komikcast] update domain
2 years ago
Mike Fährmann
4e11ca737e
[hentaifoundry] fix metadata extraction
2 years ago
Mike Fährmann
f2e59cc906
[slideshare] fix 'description' extraction
2 years ago
Mike Fährmann
31e868fca1
[khinsider] extract 'platform' metadata
2 years ago
Mike Fährmann
c6a9bab019
update extractor test results
2 years ago
Mike Fährmann
539e3bbed9
[weibo] handle invalid/broken status objects
2 years ago
Mike Fährmann
32c75d12e8
[sankaku] rewrite URLs to s.sankakucomplex.com ( #2746 )
2 years ago
Mike Fährmann
d5ded11aa8
[pixiv] fix default filenames for backgrounds
2 years ago
Mike Fährmann
e1f501ed14
[mangakakalot] update domain
2 years ago
Mike Fährmann
2dc57637cf
[foolfuuka] remove archive.wakarimasen.moe
2 years ago
Mike Fährmann
98744977cf
[itaku] fix 'date' parsing
2 years ago
Mike Fährmann
b590774f67
[twitter] add 'count' metadata field ( #2741 )
2 years ago
Mike Fährmann
7c0505868c
[kemonoparty] ensure all files have an 'extension' ( #2740 )
2 years ago
Mike Fährmann
e4f48cc810
make it easier to disable default 'browser' settings
...
Previously it was necessary to set 'browser' to a non-empty, non-string
value to disable any default 'browser' value.
Now '-o browser=' or '-o browser=false' is enough.
2 years ago
Mike Fährmann
92b75bcdce
limit path length for --write-pages output on Windows ( #2733 )
2 years ago
Mike Fährmann
311e9383af
[pinterest] handle section pins with separate extractors ( #2684 )
2 years ago
Mike Fährmann
1d14928bd9
[twitter] ignore previously seen Tweets ( #2712 )
...
occurs primarily for /with_replies results when logged in
2 years ago
Mike Fährmann
4b2a0a0eda
[twitter] implement 'strategy' option ( #2712 )
...
to be able to better control what Tweets get used an returned
for twitter.com/USER URLs.
2 years ago
Mike Fährmann
c794777600
[newgrounds] prevent exception on empty results ( #2727 )
2 years ago
Mike Fährmann
36ead45546
[itaku] fix caching bug ( #1842 )
...
ItakuApi.user() would always return the first user it was called with,
regardless of its 'username' argument.
2 years ago
Mike Fährmann
127a190c94
[itaku] categorize sections by group ( #1842 )
2 years ago
Mike Fährmann
de20cadc68
add 'brotli' as optional dependency ( #2716 )
...
only send 'Accept-Encoding: br' if supported
2 years ago
Mike Fährmann
37453a9528
[newgrounds] only login if necessary ( #2715 )
2 years ago
Mike Fährmann
7b073bf9ef
Revert "[twitter] improve strategy for user URLs ( #2665 )"
...
'user_tweets_and_replies' was a mistake
2 years ago
Mike Fährmann
3a5d5c3a91
update default User-Agent header to Firefox 102 ESR
...
snd update headers and ciphers for "browser": "firefox"
2 years ago
Mike Fährmann
f8cfc3b08a
[skeb] add 'following' extractor ( #2698 )
2 years ago
Mike Fährmann
367a491128
[vk] get URLs from *_src entries ( #2535 )
...
https://github.com/mikf/gallery-dl/issues/2535#issuecomment-1166566986
2 years ago
Mike Fährmann
241e82e18d
[horne] add support for horne.red ( #2700 )
2 years ago
Mike Fährmann
7af4d2047b
[instagram] improve metadata generated by _parse_post_api()
...
(#2695 )
2 years ago
Mike Fährmann
3f50e2fb5f
[poipiku] add simple password support ( #1602 )
2 years ago
Mike Fährmann
9d8e99af80
[itaku] support videos ( #1842 )
2 years ago
Mike Fährmann
c8ec2c4e85
[itaku] add 'title' to default filenames ( #1842 )
2 years ago
Mike Fährmann
e0c60a1206
[itaku] metadata cleanup ( #1842 )
...
- parse 'date_added' as 'date'
- simplify 'tags', 'categorized_tags', and 'sections'
2 years ago
Mike Fährmann
27e8078fb7
[poipiku] add 'user' and 'post' extractors ( #1602 )
2 years ago
Mike Fährmann
fa902cd54d
[itaku] add 'gallery' and 'image' extractors ( #1842 )
2 years ago
Mike Fährmann
d6c6c8a4a0
[twitter] improve '"replies": "self"' ( #2665 )
...
If a username is given in the input URL,
only download from replies by that user.
2 years ago
Mike Fährmann
9c8d895d19
[twitter] implement 'csrf' option ( #2676 )
2 years ago
Mike Fährmann
08db8435f1
[twitter] fix pagination for conversion tweets
...
a relic from the switch to GraphQL API
2 years ago
Mike Fährmann
78d83345d3
[cyberdrop] add fallback URLs ( #2668 )
2 years ago
Mike Fährmann
834e900037
[unsplash] add collection_title and …_id metadata fields ( #2670 )
2 years ago
Mike Fährmann
6db77d4656
[weibo] support '?tabtype=video' listings ( #2601 )
2 years ago
Mike Fährmann
1da3ccf608
[twitter] implement 'expand' option ( #2665 )
2 years ago
Mike Fährmann
0add1fc090
[twitter] improve strategy for user URLs ( #2665 )
...
- use '/with_replies' when appropriate
- consider 'text-tweets'
- build search query as necessary
2 years ago
Mike Fährmann
45c980daf0
[weibo] fix retweets ( #2601 )
2 years ago
Mike Fährmann
ae1b24aa6a
[instagram] automatically invalidate expired login sessions
2 years ago
Mike Fährmann
47a92c8c7e
[instagram] provide 'date' for 'carousel_media' files ( #2660 )
2 years ago
Mike Fährmann
2064f20e11
[instagram} fix 'tag' extractor ( #2659 )
2 years ago
Mike Fährmann
6c0fa2f258
[readcomiconline] update
2 years ago
Mike Fährmann
61cbf8318c
[weibo] fix URLs generated by 'user' extractor ( #2601 )
2 years ago
Mike Fährmann
4b78bd423f
[paheal] add 'metadata' option ( #2641 )
2 years ago
Mike Fährmann
535cbcb185
cache extracted browser cookies
...
(in memory, for as long as gallery-dl is running)
Extracting encrypted cookies from a chromium-based browser can take a
long time, so repeating this process for each extractor should be
avoided.
Same goes for creating a temporary copy of the entire cookie database.
2 years ago
Mike Fährmann
541a61d344
[subscribestar] fix 'date' metadata ( #2642 )
...
Handle instances where the actual datetime information
is preceded by "Updated on "
2 years ago
Mike Fährmann
46d171c938
[instagram] fix stories ( #2644 )
...
fixing the fix ...
2 years ago
Mike Fährmann
e59bcb8437
[weibo] ensure media URLs use https://
2 years ago
Mike Fährmann
73f673e3ca
[weibo] handle 'gif' pictures
2 years ago
Mike Fährmann
345199a3ec
[pixiv] include '.gif' in background fallback URLs ( #2495 )
2 years ago
Mike Fährmann
57508d3bb7
[weibo] support all different 'tabtype' listings ( #686 , #2601 )
2 years ago
Mike Fährmann
2687ef6bd9
[nozomi] remove slashes from search terms ( fixes #2653 )
2 years ago
Mike Fährmann
ee7cea888e
[instagram]
...
it is now possible to use 'id:…' instead of a user's screen name:
- https://www.instagram.com/instagram/
- https://www.instagram.com/id:25025320/
similar to the same functionality for twitter:
a3b473bd2f
for /tagged/ URLs, using a user ID will only have 'tagged_owner_id'
defined. 'tagged_username' and 'tagged_full_name', which are available
when using a screen name, will not be defined.
2 years ago
Mike Fährmann
d0dc29f312
[instagram] fix stories ( #2644 )
2 years ago
Mike Fährmann
2fb01938f4
[instagram] fix and update extractors ( #2644 )
...
- use different way to fetch user IDs
- use new API endpoints for /tagged/ and single posts
2 years ago
Mike Fährmann
05d4a0215a
[sankaku] extend URL patterns ( fixes #2647 )
...
- support URLs with ISO 639-1 language codes
- support black.… and white.… subdomains
2 years ago
Mike Fährmann
e0ac358aa5
[gofile] fix 401 Unauthorized errors ( #2632 )
2 years ago
Mike Fährmann
8a42d859bf
[bunkr] change domain to 'app.bunkr.is' ( #2634 )
2 years ago
Mike Fährmann
7a9cba9c10
[weibo] add support for usernames in URLs ( #1662 )
2 years ago
Mike Fährmann
4bf5bc2403
[weibo] support 'livephoto' entries ( #2146 )
2 years ago
Mike Fährmann
a0692818af
[weibo] switch to desktop API ( #2601 )
2 years ago
Mike Fährmann
61fa9b535a
[paheal] improve metadata extraction ( #2641 )
...
- unescape 'tags'
- add 'date', 'source', and 'uploader' for single posts
2 years ago
Mike Fährmann
415c208c1f
[gfycat] cleanup
2 years ago
Mike Fährmann
a80ba17ed4
[gfycat] add 'collections' extractor ( #2629 )
2 years ago
Mike Fährmann
ff5e10a86d
[hypnohub] move to gelbooru_v02 instances ( #2631 )
2 years ago
Mike Fährmann
d6e744bf0f
[gfycat] add 'collection' extractor ( #2629 )
2 years ago
Mike Fährmann
4f7fe9b4be
[deviantart] fix folder listings with 'pagination: manual'
...
(#2488 )
2 years ago
Mike Fährmann
310fee99d5
[readcomiconline] remove automatic 'browser' setting ( #2625 )
2 years ago
Mike Fährmann
d4e9d51760
[reddit] add 'home' extractor ( #2614 )
2 years ago
Infinitay
f54525573b
[Instagram] Add tagged_users to keywords for stories ( #2582 ) ( #2584 )
2 years ago
thatfuckingbird
da0696e1f5
recognize vxtwitter URLs ( #2621 )
2 years ago
Mike Fährmann
dcb580240d
[twitter] extract alt texts as 'description' ( closes #2617 )
2 years ago
Mike Fährmann
915dba8345
[twitter] improve results for regular user URLs
...
- continuation of 3346f58a
- use media timeline results (or tweet timeline if retweets are enabled)
plus search results starting from the last tweet id of the first
timeline, similar to how Twitter Media Downloader operates
- the old behavior can be forced by appending '/tweets' to a user URL,
like with '/media' (https://twitter.com/USER/tweets )
although there should be no need to ever do that
2 years ago
Mike Fährmann
9df4e0f65b
[twitter] disable 'cards' by default
2 years ago
Mike Fährmann
79dce8ae68
[weasyl] implement 'metadata' option ( #2610 )
2 years ago
Mike Fährmann
9d5580a091
[khinsider] fix metadata extraction ( closes #2611 )
2 years ago
Mike Fährmann
688d6553b4
replace calls to print() with stdout_write() ( #2529 )
2 years ago
Mike Fährmann
86cbf485ab
[webtoons] extract real episode number ( #2591 )
...
The number from the 'episode_no' query parameter
got renamed to 'episode_no'.
2 years ago
Mike Fährmann
82c1cc130b
[readcomiconline] update deobfuscation code ( #2481 )
2 years ago
Mike Fährmann
4005171db3
[pixiv] provide more metadata fields when option enabled ( #2594 )
2 years ago
Mike Fährmann
c8abb16c60
[mangahere] send Referer headers ( #2592 )
2 years ago
Mike Fährmann
3fd9249717
[mangafox] send Referer headers ( #2592 )
2 years ago
Mike Fährmann
90d28387ef
[instagram] detect empty story listings faster
2 years ago
Mike Fährmann
bd6ec5c352
[foolfuuka] match 4chan filenames ( #2577 )
...
introduce two new metadata fields:
- filename_media: original filename of file uploaded to 4chan
- timestamp_ms : timestamp with millisecond precision (tim)
2 years ago
Mike Fährmann
feb470d19a
[shopifx] natively support a few more sites ( closes #2089 )
...
- chelseacrew.com
- michaels.com.au
- modcloth.com
- pinupgirlclothing.com
- raidlondon.com (loveraid.com)
- unique-vintage.com
2 years ago
Mike Fährmann
60f4d59b1e
[gelbooru_v01] remove 'tlb.booru.org' from supported domains
...
403 Forbidden
nginx
it is also no longer listed on https://booru.org/top
2 years ago
Mike Fährmann
6b6eb0b8f6
[lolisafe] implement 'domain' option ( #2575 )
2 years ago
Mike Fährmann
d26da3b9e5
add pre-generated 'pattern' for supported BaseExtractor sites
2 years ago
Mike Fährmann
6ae3a5cdb0
[pixiv] make retrieving ugoira metadata non-fatal ( #2562 )
2 years ago
Mike Fährmann
6742f3bc1e
implement --cookies-from-browser ( #1606 )
...
most of the code is adapted from yt-dlp's implementation
and *should* work the same.
2 years ago
Mike Fährmann
c4b9f7bab8
update functions working with cookies.txt files
...
- rename
- load_cookiestxt -> cookiestxt_load
- save_cookiestxt -< cookiestxt_store
- in cookiestxt_load, add cookies directly to a cookie jar
instead of storing them in a list first
- other unnoticeable performance increases
2 years ago
Mike Fährmann
f190018e37
[mangasee] use randomly generated PHPSESSID cookie ( #2560 )
2 years ago
Mike Fährmann
4c47dfffdd
[instagram] report redirects to captcha challenges ( #2543 )
2 years ago
Mike Fährmann
4598d32370
[imgur] prevent exception for empty albums ( closes #2557 )
2 years ago
Mike Fährmann
435e9c5d2e
[vk] report errors for private albums ( #2556 )
2 years ago
Mike Fährmann
9adea93aef
[pixiv] updates to avatar/background extractors ( #2495 )
...
- add 'date' metadata to avatar/background files when available
and use that in default filenames / archive ids
- remove deprecation warnings as their option names clash with
subcategory names
2 years ago
Mike Fährmann
3e6aba05ab
[vk] add fallback for user ID extraction ( #2535 )
2 years ago
Mike Fährmann
52b47c3cf9
[gelbooru_v01] add 'favorite' extractor ( #2546 )
2 years ago
Mike Fährmann
5b7423d14c
[vk] fix URLs for older photos ( #2535 )
2 years ago
Mike Fährmann
3346f58a2a
[twitter] use twMediaDownloader strategy for user URLs
...
- use media timeline + search for default user URLs like
https://twitter.com/SCREEN_NAME
- fetches all/most media for the type of twitter URL that most users
use with gallery-dl
- can be disabled by setting 'strategy' to any truthy value,
like "timeline"
2 years ago
Mike Fährmann
84756982e9
[pixiv] implement 'include' option
...
- split 'user' extractor and its 'avatar' and 'background' options into
separate extractors ('artworks', 'avatar', 'background')
- avatars can now be downloaded with
https://www.pixiv.net/en/users/ID/avatar
as URL and will use a proper archive key; similar for backgrounds
- options for the 'user' subcategory must be moved to 'artworks' to have
the same effect as before
2 years ago
Mike Fährmann
d11e2191ae
[nijie] support /history_nuita.php listings ( closes #2541 )
2 years ago
Mike Fährmann
4aca29b7b4
[naverwebtoon] support (best)challenge comics ( closes #2542 )
...
and update URL pattern to match URLs without '.nhn'
2 years ago
Mike Fährmann
3e926bd465
[realbooru] fix extraction ( fixes #2530 )
2 years ago
Mike Fährmann
82eee72b39
[pixiv] update API interface
...
- start all endpoints with '/'
- use extractor.wait() for rate limit
- retry with while loop instead of recursion
- in case of error, write entire response to debug log
2 years ago
Mike Fährmann
1bc77efa02
[artstation] use "browser": "firefox" by default ( #2527 )
2 years ago
Mike Fährmann
a39e7b7366
[vk] handle photos without width/height info ( fixes #2535 )
2 years ago
Federico Ravasio
0381752575
[photovogue] switch to .com, update api endpoint ( #2494 )
2 years ago
Mike Fährmann
3f02e483c6
[e621] fix applying request_interval_min ( #2533 )
...
Setting this property after calling Extractor.__init__() has no effect.
2 years ago
Mike Fährmann
afde76269c
[weibo] fix infinite retries for deleted accounts ( fixes #2521 )
2 years ago
Mike Fährmann
d85e66bcac
[vk] fix extraction ( #2512 )
...
Use a different API endpoint, since thumbnail URLs from the old one
cannot be transformed into URLs for "original" photos anymore.
2 years ago
Mike Fährmann
9e6ff42a9d
[pixiv] implement 'background' option ( #623 , #1124 , #2495 )
2 years ago
Mike Fährmann
4d1896830f
[mangadex] download chapters with 'externalUrl' ( fixes #2503 )
...
if the have pages hosted on mangadex
2 years ago
Mike Fährmann
97e8a15295
[deviantart] implement 'pagination' option ( #2488 )
2 years ago
Mike Fährmann
1f9a0e2fd8
update extractor test results
2 years ago
Mike Fährmann
ad5a4b1756
[twitter] fix various syndication issues
...
- handle retweets
- fix videos without dimensions in URL (3e942a58
)
- fix '"retweets": "self"' filter (#2499 )
2 years ago
Mike Fährmann
12bd9ba33a
[readcomiconline] add 'quality' option ( #2467 )
2 years ago
Mike Fährmann
60ad46ddcc
[readcomiconline] unobfuscate image URLs ( #2481 )
2 years ago
Mike Fährmann
a6c4ff58fb
[cyberdrop] match cyberdrop.to URLs ( closes #2496 )
2 years ago
Mike Fährmann
13ed18b9aa
[lolisafe] fix typo
...
LolisafelbumExtractor -> LolisafeAlbumExtractor
2 years ago
Mike Fährmann
3e942a58be
[twitter] improve syndication video selection ( #2354 )
...
- ignore .m3u8 manifests
- always select largest format
2 years ago
Mike Fährmann
0794027100
[issuu] fix extraction ( #2483 )
2 years ago
Mike Fährmann
5d5a08cc69
[sexcom] add fallback for empty files ( #2485 )
2 years ago
thatfuckingbird
4527a35aba
[twitter] accept fxtwitter.com URLs ( #2484 )
2 years ago
Mike Fährmann
c1768972c2
[newgrounds] update and fix pagination ( #2456 )
2 years ago
Mike Fährmann
78e5d0c423
[kissgoddess] extract all images ( closes #2473 )
...
and not only the first two per page
https://github.com/mikf/gallery-dl/issues/1052#issuecomment-1047367383
2 years ago
Mike Fährmann
0b33435da5
[pinterest] support multiple files per pin ( closes #1619 , #2452 )
2 years ago
Mike Fährmann
9c5d2d7af3
[pinterest] add extractor for created pins ( #2452 )
3 years ago
Mike Fährmann
1171911dc3
[twitter] add 'syndication' option ( #2354 )
...
to fetch age-restricted content using Twitter's syndication API
3 years ago
Mike Fährmann
a53cfc845e
[newgrounds] warn about age-restricted posts ( #2456 )
3 years ago
Mike Fährmann
ecee315bbf
[mangasee] unescape manga names ( fixes #2454 )
3 years ago
loragja
7e545a3ae9
[gofile] add gofile.io extractor ( #2364 )
...
* Add gofile extractor
* add gofile extractor to module list
* add support for tiny monitors and ancient python versions
* seriously, f-strings are not *that* new...
* i love flake8 :)
* add 'api-token' and 'recursive' options
* add tests
3 years ago
Layerex
625f4d4cc4
[telegraph] Add telegra.ph extractor ( #2312 )
3 years ago
Mike Fährmann
48cc4853be
[skeb] refactor 'sent-requests' and add tests
3 years ago
Mike Fährmann
37d584a9b2
[hitomi] update metadata extraction ( fixes #2444 )
...
remove 'hitomi.metadata' option, as it is no longer necessary
to make additional HTTP requests to fetch all metadata.
3 years ago
Mike Fährmann
b03ca7f10c
[aryion] provide correct 'date' independent of dst
3 years ago
Mike Fährmann
ba69fb669d
[kemonoparty] add 'duplicates' option ( closes #2440 )
3 years ago
Mike Fährmann
29db716a63
implement 'datetime_to_timestamp()'
...
and rename 'to_timestamp()'
to the more descriptive 'datetime_to_timestamp_string()'
3 years ago
Mike Fährmann
9313d4dc10
[pinterest] do not force 'm3u8_native' for video downloads ( #2436 )
3 years ago
Mike Fährmann
42f2fd2ed7
[twibooru] fix posts without 'name' ( fixes #2434 )
3 years ago
chinggg
6f1d5e8ab9
[unsplash] replace dash with space in search API queries ( #2429 )
3 years ago
Mike Fährmann
f8230dde43
[instagram] add 'previews' option ( #2135 )
3 years ago
Mike Fährmann
500a479026
fix a third(!) bug in _check_cookies() ( #2372 )
...
turns out tests are worthless if you get em wrong ...
3 years ago
Mike Fährmann
c4cc387f7d
[furaffinity] fix search result pagination ( fixes #2402 )
3 years ago
Mike Fährmann
281a5b3b28
[newgrounds] fix video descriptions ( #2328 )
3 years ago
Mike Fährmann
b1b15d6cef
[imagebam] add support for /view/ paths ( closes #2378 )
3 years ago
Mike Fährmann
e64c2b85d0
[fantia] apply patch ( #2381 )
...
from @thatfuckingbird with small adjustments
https://github.com/mikf/gallery-dl/issues/2381#issuecomment-1063208696
3 years ago
Mike Fährmann
f31ab0d2ec
[fanbox] fetch data for each individual post ( fixes #2388 )
...
Posts from 'https://api.fanbox.cc/post.listCreator '
do not contain a 'body' with all images anymore.
https://github.com/mikf/gallery-dl/pull/1459#discussion_r614322881
3 years ago
Mike Fährmann
fc277fa45f
[seiga] require authentication with 'user_session' cookie ( #2372 )
...
Login with username & password would now require entering a 2FA token.
see also 7b009cc893
3 years ago
Mike Fährmann
47cf05c4ab
refactor proxy handling code ( #2357 )
...
- allow gallery-dl proxy settings to overwrite environment proxies
- allow specifying different proxies for data extraction and download
- add 'downloader.proxy' option
- '-o extractor.proxy=–PROXY_URL -o downloader.proxy=null'
now has the same effect as youtube-dl's '--geo-verification-proxy'
3 years ago
Mike Fährmann
d50a1ec2cc
[subscribestar] unescape attachment URLs ( fixes #2370 )
3 years ago
Mike Fährmann
3ddc620ef6
[skeb] fix post extractor ( #2330 )
3 years ago
Orkun Koçyiğit
eb2bb7d998
[fantia] add 'num' enumeration index ( #2377 )
...
* Adding numerical ordering to fantia
* Fixed line to fit PEP8 line size limit
3 years ago
Mike Fährmann
fac8047899
[kemonoparty] limit default filename length ( #2373 )
3 years ago
Mike Fährmann
bfa5e61900
[patreon] add explicit 'image_large' file type ( #2257 )
...
to allow more control over when and if to download 'large_url' images
4fee3a0e52
forced them to be downloaded
instead of regular images, even though 'large_url' images are most likely
an upscaled version of the original.
3 years ago
Mike Fährmann
6ea3ff5173
[tumblr] notify users about registering an oauth application
...
if they hit the daily rate limit and are using default API credentials
3 years ago
Mike Fährmann
b5236656d5
[deviantart] notify users about registering an oauth application
...
if they get repeated 429 errors and are using default API credentials
3 years ago
Mike Fährmann
2aa47e8382
[twitter] handle Tweets with "softIntervention" entries
...
or other such things where the actual Tweet data is one level deeper
than usual
3 years ago
Mike Fährmann
64bbc7969d
[twitter] warn about age-restricted Tweets ( #2354 )
3 years ago
Mike Fährmann
e778be52bc
[twitter] update query hashes
3 years ago
Mike Fährmann
bddcec49f1
implement 'text.root_from_url()'
...
use domain from input URL for kemono
3 years ago
Mike Fährmann
92c492dc09
[kemonoparty] match beta.kemono.party URLs ( #2348 )
3 years ago
Mike Fährmann
4ea9157d51
[mangadex] fix chapters without 'translatedLanguage' ( #2352 )
3 years ago
Alice
f1cab23724
[skeb] add 'sent-requests' option ( #2322 ) ( #2330 )
...
* Update skeb.py
* Update configuration.rst
* flake8
3 years ago
dragobit
781fdfa212
[hentaicosplays] add Referer to headers ( #2317 )
3 years ago
Mike Fährmann
4385a34e05
[twitter] fix handling of 429 responses ( fixes #2339 )
...
Twitter doesn't return a valid JSON response for 429 errors anymore.
3 years ago
Mike Fährmann
5a50569360
[toyhouse] support 'art' listings ( #1546 , #2331 )
3 years ago
Mike Fährmann
1c79044433
[imagebam] set 'nsfw_inter' cookie ( fixes #2334 )
3 years ago
Mike Fährmann
d71c173150
[newgrounds] strip incomplete HTML tag from '_comment' ( #2328 )
3 years ago
Mike Fährmann
cf58048bd4
[newgrounds] add 'post_url' metadata field ( #2328 )
3 years ago
Mike Fährmann
7aa2e2cd84
[slideshare] fix extraction
3 years ago
Mike Fährmann
fdfdc1b614
[kissgoddess] add 'gallery' and 'model' extractors
...
(closes #1052 , #2304 )
3 years ago
Mike Fährmann
79a461a2c1
[mememuseum] add 'tag' and 'post' extractors ( closes #2264 )
3 years ago
Mike Fährmann
e5f6af6e32
[oauth:pixiv] add note about 'code' expiring in 30 seconds ( #2306 )
3 years ago
Mike Fährmann
bbc4190017
[bunkr] fix .mp4 downloads ( #2239 )
...
again ...
3 years ago
Mike Fährmann
254a5b26e0
[twibooru] add extractors for searches, galleries, and posts
...
(#2219 )
3 years ago
Mike Fährmann
9ebc20e290
[booru] call nameext_from_url() before update() and _prepare()
...
to be able to overwrite filename and extension in _prepare()
3 years ago
Mike Fährmann
4fee3a0e52
[patreon] download 'large_url' images if available ( #2257 )
3 years ago
Mike Fährmann
f5b2b9333f
fix another bug in _check:cookies ( #2160 )
...
regression introduced in ed317bfc
Added a couple of tests to hopefully catch such bugs
before they land in a release.
3 years ago
Ailothaen
203a04a4a3
[reddit] Support of standalone submissions on personal pages of users ( #2301 )
...
* [reddit] Support of submissions on personal pages of users
* [reddit] Design improvement for user submissions
* [reddit] Removed functions declared twice
3 years ago
Mike Fährmann
806bc62379
[redgifs] support 'i.redgifs.com' URLs ( closes #2300 )
3 years ago
Mike Fährmann
655b2de5d9
[vk] fix infinite pagination loops ( fixes #2297 )
3 years ago
Mike Fährmann
cc5b1ce91a
[inkbunny] rename search parameters to their API equivalents
...
(fixes #2292 )
3 years ago
Mike Fährmann
ed317bfcf1
warn about cookies expiring in less than 24 hours
...
requires an expiration timestamp,
so this only works with cookies from a cookies.txt file
3 years ago
David Hoppenbrouwers
b17e2dcf93
[wallpapercave] add extractor for images ( #2205 )
3 years ago
v-delta
c661737f36
[Imgbox] Fix ImgboxExtractor ( #2281 )
3 years ago
Thomas Jost
a7de819aca
[lightroom] add Lightroom gallery extractor ( #2263 )
3 years ago
Mike Fährmann
563bd0ecf4
[danbooru] inherit from BaseExtractor
...
- merge danbooru and e621 code
- support booru.allthefallen.moe (closes #2283 )
- remove support for old e621 tag search URLs
3 years ago
Mike Fährmann
bc0e853d30
combine KeyError & IndexError to common base class LookupError
3 years ago
Mike Fährmann
f1c853c6ef
[furaffinity] add 'layout' option ( #2277 )
...
to be able to force gallery-dl to parse according to a specific layout
in case its auto-detect fails
3 years ago
Mike Fährmann
b4f8e15a1f
allow BaseExtractors to use the domain pf the matched URL
3 years ago
Mike Fährmann
a57a44f510
[kemonoparty] handle files without 'name' ( fixes #2276 )
3 years ago
Mike Fährmann
4efe56f419
[furaffinity] improve new/old layout detection ( fixes #2277 )
3 years ago
Mike Fährmann
0f1e7ff319
[twitter] fix extraction ( #2275 )
3 years ago
Mike Fährmann
dee0d22561
update extractor test results
3 years ago
Mike Fährmann
d7b8e04b50
[kemonoparty] use 'Accept-Encoding: identity' for all downloads
...
(#2267 )
fixes issues when data send with 'Content-Encoding: gzip' or other
encodings is larger than the actual file
3 years ago
enormous-muscles
55326377d8
Add Kohlchan extractor ( #2251 )
3 years ago
Mike Fährmann
cc7dce5755
[sexcom] add 'pins' extractor ( closes #2265 )
3 years ago
Mike Fährmann
02e18f56be
[e621] add 'favorite' extractor ( closes #2250 )
3 years ago
Mike Fährmann
70e6e1549e
[twitter] provide fallback URLs for card images
...
f2e8aedd74 (commitcomment-64057751)
3 years ago
Mike Fährmann
86fa412b47
[hitomi] add 'format' option ( #2260 )
...
default is 'webp' since downloading original files is no longer allowed
3 years ago
Mike Fährmann
492436f936
[twitter] add 'warnings' option ( #2258 )
...
disable reporting any non-fatal errors by default
3 years ago
Mike Fährmann
a5163e4c70
[twitter] restore 'logout' functionality ( #1719 )
3 years ago
Mike Fährmann
f58364f6a8
update Firefox cipher list
3 years ago
Mike Fährmann
7e6981dda6
rename 'disabletls12' to 'tls12'
...
and let config options override any default settings
3 years ago
Mike Fährmann
bb3e182562
overhaul session initialization
...
- share adapter & connection pool across sessions with the same
ssl options, ssl ciphers, and source address
- simplify browser emulation to just a list of headers and ciphers
3 years ago
Mike Fährmann
e670dc518e
[weibo] update pagination code ( fixes #2244 )
...
- send proper headers and query parameters
- use 'since_id' instead of page numbers
- set a 1-2 second delay between requests
3 years ago
Robert Pendell
4c651f6252
[patreon] Disable TLS 1.2 by default ( #2249 )
...
Disables TLS 1.2 on Patreon by default.
3 years ago
Robert Pendell
392cf079f7
Add ability to disable TLS 1.2 ( #2243 )
...
Fix for Patreon Cloudflare issues by having only TLS v1.3 or higher establish HTTPS connections
This now allows you to disable it on a per-host or global basis. Add disabletls12 as a config option either under extractor.(host) or just under extractor. Option is false by default.
Example:
"patreon":
{
"disabletls12": true,
"cookies": {
"session_id": "X"
}
}
3 years ago
Mike Fährmann
d33227fc38
[twitter] restore errors for protected timelines etc ( fixes #2237 )
3 years ago
Mike Fährmann
ebd3d5c1cc
[bunkr] fix .mp4 downloads ( closes #2239 )
3 years ago
Mike Fährmann
e2be199124
[gelbooru] improve and fix pagination ( #2230 , #2232 )
...
Use 'id:<POSTID' as a tag instead of going through pages with 'pid'.
Something similar was already implemented in 93cef784
,
but that got broken again in 3085aac4
.
3 years ago
Mike Fährmann
8230f31800
[twitter] update query hashes
3 years ago
Mike Fährmann
c180806cec
[twitter] fix deleted/invalid retweets ( #2225 )
3 years ago
Mike Fährmann
a2eecc6aa8
[kemonoparty] fix DMs extraction ( #2008 )
3 years ago
Mike Fährmann
2bf554a896
[twitter] fix several errors ( #2212 , #2216 , #2225 )
...
- fix Tweets with deleted quotes
- fix suspended Tweets without 'legacy' entry
- fix unified_cards without 'type'
3 years ago
Mike Fährmann
e5242b83bf
[twitter] define directory format for events ( #2109 )
3 years ago
Mike Fährmann
efb3e65a6a
[sexcom] extend URL pattern ( fixes #2220 )
3 years ago
vsyx
3f2b6335d7
[instagram] fix highlights extraction ( #2197 )
...
* [instagram] fix highlights extraction
* [instagram] improve highlights extraction
- 'yield' individual reels instead of collecting them in a list
and returning them all at once
- reduce 'chunk_size' to an even saver value
(instagram.com also uses 5)
3 years ago
Mike Fährmann
5ed26e1773
[twitter] fix pinned tweets ( #2216 )
...
caused by the changes in dffa440ede
3 years ago
Mike Fährmann
a9f78e6527
[twitter] improve error handling
...
- handle accounts without 'rest_id'
- handle timelines with empty 'instructions'
3 years ago
Mike Fährmann
729b07c1f5
[twitter] simplify
...
- use dict with common GraphQL variables
- reduce 'variables' size with custom JSON encoder instance
- centralise TwitterAPI() creation
3 years ago
Mike Fährmann
7cb29224f0
[philomena] fix search parameter escaping ( #2215 )
...
The pluses from search terms in /tags/ URLs need to be
replaced with spaces to get accepted by Philomena.
3 years ago
Mike Fährmann
9ca8bb2dc0
[twitter] improve error handling
3 years ago
Mike Fährmann
9a221494c3
[twitter] add 'event' extractor ( closes #2109 )
3 years ago
Mike Fährmann
14867dad6b
[twitter] fix unified cards from search results
3 years ago
Mike Fährmann
dffa440ede
[twitter] improve handling of deleted tweets ( #2212 )
3 years ago
Mike Fährmann
54ef874ba4
[twitter] fix retweet filter ( #2212 )
3 years ago
Mike Fährmann
cb43f7731b
[twitter] update to GraphQL API ( #2212 )
...
The old REST API endpoints, which were not used by Twitter since
summer 2021, are going to finally be phased out it seems, with
'/2/timeline/profile/USERID.json' being the first one.
Only Twitter's search doesn't have a GraphQL interface yet.
3 years ago
Mike Fährmann
de754590e0
add --source-address command-line option ( closes #2206 )
3 years ago
Mike Fährmann
698f35215e
[blogger] support new image domain ( fixes #2204 )
3 years ago
Mike Fährmann
c587b678d0
[mangadex] re-enable warning for external chapters ( #2193 )
3 years ago
Mike Fährmann
f2e8aedd74
[twitter] changes to 'cards' option
...
- change default value to 'true'
- only invoke youtube-dl for cards unsupported by gallery
when 'cards' is set to "ytdl"
"cards": true --> only download card images
"cards": "ytdl" --> download card images and
use youtube_dl on otherwise unsupported cards
3 years ago
Mike Fährmann
2d34d8ff8b
[reddit] allow downloading from quarantined subreddits ( #2180 )
3 years ago
Mike Fährmann
17c9c47ca0
[hitomi] fix 'tag' extraction ( fixes #2189 )
3 years ago
Mike Fährmann
df2f0c09bb
[twitter] support "image_carousel_website" unified cards
3 years ago
Mike Fährmann
cdc96e1217
[gelbooru] improve video file detection ( fixes #2188 )
...
not all files from 'https://video-cdnN.gelbooru.com ' are videos
3 years ago
Mike Fährmann
4acc31bd9f
[newgrounds] set suitabilities filter before starting a search
3 years ago
Mike Fährmann
170711af7e
[mangadex] fix extraction ( closes #2177 )
3 years ago
Mike Fährmann
199e7616a7
[rule34] use https://api.rule34.xxx for API requests
3 years ago
Mike Fährmann
37beb1298e
[newgrounds] add 'search' extractor ( closes #2161 )
3 years ago
Mike Fährmann
8b910dd8ae
[hitomi] fix image URLs
...
again and again ...
3 years ago
Mike Fährmann
3085aac4d8
[gelbooru] handle changed API response format ( #2157 )
3 years ago
Mike Fährmann
38e2af29d6
[hitomi] fix image URLs
...
update '_parse_gg()' yet again
3 years ago
Mike Fährmann
6f2e0c9c3d
fix cookie checks for patreon, fanbox, fantia
...
The changes in 9a255344
caused a warning about missing cookies to be
displayed even if those cookies were present, because _check_cookies()
did not account for an empty cookiedomain.
3 years ago
Mike Fährmann
1e0278702d
[hitomi] update '_parse_gg()'
3 years ago
Mike Fährmann
becc7f85a6
[hitomi] fix image URLs
3 years ago
Mike Fährmann
6af8d71da6
[kemonoparty] use service as subcategory ( closes #2147 )
3 years ago
Vrihub
96fcff182c
generic extractor ( #735 )
...
* Generic extractor, see issue #683
* Fix failed test_names test, no subcategory needed
* Prefix directory_fmt with "generic"
* Relax regex (would break some urls)
* Flake8 compliance
* pattern: don't require a scheme
This fixes a bug when we force the generic extractor on urls without a
scheme (that are allowed by all other extractors).
* Fix using g: and r: on urls without http(s) scheme
Almost all extractors accept urls without an initial http(s) scheme.
Many extractors also allow for generic subdomains in their "pattern"
variable; some of them implement this with the regex character class
"[^.]+" (everything but a dot).
This leads to a problem when the extractor is given a url starting
with g: or r: (to force using the generic or recursive extractor)
and without the http(s) scheme: e.g. with "r:foobar.tumblr.com"
the "r:" is wrongly considered part of the subdomain.
This commit fixes the bug, replacing the too generic "[^.]+" with the
more specific "[\w-]+" (letters, digits and "-", the only characters
allowed in domain names), which is already used by some extractors.
* Relax imageurl_pattern_ext: allow relative urls
* First round of small suggested changes
* Support image urls starting with "//"
* self.baseurl: remove trailing slash
* Relax regexp (didn't catch some image urls)
* Some fixes and cleanup
* Fix domain pattern; option to enable extractor
Fixed the domain section for "pattern", to pass "test_add" and
"test_add_module" tests.
Added the "enabled" configuration option (default False) to enable the
generic extractor. Using "g(eneric):URL" forces using the extractor.
3 years ago
Mike Fährmann
4376b39a2b
[sexcom] fix and improve embed extraction ( fixes #2145 )
3 years ago
Mike Fährmann
6d190834ee
[instagram] fix error when PostPage data is not in GraphQL format
...
(#2037 )
3 years ago
Mike Fährmann
dd67e24aa9
[lolisafe] include file ID in filenames
...
More precisely, it now splits the full 'filename' into 'name' and 'id'
instead of overwriting 'filename'. The format string stays the same as
before. Use '{name}.{extension}' to restore the old behavior.
before:
- filename: foobar
- id : 12345
now:
- filename: foobar-12345
- name : foobar
- id : 12345
3 years ago
Mike Fährmann
f3d61de18d
[artstation] create directories per asset ( closes #2136 )
3 years ago
Mike Fährmann
49a50fb2eb
[500px] create directories per photo
3 years ago
Mike Fährmann
89bebe1bef
[500px] add 'favorite' extractor ( closes #1927 )
3 years ago
Mike Fährmann
22b0433985
[fanbox] support pixiv redirects ( closes #2122 )
3 years ago
Mike Fährmann
281828b58b
[tumblrgallery] improve search pagination ( fixes #2132 )
3 years ago
Mike Fährmann
4bec34fc94
[pixiv] allow setting a date range for search results ( #2133 )
...
with the 'scd' and 'ecd' query parameters
3 years ago
Mike Fährmann
882c614281
add album extractor for lolisafe/chibisafe instances
...
- support bunkr.is (closes #2038 )
- support zz.ht (closes #2105 )
3 years ago
Mike Fährmann
d441888bfb
[deviantart] adjust API endpoints
...
Start all endpoints with a forward slash '/'
to be consistent with other API interfaces.
3 years ago
Mike Fährmann
8f0cf0bf71
[deviantart] use '/browse/newest' for most-recent searches
...
(#2096 )
3 years ago
Mike Fährmann
0bd7607da5
[tumblrgallery] improve 'id' extraction ( #2115 )
3 years ago
Mike Fährmann
0d02a7861e
[tumblrgallery] fix extraction ( closes #2112 )
3 years ago
Mike Fährmann
62692c6842
[exhentai] add 'source' option
...
setting it to "hitomi" downloads the corresponding gallery from
hitomi.la; might be extended to other sources in the future
3 years ago
Mike Fährmann
099ed72de7
[hitomi] disable extra 'metadata' by default
...
safes one HTTP request that not needed with default filename settings
3 years ago
Mike Fährmann
9a25534490
use Extractor._check_cookies() for all cookie checks
3 years ago
Mike Fährmann
63c6bc26b5
[rule34us] extract tags per category ( #1527 )
...
like for other boorus with 'tags': true
3 years ago
Mike Fährmann
f587458a3c
[twitter] include '4096x4096' as a default image fallback
...
(closes #2107 , closes #1881 )
3 years ago
Mike Fährmann
8ed282f7f2
[kemonoparty] support coomer.party URLs ( #2100 )
3 years ago
Mike Fährmann
87ce3fa669
[furaffinity] warn when no session cookies were found
3 years ago
Mike Fährmann
159631c808
[philomena] use a default 'filter_id' if non is given
3 years ago
Mike Fährmann
ad30653b17
allow running a BaseExtractor for any URL
...
by prefixing it with '<base-category>:'
For example:
shopify:https://partakefoods.com/products/crunchy-cookie-variety-pack
gelbooru_v01:https://5naf.booru.org/index.php?page=post&s=view&id=46963
Available base categories are:
mastodon, shopify, moebooru, gelbooru_v01, gelbooru_v02,
reactor, foolslide, foolfuuka, philomena
3 years ago
Mike Fährmann
299bd2f1f5
[rule34us] add 'tag' and 'post' extractors ( #1527 )
3 years ago
Mike Fährmann
3cf1075d86
[inkbunny] add 'search' extractor ( closes #2094 )
3 years ago
Mike Fährmann
c6a23c26d7
[instagram] allow downloading specific stories ( closes #2088 )
...
https://instagram.com/stories/ <USER>/<ID> now only downloads the one
story specified by <ID> and not all stories from that user.
3 years ago
Mike Fährmann
352ffcddb0
[instagran] match post URLs with usernames ( fixes #2085 )
3 years ago
Mike Fährmann
f4e3cee6ac
use yt-dlp by default ( #1850 , #2028 )
3 years ago
Mike Fährmann
f1b142e993
{kemonoparty[ change default 'files' order to attachments,file,inline
...
(#1991 )
3 years ago