Mike Fährmann
4883420e67
[generic] revert pattern change
2 years ago
Mike Fährmann
9037128315
[twitter] fix some 'original' retweets not downloading ( #3744 )
2 years ago
Mike Fährmann
ea3d95e7e8
merge #3740 : [deviantart] add support for fxdeviantart.com URLs
2 years ago
Mike Fährmann
9abcb2b6e5
update headers and ciphers for '"browser": "chrome"'
2 years ago
ClosedPort22
c489aecb3e
[deviantart] add support for fxdeviantart.com URLs
...
fxdeviantart.com is a service that fixes embeds on Discord, similar to
fxtwitter.com
2 years ago
ClosedPort22
34a7fab0e2
[generic] add support for IDNs
...
(internationalized domain name)
2 years ago
Mike Fährmann
c9a7345228
[newgrounds] prevent archive ID overlap ( #3681 )
...
add an 'i' and 'a' prefix to image and audio files
(/art/view/, /audio/listen/)
since their numeric ID may conflict with movies and other media
2 years ago
Mike Fährmann
da9840a39d
[reddit] update 'videos' option ( #3712 )
...
- add 'dash' to directly extract DASH manifest URLs
(was default behavior since a7c79531
)
- change default strategy back to before a7c79531
- disable 'Falling back on generic information extractor' warning
2 years ago
Mike Fährmann
baf41d7437
[misskey] update ( #3717 )
...
- add module docstring
- add options to docs/gallery-dl.conf
2 years ago
Mike Fährmann
6762d99515
merge #3717 : [misskey] add misskey extractors
2 years ago
Mike Fährmann
b8a702929d
[oauth] import extractor modules on demand
2 years ago
Mike Fährmann
dd88740ec7
replace remaining instances of base64 with binascii
2 years ago
enduser420
e1867cf5eb
[misskey] add 'renotes' and 'replies' options
2 years ago
enduser420
a95b5e0d8e
[misskey] add misskey extractors
2 years ago
Mike Fährmann
0d142e403c
[szurubooru] add 'tag' and 'post' extractors ( #3583 , #3713 )
2 years ago
Mike Fährmann
b14f8d5817
[gelbooru] add 'favorite' extractor ( #3704 )
...
requires logged in cookies to work
2 years ago
Mike Fährmann
a70a3e5da6
[mangasee] extract 'author' and 'genre' metadata ( #3703 )
...
Both are lists/arrays. Use {author!S} or {genre:J, } to format them.
2 years ago
Mike Fährmann
6b03506655
[deviantart] allow searching when not logged in
2 years ago
Mike Fährmann
511a051705
[fanbox] fix crash with missing images ( #3673 )
2 years ago
Mike Fährmann
3fa456d989
[deviantart] remove mature scraps warning ( #3691 )
...
warn about private deviations
when paginating over eclipse results
2 years ago
Mike Fährmann
51301e0c31
replace remaining time.sleep() calls
...
with Extractor.sleep() or request_interval
2 years ago
Mike Fährmann
6ed4309aba
[deviantart] add 'gallery-search' extractor ( #1695 )
2 years ago
Mike Fährmann
3d8777fbc1
move user agent string to util.py
2 years ago
Mike Fährmann
e1df7f73b1
[deviantart] add 'search' extractor
...
(#538 , #1264 , #2954 , #2970 , #3577 )
Requires login to fetch any results, since the API endpoint raises an
error for not logged in requests.
TODO: parse HTML search results
2 years ago
Mike Fährmann
4f029ab38b
[pornpics] support '/pornstar' and '/channels' listings
...
- fix docstring (#3671 )
- simplify code
2 years ago
Mike Fährmann
cbe4769246
[danbooru] use gallery-dl UA ( #3665 )
...
this removes the ability to set a custom UA via 'user-agent' option
for extractor requests
2 years ago
Mike Fährmann
253ac08203
pre-define and use 'gallery-dö/<version>' UA string
2 years ago
Mike Fährmann
b4899c266f
merge #3656 : [deviantart] fix crash when handling deleted deviations in status updates
2 years ago
Mike Fährmann
bb11c2a576
merge #3662 : [redgifs] add 'collection' extractors
2 years ago
Mike Fährmann
884f1848d6
[redgifs] fix syntax for older Python versions
...
and update docs/supportedsites
2 years ago
Mike Fährmann
725baedad3
[deviantart] use '/collections/all' endpoint for favorites
...
(#3666 ,#3668)
2 years ago
Mike Fährmann
2bd8f2f4bd
[pornpics] add 'search' and 'tag' extractors
...
(#263 , #3544 , #3654 )
2 years ago
Mike Fährmann
79bc82884c
[pornpics] add 'gallery' extractor ( #263 , #3544 , #3654 )
2 years ago
Mike Fährmann
7bdc1d6d3d
[manganelo] update and fix metadata extraction
2 years ago
Mike Fährmann
363bb76dff
[manganelo] simplify URL pattern
2 years ago
enduser420
b28bd9789e
[redgifs] add 'collection' extractors
2 years ago
ClosedPort22
f4e211356d
[deviantart] slight refactor
2 years ago
Mike Fährmann
bd5d08abbc
[catbox] add 'file' extractor ( #3570 )
2 years ago
Mike Fährmann
8e1e8a5bea
[soundgasm] rewrite ( #3578 )
...
use a more standard extractor structure to make -A work as expected
2 years ago
Mike Fährmann
0b93420a81
[pinterest] unescape search terms ( #3621 )
2 years ago
Mike Fährmann
ad96e70546
[bunkr] fix extraction ( #3636 , #3655 )
2 years ago
Mike Fährmann
9335d55bbc
[manganelo] support mobile-only chapters
2 years ago
ClosedPort22
a74114ef7a
[deviantart] fix crash when handling deleted deviations
...
in status updates
2 years ago
Mike Fährmann
75570ad3f1
[oauth] remove stray 'exit()' ( #3628 )
...
- bug from 70ce45d9
- broke oauth:tumblr, oauth:flickr, and oauth:smugmug
2 years ago
Mike Fährmann
8fb043e8ff
[tumblr] raise more detailed errors for dashboard-only blogs
...
(#3628 )
2 years ago
Mike Fährmann
ce996dd21b
[poipiku] warn about incorrect passwords ( #3646 )
2 years ago
Mike Fährmann
70ce45d965
[oauth] use default name for browsers without 'name' attribute
...
(#3645 )
Seem to only be an issue for MacOSXOSAScript before Python 3.11.
d12bec6993
2 years ago
Mike Fährmann
2a53e6445c
[bunkr] update domain ( #3636 )
2 years ago
Mike Fährmann
5503ac4d5e
replace json.dumps with direct calls to JSONEncoder.encode
2 years ago
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2 years ago
Mike Fährmann
8805bd38ab
merge #3622 : [imagetwist] add phun.imagetwist.com and imagehaha.com support
2 years ago
Mike Fährmann
706ec70e89
[imagetwist] simplify pattern and add tests
2 years ago
Mike Fährmann
f2e91732ae
[instagram] add 'user' metadata field ( #3107 )
...
at the moment only for URLs that need to translate user name to ID
2 years ago
Prinz23
29f0830b53
[imagetwist] add phun.imagetwist.com and imagehaha.com alias to imagetwist extractor
2 years ago
Mike Fährmann
bbf0911a46
[e621] implement 'notes' and 'pools' metadata extraction
...
(#3425 )
2 years ago
Mike Fährmann
925b467496
split e621 from danbooru module ( #3425 )
2 years ago
Mike Fährmann
1ae48a54f8
[twitter] add 'transform' option
2 years ago
Mike Fährmann
489c51cecc
[telegraph] fix extraction when images not in <figure> ( #3590 )
2 years ago
Mike Fährmann
0f7e6c422a
merge #3596 : [shopify] support ohpolly.com
2 years ago
enduser420
fcf7030b85
[shopify] support ohpolly.com
2 years ago
Mike Fährmann
a6a631f992
merge #3589 : [redgifs] support v3 URLs
2 years ago
Mike Fährmann
137a395ae0
[imagefap] fix infinite pagination loop ( #3594 )
2 years ago
Mike Fährmann
3c708ade8f
[imagefap] fix metadata extraction
2 years ago
Mike Fährmann
17e24eacf0
[imagefap] update 'gallery' URLs ( #3595 )
2 years ago
Mike Fährmann
c2bc70593e
implement ability to load external extractor classes
...
- -X/--extractors
- extractor.module-sources
2 years ago
enduser420
a18f627bfc
[redgifs] support v3 URLs
2 years ago
Mike Fährmann
13a90969c7
merge #3575 : [nudecollect] add 'image' and 'album' extractors
2 years ago
Mike Fährmann
aacd27e4ef
merge #3581 : [hotleak] fix video URLs
2 years ago
Mike Fährmann
abc3619feb
[lexica] add 'search' extractor ( #3567 )
2 years ago
Mike Fährmann
7c9b1ec830
[hotleak] optimize decoding video URLs
...
- use binascii module
- combine slice and reverse step
2 years ago
nifnat
f14dbfe079
Make decode_video_url static (used in both post and creator extractor).
2 years ago
nifnat
bd23a701f3
Tidy up code.
2 years ago
nifnat
7f34f99a26
Reverse engineered obfuscated JS function and reimplemented in python.
2 years ago
Mike Fährmann
0d818d3540
[fantia] send 'X-CSRF-Token' headers ( #3576 )
2 years ago
enduser420
2a5903dc16
[nudecollect] add 'image' and 'album' extractors
2 years ago
Mike Fährmann
c8fdd5096e
merge #3571 : [bunkr] Fix extracting mkv and ts files
2 years ago
Mike Fährmann
58c008e30a
[hiperdex] update domain ( #3572 )
2 years ago
Luc Ritchie
842064e597
[bunkr] Fix extracting ts files
2 years ago
Luc Ritchie
99ca0437e4
[bunkr] Fix extracting mkv files
2 years ago
Mike Fährmann
76b01b64cf
[kemonoparty] remove MD5 hash extraction ( #3531 )
...
This partially reverts commit 20d6194ffa
.
2 years ago
Mike Fährmann
09fb212414
[philomena] match URLs with www subdomain
2 years ago
Mike Fährmann
7e2fd2e573
merge #3560 : [deviantart] add support for /deviation/ and fav.me URLs
2 years ago
Mike Fährmann
caae8fefe1
merge #3541 : [deviantart] add extractor for status updates
2 years ago
ClosedPort22
c90b4ea8d9
[deviantart] add support for fav.me URLs
2 years ago
Mike Fährmann
d63af4f3d3
merge #3555 : [generic] fix regex for non-src image URLs
2 years ago
Mike Fährmann
8993b10751
[mastodon] add 'num' and 'count' metadata fields ( #3517 )
2 years ago
Mike Fährmann
d817d23ccb
[instagram] update csrf token handling
...
- update internal value according to cookie
- do not send a second 'csrftoken' cookie
2 years ago
Mike Fährmann
00b94946b3
[instagram] show -o cursor=… after every error ( #3440 )
2 years ago
ClosedPort22
674c719646
[deviantart] refactor base36 conversion
2 years ago
ClosedPort22
293abb8921
[deviantart] add support for /deviation/ URLs
2 years ago
thatfuckingbird
8cfeed78b1
[generic] fix regex for non-src image URLs
2 years ago
Mike Fährmann
fc6ea8ee5c
[instagram] update API domain and headers
2 years ago
ClosedPort22
597b89245e
[deviantart] misc improvements to status extractor
...
- relax regex pattern
- handle invalid 'items' field
- add a test for shared sta.sh item
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
137de090dd
merge #3549 : [twitter] fix search ( #3536 )
2 years ago
Mike Fährmann
02e314c1b6
merge #3537 : [wikifeet/wikifeetx] add 'gallery' extractor
2 years ago
Mike Fährmann
568112dfbb
[oauth] improve output
...
- show which api key / client id gets used (#3518 )
- show in which browser authorization URLs gets opened in
2 years ago
ClosedPort22
ab58c375b4
[twitter] fix search ( #3536 )
...
- partially revert 18fe4b334d
- properly search for cursor when processing 'replaceEntry'
2 years ago
Mike Fährmann
df91ebb945
[oauth] simplify OAuth 1.0a init
2 years ago
ClosedPort22
013733c9e9
[deviantart] fix index fields for embedded/shared images
2 years ago
ClosedPort22
c4aeca7a5a
[deviantart] improve handling of statuses
...
- recursively yield statuses
- ignore items with missing or unexpected field(s)
2 years ago
ClosedPort22
3b32671fbd
[deviantart] add extractor for status updates
...
extract user status updates using the '/user/statuses/' endpoint
2 years ago
Mike Fährmann
107c60c973
[sankaku] update URL pattern ( #3523 )
...
match tag searches with language codes without a trailing slash
2 years ago
enduser420
5cb263fdd2
[wikifeet/wikifeetx] add 'gallery' extractor
2 years ago
Mike Fährmann
35a30498bc
merge #3531 : [kemonoparty] improve hash extraction
...
- extract md5 hashes if available
- extract discord file hashes
2 years ago
Mike Fährmann
9683d79bb7
[twitter] "fix" search pagination ( #3536 , #3534 )
...
- properly process instructions
- do not expect a predetermined instruction order
2 years ago
Mike Fährmann
4fec848858
[twitter] use "browser": "firefox" by default ( #3522 )
...
and reenable TLS 1.2 ciphers
2 years ago
Mike Fährmann
78937564fd
[twitter] fix login after 32b03433
2 years ago
ClosedPort22
20d6194ffa
[kemonoparty] improve hash extraction
...
- extract MD5 hash from URLs
- extract MD5 and SHA256 hash from Discord URLs (kemono.party only)
- minor optimization (do not call 'hashes.add' when 'duplicates' is
true)
- update tests accordingly
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
80a2ff2d38
support setting 'write-pages' to "ALL"
...
to show authentication header, cookies, etc
2 years ago
Mike Fährmann
c881548a27
add 'extractor.retry-codes' option ( #3313 )
...
do not retry 429 and 430 by default
2 years ago
Mike Fährmann
e30e8aeef7
[mastodon] rename '_check_move' -> '_check_moved'
2 years ago
Mike Fährmann
32b0343334
[twitter] refresh guest tokens ( #3445 , #3458 )
2 years ago
Mike Fährmann
512abeb4ae
[booru] add 'url' option
2 years ago
Mike Fährmann
c87bd1a752
[danbooru] extend 'metadata' option
...
make it possible to specify a custom list of metadata includes
2 years ago
Mike Fährmann
26c3292538
[twitter] disable TLS 1.2 ciphers by default ( #3522 )
2 years ago
Mike Fährmann
18fe4b334d
[twitter] remove 'tweet_search_mode' from search parameters ( #3522 )
...
and update API root and general query parameters
2 years ago
Mike Fährmann
85bd1cbc89
[kemonoparty] fix regression from 473bd380
( #3519 )
...
- do not access 'response.content' unless necessary
- only validate responses if filename extensions differ
2 years ago
Mike Fährmann
473bd380c8
[kemonoparty] reject invalid/empty files ( #3510 )
2 years ago
Mike Fährmann
4833ec323e
[imagefap] add 'folder' extractor ( #3504 )
2 years ago
Mike Fährmann
362cd6991b
[pixiv] implement 'metadata-bookmark' option ( #3417 )
2 years ago
Mike Fährmann
2142b9c7ae
merge #3503 : [myhentaigallery] handle whitespace before title tag
2 years ago
Mike Fährmann
3a0450adbf
[behance] use default delay between requests ( #2507 )
2 years ago
Mike Fährmann
2cae4567ba
[telegraph] fix file URLs ( #3506 )
2 years ago
Mike Fährmann
cbaeee9533
[imagefap] warn about redirects to '/human-verification' ( #1140 )
2 years ago
Mike Fährmann
435de1329a
[imagefap] use default delay between requests ( #1140 )
2 years ago
Erik Rimskog
a8a982359e
[myhentaigallery] handle whitespace before the title tag
2 years ago
Mike Fährmann
d1dd52349a
merge #3189 : [tcbscans] add 'chapter' and 'manga' extractors
2 years ago
Mike Fährmann
2f31d21509
merge #3455 : [twitter] apply tweet type checks before uniqueness check
2 years ago
enduser420
e8541a131d
[tcbscans] add 'chapter' and 'manga' extractors
2 years ago
Mike Fährmann
9695c4e88d
emit debug logging message when loading cookies from file
...
attempt nr. 2
no idea how I managed to remove 6514828d
in a918ce29
2 years ago
Mike Fährmann
30a31836e7
merge #3449 : [twitter] force HTTPS for TwitPic URLs
2 years ago
Mike Fährmann
e18482e9ae
[twitter] improve 'http' -> 'https' replacement
2 years ago
Mike Fährmann
4fd6da474f
merge #3473 : [twitter] fix crash when using 'expand' and 'syndication'
2 years ago
Mike Fährmann
a918ce29b5
run tests on ubuntu-20.04
...
and remove Python 3.4, since that's no longer available
on this test runner
2 years ago
Mike Fährmann
6514828d4e
emit debug logging message when loading cookies from file
2 years ago
Mike Fährmann
3a238fd490
[poipiku] warn about login requirements
2 years ago
Mike Fährmann
f29ba089ff
merge #3474 : [fanleaks] add 'post' and 'model' extractors
2 years ago
Mike Fährmann
6933727b45
merge #3483 : [twitter] implement 'syndication=extended'
2 years ago
Mike Fährmann
07ed3a1fbf
merge #3460 : [poipiku] fix extraction for a different warning button style
...
(#3493 , #3492 )
2 years ago
Mike Fährmann
9116398c1c
[pinterest] add 'domain' option ( #3484 )
...
use input URL domain by default
2 years ago
blankie
2f985bcddb
[poipiku] fix extraction for a different warning button style
2 years ago
Mike Fährmann
294108c90a
[pinterest] support 'All Pins' boards ( #2855 , #3484 )
2 years ago
Mike Fährmann
77df8d3116
[deviantart] implement username&password login for scraps ( #1029 )
...
re-login when getting prematurely logged out by dA
is missing at the moment
2 years ago
Mike Fährmann
ed2d715019
fix 'keywords' in extractor tests ( #3491 )
2 years ago
ClosedPort22
6853b14be3
[twitter] apply suggestions from code review
...
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
4611237f8c
merge #3457 : [danbooru] extract uploader metadata (if option is set)
2 years ago
Mike Fährmann
e7522482bb
merge #3463 : [lynxchan] support 'bbw-chan.nl'
2 years ago
Mike Fährmann
7d6c846176
[fanbox] return 'imageMap' files in order ( #2718 )
2 years ago
Mike Fährmann
dc8e7ff54e
[bunkr] fix URLs returned by API ( #3481 )
2 years ago
enduser420
5fedef3a1a
[fanleaks] update 'model' URL pattern
2 years ago