Mike Fährmann
8e1e8a5bea
[soundgasm] rewrite ( #3578 )
...
use a more standard extractor structure to make -A work as expected
2 years ago
Mike Fährmann
0b93420a81
[pinterest] unescape search terms ( #3621 )
2 years ago
Mike Fährmann
ad96e70546
[bunkr] fix extraction ( #3636 , #3655 )
2 years ago
Mike Fährmann
9335d55bbc
[manganelo] support mobile-only chapters
2 years ago
ClosedPort22
a74114ef7a
[deviantart] fix crash when handling deleted deviations
...
in status updates
2 years ago
Mike Fährmann
75570ad3f1
[oauth] remove stray 'exit()' ( #3628 )
...
- bug from 70ce45d9
- broke oauth:tumblr, oauth:flickr, and oauth:smugmug
2 years ago
Mike Fährmann
d37e7f4898
add 'hooks' option
...
Very much a work in progress.
At the moment, it allows to
- wait and restart an extractor (#3338 )
- change the exit code (#3630 )
- change the log level of a logging message
based on the contents of a logging message
2 years ago
Mike Fährmann
8fb043e8ff
[tumblr] raise more detailed errors for dashboard-only blogs
...
(#3628 )
2 years ago
Mike Fährmann
d4232f3a8b
implement restarting an extractor ( #3338 )
2 years ago
Mike Fährmann
ce996dd21b
[poipiku] warn about incorrect passwords ( #3646 )
2 years ago
Mike Fährmann
70ce45d965
[oauth] use default name for browsers without 'name' attribute
...
(#3645 )
Seem to only be an issue for MacOSXOSAScript before Python 3.11.
d12bec6993
2 years ago
Mike Fährmann
1aae72773f
put argument init on separate lines
2 years ago
Mike Fährmann
2a53e6445c
[bunkr] update domain ( #3636 )
2 years ago
Mike Fährmann
5503ac4d5e
replace json.dumps with direct calls to JSONEncoder.encode
2 years ago
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2 years ago
Mike Fährmann
b7337d810e
[postprocessor:metadata] add 'sort' and 'separators' options
2 years ago
Mike Fährmann
8805bd38ab
merge #3622 : [imagetwist] add phun.imagetwist.com and imagehaha.com support
2 years ago
Mike Fährmann
706ec70e89
[imagetwist] simplify pattern and add tests
2 years ago
Mike Fährmann
f2e91732ae
[instagram] add 'user' metadata field ( #3107 )
...
at the moment only for URLs that need to translate user name to ID
2 years ago
Mike Fährmann
3436c6b117
[postprocessor:metadata] speed up JSON encoding
2 years ago
Prinz23
29f0830b53
[imagetwist] add phun.imagetwist.com and imagehaha.com alias to imagetwist extractor
2 years ago
Mike Fährmann
762a68996b
implement 'archive-pragma' option
2 years ago
Mike Fährmann
bbf0911a46
[e621] implement 'notes' and 'pools' metadata extraction
...
(#3425 )
2 years ago
Mike Fährmann
925b467496
split e621 from danbooru module ( #3425 )
2 years ago
Mike Fährmann
1ae48a54f8
[twitter] add 'transform' option
2 years ago
Mike Fährmann
78d3960a31
[postprocessor:exec] implement archive options ( #3584 )
2 years ago
Mike Fährmann
489c51cecc
[telegraph] fix extraction when images not in <figure> ( #3590 )
2 years ago
Mike Fährmann
0f7e6c422a
merge #3596 : [shopify] support ohpolly.com
2 years ago
enduser420
fcf7030b85
[shopify] support ohpolly.com
2 years ago
Mike Fährmann
a6a631f992
merge #3589 : [redgifs] support v3 URLs
2 years ago
Mike Fährmann
137a395ae0
[imagefap] fix infinite pagination loop ( #3594 )
2 years ago
Mike Fährmann
3c708ade8f
[imagefap] fix metadata extraction
2 years ago
Mike Fährmann
17e24eacf0
[imagefap] update 'gallery' URLs ( #3595 )
2 years ago
Mike Fährmann
d16873941c
[downloader:http] use 'time.monotonic()'
2 years ago
Mike Fährmann
c2bc70593e
implement ability to load external extractor classes
...
- -X/--extractors
- extractor.module-sources
2 years ago
enduser420
a18f627bfc
[redgifs] support v3 URLs
2 years ago
Mike Fährmann
9ec627c760
release version 1.24.5
2 years ago
Mike Fährmann
13a90969c7
merge #3575 : [nudecollect] add 'image' and 'album' extractors
2 years ago
Mike Fährmann
aacd27e4ef
merge #3581 : [hotleak] fix video URLs
2 years ago
Mike Fährmann
abc3619feb
[lexica] add 'search' extractor ( #3567 )
2 years ago
Mike Fährmann
7c9b1ec830
[hotleak] optimize decoding video URLs
...
- use binascii module
- combine slice and reverse step
2 years ago
nifnat
f14dbfe079
Make decode_video_url static (used in both post and creator extractor).
2 years ago
nifnat
bd23a701f3
Tidy up code.
2 years ago
nifnat
7f34f99a26
Reverse engineered obfuscated JS function and reimplemented in python.
2 years ago
Mike Fährmann
0d818d3540
[fantia] send 'X-CSRF-Token' headers ( #3576 )
2 years ago
Mike Fährmann
f58215705a
add '-O/--postprocessor-option' command-line option ( #3565 )
2 years ago
enduser420
2a5903dc16
[nudecollect] add 'image' and 'album' extractors
2 years ago
Mike Fährmann
c8fdd5096e
merge #3571 : [bunkr] Fix extracting mkv and ts files
2 years ago
Mike Fährmann
58c008e30a
[hiperdex] update domain ( #3572 )
2 years ago
Luc Ritchie
842064e597
[bunkr] Fix extracting ts files
2 years ago
Luc Ritchie
99ca0437e4
[bunkr] Fix extracting mkv files
2 years ago
Mike Fährmann
76b01b64cf
[kemonoparty] remove MD5 hash extraction ( #3531 )
...
This partially reverts commit 20d6194ffa
.
2 years ago
Mike Fährmann
09fb212414
[philomena] match URLs with www subdomain
2 years ago
Mike Fährmann
7e2fd2e573
merge #3560 : [deviantart] add support for /deviation/ and fav.me URLs
2 years ago
Mike Fährmann
caae8fefe1
merge #3541 : [deviantart] add extractor for status updates
2 years ago
ClosedPort22
c90b4ea8d9
[deviantart] add support for fav.me URLs
2 years ago
Mike Fährmann
d63af4f3d3
merge #3555 : [generic] fix regex for non-src image URLs
2 years ago
Mike Fährmann
8993b10751
[mastodon] add 'num' and 'count' metadata fields ( #3517 )
2 years ago
Mike Fährmann
d817d23ccb
[instagram] update csrf token handling
...
- update internal value according to cookie
- do not send a second 'csrftoken' cookie
2 years ago
Mike Fährmann
00b94946b3
[instagram] show -o cursor=… after every error ( #3440 )
2 years ago
ClosedPort22
674c719646
[deviantart] refactor base36 conversion
2 years ago
ClosedPort22
293abb8921
[deviantart] add support for /deviation/ URLs
2 years ago
thatfuckingbird
8cfeed78b1
[generic] fix regex for non-src image URLs
2 years ago
Mike Fährmann
fc6ea8ee5c
[instagram] update API domain and headers
2 years ago
ClosedPort22
597b89245e
[deviantart] misc improvements to status extractor
...
- relax regex pattern
- handle invalid 'items' field
- add a test for shared sta.sh item
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
137de090dd
merge #3549 : [twitter] fix search ( #3536 )
2 years ago
Mike Fährmann
02e314c1b6
merge #3537 : [wikifeet/wikifeetx] add 'gallery' extractor
2 years ago
Mike Fährmann
568112dfbb
[oauth] improve output
...
- show which api key / client id gets used (#3518 )
- show in which browser authorization URLs gets opened in
2 years ago
ClosedPort22
ab58c375b4
[twitter] fix search ( #3536 )
...
- partially revert 18fe4b334d
- properly search for cursor when processing 'replaceEntry'
2 years ago
Mike Fährmann
df91ebb945
[oauth] simplify OAuth 1.0a init
2 years ago
ClosedPort22
013733c9e9
[deviantart] fix index fields for embedded/shared images
2 years ago
ClosedPort22
c4aeca7a5a
[deviantart] improve handling of statuses
...
- recursively yield statuses
- ignore items with missing or unexpected field(s)
2 years ago
ClosedPort22
3b32671fbd
[deviantart] add extractor for status updates
...
extract user status updates using the '/user/statuses/' endpoint
2 years ago
Mike Fährmann
107c60c973
[sankaku] update URL pattern ( #3523 )
...
match tag searches with language codes without a trailing slash
2 years ago
enduser420
5cb263fdd2
[wikifeet/wikifeetx] add 'gallery' extractor
2 years ago
Mike Fährmann
35a30498bc
merge #3531 : [kemonoparty] improve hash extraction
...
- extract md5 hashes if available
- extract discord file hashes
2 years ago
Mike Fährmann
ec9ff7640d
merge #3535 : [downloader:http] add signature checks for .blend, .obj, and .clip files
2 years ago
Mike Fährmann
9683d79bb7
[twitter] "fix" search pagination ( #3536 , #3534 )
...
- properly process instructions
- do not expect a predetermined instruction order
2 years ago
Mike Fährmann
4fec848858
[twitter] use "browser": "firefox" by default ( #3522 )
...
and reenable TLS 1.2 ciphers
2 years ago
Mike Fährmann
78937564fd
[twitter] fix login after 32b03433
2 years ago
ClosedPort22
b6706b373a
[downloader:http] add signature checks for some formats
...
also add the MIME type for .obj files
2 years ago
ClosedPort22
20d6194ffa
[kemonoparty] improve hash extraction
...
- extract MD5 hash from URLs
- extract MD5 and SHA256 hash from Discord URLs (kemono.party only)
- minor optimization (do not call 'hashes.add' when 'duplicates' is
true)
- update tests accordingly
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
80a2ff2d38
support setting 'write-pages' to "ALL"
...
to show authentication header, cookies, etc
2 years ago
Mike Fährmann
d6793b2c7d
include request body in 'write-pages=all' output
2 years ago
Mike Fährmann
c881548a27
add 'extractor.retry-codes' option ( #3313 )
...
do not retry 429 and 430 by default
2 years ago
Mike Fährmann
e30e8aeef7
[mastodon] rename '_check_move' -> '_check_moved'
2 years ago
Mike Fährmann
32b0343334
[twitter] refresh guest tokens ( #3445 , #3458 )
2 years ago
Mike Fährmann
512abeb4ae
[booru] add 'url' option
2 years ago
Mike Fährmann
c87bd1a752
[danbooru] extend 'metadata' option
...
make it possible to specify a custom list of metadata includes
2 years ago
Mike Fährmann
26c3292538
[twitter] disable TLS 1.2 ciphers by default ( #3522 )
2 years ago
Mike Fährmann
18fe4b334d
[twitter] remove 'tweet_search_mode' from search parameters ( #3522 )
...
and update API root and general query parameters
2 years ago
Mike Fährmann
ec04c97075
release version 1.24.4
2 years ago
Mike Fährmann
c0d7d2be35
[downloader:http] add 'validate' option
2 years ago
Mike Fährmann
85bd1cbc89
[kemonoparty] fix regression from 473bd380
( #3519 )
...
- do not access 'response.content' unless necessary
- only validate responses if filename extensions differ
2 years ago
Mike Fährmann
805a5663ec
release version 1.24.3
2 years ago
Mike Fährmann
473bd380c8
[kemonoparty] reject invalid/empty files ( #3510 )
2 years ago
Mike Fährmann
4833ec323e
[imagefap] add 'folder' extractor ( #3504 )
2 years ago
Mike Fährmann
362cd6991b
[pixiv] implement 'metadata-bookmark' option ( #3417 )
2 years ago
Mike Fährmann
0895e6afee
merge #3462 : [docs] Update links and fix field typo
2 years ago
Mike Fährmann
2142b9c7ae
merge #3503 : [myhentaigallery] handle whitespace before title tag
2 years ago
Mike Fährmann
3a0450adbf
[behance] use default delay between requests ( #2507 )
2 years ago
Mike Fährmann
2cae4567ba
[telegraph] fix file URLs ( #3506 )
2 years ago
Mike Fährmann
cbaeee9533
[imagefap] warn about redirects to '/human-verification' ( #1140 )
2 years ago
Mike Fährmann
435de1329a
[imagefap] use default delay between requests ( #1140 )
2 years ago
Erik Rimskog
a8a982359e
[myhentaigallery] handle whitespace before the title tag
2 years ago
Mike Fährmann
d1dd52349a
merge #3189 : [tcbscans] add 'chapter' and 'manga' extractors
2 years ago
Mike Fährmann
2f31d21509
merge #3455 : [twitter] apply tweet type checks before uniqueness check
2 years ago
enduser420
e8541a131d
[tcbscans] add 'chapter' and 'manga' extractors
2 years ago
Mike Fährmann
9695c4e88d
emit debug logging message when loading cookies from file
...
attempt nr. 2
no idea how I managed to remove 6514828d
in a918ce29
2 years ago
Mike Fährmann
30a31836e7
merge #3449 : [twitter] force HTTPS for TwitPic URLs
2 years ago
Mike Fährmann
e18482e9ae
[twitter] improve 'http' -> 'https' replacement
2 years ago
Mike Fährmann
4fd6da474f
merge #3473 : [twitter] fix crash when using 'expand' and 'syndication'
2 years ago
Mike Fährmann
a918ce29b5
run tests on ubuntu-20.04
...
and remove Python 3.4, since that's no longer available
on this test runner
2 years ago
Mike Fährmann
6514828d4e
emit debug logging message when loading cookies from file
2 years ago
Mike Fährmann
3a238fd490
[poipiku] warn about login requirements
2 years ago
Mike Fährmann
fa144f38ed
[ytdl} fix dfe4f00c
for legacy yt-dlp
2 years ago
Mike Fährmann
f29ba089ff
merge #3474 : [fanleaks] add 'post' and 'model' extractors
2 years ago
Mike Fährmann
6933727b45
merge #3483 : [twitter] implement 'syndication=extended'
2 years ago
Mike Fährmann
07ed3a1fbf
merge #3460 : [poipiku] fix extraction for a different warning button style
...
(#3493 , #3492 )
2 years ago
Mike Fährmann
9116398c1c
[pinterest] add 'domain' option ( #3484 )
...
use input URL domain by default
2 years ago
Mike Fährmann
6f6af36cad
use double quotes for --help examples
2 years ago
Mike Fährmann
dfe4f00ca2
[ytdl] update for yt-dlp changes
2 years ago
blankie
2f985bcddb
[poipiku] fix extraction for a different warning button style
2 years ago
Mike Fährmann
294108c90a
[pinterest] support 'All Pins' boards ( #2855 , #3484 )
2 years ago
Mike Fährmann
77df8d3116
[deviantart] implement username&password login for scraps ( #1029 )
...
re-login when getting prematurely logged out by dA
is missing at the moment
2 years ago
Mike Fährmann
ed2d715019
fix 'keywords' in extractor tests ( #3491 )
2 years ago
Mike Fährmann
3f29b8fe91
[cookies] convert browser names to lowercase
2 years ago
ClosedPort22
6853b14be3
[twitter] apply suggestions from code review
...
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
4611237f8c
merge #3457 : [danbooru] extract uploader metadata (if option is set)
2 years ago
Mike Fährmann
e7522482bb
merge #3463 : [lynxchan] support 'bbw-chan.nl'
2 years ago
Mike Fährmann
7d6c846176
[fanbox] return 'imageMap' files in order ( #2718 )
2 years ago
Mike Fährmann
dc8e7ff54e
[bunkr] fix URLs returned by API ( #3481 )
2 years ago
enduser420
5fedef3a1a
[fanleaks] update 'model' URL pattern
2 years ago
enduser420
5a740ef78b
[fanleaks] add 'post' and 'model' extractors
2 years ago
ClosedPort22
7c8eab8d52
[twitter] implement 'syndication=extended'
...
to be able to fetch extended user metadata
2 years ago
ClosedPort22
be3286206a
[twitter] assume 'conversation_id' when using syndication
...
not possible to expand replies at the momemt
2 years ago
ClosedPort22
ce8dbb1ccc
[twitter] fix crash when using 'expand' and 'syndication'
...
caused by KeyError: 'conversation_id_str'
2 years ago
Mike Fährmann
d651d45239
implement specifying ranges in slice notation ( #918 , #2865 )
...
e.g.
- '1:101' or ':101' or ':101:' for files 1 to 100
- '1::2' or '::2' for every second file
- '1:101:5' or ':101:5' for files 1, 6, 11, ..., 91, 96
(the second argument specifies the first index NOT included)
2 years ago
ClosedPort22
38786a9593
[twitter] refactor extraction of TwitPic URLs
...
flattening
2 years ago
Mike Fährmann
3616adfc75
implement '--range' with Python ranges
2 years ago
enduser420
527bb2c4ab
[lynxchan/bbw-chan] add 'thread' and 'board' extractors
2 years ago
pi_allen
64902f518e
[docs] Update links and fix field typo
2 years ago
blankie
f82ee93676
[danbooru] extract uploader metadata (if metadata is set)
2 years ago
ClosedPort22
250d35107c
[twitter] prioritize tweet type checks ( #3439 )
...
Do not consider a tweet seen before applying 'retweet', 'quote' and
'reply' checks. Otherwise the original tweets will also be skipped if
the "derivative" tweets and the original tweets are from the same user.
2 years ago
Mike Fährmann
1800bd7d14
allow '*-filter' options to be a list of expressions
2 years ago
ClosedPort22
3eb352fcb0
[twitter] force HTTPS for TwitPic URLs
2 years ago
Mike Fährmann
73ab5d84c0
update docs/configuration.rst
2 years ago
Mike Fährmann
2d7d80d302
release version 1.24.2
2 years ago
Mike Fährmann
bee354c264
Merge pull request #3415 from enduser420/extractor/fapello
...
[fapello] add 'post', 'user' and 'path' extractors
2 years ago
Mike Fährmann
8d7585534e
Merge pull request #3367 from the-blank-x/deviantart-view
...
[deviantart] add /view URL support
2 years ago