Mike Fährmann
6933727b45
merge #3483 : [twitter] implement 'syndication=extended'
2 years ago
Mike Fährmann
07ed3a1fbf
merge #3460 : [poipiku] fix extraction for a different warning button style
...
(#3493 , #3492 )
2 years ago
Mike Fährmann
9116398c1c
[pinterest] add 'domain' option ( #3484 )
...
use input URL domain by default
2 years ago
Mike Fährmann
6f6af36cad
use double quotes for --help examples
2 years ago
Mike Fährmann
dfe4f00ca2
[ytdl] update for yt-dlp changes
2 years ago
blankie
2f985bcddb
[poipiku] fix extraction for a different warning button style
2 years ago
Mike Fährmann
294108c90a
[pinterest] support 'All Pins' boards ( #2855 , #3484 )
2 years ago
Mike Fährmann
77df8d3116
[deviantart] implement username&password login for scraps ( #1029 )
...
re-login when getting prematurely logged out by dA
is missing at the moment
2 years ago
Mike Fährmann
ed2d715019
fix 'keywords' in extractor tests ( #3491 )
2 years ago
Mike Fährmann
3f29b8fe91
[cookies] convert browser names to lowercase
2 years ago
ClosedPort22
6853b14be3
[twitter] apply suggestions from code review
...
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
4611237f8c
merge #3457 : [danbooru] extract uploader metadata (if option is set)
2 years ago
Mike Fährmann
e7522482bb
merge #3463 : [lynxchan] support 'bbw-chan.nl'
2 years ago
Mike Fährmann
7d6c846176
[fanbox] return 'imageMap' files in order ( #2718 )
2 years ago
Mike Fährmann
dc8e7ff54e
[bunkr] fix URLs returned by API ( #3481 )
2 years ago
enduser420
5fedef3a1a
[fanleaks] update 'model' URL pattern
2 years ago
enduser420
5a740ef78b
[fanleaks] add 'post' and 'model' extractors
2 years ago
ClosedPort22
7c8eab8d52
[twitter] implement 'syndication=extended'
...
to be able to fetch extended user metadata
2 years ago
ClosedPort22
be3286206a
[twitter] assume 'conversation_id' when using syndication
...
not possible to expand replies at the momemt
2 years ago
ClosedPort22
ce8dbb1ccc
[twitter] fix crash when using 'expand' and 'syndication'
...
caused by KeyError: 'conversation_id_str'
2 years ago
Mike Fährmann
d651d45239
implement specifying ranges in slice notation ( #918 , #2865 )
...
e.g.
- '1:101' or ':101' or ':101:' for files 1 to 100
- '1::2' or '::2' for every second file
- '1:101:5' or ':101:5' for files 1, 6, 11, ..., 91, 96
(the second argument specifies the first index NOT included)
2 years ago
ClosedPort22
38786a9593
[twitter] refactor extraction of TwitPic URLs
...
flattening
2 years ago
Mike Fährmann
3616adfc75
implement '--range' with Python ranges
2 years ago
enduser420
527bb2c4ab
[lynxchan/bbw-chan] add 'thread' and 'board' extractors
2 years ago
pi_allen
64902f518e
[docs] Update links and fix field typo
2 years ago
blankie
f82ee93676
[danbooru] extract uploader metadata (if metadata is set)
2 years ago
ClosedPort22
250d35107c
[twitter] prioritize tweet type checks ( #3439 )
...
Do not consider a tweet seen before applying 'retweet', 'quote' and
'reply' checks. Otherwise the original tweets will also be skipped if
the "derivative" tweets and the original tweets are from the same user.
2 years ago
Mike Fährmann
1800bd7d14
allow '*-filter' options to be a list of expressions
2 years ago
ClosedPort22
3eb352fcb0
[twitter] force HTTPS for TwitPic URLs
2 years ago
Mike Fährmann
73ab5d84c0
update docs/configuration.rst
2 years ago
Mike Fährmann
2d7d80d302
release version 1.24.2
2 years ago
Mike Fährmann
bee354c264
Merge pull request #3415 from enduser420/extractor/fapello
...
[fapello] add 'post', 'user' and 'path' extractors
2 years ago
Mike Fährmann
8d7585534e
Merge pull request #3367 from the-blank-x/deviantart-view
...
[deviantart] add /view URL support
2 years ago
blankie
6614d94b08
[deviantart] add /view URL support
2 years ago
Mike Fährmann
dd6eeb4336
Merge pull request #3366 from ClosedPort22/da-extra-stash
...
[deviantart] extract sta.sh URLs from `text_content`
2 years ago
Mike Fährmann
f36cbb3911
Merge pull request #3413 from ClosedPort22/e621-manual-pagination
...
[e621] implement manual pagination
2 years ago
ClosedPort22
dd4a4a3fa6
[e621] softcode the pagination threshold
2 years ago
ClosedPort22
9faa4ed738
[e621] refactor pagination control
...
as suggested by @mikf
2 years ago
Mike Fährmann
7851a2c520
[seiga] raise error when redirected to login page ( #3401 )
2 years ago
Mike Fährmann
68ce5f965d
[instagram] remove unused code
2 years ago
Mike Fährmann
4063563cd7
[zerochan] update for layout v3
...
- remove cookie disabling v3
- fix and improve metadata extraction
2 years ago
Mike Fährmann
1e6407ca98
Merge pull request #3414 from pubak42/master
...
[sex.com] Download videos from cdn (#3408 )
2 years ago
ClosedPort22
bf1649dadb
[imgur] add support for imgur.io URLs
2 years ago
enduser420
7e08e2d982
[fapello] set 'filename_fmt'
2 years ago
enduser420
e5076ba056
[fapello] add 'post', 'user' and 'path' extractors
2 years ago
pubak42
e7326cdf1d
[sex.com] Download videos from cdn ( #3408 )
...
The format of video sources was changed recently to be a full URL with https:// in the beginning.
The original extractor code appended the video source URL to root url of the website, thus yielding
invalid url in format ...sex.comhttps... that failed to resolve.
2 years ago
ClosedPort22
d0ad6d0e67
[e621] implement manual pagination mode
2 years ago
Mike Fährmann
6f0735568c
[2chen] fix file URLs
2 years ago
enduser420
a2be06d873
[2chen] add '.club' support ( #3406 )
2 years ago
Mike Fährmann
a6d4733e11
[pixiv] extract 'date_url' metadata ( #3405 )
...
i.e. the datetime encoded in each file URL.
https://i.pximg.net/img-master/img/2022/12/01/13/44/55/12345678_p0.jpg
->
2022-12-01 13:44:55 +09:00
->
2022-12-01 04:44:55
2 years ago
Mike Fährmann
1317625ec4
[webmshare] add 'video' extractor ( #2410 )
2 years ago
Mike Fährmann
90a9c0790f
[twitter] update 'search' pagination ( #544 )
...
Only stop when list of all returned Tweets is empty
instead of when no valid Tweet was found.
2 years ago
Mike Fährmann
1cbc234819
[mangafox] extract more metadata ( #3167 )
2 years ago
Mike Fährmann
3082544fff
misc fixes
...
- fix typo (#3399 )
- remove double assignment
- [bunkr] update things I forgot in 6b6f886d
- [soundgasm] adjust 'archive_fmt' (#3388 )
2 years ago
enduser420
41bf236d36
[lynxchan] add generic extractors for lynxchan imageboards ( #3394 )
...
* [lynxchan] add generic extractors for lynxchan imageboards
includes kohlchan.net, endchan.org:wq
* [lynxchan] set pop default to empty tuple
* Apply suggestions from code review
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
3c75c3bbc4
[soundgasm] add 'user' extractor ( #3384 )
...
based on code from PR #3388 by @enduser420
2 years ago
Mike Fährmann
2952add4a8
[reddit] increase 'id-max' default value ( #3397 )
...
to float("inf")
2 years ago
Mike Fährmann
a001c9c06f
[instagram] prevent post 'date' overwriting file 'date' ( #3392 )
2 years ago
Mike Fährmann
6b6f886dcf
[bunkr] update domain ( #3391 )
...
and improve bunkr/app.bunkr handling
2 years ago
ClosedPort22
bf3fd5951a
Merge branch 'master' into da-extra-stash
2 years ago
Mike Fährmann
eb94568e1f
[soundgasm] add 'audio' extractor ( #3384 )
2 years ago
Mike Fährmann
dfe7b23579
support Firefox containers for --cookies-from-browser ( #3346 )
2 years ago
Mike Fährmann
cd931e1139
update extractor test results
2 years ago
Mike Fährmann
989ec9fc79
[khinsider] fix metadata extraction
2 years ago
Mike Fährmann
1c25cc7a3e
[warosu] fix and update
2 years ago
Mike Fährmann
79e52f3539
[imgth] rewrite
...
- inherit from GalleryExtractor
- fix image URLs
- better metadata
2 years ago
Mike Fährmann
202c1210d5
[exhentai] fix pagination
2 years ago
Mike Fährmann
ca4742200b
use util.NONE as 'keyword-default' default value
2 years ago
Mike Fährmann
43c211f1a7
extend and rename util.CustomNone
2 years ago
Mike Fährmann
6afb3cc766
restore paths for archived files ( #3362 )
2 years ago
Mike Fährmann
4a3a1f4c87
[komikcast] update domain and fix extraction
2 years ago
ClosedPort22
13d825731e
[deviantart] fix test for sta.sh URL extraction
...
Without the 'count' assertion, the test would be essentially useless.
2 years ago
ClosedPort22
6356c9be96
[deviantart] extract sta.sh URLs from 'text_content'
2 years ago
Mike Fährmann
5f57a27ba6
[imagetwist] fix extraction
2 years ago
Mike Fährmann
a42ba25ca1
[foolslide] remove 'kireicake'
...
site redirects to (unclaimed) mangadex group
2 years ago
Mike Fährmann
86f0597c95
[kissgoddess] remove module
...
site does not host albums anymore
2 years ago
Mike Fährmann
049d1bae9a
release version 1.24.1
2 years ago
Mike Fährmann
d0b160461a
terrible workaround for errors with 'http-metadata' ( #3334 )
2 years ago
Mike Fährmann
20e12b5d7c
[nitter] support '/i/user/' URLs ( #3310 )
...
as well as using 'id:<userid>' as username
not all nitter instances seem to support '/i/user/' ...
2 years ago
Mike Fährmann
fceaee3c4f
[lolisafe] remove zz.ht
2 years ago
Mike Fährmann
4554c43d5f
[bunkr] use 'media-files' servers for more file types
2 years ago
enduser420
4bc756dfe0
[2chen] fix extraction ( #3356 )
...
update 'archive_fmt'
update tests
update 'board' regex
2 years ago
enduser420
54844944ab
[pixhost] add 'gallery' support ( #3353 )
2 years ago
enduser420
213676c785
[fapachi] add 'post' and 'user' extractors ( #3347 )
...
* [fapachi] add 'post' and 'user' extractors
* [fapachi] add 'keyword' to test
* [fapachi] remove whitespaces
2 years ago
Mike Fährmann
a18511e346
[nitter] retry downloads on 404 ( #3313 )
2 years ago
Mike Fährmann
80102fa367
[downloader:http] add 'retry-codes' option ( #3313 )
2 years ago
Mike Fährmann
88610c3478
[patreon] update API query parameters
2 years ago
Mike Fährmann
c19b1f03b9
[patreon] fix '403 Forbidden' errors
...
send 'Content-Type' headers for API requests
2 years ago
Mike Fährmann
b4253f69c9
[downloader:http] fix ZeroDivisionError ( #3328 )
...
ensure 'time_elapsed' only get used as divisor
when it is greater than zero
2 years ago
Mike Fährmann
fc34f76cc5
[bunkr] fix video downloads ( #3326 )
...
by sending 'https://stream.bunkr.is/ ' as Referer header
2 years ago
Mike Fährmann
86a396e086
[bcy] fix JSONDecodeError ( #3321 )
2 years ago
Mike Fährmann
5b9a22af7f
[patreon] improve 'campaign_id' extraction ( #3235 )
2 years ago
Mike Fährmann
1bdd0e4338
[nitter] support '/i/web/' Tweet URLs ( #3310 )
2 years ago
Mike Fährmann
7e277d0f7d
[weibo] add 'count' metadata field ( #3305 )
...
or '{status[count]}', as most metadata for weibo is inside 'status'
2 years ago
Mike Fährmann
4287a93202
[nitter] handle base64-encoded filenames
2 years ago
ClosedPort22
b14b33f19e
Implement `version-metadata` option ( #3201 )
2 years ago
sudo
a6305d031c
[hitomi] apply format check for every image ( #3030 ) ( #3280 )
2 years ago
Steven Docherty
a7c7953107
[reddit] use 'dash_url' for videos ( #3258 ) ( #3306 )
...
* use fallback_url for reddit_video to fix issue 3258
* changed to dash_url to include audio
* update
- use [] instead of .get
- catch TypeErrors in case one of the elements is not a dict
Co-authored-by: InterruptSpeed <steven@docherty.ca>
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
0e75358af8
[twitter] fix using user IDs for suspended accounts
2 years ago
Mike Fährmann
c25905641e
[weibo] fix bug with empty 'playback_list' ( #3301 )
2 years ago