Mike Fährmann
b5affc62aa
[twitter] rename 'text-only' to 'text-tweets' ( #570 )
3 years ago
Mike Fährmann
724ca61f36
[twitter] add 'text-only' option ( #570 )
3 years ago
Mike Fährmann
8fd8126117
fix ISO 639-1 code for Japanese
...
"jp" -> "ja"
3 years ago
Mike Fährmann
2c60c7d798
[reactor] skip deleted/empty posts
3 years ago
Mike Fährmann
532ac79fb0
update extractor test results
3 years ago
Mike Fährmann
d7bc4a2b8b
[500px] update query hashes
3 years ago
Mike Fährmann
0f35aca728
[aryion] minor code updates
3 years ago
Mike Fährmann
2eb46452ad
[aryion] update 'needle' to not skip text posts ( fixes #1568 )
...
on "Latest Updates" pages
"class='thumb scrollthumb' href='/g4/view/" and
"class='thumb' href='/g4/view/" both end with
"thumb' href='/g4/view/"
3 years ago
Mike Fährmann
adf4d661b3
use '_extractor' info in UrlJobs
3 years ago
Mike Fährmann
4fc9668922
[imgur] update URL patterns ( #1561 )
3 years ago
Mike Fährmann
1eabfa5c7a
[pillowfort] implement login with username & password ( #846 )
3 years ago
Mike Fährmann
24dd10ac3c
[patreon] extract user defined 'tags' ( #1539 , closes #1540 )
3 years ago
Mike Fährmann
a7e4917ee1
[pillowfort] add 'inline' option ( #846 )
...
to support images present in a post's 'content',
but not listed in 'media'.
also separates the file hash present at the beginning
of each 'filename' into its own field.
3 years ago
Mike Fährmann
efa6cc8ec3
[pillowfort] add 'external' option ( #846 )
...
for links to external Twitter posts etc.
3 years ago
Mike Fährmann
394fbb5f56
[twitter] strip useless t.co links ( #1532 )
...
The 'full_text' of Tweets with media content usually ends with a t.co
link to itself. This commit removes those.
3 years ago
Mike Fährmann
3a7c3ff138
support XDG_CONFIG_HOME ( closes #1545 )
...
This will only load either
${XDG_CONFIG_HOME}/gallery-dl/config.json or
${HOME}/.config/gallery-dl/config.json
if XDG_CONFIG_HOME is empty, never both.
3 years ago
Mike Fährmann
41457dbb1b
[twitter] resolve t.co URLs in 'content' ( #1532 )
3 years ago
Mike Fährmann
2b5d80862e
[kemonoparty] add 'type' metadata field ( #1556 )
...
'file', 'attachment', or 'inline'
3 years ago
Mike Fährmann
17b0ccb071
[twitter] add missing retweet media entities ( fixes #1555 )
...
from the original tweets
3 years ago
Mike Fährmann
5eeaaee01d
[pixiv] add 'metadata' option ( #1551 )
3 years ago
Mike Fährmann
0717456b4e
[kemonoparty] add 'metadata' option ( closes #1548 )
...
to fetch creator names with an additional HTTP request
3 years ago
Mike Fährmann
b50b8e6cf4
refactor applying 'parent-…' options
3 years ago
Mike Fährmann
7ab8374385
add 'parent-skip' option ( #1399 )
3 years ago
Mike Fährmann
c693db5b1a
add '"skip": "terminate"' option
...
Stops not only the current extractor/job,
but all parent extractors/jobs as well.
3 years ago
Mike Fährmann
4835888acc
release version 1.17.4
3 years ago
Mike Fährmann
36ed1efcfb
[pixiv] rename "noop" value for 'tags' option to "original"
...
(#1507 )
3 years ago
Mike Fährmann
14f983eab6
[deviantart] use default ID when 'client-id' is None
3 years ago
Mike Fährmann
3e4ffb0821
[gelbooru] add extractor for '/redirect.php' URLs ( #1530 )
3 years ago
Mike Fährmann
5e54105ae4
[instagram] update query hashes
3 years ago
Mike Fährmann
b3ee10a7fb
[500px] update query hashes
3 years ago
Mike Fährmann
15b0241bbc
[imagebam] fix extraction
3 years ago
Mike Fährmann
38ae61edd4
[inkbunny] add 'favorite' extractor ( #1521 )
3 years ago
Mike Fährmann
577fffad5f
[nozomi] update 'archive_fmt' values for tag and search extractors
...
… so they actually work for posts with more than 1 file.
(fixes #1523 )
3 years ago
Mike Fährmann
e300da1424
add 'output.skip' option
3 years ago
Mike Fährmann
c5ca7905ce
add 'noop()' and 'identity()' functions
3 years ago
Mike Fährmann
755164b36a
improve --clear-cache ( #1230 )
...
Allow for an optional argument to only delete cached entries from
a specific module.
delete all cache entries
$ gallery-dl --clear-cache
or
$ gallery-dl --clear-cache all
only delete entries for instagram
$ gallery-dl --clear-cache instagram
3 years ago
HRXN
e13cae182b
[nozomi] Extend default archive-fmt for Tag and Search Extractor ( #1529 )
...
Closes #1523
3 years ago
Mike Fährmann
bc868e7bb8
consider apparently long extensions as part of the filename
...
(#1516 )
3 years ago
Mike Fährmann
2133f1d77f
[readcomiconline] change domain to 'readcomiconline.li'
...
(closes #1517 )
3 years ago
Mike Fährmann
66f28e471c
[kemonoparty] update file URLs directly linking to kemono.party
...
(#1514 )
3 years ago
Mike Fährmann
6fa20d456b
[sankaku] update invalid-token detection ( fixes #1515 )
3 years ago
Mike Fährmann
4b65ebf652
[kemonoparty] fix file URLs ( #1514 )
...
files are now hosted on https://data.kemono.party/
3 years ago
Mike Fährmann
fa519f9202
[pixiv] change 'translated-tags' option ( #1507 )
...
- rename to 'tags'
- use string-values: "japanese", "translated", "noop"
- remove duplicate entries for "translated" tags
3 years ago
Mike Fährmann
5b4da4b4bf
reorder config access in Job constructor
...
(#1111 )
3 years ago
Mike Fährmann
221015e586
[downloader:http] disable filename extension changes for ugoira
...
(#1507 )
3 years ago
Mike Fährmann
e5123f56c9
fix crash when using --no-download with --ugoira-conv ( #1507 )
3 years ago
Mike Fährmann
07b6661a87
release version 1.17.3
3 years ago
thatfuckingbird
e47952ac14
add extractors for fantia and fanbox ( #1459 )
...
* add extractors for fantia and fanbox
* appease linter
* make docstrings unique
* [fantia] refactor post extraction
* [fantia] capitalize
* [fantia] improve regex pattern
* code style
* capitalize
* [fanbox] use BASE_PATTERN for url regexes
* [fanbox] refactor metadata and post extraction
* [fanbox] improve url base pattern
* [fanbox] accept creator page links ending with /posts
* [fanbox] more tests
* [fantia] improved pagination
* [fanbox] misc. code logic improvements
* [fantia] finish restructuring pagination code
* [fanbox] avoid making a request for each individual post when processing a creator page
* [fanbox] support embedded videos
* [fanbox] fix errors
* [fanbox] document extractor.fanbox.videos
* [fanbox] handle "article" and "entry" post types, all embeds
* [fanbox] fix downloading of embedded fanbox posts
3 years ago
Mike Fährmann
d900edfcfb
[simplyhentai] fix extraction
3 years ago
Mike Fährmann
ba8180b5e6
[bcy] don't crash with deleted posts
3 years ago
Mike Fährmann
d108421461
[myportfolio] fix extraction
3 years ago
Mike Fährmann
8b22d4e667
[mangapark] use '"browser": "firefox"' by default
...
to get rid of Cloudflare CAPTCHA resonses
3 years ago
Mike Fährmann
9514cb8c12
[exhentai] update 'limits' check ( #1487 )
...
Only use 'limits' to set a custom upper bound.
Checking if the actual maximum gets exceeded is not necessary.
3 years ago
thatfuckingbird
141ca4ac0a
[pixiv] also save untranslated tags when translated-tags is enabled ( #1501 )
3 years ago
Renan Vedovato Traba
9322c5e43b
[exhentai] restore limit config ( #1487 )
...
This partially reverts commit e9ec91c8
3 years ago
Mike Fährmann
cb86bb9cc9
[hentaicosplays] add 'slug' metadata field ( closes #1483 )
3 years ago
Mike Fährmann
b4ed7cb961
fix 'category-transfer' ( #1111 )
...
broken since commit 055c32e0
3 years ago
Mike Fährmann
dddda7d0e7
[hentaicosplays] use GalleryExtractor ( #1473 )
3 years ago
Mike Fährmann
d88e34f17e
[webtoons] use GalleryExtractor
3 years ago
Mike Fährmann
c4210b5371
[webtoons] update agegate/GDPR cookies
3 years ago
Mike Fährmann
d89eb7536b
[naverwebtoon] use GalleryExtractor
3 years ago
Mike Fährmann
9b52eb9bf1
[naverwebtoon] ignore non-comic images
3 years ago
Mike Fährmann
bdfcc9c4b1
update extractor test results
3 years ago
Hans Christian Gunawan
334d690687
[hentaicosplays] Add extractor ( #1473 )
3 years ago
Mike Fährmann
82c32d25af
[500px] update query hashes
3 years ago
Mike Fährmann
de14b7ad7a
[slideshare] fix extraction
3 years ago
Mike Fährmann
bef3105121
[komikcast] fix extraction
3 years ago
Mike Fährmann
086925e685
[shopify] support omgmiamiswimwear.com ( closes #1280 )
3 years ago
thatfuckingbird
224b883ff4
[danbooru] add option for extended metadata extraction ( #1458 )
...
* [danbooru] add option for extended metadata extraction
* appease linter
* [danbooru] update docs/configuration.rst
* [danbooru] rename extended-metadata -> metadata
3 years ago
thatfuckingbird
dff03a6605
[booru] add an option to extract notes (only gelbooru for now) ( #1457 )
...
* [booru] add an option to extract notes (currently implemented only for gelbooru)
* appease linter
* [gelbooru] rename "text" to "body" in note extraction
* add a code comment about reusing return value of _extended_tags
3 years ago
Mike Fährmann
78d7ee3ef4
[yuki] remove module for yuki.la
3 years ago
Mike Fährmann
a86ffb04bb
add 'output.fallback' option
...
to enable/disable fallback URLs for -g/--get-urls
3 years ago
Mike Fährmann
5a98bcec3a
[deviantart] improve folder name matching ( fixes #1451 )
3 years ago
thatfuckingbird
918b0441fb
[gelbooru] fix tag category extraction ( #1455 )
3 years ago
Mike Fährmann
fe6ce5b8f8
[erome] skip deleted albums ( fixes #1447 )
3 years ago
Mike Fährmann
457abf0e71
[deviantart] fix pagination for Eclipse results ( fixes #1444 )
...
- don't crash on missing keys
- use fallback for invalid 'nextOffset' values
3 years ago
Mike Fährmann
dee540050f
[8muses] fix JSON unobfuscation
...
limit the characters that get modified,
leave non-ASCII characters alone
4 years ago
Mike Fährmann
b869b3a9eb
[instagram] fetch media for incomplete GraphSidecar posts
...
GraphSidecar results from /tagged pages don't contain
all media elements, only the first one.
(#1439 )
4 years ago
Mike Fährmann
b0686d2174
[instagram] update query hashes
4 years ago
Mike Fährmann
e8e3717b71
[instagram] add extractor for /tagged posts ( #1439 )
4 years ago
Mike Fährmann
abafe71e04
[exhentai] fix image limit detection ( closes #1437 )
...
check for image limit message when downloading original files
4 years ago
Mike Fährmann
a75e485461
add archive format to InfoJob output ( #875 )
4 years ago
Mike Fährmann
52a7913abe
[artstation] download /4k/ images ( #1422 )
4 years ago
Mike Fährmann
37940193a6
build executables with SOCKS proxy support ( closes #1424 )
4 years ago
Christian Paul
41fbc20020
[webtoons]: Add cookie rstagGDPR_DE=true ( #1431 )
4 years ago
Mike Fährmann
583bee7725
release version 1.17.2
4 years ago
FollieHiyuki
e3b9f88540
Add manganelo extractor ( #1415 )
4 years ago
Mike Fährmann
fd858eed7b
[twitter] add 'user_likes' metadata field for liked tweets
...
i.e. the 'screen_name' of the user whose liked tweets get extracted.
Ideally this would replace 'user' or at least be in the same format,
but that would break backwards compatibility or be impossible/too
complicated thanks to API result differences.
(#1421 )
4 years ago
Mike Fährmann
8d124a3766
[twitter] rename variables
4 years ago
Mike Fährmann
105f3c9666
[twitter] add extractor for direct image links ( closes #1417 )
4 years ago
Mike Fährmann
ec3d5d58a8
[vk] improve extractor ( #474 )
...
- fetch all photos
- add 'metadata' option
- fix extracting photos without '?' in URL
4 years ago
Mike Fährmann
ebd142e2a8
[twitter] don't use youtube-dl for cards when videos are disabled
...
(#1416 )
4 years ago
Mike Fährmann
d5aad999dc
[tapas] implement login with username & password ( #692 )
4 years ago
Mike Fährmann
e9ec91c811
[exhentai] improve image limits check
...
- check if current image is the '509 Bandwidth Exceeded' notification
(https://ehgt.org/g/509.gif or https://exhentai.org/img/509.gif )
- remove 'limits' option
4 years ago
Mike Fährmann
387fe415d5
unescape items in text.split_html()
4 years ago
Mike Fährmann
36291176bc
[pinterest] add 'search' extractor ( #1411 )
4 years ago
Mike Fährmann
058cc47e9b
[bcy] improve pagination
4 years ago
Mike Fährmann
ddd48ceee5
update extractor test results
4 years ago
Mike Fährmann
1a540fbe00
[komikcast] fix extraction
4 years ago
Mike Fährmann
78fd63b8f0
remove 'text.clean_xml()'
...
was not used anywhere
4 years ago
Mike Fährmann
8553b218d9
replace calls to 'os.path.splitext()' with 'str.rpartition()'
...
Makes functions who used it more than twice as fast
and we can get rid of an import as well.
4 years ago
Mike Fährmann
5aa30c3669
[tapas] add 'series' and 'episode' extractors ( #692 )
4 years ago
Mike Fährmann
ccfa5a8694
[twitter] better error message when logging in with 2FA ( #1409 )
4 years ago
Mike Fährmann
214ecf62ce
[deviantart] fix arguments for search/popular results ( #1408 )
4 years ago
Magnus Boman
522d0a834c
[aryion] Unescape paths too ( #1414 )
...
Without this you'll get paths like this:
- Starcross - Ch. 2 "The Ins and Outs of Sarah"
This commit changes it to:
- Starcross - Ch. 2 "The Ins and Outs of Sarah"
4 years ago
beesdotjson
5ad615f0db
fix PixivFavoriteExtractor regex ( #1405 )
...
* fix PixivFavoriteExtractor regex
* do not use lookbehind
4 years ago
Mike Fährmann
62cfee4d28
[vk] initial support for albums ( #474 )
4 years ago
Mike Fährmann
0e601de67b
[sankaku] simplify 'pool' tags ( #1388 )
...
normalize 'tags' and 'artist_tags' to a string-list
4 years ago
Mike Fährmann
d085ade9d5
[sankaku] add 'tag_string' metadata field ( #1388 )
...
The 'join()'ed version of 'tags'.
Handling lists in format strings isn't properly supported yet.
4 years ago
Mike Fährmann
2dffd231b7
[sankaku] add enumeration index for books ( #1388 )
4 years ago
Mike Fährmann
139fb84108
[deviantart] fix username for 'watch' results ( #794 )
...
before it'd use "/" as username
4 years ago
Mike Fährmann
91c2e15da9
[deviantart] add support for posts from watched users ( #794 )
4 years ago
Mike Fährmann
03c20d8c8e
[deviantart] update 'watch' URL pattern ( #794 )
4 years ago
Mike Fährmann
2846235669
[twitter] allow specifying a custom format for user results
...
(#1337 )
4 years ago
Mike Fährmann
bf241811dd
allow '_extractor' fields to be None or empty
4 years ago
Mike Fährmann
dc23cfd684
[deviantart] use fallback for /intermediary/ URLs
...
instead of checking availability with HEAD requests
4 years ago
Mike Fährmann
15daa62842
release version 1.17.1
4 years ago
Mike Fährmann
b0438c8f99
Revert "[deviantart] extend 'extra' option"
...
This reverts commit
5ad2b9c82b
,
5c32a7bf58
, and
83f465faca
.
(#1387 , #1356 )
4 years ago
Mike Fährmann
58b93635ee
[architizer] add 'firm' extractor ( #1369 )
4 years ago
Mike Fährmann
204523611c
[imgclick] use 'http://' for image URLs
...
The TLS certificate for main.imgclick.net is invalid.
4 years ago
Mike Fährmann
0b55f5ad84
[imgur] fix/improve rate limit handling ( #1386 )
...
- also wait-and-retry on 429 status codes
- use infinite loop instead of recursive calls
- 'extractor.sleep()' -> 'extractor.wait()'
4 years ago
Mike Fährmann
69ca4e29f1
[deviantart] add 'watch' extractor ( #794 )
4 years ago
Mike Fährmann
fcdda6128c
[mangastream] remove module
4 years ago
Mike Fährmann
c677ea19dd
[mangareader] remove module
4 years ago
Mike Fährmann
71523aaab6
[architizer] add 'project' extractor ( #1369 )
4 years ago
Mike Fährmann
3378b39719
[twitter] implement 'users' option ( #1337 )
4 years ago
Mike Fährmann
847e9b0ed7
[philomena] support post URLs without '/images/'
...
e.g. 'derpibooru.org/1'
4 years ago
Mike Fährmann
466966bf83
[hentaicafe] remove module
4 years ago
Mike Fährmann
97641cd151
[hentainexus] remove module
4 years ago
Mike Fährmann
23641742a3
improve 'parent-directory' ( #1364 )
...
Allow forwarding metadata from the top-level extractor to all children
if 'parent-directory' is enabled for all extractors along the way.
For example 'reddit' -> 'gfycat' -> 'redgifs'
4 years ago
Mike Fährmann
c485d0a956
[philomena] add generalized extractors for philomena sites
...
(closes #1379 )
4 years ago
Mike Fährmann
6be7df53da
[hentaifox] improve metadata extraction ( fixes #1378 )
4 years ago
Mike Fährmann
72fe9ac0f3
[gelbooru_v01] support some more boorus by default
...
- https://drawfriends.booru.org/
- https://vidyart.booru.org/
- https://tlb.booru.org/
4 years ago
tux93
10c279f285
Weasyl: Drop the `&feature=submit` part of the favourite extractor URL ( #1374 )
...
It's optional and requiring it forces users to escape those URLs because
of the ampersand
4 years ago
Mike Fährmann
df94182e11
implement 'parent-metadata' option ( #1364 )
...
experimental, might not work as expected, etc.
4 years ago
Mike Fährmann
4be27ff0fe
[nozomi] support '/index-N.html' URLs ( closes #1365 )
...
and '/index-Popular-N.html'
4 years ago
Mike Fährmann
780bac4c8a
[gelbooru] update video server ( fixes #1368 )
...
from 'https://img2.gelbooru.com ' to 'https://img3.gelbooru.com '
and provide fallback URLs
4 years ago
Mike Fährmann
f8441e851a
[hentaifox] improve image extraction ( fixes #1366 )
...
build image URLs from embedded JSON data
instead 0f rewriting thumbnail URLs
4 years ago
Mike Fährmann
c7c3fef0bc
[exhentai] support '/tag/' URLs ( closes #1363 )
4 years ago
Mike Fährmann
90830daf85
[exhentai] improve 'favorites' extraction ( closes #1360 )
...
add special cases for when the favorite count is 0 (Never) or 1 (Once)
4 years ago
Mike Fährmann
b6719becf1
ensure '-s/--simulate' always prints filenames ( #1360 )
...
by assuming a potentially wrong filename extension in cases where the
correct one would only get known after a download started
4 years ago
Mike Fährmann
83f465faca
[deviantart] refactor 'extra' ( #1356 )
...
- change its expected type to string
- let users specify a list of sources (stash, posts) or 'all'
4 years ago
Mike Fährmann
5c32a7bf58
[deviantart] allow selecting source for 'extra' ( #1356 )
...
Setting 'extra' to "stash" or "deviations" will only download embedded
sta.sh content or deviations. 'true' still downloads both.
4 years ago
Mike Fährmann
a677123abb
[instagram] recognize 'reels' as option for 'include' ( #1329 )
4 years ago
Mike Fährmann
94faf8c85a
add type check before applying 'browser' option ( fixes #1358 )
4 years ago
Mike Fährmann
5cf593a00a
release version 1.17.0
4 years ago
Mike Fährmann
7440d1f112
[pixiv] add 'translated-tags' option ( closes #1354 )
...
(a lot more straight forward than I thought ...)
4 years ago
Ailothaen
2e8061091a
Adding handling of several input files ( #1353 )
...
* Adding handling of several input files
* Fixed flake8 error due to bad indenting
4 years ago
Mike Fährmann
106cdc37c0
[instagram] support '/user/reels/' URLs ( closes #1329 )
4 years ago
Mike Fährmann
524ebb133e
[instagram] refactor reel handling
4 years ago
Mike Fährmann
9785c551bc
[500px] skip unavailable photos ( #1335 )
...
instead of crashing with a KeyError exception
4 years ago
Mike Fährmann
6cfc9613fe
update some code in Extractor constructor
...
- combine '_init_headers' and '_emulate_browser' functionality
into new '_init_session'
- add 'headers' and 'ciphers' options
4 years ago
Mike Fährmann
f59e63669b
[hentaicafe] add 'search' and 'tag' extractors ( #1345 )
4 years ago
Mike Fährmann
38e66940c1
[tumblrgallery] simplify
4 years ago
Seonghyeon Cho
665499924d
Support naver webtoon ( #1331 )
...
* Support naver webtoon (WIP)
* Apply patch
* Change filename format
* Fill test results
* Fill test result
4 years ago
topozorra
a9119da4d4
support `tumblrgallery.xyz` ( #1298 )
...
* support `tumblrgallery.xyz`
* fix format issues
* Refactor and add post and search page support
* Fix warnings
* Few improvments
* Better file names
* Fix linting errors
* move id closer to the begining of the file name
Co-authored-by: topozorra <none>
4 years ago
Mike Fährmann
c963741860
add '-E/--extractor-info' command-line option ( #875 )
4 years ago
Mike Fährmann
bff71cde80
implement 'util.unique_squence()'
4 years ago
Mike Fährmann
bae874f370
replace 'wait-min/-max' with 'sleep-request'
...
on exhentai, idolcomplex, reactor
4 years ago
Mike Fährmann
e165e6c265
[wallhaven] add 'collections' extractor ( #1351 )
4 years ago
Mike Fährmann
faf561b6ca
[wallhaven] add 'collection' extractor ( #1351 )
4 years ago
Mike Fährmann
5d3d94ba14
[wallhaven] refactor
4 years ago
Mike Fährmann
1a38fae785
add option to use different youtube-dl modules ( fixes #1330 )
...
by setting the 'downloader.ytdl.module' value. For example
{
"downloader": {
"ytdl": {
"module": "yt_dlp"
}
}
}
or '-o module=yt_dlp'
4 years ago
Mike Fährmann
8821dceb79
use __import__() to dynamically load modules
4 years ago
Mike Fährmann
69ea781d32
[mangadex] improve caching of manga results
...
'manga_id' being a string or integer are treated as two different keys
4 years ago
Mike Fährmann
e58039358d
[mangadex] use 'api.mangadex.org' as default API server
...
The caching issues seem to be gone.
(#1290 , #1310 )
4 years ago
Mike Fährmann
fc15930266
[readcomiconline] download high quality image versions
...
(fixes #1347 )
4 years ago
Mike Fährmann
f360778e60
[komikcast] fix extraction
4 years ago
Mike Fährmann
3df527ee2c
update extractor test results
4 years ago
Mike Fährmann
1bd3d7cfb0
[postprocessor:metadata] call expand_path() on custom paths
...
(#1299 )
4 years ago
Mike Fährmann
fe2ec9cf68
[patreon] reduce redirects when fetching campaign ID
4 years ago
Mike Fährmann
29ea54dc41
[patreon] use '"browser": "firefox"' by default ( #1117 )
4 years ago
Mike Fährmann
61fbbd2dae
[exhentai] rename metadata fields to match API results ( #1325 )
...
- gallery_id -> gid
- gallery_token -> token
- title_jp -> title_jpn
- visible -> expunged
- gallery_size -> filesize
- count -> filecount
Also changes the function of the 'metadata' option.
It is now boolean and causes extra data fields from the API to be added
instead of completely replacing the data from HTML when activated.
4 years ago
Mike Fährmann
996bfe4d4b
[hentaicafe] fix manga extractor
...
was broken since 993856b8
4 years ago
Mike Fährmann
5d69e437d0
[twitter] add option to download all media from a conversation
...
(fixes #1319 )
4 years ago
Mike Fährmann
cf5fa75d4c
add 'browser' option ( #1117 )
...
- change default user agent to Firefox ESR 78 on Windows 10
- remove 'ciphers' option
4 years ago
Mike Fährmann
92071d02f4
fix crash when 'base-directory' is an empty string ( #1339 )
4 years ago
Mike Fährmann
970fc2b2b5
allow setting 'filename' & '(base-)directory' to default
...
by setting them to 'None'/'null'
4 years ago
Mike Fährmann
e5735361ed
[exhentai] add 'metadata' option ( #1325 )
...
to select between gallery metadata from 'api' or 'html'
4 years ago
Mike Fährmann
8f095a0980
[exhentai] extract more metadata from gallery pages ( #1325 )
4 years ago
Mike Fährmann
ffce8d85e7
[cyberdrop] update
...
- add test and archive_fmt
- extract more metadata
4 years ago
Mike Fährmann
de0656941b
[twitter] add extractor for followed users ( #1337 )
...
https://twitter.com/USER/following or
https://twitter.com/id:USERID/following
4 years ago
Mike Fährmann
e39aea42cd
fix supportedsites.py for modules without docstring
...
(fixes #1332 )
4 years ago
loragja
7b5ee922b7
cyberdrop extractor ( #1328 )
...
* create cyberdrop extractor
* add cyberdrop to list of extractors
* fix formatting
* change class name from CyberdropExtractor to CyberdropAlbumExtractor
* add cyberdrop to list of supported sites
* attempt to clean up diff of supportedsites.rst
* replace regex with functions from text library
4 years ago
Mike Fährmann
5ad2b9c82b
[deviantart] extend 'extra' option
...
also download from embedded DeviantArt posts
4 years ago
Mike Fährmann
560277394e
[downloader:http] add 'headers' option ( #1322 )
4 years ago
Mike Fährmann
6b0ecbf6bc
[hentainexus] add 'orignal' option ( #1322 )
4 years ago
Mike Fährmann
5542a11c46
[twitter] update GraphQL endpoints
4 years ago
Mike Fährmann
e1a12761d7
strip '/' from instance root URLs
4 years ago
Mike Fährmann
595bdaa4be
add extractors for gelbooru v0.1 sites
...
- support https://illusioncards.booru.org/ (closes #426 )
- support https://the-collection.booru.org/ (closes #767 )
- support https://allgirl.booru.org/
- closes #234 , closes #473 , closes #1238
To get gallery-dl to recognize other sites running Gelbooru v0.1
(most sites on booru.org), add one or more entries to the
'gelbooru_v01' block in your config file. For example:
{
"extractor": {
"gelbooru_v01": {
"rozenmaidenbooru": {"root": "http://rm.booru.org "},
"drawfriendsbooru": {"root": "http://drawfriends.booru.org "}
}
}
}
4 years ago
Mike Fährmann
59fd740b47
[tbib] add support for https://tbib.org/ ( #473 , closes #1082 )
4 years ago
Mike Fährmann
08d7934c6e
move extractors from booru.py into their own gelbooru_v02 module
4 years ago
Mike Fährmann
d656892670
remove cloudflare.py
...
The old IUAM challenge doesn't get used anymore, i.e. code to bypass it
is pointless, and the 'is_...()' checks are simple enough to directly
include them in 'extractor.request()'.
4 years ago
Mike Fährmann
65ca923b4e
fix 'whitelist' option for BaseExtractor instances
4 years ago
Mike Fährmann
fbfcbcbf57
Merge branch '1.17.0'
4 years ago
Mike Fährmann
6e40585fb1
release version 1.16.5
4 years ago
Mike Fährmann
ba693d8686
[patreon] skip posts without view permission ( #1316 )
4 years ago
Mike Fährmann
dcbd995346
[vanillarock] fix metadata extraction
4 years ago
Mike Fährmann
4b1cda4cf7
[paheal] fix metadata extraction
4 years ago
Mike Fährmann
2919d78bfc
update extractor test results
4 years ago
Mike Fährmann
8974f0361c
[pixiv] update ( #1304 )
...
- remove login with username & password
- require a refresh token
- add 'oauth:pixiv' functionality
See also:
- https://github.com/upbit/pixivpy/issues/158
- https://gist.github.com/ZipFile/c9ebedb224406f4f11845ab700124362
4 years ago
Mike Fährmann
79c0fc249b
[mangadex] add 'api-server' option ( #1309 )
...
and change the API server back to 'https://mangadex.org/api ' for now
4 years ago
Mike Fährmann
96a51ff169
[sankaku] update invalid-token detection ( fixes #1309 )
4 years ago
Mike Fährmann
b3cd970d87
[postprocessor:metadata] fix crash with 'extension-format'
...
Using the 'extension-format' option for events where no filename
extension is available caused a crash.
(fixes #1285 )
4 years ago
Mike Fährmann
23be48427c
[deviantart] fix 'folders' option ( closes #1302 )
...
don't assume parent folders are listed before their children
4 years ago
Mike Fährmann
ca6b0fc2ac
[imagehosts] cleanup
4 years ago
Mike Fährmann
95a66bdad6
[imgclick] add 'image' extractor ( closes #1307 )
...
basically reverts b0e8daf415
4 years ago
Mike Fährmann
fc78210725
[kemonoparty] include 'service' in directories and archive keys
4 years ago
Mike Fährmann
c386a9fabf
[kemonoparty] fix absolute file URLs
4 years ago
Mike Fährmann
7e7158e7c0
[kemonoparty] support URLs with non-numeric user and post IDs
...
(fixes #1303 )
4 years ago
Mike Fährmann
e88d5bede8
[500px] update query hash
4 years ago
Mike Fährmann
280b1ac16d
[slideshare] fix extraction
4 years ago
Mike Fährmann
ae530f6365
[erome] add extractors for albums, users, searches ( closes #409 )
4 years ago
ticklebits
fa6d4d73c7
[hentaifox] support searching by group ( #1294 )
...
Groups on hentaifox lists the items to download the same way as the other pages (artists, search, tag, etc). Added group to the pattern to search, and the test.
4 years ago
Mike Fährmann
2cc1e04fe5
[kemonoparty] extract inline images ( fixes #1286 )
4 years ago
Mike Fährmann
56a8968435
remove 'Message.Metadata' ( #866 )
4 years ago
Mike Fährmann
1d145a6186
[mastodon] use cache for OAuth tokens ( #616 )
4 years ago
Mike Fährmann
a041a017d1
[pillowfort] ignore files without download URL ( #846 )
4 years ago
Mike Fährmann
23ff936d46
[nsfwalbum] use fallback for deleted content ( fixes #1259 )
4 years ago
Mike Fährmann
a228bb3a5f
[downloader:http] support callbacks to validate responses
4 years ago
Mike Fährmann
6b2bce3b7d
[behance] support 'video' modules ( closes #1282 )
...
(requires youtube-dl to download from m3u8 manifests)
4 years ago
Mike Fährmann
b0cff979b1
[inkbunny] raise NotFoundError for invalid/private submissions
...
instead of crashing
4 years ago
Mike Fährmann
e61b125fcc
[inkbunny] add 'sid' parameter to private file downloads
...
(fixes #1281 )
4 years ago
Mike Fährmann
36bf76fa44
update 'oauth:mastodon:<instance>' code
4 years ago
Mike Fährmann
88fae99811
remove 'generate_extractors()'
4 years ago
Mike Fährmann
fa33f13453
[mastodon] update
...
- inherit from BaseExtractor
- remove custom generate_extractors() and config()
- improve layout of MastodonAPI internals
4 years ago
Mike Fährmann
231bcad614
[shopify] use BaseExtractor
4 years ago
Mike Fährmann
2de8ebc098
[moebooru] use BaseExtractor
4 years ago
Mike Fährmann
0978c1e184
[booru] use BaseExtractor
4 years ago
Mike Fährmann
c6cc86d7d0
[deviantart] update parameters for '/browse/popular'
...
- limit results to 50 when also querying metadata (fixes #1267 )
- remove deprecated 'category_path' parameter
4 years ago
Mike Fährmann
993856b866
[foolslide] use BaseExtractor
4 years ago
Mike Fährmann
671a95cae5
[foolfuuka] use BaseExtractor
4 years ago
Mike Fährmann
745a114c61
[common] implement BaseExtractor class
...
Should be used when the same extractor logic applies to different
instances/domains of several sites, e.g. FoolFuuka, Shopify, etc.
This will replace the functionality of 'generate_extractors()' in
a more efficient way, by condensing everything into 1 class and not
dynamically generating an extractor class for each instance.
4 years ago
Mike Fährmann
b549c53b36
add long option for '-G'
4 years ago
Mike Fährmann
c26de0929d
[deviantart] provide 'extension' for original file downloads
...
(#1272 )
4 years ago
Mike Fährmann
24e8e398e0
[twitter] skip login if 'auth_token' cookie is present
4 years ago
Mike Fährmann
cdb0b02e30
[pillowfort] add 'reblogs' option ( #846 )
4 years ago
Mike Fährmann
7ca3bf7cb0
[pillowfort] add 'user' and 'post' extractors ( #846 )
4 years ago
Mike Fährmann
ebf417f31f
remove support for deprecated options
...
- instagram.highlights
- metadata.bypost
- exec.final
4 years ago
Mike Fährmann
477ed010c1
release version 1.16.4
4 years ago
Mike Fährmann
1d13e48512
[unsplash] implement 'skip()'
4 years ago
Mike Fährmann
6cdbfb79e9
[photovogue] update ( #1253 )
4 years ago
Federico Ravasio
25297815bc
[photovogue] added portfolio extractor ( #1253 )
4 years ago
Mike Fährmann
0265fbda61
[mangakakalot] fix extraction
4 years ago
Mike Fährmann
7a096c443f
[unsplash] add 'format' option ( #1197 )
4 years ago
Mike Fährmann
3188ac16d1
[unsplash] add 'collection' extractor ( #1197 )
4 years ago
Mike Fährmann
247cc73446
[derpibooru] update 'date' parsing
4 years ago
Mike Fährmann
193dca2ce1
update extractor test results
4 years ago
Mike Fährmann
89ea1384fc
[unsplash] fix typo
4 years ago
Mike Fährmann
e5e591b848
[vipr] simplify and add test ( #1258 )
4 years ago
v-delta
e707e060cb
[vipr] add image extractor ( #1258 )
...
* [vipr] add image extractor
Adds support for images hosted on https://vipr.im
* Fix codestyle issues
4 years ago
Mike Fährmann
95e5911895
[twitter] match '/i/user/ID' URLs
4 years ago
Mike Fährmann
069b113cbf
[twitter] improve and fix retry after hitting rate limit
...
- replace recursive call with infinite loop
- fix function arguments for recursive call
4 years ago
Mike Fährmann
89a2bcbb2d
[furaffinity] add 'descriptions' option ( #1231 )
4 years ago
Mike Fährmann
36f281330a
[newgrounds] fix flash file extraction ( closes #1257 )
...
… and add a 'flash' option to choose between flash and video formats.
4 years ago
Mike Fährmann
534194bf92
[unsplash] add extractors ( #1197 )
...
for
- single photos (/photos/ID)
- user profiles (/@USER)
- user likes (/@USER/likes)
- search results (/s/photos/SEARCH)
4 years ago
Mike Fährmann
1fc16cb8c5
[instagram] fix regex for '/saved' URLs ( fixes #1251 )
...
The URL pattern erroneously had two '([^/?#]+)' capture groups,
which would split any username into 'usernam' for the first group
and 'e' for the ignored second group.
4 years ago
Mike Fährmann
c008cb5100
[pixiv] add 'related' option ( #1237 )
4 years ago
Mike Fährmann
e9a75e27d9
[foolfuuka] stop search when results are exhausted ( #1174 )
4 years ago
Mike Fährmann
b0cf968115
[mangadex] update API URLs
...
https://mangadex.org/thread/351011/9/#post_4238014
4 years ago
Mike Fährmann
a6414c31d6
[kemonoparty] simplify ( #1216 )
...
Use metadata from API responses as is and
don't try to detect duplicated by their original filename.
4 years ago
Mike Fährmann
01b9ccd4de
[derpibooru] use the "Everything" filter by default ( #1243 )
...
when neither 'api-key' nor 'filter' are set
4 years ago
Mike Fährmann
91308140ec
make 'generate_token()' compatible with Python 3.4
4 years ago
Mike Fährmann
1fdecfa269
[kemonoparty] use API endpoints ( #1216 )
4 years ago
Mike Fährmann
318876e4dd
[nozomi] add 'num' enumeration index ( closes #1239 )
4 years ago
Mike Fährmann
2da9068ea8
[sankaku] simplify login process
4 years ago
Mike Fährmann
e07dfc4fe5
[kemonoparty] add 'user' and 'post' extractors ( #1216 )
4 years ago
Mike Fährmann
780b6adb91
rename 'generate_csrf_token()' to just 'generate_token()'
...
and add a 'size' argument
4 years ago
Mike Fährmann
f277e48c77
release version 1.16.3
4 years ago
Mike Fährmann
79501a356f
fix crash when 'path-restrict' is an object/dict
...
This basically reverts commit 5818c928
(#1234 )
4 years ago
Mike Fährmann
0fdaea00a3
[postprocessor:metadata] sanitize filenames
4 years ago
Mike Fährmann
32fcc61b84
release version 1.16.2
4 years ago
Mike Fährmann
02bc59d75c
[hentainexus] fix extraction ( fixes #1234 )
...
hentainexus is now hosting its images on wordpress, or at least it is
using wordpress' servers as cache:
https://i2.wp.com/images.hentainexus.com/gallery/2199754b23c191deb330c99c9bb43341/9576/002.png?filter=null
4 years ago
Mike Fährmann
5d4494b15f
add "ascii" as a special 'path-restrict' value
4 years ago
Mike Fährmann
5818c928c4
refactor 'path-restrict' parsing
4 years ago
Mike Fährmann
aac00a2024
add 'd' conversion for format strings
...
to convert a timestamp to a formattable 'datetime' object.
For example '{created_at!d:%Y-%m-%d}'
transforms the timestamp in 'created_at' into a 'datetime' object
and then formats its content using '%Y-%m-%d' as template.
1262304000 -> datetime(2010, 1, 1) -> "2010-01-01"
4 years ago
Mike Fährmann
20bd9cd296
[wikiart] add extractor for single paintings ( closes #1233 )
...
There is no API endpoint for single paintings from what I can tell,
so this uses the site's search.
4 years ago
Mike Fährmann
e2d4ca4955
[deviantart] improve '--range' for favorites ( closes #1226 )
4 years ago
Mike Fährmann
56ccb9951a
[gfycat] add 'date' metadata field ( #1138 )
4 years ago
Mike Fährmann
f2b83b8578
[gfycat] convert IDs to lowercase
...
Redgifs expects all IDs and names to be lowercase
and throws a 404 if an ID contains an uppercase letter.
Gfycat on the other hand doesn't care about case,
so it's fine to just convert all IDs.
(#1138 )
4 years ago
Mike Fährmann
b3bc646236
[redgifs] match embedded URLs
...
https://redgifs.com/ifr/ <ID>
4 years ago
Mike Fährmann
98e0d21383
[instagram] categorize single highlight URLs as 'highlights'
...
They were categorized as 'stories' before.
(fixes #1222 )
4 years ago
Mike Fährmann
1c9435e0df
add '-G' command-line option ( #1217 )
...
A "stronger" version of '-g', resolving all intermediate URLs.
4 years ago
Mike Fährmann
fa8ee6eac4
[derpibooru] add search and gallery extractors ( #862 )
4 years ago
Mike Fährmann
3759d0cb42
[redgifs] fix search results
...
The metadata for Redgifs search results got stripped down to a bare
minimum, including download URLs. (Clicking on search results on the
website itself is broken as well)
As a workaround, we make an extra call to '/v1/gfycats/<ID>'
for each search result entry to fetch the missing data.
4 years ago
Mike Fährmann
8a88025dc4
[pinterest] support generic user URLs ( #1205 )
...
i.e. https://www.pinterest.com/USERNAME
also renames 'BoardsExtractor' to 'UserExtractor'
4 years ago
Mike Fährmann
56b460dcea
[foolfuuka] add 'search' extractors ( #1174 )
4 years ago
Mike Fährmann
fb64183d53
[foolfuuka] add 'board' extractors ( closes #1044 )
4 years ago
Mike Fährmann
0594821fcd
[downloader:http] add MIME type and signature for .ico files
...
(closes #1211 )
4 years ago
Mike Fährmann
b0beed7a06
[sankaku] add support for book searches ( closes #1204 )
4 years ago
Mike Fährmann
6cdbab07b5
[pinterest] add support for getting all boards of a user
...
(#1205 )
4 years ago
Mike Fährmann
25074aec47
[twitter] fetch media from pinned tweets ( #1203 )
4 years ago
Mike Fährmann
2475176d99
[twitter] fetch tweets from 'homeConversation' entries
...
When logged in, some entries returned by Twitter's API are so called
'homeConversation's (they would be regular tweet entries otherwise.)
Those weren't picked up before and resulted in missing files compared
to accessing a timeline as guest.
('/media' timelines and search results were not affected)
4 years ago
Mike Fährmann
3af9350648
[twitter] update API calls
...
- use 'https://twitter.com/i/api ' for all requests
except '/guest/activate.json'
- update (default) URL parameters
- update GraphQL endpoints
4 years ago
Mike Fährmann
b656b829db
[twitter] fix login with username & password
...
It is no longer possible to get an 'authenticity_token' from Twitter's
Javascript-free login form, which got disabled few days ago.
Generating a random 16 byte hex string client-side and sending that as
a cookie alongside the regular login form works just as well.
4 years ago
Mike Fährmann
d1903589a5
release version 1.16.1
4 years ago
Mike Fährmann
912eea29bc
update extractor test results
4 years ago
Mike Fährmann
47a7a51944
[sankaku] fix 'invalid_token' detection
4 years ago
Mike Fährmann
ba5df84f7e
[keenspot] improve redirect handling
...
Before it would use http:// for all requests and
get a redirect to a https:// version if those are supported.
Now the redirect only happens once during the first request.
4 years ago
Mike Fährmann
d781e6ac44
[e621] return pool posts in order ( closes #1195 )
...
… and add a 'num' enumeration index.
A bit more code than the PR version, but it prints some helpful messages
and doesn't call 'metadata()' twice.
4 years ago