Mike Fährmann
4adc44df69
[furaffinity] improve metadata extraction ( fixes #1630 )
...
Fetch 'title' and 'artist' metadata from a different location,
since for posts with an empty title the <title> element is
completely empty and does not contain the artist's name.
3 years ago
Mike Fährmann
e98fa01c44
[hitomi] update image URL code ( fixes #1637 )
3 years ago
Mike Fährmann
e9ab97396f
[kemonoparty] update default filenames and archive IDs ( #1514 )
...
Add an enumeration index so that attachments and regular files with the
same filename still get downloaded and not counted as duplicate files
(even though for patreon posts they usually are)
This invalidates all previously generated archive IDs.
To keep using old names and IDs, set
'filename' to "{id}_{title}_{filename}.{extension}" and
'archive-format' to "{service}_{user}_{id}_{filename}.{extension}".
3 years ago
Mike Fährmann
fb4b4725ba
[hiperdex] match 'hiperdex2.com' URLs
...
still doesn't properly work due to Cloudflare CAPTCHA and IUAM page
3 years ago
Mike Fährmann
95bc1139e0
[instagram] update query hashes
3 years ago
Mike Fährmann
23018a46f6
[instagram] fix login ( fixes #1631 )
3 years ago
Mike Fährmann
cac0110d8b
[redgifs] update API server address ( fixes #1632 )
...
napi.redgifs.com -> api.redgifs.com
3 years ago
Mike Fährmann
0d2961ae81
[500px] remove last query hash entry
...
forgot to include this in b56e2450
3 years ago
Mike Fährmann
7273cf8536
[pixiv] support fetching privately followed users ( fixes #1628 )
3 years ago
Mike Fährmann
e60962f7e5
[philomena] improve tag escapes handling ( fixes #1629 )
3 years ago
Mike Fährmann
d8908ca577
[unsplash] update collections URL pattern ( fixes #1627 )
3 years ago
Mike Fährmann
9ed13703cc
[sankaku] handle empty tags ( fixes #1617 )
3 years ago
Mike Fährmann
b56e245094
[500px] update GraphQL queries
...
500px changed its method from query hashes to sending the entire query
string for every request.
3 years ago
Mike Fährmann
a751afdfb3
[twitter] change some defaults
...
- 'retweets' option: true -> false
- 'quoted' option : true -> false
i.e. disable downloading tweets from other user's timelines by default
- search directory:
'["{category}", "Search", "{search}"]' ->
'["{category}", "{user[name]}"]'
i.e. change it to the same as other twitter extractors (#1308 )
3 years ago
Mike Fährmann
4e4ca3c330
[deviantart] pin API version ( #1611 )
...
'/gallery/folders' in the newest version doesn't include subfolders.
It probably only needs the right query parameter to do so, but that
doesn't seem to be documented anywhere.
3 years ago
Mike Fährmann
d09bc5bd34
[subscribestar] improve attachment filenames ( #1609 )
3 years ago
Mike Fährmann
2986bf63bf
[mangafox] update URL pattern ( fixes #1608 )
...
also accept non-numeric volume labels, e.g. vTBD
3 years ago
Mike Fährmann
53dab5c289
[mangadex] revert chapter handling ( #1535 )
...
Spawn a new ChapterExtractor for each individual chapter
instead of handling them directly with a MangaExtractor.
Doing it that way broke too many features like
--chapter-filter, --chapter-range, --zip, etc.
3 years ago
Mike Fährmann
1197ee2c20
[mangadex] add extractor for a user's followed feed ( #1535 )
3 years ago
Mike Fährmann
07c8adbd8b
[mangadex] implement login with username & password ( #1535 )
3 years ago
Mike Fährmann
3e332eaf53
[mangadex] update to API v5 ( #1535 )
3 years ago
Mike Fährmann
04f4f9badb
[oauth] prevent exceptions when reporting errors ( #1603 )
3 years ago
Mike Fährmann
a3bf878329
[idolcomplex] improve and fix pagination ( #1601 )
...
always rely on the 'next-page-url' value and its query parameters
3 years ago
Mike Fährmann
e39c4633ba
[cyberdrop] b64decode -> a2b_base64
3 years ago
Mike Fährmann
407627ec86
[foolfuuka] support 'archive.wakarimasen.moe' ( closes #1595 )
3 years ago
Mike Fährmann
78f89d2e61
[idolcomplex] fix pagination ( closes #1594 )
3 years ago
Mike Fährmann
52052a0e1a
[manganelo] update domain to 'manganato.com'
3 years ago
Mike Fährmann
c80b18a477
[weibo] extend 'retweets' option ( closes #1542 )
...
Setting 'retweets' to "original" will use metadata from the
original posts, and not from the retweeted ones.
3 years ago
Mike Fährmann
c0fa5058da
[kemonoparty] actually add a 'type' metadata field ( #1556 )
3 years ago
thatfuckingbird
264beb8556
recognize v2.mangapark URLs ( #1578 )
...
* recognize v2.mangapark URLs
* update mangapark root url to use the v2 subdomain
3 years ago
thatfuckingbird
e6811c7450
[pixiv] implement 'max-posts' option ( #1558 )
...
* implement max-rank for pixiv
* rename to max-posts and make more generic
3 years ago
Mike Fährmann
8a909e478d
[imagebam] fix extraction of NSFW images ( #1534 )
3 years ago
Mike Fährmann
b5affc62aa
[twitter] rename 'text-only' to 'text-tweets' ( #570 )
3 years ago
Mike Fährmann
724ca61f36
[twitter] add 'text-only' option ( #570 )
3 years ago
Mike Fährmann
8fd8126117
fix ISO 639-1 code for Japanese
...
"jp" -> "ja"
3 years ago
Mike Fährmann
2c60c7d798
[reactor] skip deleted/empty posts
3 years ago
Mike Fährmann
532ac79fb0
update extractor test results
3 years ago
Mike Fährmann
d7bc4a2b8b
[500px] update query hashes
3 years ago
Mike Fährmann
0f35aca728
[aryion] minor code updates
3 years ago
Mike Fährmann
2eb46452ad
[aryion] update 'needle' to not skip text posts ( fixes #1568 )
...
on "Latest Updates" pages
"class='thumb scrollthumb' href='/g4/view/" and
"class='thumb' href='/g4/view/" both end with
"thumb' href='/g4/view/"
3 years ago
Mike Fährmann
4fc9668922
[imgur] update URL patterns ( #1561 )
3 years ago
Mike Fährmann
1eabfa5c7a
[pillowfort] implement login with username & password ( #846 )
3 years ago
Mike Fährmann
24dd10ac3c
[patreon] extract user defined 'tags' ( #1539 , closes #1540 )
3 years ago
Mike Fährmann
a7e4917ee1
[pillowfort] add 'inline' option ( #846 )
...
to support images present in a post's 'content',
but not listed in 'media'.
also separates the file hash present at the beginning
of each 'filename' into its own field.
3 years ago
Mike Fährmann
efa6cc8ec3
[pillowfort] add 'external' option ( #846 )
...
for links to external Twitter posts etc.
3 years ago
Mike Fährmann
394fbb5f56
[twitter] strip useless t.co links ( #1532 )
...
The 'full_text' of Tweets with media content usually ends with a t.co
link to itself. This commit removes those.
3 years ago
Mike Fährmann
41457dbb1b
[twitter] resolve t.co URLs in 'content' ( #1532 )
3 years ago
Mike Fährmann
2b5d80862e
[kemonoparty] add 'type' metadata field ( #1556 )
...
'file', 'attachment', or 'inline'
3 years ago
Mike Fährmann
17b0ccb071
[twitter] add missing retweet media entities ( fixes #1555 )
...
from the original tweets
3 years ago
Mike Fährmann
5eeaaee01d
[pixiv] add 'metadata' option ( #1551 )
3 years ago