Mike Fährmann
b788712844
[fallenangels] fix extraction of '.5' chapters
4 years ago
Mike Fährmann
28d8541cb3
[mangafox] ensure download URLs have a scheme
4 years ago
Mike Fährmann
8e3a324c91
[mangakakalot] ignore "Go Home" buttons in chapter pages
4 years ago
Mike Fährmann
c14c5d82d6
[newgrounds] use generator for fallback URLs
4 years ago
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
4 years ago
Mike Fährmann
1686dc1757
[twitter] support media from Cards ( #1005 , #937 )
...
Can be enabled with 'extractor.twitter.cards', but for now disabled by
default because cards can redirect to rather large videos from YouTube
or Twitch.
4 years ago
Mike Fährmann
ffd38215a4
[hitomi] fix image URLs and URL pattern
...
- non-webp files are now hosted on [a-c]b.hitomi.la
- removed ampersand from invalid slug characters
4 years ago
Mike Fährmann
286718950c
[mangahere] ensure download URLs have a scheme ( fixes #1070 )
4 years ago
Mike Fährmann
76dfa11a65
[reddit] add 'date' metadata field ( closes #1068 )
4 years ago
Mike Fährmann
3f2ba629ea
[newgrounds] provide fallback URLs for video downloads ( #1042 )
4 years ago
Mike Fährmann
a3ca2f6080
update fallback URL handling
...
remove Message.Urllist and use a '_fallback' field inside a kwdict
4 years ago
Mike Fährmann
43dab3a228
[mangadex] unescape more metadata fields ( fixes #1066 )
...
like 'manga', 'author', 'artist', etc.
4 years ago
Mike Fährmann
5565025221
[xhamster] fix user profile extraction
4 years ago
Mike Fährmann
07432d6262
[seiga] fix flake8 and cookie test ( #1063 )
4 years ago
Mike Fährmann
b8daabc3ca
[pinterest] implement login support ( closes #1055 )
...
being logged allows access to secret/protected boards
4 years ago
Mike Fährmann
1b1cf01d0d
add a general 'generate_csrf_token()' function
4 years ago
Mike Fährmann
7a0ba370d1
[gelbooru] rewrite mp4 video URLs ( fixes #1048 )
4 years ago
Mike Fährmann
6491db3eaf
[blogger] handle URLs with specified width/height ( closes #1061 )
...
get highest quality for images with
/wXXX-hXXX/ instead of the usual /sXXX/
4 years ago
Mike Fährmann
783e0af26d
[hentaifoundry] update and simplify
4 years ago
Mike Fährmann
5b844a72b7
[newgrounds] handle embeds without scheme ( #1033 )
4 years ago
kurumigi
7e0e872f4f
[seiga] Add metadata for single image downloads ( #1063 )
...
* [seiga] Support image metadata.
* [seiga] Update test data.
* [seiga] Fix cookie check.
* [test_cookies] [seiga] Fit test_cookies.py to the last commit.
4 years ago
Zanny
3ec60e894a
[weasyl] api-key authentication ( #1057 )
...
* [weasyl] support api keys
* [weasyl] document api-key authentication
* [weasyl] usernames can contain ~
4 years ago
Mike Fährmann
844793847c
update extractor test results
4 years ago
Mike Fährmann
ddd6840509
[behance] fix 'collection' extraction
4 years ago
Mike Fährmann
c5e3971b18
[newgrounds] extract image embeds ( closes #1033 )
4 years ago
dawidsowa
43b156fb40
[reactor] match URLs without subdomain ( #1053 )
4 years ago
Mike Fährmann
3ebb174f2c
add missing extractor info when spawning new ones ( fixes #1051 )
...
Not having this information causes the blacklist/whitelist logic to
trigger and prevents things from functioning as intended when using
default settings.
Fixes issues for 8muses, deviantart, exhentai, and mangoxo.
4 years ago
Mike Fährmann
f9c1684af7
[newgrounds] restore original video URLs ( #1042 )
4 years ago
Mike Fährmann
73373c06ec
[weibo] handle posts with more than 9 images ( closes #926 )
...
Responses from '/api/container/getIndex' don't list more than
9 images per 'status' object, but the embedded JSON from a
'/detail/<ID>' page does.
4 years ago
Mike Fährmann
dd1e545597
[hentaifoundry] rename GalleryExtractor to PicturesExtractor
4 years ago
Mike Fährmann
c874071f5a
[kissmanga] remove module
4 years ago
Mike Fährmann
93e04bf9a9
[500px] update query hashes
4 years ago
Mike Fährmann
844502cad5
update extractor test results
4 years ago
Mike Fährmann
fad7748b6b
[xvideos] fix 'title' extraction
4 years ago
Mike Fährmann
5b927c15df
[newgrounds] fix video extraction ( closes #1042 )
4 years ago
Mike Fährmann
bdc6c8f074
improve message for 'oauth:deviantart' etc ( closes #989 )
4 years ago
Mike Fährmann
430b6d6e2e
[twitter] extend 'retweets' option ( closes #1026 )
...
Setting 'retweets' to '"original"' will use metadata from the
original retweeted Tweets, and not from the Retweet entry.
4 years ago
Mike Fährmann
b9bdd2c564
[hentaifoundry] add support for stories ( closes #734 )
4 years ago
Mike Fährmann
9a9d1924d8
[hentaicafe] add 'manga_id' metadata field ( closes #1036 )
...
This field is only available when using a non-foolslide URL
like '/hc.fyi/9874' or '/hazuki-yuuto-summer-blues/'
4 years ago
Mike Fährmann
cc4ac80302
[weasyl] add 'favorite' extractor ( #1032 )
4 years ago
Mike Fährmann
e9cc719497
[weasyl] update and simplify
...
- simplify 'pattern' regexps
- parse 'posted_at' as 'date'
- use unaltered 'title' ({title!l:R /_/} to lowercase and replace spaces)
4 years ago
Mike Fährmann
6514312126
[nijie] add 'include' option ( closes #1018 )
4 years ago
Mike Fährmann
0d43456323
[hentaifoundry] add 'include' option
4 years ago
Zanny
ebb7737b9b
Weasyl Extractor ( #977 )
...
* weasyl extractor
* @kattjevfel suggested changes
* @mikf changes
4 years ago
Mike Fährmann
aeb0d32333
[twitter] improve twitpic extraction ( fixes #1019 )
...
- ignore twitpic.com/photos/… URLs
- ignore empty image URLs
4 years ago
Mike Fährmann
7cd383c0f9
update extractor test results
4 years ago
Mike Fährmann
1e313d5b84
implement 'sleep-request' option
4 years ago
Mike Fährmann
c43b3894be
[myhentaigallery] update and fix extraction ( #1001 )
...
- extract more metadata
- match "/show/" URLs
- complete test results
- fix missing images for lines starting with " <img"
- fix missing comma in supportedsites.py
4 years ago
choeronline
05b9ac8d37
[myhentaigallery] add extractor ( #1001 )
...
* adds support for myhentaigallery
* fixes linting issues in myhentaigallery extractor
4 years ago
Mike Fährmann
2626629117
[danbooru] handle posts without 'id' ( fixes #1004 )
4 years ago