Mike Fährmann
f0c5093812
[nsfwalbum] add album extractor ( closes #287 )
5 years ago
Mike Fährmann
61e413d85d
[hentaifoundry] stop disabling IPv6 addresses
...
The rogue address mentioned in a138d58
is no longer included in the DNS
results for www.hentai-foundry.com.
5 years ago
Mike Fährmann
76ae9957c2
[deviantart] force legacy version for single deviations
...
Let's see how long this works ...
DeviantArt is rolling out a new version of their website, including a
new internal and potentially usable API (rewrite incoming, yay).
The issue with the new layout is that it doesn't include the "old"
UUIDs for single deviations, i.e. mapping a numeric deviation ID to its
UUID counterpart is impossible with the new layout.
5 years ago
Mike Fährmann
a01f99728c
[postprocessor:zip] delete empty archives when done ( #316 )
5 years ago
Mike Fährmann
520c8ba106
[hentaicafe] extract 'tags' and 'artist' metadata ( closes #238 )
...
These metadata fields will only be filled in when using a top-level
URL, because that's the only place this information is available. Using
a Foolslide URL (1) will leave these fields empty.
(1) https://hentai.cafe/manga/read/.../en/0/1/ "
5 years ago
Mike Fährmann
b51baa9a4b
[hitomi] fix empty language detection; parse datetime
5 years ago
Mike Fährmann
258e8b2060
[deviantart] small code improvements
5 years ago
Mike Fährmann
a77340c647
[keenspot] fix extraction for "TwoKinds"
5 years ago
Mike Fährmann
03e6876fbe
[instagram] provide 'description' metadata ( #310 )
5 years ago
Mike Fährmann
b171befa87
implement 'parse_unicode_escapes()'
5 years ago
Mike Fährmann
3a36a0fa1e
release version 1.8.6
5 years ago
Mike Fährmann
ec3e8601f1
[slickpic] add user extractor ( #249 )
5 years ago
Mike Fährmann
97ef416218
[8muses] support multi-page listings ( #305 )
5 years ago
Mike Fährmann
f5961ac968
[deviantart] download deviations with no 'content' field
...
Some deviations (possibly only from sta.sh sources) are downloadable
(i.e. 'is_downloadable' is true and /deviation/download/ works), but
have no 'content' or similar in their JSON representation.
(fixes #307 )
5 years ago
Mike Fährmann
4e07f99e3e
[mangoxo] change token message level to debug
...
The login page currently doesn't provide and require a login token
(logging in works without a token), so printing a warning during
each login is unnecessary.
5 years ago
Mike Fährmann
d997c10320
[8muses] add album extractor ( #305 )
5 years ago
Mike Fährmann
e05a96db5e
[deviantart] rename 'stash' to 'extra' ( #302 )
...
'stash' is already used as a name for the StashExtractor and therefore
expected to be a dictionary.
5 years ago
Mike Fährmann
2184e3a86b
[slickpic] add album extractor ( #249 )
5 years ago
Mike Fährmann
c23bf263fe
[deviantart] rename 'external' to 'stash' ( #302 )
...
restrict extracted URLs to ones from https://sta.sh/ ...
5 years ago
Mike Fährmann
c73c2cda50
[pornhub] add gallery & user extractor ( #282 )
5 years ago
Mike Fährmann
7c6cb908f9
[xhamster] update test results
5 years ago
Mike Fährmann
2fb85178da
[deviantart] add 'external' option ( #302 )
...
If a description is available, this will extract URLs from the
description text and try to find Extractors for them.
5 years ago
Mike Fährmann
f85e42cffc
[deviantart] fix --range for deviation & stash extractor
5 years ago
Mike Fährmann
40c7eb3424
[livedoor] improve extraction ( fixes #301 )
5 years ago
Mike Fährmann
62335b9015
[paheal] adjust test results
5 years ago
Mike Fährmann
aa1ca4ed35
[shopify] skip deleted products ( #175 )
...
Product pages which return a 4xx status code will now be skipped instead
of raising an exception.
5 years ago
Mike Fährmann
096009367b
[xhamster] add gallery & user extractor ( #281 )
5 years ago
Mike Fährmann
208202b962
[tumblr] improve error handling ( #297 )
...
In some cases Tumblr's API responds with an HTML document.
Trying to decode it as JSON would raise an uncaught exception.
5 years ago
Mike Fährmann
c08c340178
[directlink] make pattern case insensitive ( fixes #296 )
5 years ago
Mike Fährmann
95b4a53b9c
[keenspot] improve pagination ( #223 )
...
The old code would skip the last comic page for some series.
5 years ago
Mike Fährmann
12c965d547
release version 1.8.5
5 years ago
Mike Fährmann
731c7cbd5b
[keenspot] support all comics and "random" access ( #223 )
5 years ago
Mike Fährmann
6a34f4b0c1
skip tests on read timeouts; print list of skipped tests
5 years ago
Mike Fährmann
1c36e65e9b
[exhentai] choose site version depending on input URL ( #278 )
...
Use e-hentai.org as root and cookiedomain if the input URL is from
e-hentai (or g.e-hentai), use exhentai.org otherwise.
5 years ago
Mike Fährmann
6da3e21237
[downloader:ytdl] provide 'filename' metadata ( closes #291 )
5 years ago
Mike Fährmann
d33f5a7423
[wallhaven] rewrite
...
- use API
- remove login support, add 'api-key' option
- remove support for "alpha" subdomain - alpha.wallhaven.cc used numeric
IDs that can't be translated to the new ID system
- support direct links to wallpapers
5 years ago
Mike Fährmann
5499934ae2
[ngomik] fix extraction
5 years ago
Mike Fährmann
f1893b2b5b
[deviantart] add 'folders' option ( #276 )
5 years ago
Mike Fährmann
c849574def
[keenspot] add comic extractor ( #223 )
...
Doesn't work for
- http://brawlinthefamily.keenspot.com/
- http://flipside.keenspot.com/
- http://lastblood.keenspot.com/
- http://mysticrevolution.keenspot.com/
- http://porcelain.keenspot.com/
- http://twokinds.keenspot.com/
yet, because of custom layouts.
5 years ago
Mike Fährmann
2b1999476e
implement 'text.rextract()'
5 years ago
Mike Fährmann
8bd5a19515
[hentainexus] add '_extractor' data
5 years ago
Mike Fährmann
2a085a5e96
[sankakucomplex] fix 'date' values ( #258 )
5 years ago
Mike Fährmann
bcd1801aa8
[sankakucomplex] add 'tag' extractor ( #258 )
5 years ago
Mike Fährmann
74c2415138
[sankakucomplex] move article extractor to its own module ( #258 )
5 years ago
Mike Fährmann
4465a3ea68
[kissmanga][readcomiconline] add 'captcha' option ( #279 )
...
to configure how to handle CAPTCHA page redirects:
- either interactively wait for the user to solve the CAPTCHA
- or raise StopExtraction like before
5 years ago
Mike Fährmann
1e3e15c4f3
[sankaku] add article extractor ( #258 )
5 years ago
Mike Fährmann
48233f00c0
[readcomiconline] detect 'AreYouHuman' redirects ( #279 )
5 years ago
Mike Fährmann
1cde38110d
[livedoor] return 'date' as datetime object
5 years ago
Mike Fährmann
e88824e1a7
[livedoor] fix adjustments for https:// URLs
5 years ago
Mike Fährmann
2316e0ed3d
fix strptime workaround from b0e85a4
...
Don't return a modified version of 'date_time' if strptime fails.
5 years ago