Mike Fährmann
a666ddd16b
[tumblr] extend 'reblogs' functionality ( #103 )
...
Setting 'reblogs' to "deleted" will check if the parent post of a
reblog has been deleted and download its media content if that is the
case, otherwise it will be skipped.
This is a rather costly operation (1 API request per reblogged post)
and should therefore be used with care.
6 years ago
Mike Fährmann
c9b8e6aefc
[reddit] fix submission-ID parsing ( #104 )
...
Uppercase characters caused a ValueError exception
6 years ago
Mike Fährmann
488abeca0b
[hentaicafe] adjust default directory format
...
A separate folder for each chapter is rather pointless if almost all
manga have only one chapter each.
6 years ago
Mike Fährmann
b4eca2633e
[tumblr] support /archive URLs
6 years ago
Mike Fährmann
aa1de70da0
[tumblr] recognize inline videos ( #102 )
6 years ago
Mike Fährmann
3ecea4cf36
[hentaicafe] add chapter and manga extractors ( #101 )
6 years ago
Mike Fährmann
41249f3ead
improve extractor.get_downloader()
6 years ago
Mike Fährmann
eb3185d6a3
update exception hierarchy
6 years ago
Mike Fährmann
e9ae6fd080
improve downloader/postprocessor module loading
...
- handle arguments of any type without propagating an exception
- prevent potential security risk through relative imports
6 years ago
Mike Fährmann
712b58a93b
[postprocessor] add black-/whitelist options
...
Each post-processor config dict now supports a list of extractor
categories for which it should/shouldn't be active for.
For example:
"postprocessors": [
{"name": "classify",
"whitelist": ["tumblr", "deviantart"],
...
}
]
6 years ago
Mike Fährmann
8a23b21d0e
[tests] let 'pattern' require at least 1 URL
6 years ago
Mike Fährmann
0bc8ef51c8
[smugmug] Handle albums with no explicit owner ( #100 )
6 years ago
Mike Fährmann
ff83ee22b0
release version 1.5.2
6 years ago
Mike Fährmann
b47af4637a
[mangadex] update URL pattern
...
Manga URLs now begin with /title/ instead of /manga/
6 years ago
Mike Fährmann
75862715ac
[behance] add user extractor
6 years ago
Mike Fährmann
a493fed376
[deviantart] fix journal creation if no 'username' is set
6 years ago
Mike Fährmann
6ecb36d88c
[postprocessor:ugoira] add 'ffmpeg-output' option
6 years ago
Mike Fährmann
02a4a67f6d
[postprocessor:ugoira] support danbooru sources
6 years ago
Mike Fährmann
5b8a314de7
[tumblr] replace inline URLs with higher quality ones ( #98 )
6 years ago
Mike Fährmann
2af2bb7911
[mangadex] fix relative page URLs
6 years ago
Mike Fährmann
590c0b3ad5
re-implement and improve filename formatter
...
A format string now gets parsed only once instead of re-parsing it each
time it is applied to a set of data.
The initial parsing causes directory path creation to be at about 2x
slower than before, since each format string there is used only once,
but building a filename, the more common operation, is at least 2x
faster. The "directory slowness" cancels at about 5 filenames and
everything above that is significantly faster.
6 years ago
Mike Fährmann
34b556922d
update/restore tests
6 years ago
Mike Fährmann
ab2bfaeb46
[ngomik] add replacement for 'subapics'
...
http://subapics.com/ got discontinued and replaced by http://ngomik.in/ .
ngomik.in is still displaying a link to the "old site" showing a big
"Account Suspended" sign.
6 years ago
Mike Fährmann
a2eeef1f5e
[behance] replace test
...
The "UVMW Studio" account and their galleries are gone.
6 years ago
Mike Fährmann
e9dd2eff1d
[twitter] add extractor for media-tweet timelines ( #96 )
...
For example "https://twitter.com/PicturesEarth/media ".
They are different from normal timelines in that they do not contain
any (re)tweets from other users and feature all media the user ever
posted, including responses to other tweets.
6 years ago
Mike Fährmann
f45c9f2141
[gfycat] test-updates and code-adjustments
6 years ago
Mike Fährmann
9b1c39032c
[twitter] changes and improvements
...
- rename User- to TimelineExtractor
- rename 'userid' to 'user_id' to conform to the other ..._id values
- adjust archive_fmt to deal with retweets
- emulate browser behavior for API calls
6 years ago
Mike Fährmann
10365394d7
[twitter] add support for user-timelines ( closes #96 )
...
also adds a 'retweets' option to filter retweeted content
6 years ago
Mike Fährmann
e3055d356c
release version 1.5.1
6 years ago
Mike Fährmann
d3f1eed2a6
[pinterest] improvements
...
- add stop condition for pin-related pins
- improve URL patterns
- make Pylint happy
6 years ago
Mike Fährmann
2801a0d997
[exhentai] skip "Content Warning" page when not logged in
...
(closes #97 )
6 years ago
Mike Fährmann
63fa0b2006
[pinterest] add extractors for related pins
...
Related pins can not be accessed by adding a "#related" fragment
to the end of a Pinterest URL, for example:
- https://www.pinterest.com/pin/858146903966145189/#related
- https://www.pinterest.com/g1952849/test-/#related
There are no explicit real URLs for related pins,
using an option to enable them results in "clunky" code,
and a custom "related:<URL>" scheme doesn't feel right either.
6 years ago
Mike Fährmann
1694039de0
[komikcast] update ad-filter
6 years ago
Mike Fährmann
f9ded38d89
[test:results] add support for "range" options in tests
6 years ago
Mike Fährmann
c9e6ccbd7c
[test:extractor] small fixes and improvements
6 years ago
Mike Fährmann
792135a339
enable Python 3.7 for Travis CI tests
6 years ago
Mike Fährmann
a74591b84b
[tumblr] remove "original image" functionality
...
Accessing higher/original quality images on
https://s3.amazonaws.com/data.tumblr.com and http://data.tumblr.com
is no longer possible and any HTTP request results in 403 Forbidden.
A few images can still be accessed through https//a.tumblr.com [1][2],
but not as "_raw", just "_1280", and that might also be "fixed" in
the near future.
[1] https://a.tumblr.com/tumblr_kzjlfiTnfe1qz4rgho1_1280.jpg
[2] https://a.tumblr.com/ee589c6345f29d2d5935cecb49b0a705/tumblr_oztu02dIHp1wgha4yo1_1280.png
6 years ago
Mike Fährmann
38d4f43cc0
[komikcast] skip ads
6 years ago
Mike Fährmann
4313c95bc9
improve error message for OAuth2 authentication
6 years ago
Mike Fährmann
7f4e41c989
increase timeout during extractor tests
...
cloudflare's 522 response takes longer than 30 seconds
6 years ago
Mike Fährmann
b55e39d1ee
[mangadex] improve extraction
...
- cache manga API results
- add artist, author and date fields to chapter metadata
- remove Manga-/ChapterExtractor inheritance
- minor code simplifications and improvements
6 years ago
Mike Fährmann
b1c4c1e13c
[mangadex] fix extraction
6 years ago
Mike Fährmann
3c90df6635
[piczel] add user, folder and image extractors
6 years ago
Mike Fährmann
2a9f3341a2
[behance] fix title extraction
6 years ago
Mike Fährmann
3fc2f269fa
[behance] filter 'fields' list
6 years ago
Mike Fährmann
b67339155f
[rule34] update test results
...
'metadata' tag type has been removed
6 years ago
Mike Fährmann
a86f2bfc80
[pinterest] update not-found redirects
6 years ago
Mike Fährmann
7442d2940c
release version 1.5.0
6 years ago
Mike Fährmann
b040ca0718
[rule34] small unit test fixes
6 years ago
Mike Fährmann
b164231bca
[sankaku] increase default values for 'wait-min/-max'
6 years ago