Mike Fährmann
e9dd2eff1d
[twitter] add extractor for media-tweet timelines ( #96 )
...
For example "https://twitter.com/PicturesEarth/media ".
They are different from normal timelines in that they do not contain
any (re)tweets from other users and feature all media the user ever
posted, including responses to other tweets.
6 years ago
Mike Fährmann
f45c9f2141
[gfycat] test-updates and code-adjustments
6 years ago
Mike Fährmann
9b1c39032c
[twitter] changes and improvements
...
- rename User- to TimelineExtractor
- rename 'userid' to 'user_id' to conform to the other ..._id values
- adjust archive_fmt to deal with retweets
- emulate browser behavior for API calls
6 years ago
Mike Fährmann
10365394d7
[twitter] add support for user-timelines ( closes #96 )
...
also adds a 'retweets' option to filter retweeted content
6 years ago
Mike Fährmann
d3f1eed2a6
[pinterest] improvements
...
- add stop condition for pin-related pins
- improve URL patterns
- make Pylint happy
6 years ago
Mike Fährmann
2801a0d997
[exhentai] skip "Content Warning" page when not logged in
...
(closes #97 )
6 years ago
Mike Fährmann
63fa0b2006
[pinterest] add extractors for related pins
...
Related pins can not be accessed by adding a "#related" fragment
to the end of a Pinterest URL, for example:
- https://www.pinterest.com/pin/858146903966145189/#related
- https://www.pinterest.com/g1952849/test-/#related
There are no explicit real URLs for related pins,
using an option to enable them results in "clunky" code,
and a custom "related:<URL>" scheme doesn't feel right either.
6 years ago
Mike Fährmann
1694039de0
[komikcast] update ad-filter
6 years ago
Mike Fährmann
a74591b84b
[tumblr] remove "original image" functionality
...
Accessing higher/original quality images on
https://s3.amazonaws.com/data.tumblr.com and http://data.tumblr.com
is no longer possible and any HTTP request results in 403 Forbidden.
A few images can still be accessed through https//a.tumblr.com [1][2],
but not as "_raw", just "_1280", and that might also be "fixed" in
the near future.
[1] https://a.tumblr.com/tumblr_kzjlfiTnfe1qz4rgho1_1280.jpg
[2] https://a.tumblr.com/ee589c6345f29d2d5935cecb49b0a705/tumblr_oztu02dIHp1wgha4yo1_1280.png
6 years ago
Mike Fährmann
38d4f43cc0
[komikcast] skip ads
6 years ago
Mike Fährmann
4313c95bc9
improve error message for OAuth2 authentication
6 years ago
Mike Fährmann
b55e39d1ee
[mangadex] improve extraction
...
- cache manga API results
- add artist, author and date fields to chapter metadata
- remove Manga-/ChapterExtractor inheritance
- minor code simplifications and improvements
6 years ago
Mike Fährmann
b1c4c1e13c
[mangadex] fix extraction
6 years ago
Mike Fährmann
3c90df6635
[piczel] add user, folder and image extractors
6 years ago
Mike Fährmann
2a9f3341a2
[behance] fix title extraction
6 years ago
Mike Fährmann
3fc2f269fa
[behance] filter 'fields' list
6 years ago
Mike Fährmann
b67339155f
[rule34] update test results
...
'metadata' tag type has been removed
6 years ago
Mike Fährmann
a86f2bfc80
[pinterest] update not-found redirects
6 years ago
Mike Fährmann
b040ca0718
[rule34] small unit test fixes
6 years ago
Mike Fährmann
b164231bca
[sankaku] increase default values for 'wait-min/-max'
6 years ago
Mike Fährmann
68d6033a5d
use 'retries' and 'timeout' options for regular HTTP requests
6 years ago
Mike Fährmann
f3793660ef
update tests
6 years ago
Mike Fährmann
df082e923c
[behance] add gallery extractor ( #95 )
6 years ago
Mike Fährmann
5f27cfeff6
[deviantart] remove `prefer-public` option
...
All API requests now always use a public token and only switch to
a private token for pagination results if `refresh-token` is set
and less deviations than requested were returned.
6 years ago
Mike Fährmann
bb89a1e6d7
[mangahere] use http://
...
invalid SSL cert for quite some time now
6 years ago
Mike Fährmann
212130b048
[deviantart] improve public-private token switching
...
- rename option to `prefer-public`
- now also works for galleries with less than 24 items
6 years ago
Mike Fährmann
886d662582
[deviantart] add option to minimize refresh-token usage
...
Always trying with a public token first and repeating the API request
with a private token if deviations are missing doesn't quite work for
galleries and folders with less than 25 items, so its an option and
not the default.
6 years ago
Mike Fährmann
d98e47817d
[deviantart] reduce refresh-token usage
...
Instead of using a refresh-token-based access-token for every API
request, they are now only used for paginated results.
API requests to get a user's profile and the original download URL
now always use a public access-token.
6 years ago
Mike Fährmann
84854fcad7
[myportfolio] add user and gallery extractors ( #95 )
6 years ago
Mike Fährmann
c9f70e0a19
[paheal] use HTTPS
6 years ago
Mike Fährmann
ff436692bf
["deviantart] add 'journals' option
6 years ago
Mike Fährmann
00032b828c
[deviantart] add 'wait-min' option
6 years ago
Mike Fährmann
a6fe2bb594
[whatisthisimnotgoodwithcomputers] remove extractor
6 years ago
Mike Fährmann
0ba93650e0
[8chan] replace unit test URL
...
the other thread is no longer accessible
6 years ago
Mike Fährmann
269dc2bbd5
[sankaku] add 'tags' option ( #94 )
6 years ago
Mike Fährmann
173add6935
[nijie] fix artist_id extraction
...
view_popup.php pages for older images or dojins either have the
artist_id value at a different place or not at all.
6 years ago
Mike Fährmann
6996f5c118
[mangahere] fix and improve chapter extraction
6 years ago
Mike Fährmann
1d43cbbf52
[gelbooru] tag-splitting for non-api mode
6 years ago
Mike Fährmann
2eefaa99a3
[mangapark] support .net and .com mirrors
6 years ago
Mike Fährmann
c20c0a4820
[safebooru] add pool extractor
6 years ago
Mike Fährmann
f916279ae6
[rule34] add pool extractor
6 years ago
Mike Fährmann
3dbc7c5f8d
[gelbooru] restore pool functionality
6 years ago
Mike Fährmann
a2c74bc6f0
[gelbooru] inherit from BooruExtractor class
...
Breaks pool functionality when using API calls (for now),
but reduces code clutter and enables the `tags` option.
6 years ago
Mike Fährmann
4a57509392
generalize tag-splitting option ( #92 )
...
- extend functionality to other booru sites:
- http://behoimi.org/
- https://konachan.com/
- https://e621.net/
- https://rule34.xxx/
- https://safebooru.org/
- https://yande.re/
6 years ago
Mike Fährmann
188e956c4e
[imagefap] use HTTPS + update test results
6 years ago
Mike Fährmann
87853538b4
[yandere] add option to split tags by type ( #92 )
6 years ago
Mike Fährmann
a699787d01
[deviantart] update URL patterns to new format
...
DeviantArt changed its URL format from
https://<name>.deviantart.com/...
to
https://www.deviantart.com/ <name>/...
With this change both formats will be supported.
6 years ago
Mike Fährmann
9e3415886c
[senmanga] fix/update tests
6 years ago
Mike Fährmann
b8c97d2295
use 'extractor.request()' for more HTTP requests
6 years ago
Mike Fährmann
150a6b9064
[xvideos] fix metadata extraction
6 years ago