Mike Fährmann
fa7fa2f8ff
[deviantart1 update tests]
6 years ago
Mike Fährmann
b7b5456a32
[kissmanga] use HTTPS
6 years ago
Mike Fährmann
259123732f
[readcomiconline] improve comic-page parsing
6 years ago
Mike Fährmann
0328a04a65
[cloudflare] don't output the whole challenge page
...
thanks to the embedded animated gifs this is just a bit too much
6 years ago
Mike Fährmann
4ab0960083
[reddit] add metadata to extracted URLs
6 years ago
Mike Fährmann
2f4f60de33
[tumblr] add tests for each post type
6 years ago
Mike Fährmann
98314aa04c
[mangapark] detect non-existent chapters
6 years ago
Mike Fährmann
6c71e9cf5d
[deviantart] add separate 'sta.sh' extractor ( #113 )
...
- supports multiple stashed deviations per page
- explicitly mentions sta.sh support on supportedsites.rst
6 years ago
Mike Fährmann
f9ace0f4a3
[mangapark] fix manga extraction ... again
6 years ago
Mike Fährmann
28f9539551
[tumblr] change default values for post types and inline media
6 years ago
Mike Fährmann
5be95034ba
[tumblr] add option to download avatars ( #137 )
6 years ago
Mike Fährmann
7471933d5f
use extractor.request for all other API calls
...
- deviantart
- pawoo
- pixiv
- reddit
6 years ago
Mike Fährmann
995844c915
[instagram] relax test pattern even more
6 years ago
Mike Fährmann
2e5f82e59e
[tumblr] don't follow 'external' Tumblr URLs ( #139 )
6 years ago
Mike Fährmann
c5d4f558c9
allow missing field access keys in format strings ( #136 )
6 years ago
Mike Fährmann
0c9762f00e
[mangapark] fix extraction
6 years ago
Mike Fährmann
c9ef5ed364
[luscious] ensure URLs have a scheme
6 years ago
Mike Fährmann
851ee9f89f
[sensescans] replace tests
...
the old ones got removed
6 years ago
Mike Fährmann
c14d44e1bc
[downloader:common] retry downloads on SSL errors ( #130 )
6 years ago
Mike Fährmann
0be7ee3106
[hitomi] fix image subdomains ( closes #142 )
...
galleries with an ID ending in 1 need some special treatment
6 years ago
Mike Fährmann
fe96835d25
[kissmanga] add fallback for chapter-string parsing ( #20 )
6 years ago
Mike Fährmann
4d73cc785d
update test results
6 years ago
Mike Fährmann
049a9575c4
[tumblr] fix inline extraction #2
...
Using only the "comment" field isn't enough ...
[ci skip]
6 years ago
Mike Fährmann
f6bf66f72c
[pixiv] create directory for each "work" item ( #136 )
6 years ago
Mike Fährmann
79f6755c60
[postprocessor:classify] handle missing "extension" ( #138 )
6 years ago
Mike Fährmann
b7a9f6cc49
[tumblr] improve inline extraction ( #137 )
6 years ago
Mike Fährmann
010da8372a
[instagram] relax test pattern
6 years ago
Mike Fährmann
1c6b9ba322
[readcomiconline] use HTTPS
6 years ago
Leonardo Taccari
2655a2ea02
Add support for instagram.com user profiles and pages ( #134 )
...
* [instagram] Add extractor for instagram.com user profiles and pages
The extractor scrapes `instagram.com/<user>' timelines and
`instagram.com/p/<shortcode>' by mimicking the behaviour of a web
browser and extracting the sharedData JSON of the single pages.
Please note that this mean that for user timelines we also do an
extra request to the `instagram.com/p/<shortcode>' page but this
permit to have consistent (and all) information about the media
fetched.
The MD5 logic used for X-Instagram-GIS was documented in
<https://stackoverflow.com/questions/49786980/ >
* [instagram] Test for keywords, not url for GraphImage and GraphSidecar
URLs returned by instagram seems not stable so avoid testing for
them and instead test for keyword returned.
* [instagram] Improve test of InstagramProfilepageExtractor
Also check the count of media returned.
* [instagram] Several cleanup and improvements
- Change description, subcategories to generate a better description in
docs/supportedsite.rst
- Remove not needed InstagramExtractor.__init__()
- Use text.parse_int() instead of directly using int() (the former is more
robust)
- Use self.request().json() instead of using json.loads() the
self.request().text()
- Add `pattern:' to check the URLs where we do not have a stable URLs.
It seems that only the subdomain is not stable.
Thanks to @mikf!
6 years ago
HRXN
e80ee77d71
tumblr.py: update regex for video ( #133 )
...
There seems to be another sub-domain for videos, apparently..
Not just
`vt(.media).tumblr`
`vtt(media).tumblr`
But also
`ve(.media).tumblr`
6 years ago
Mike Fährmann
9a98b6769d
use extractor.request for API calls ( #130 )
...
... at least for OAuth1.0 based APIs (flickr, smugmug, tumblr)
6 years ago
Mike Fährmann
0225d90078
add exception name and traceback for OSErrors
6 years ago
Mike Fährmann
ad2cefda6b
[tumblr] in case of exception use filename as 'hash' ( #129 )
...
While a filename might not be a real 'hash', or comparable to what
tumbler usually provides, it is still better than an empty string.
At least as long as "alternatives" in format strings aren't implemented.
6 years ago
Mike Fährmann
95636418ad
[tumblr] catch exception for 'hash' extraction ( fixes #129 )
6 years ago
Mike Fährmann
40e30694f3
[pinterest] fix pin.it redirects
6 years ago
Mike Fährmann
770200888e
[gfycat] use public API endpoint
6 years ago
Mike Fährmann
b1e22e8354
release version 1.6.1
6 years ago
Mike Fährmann
be52069cbc
update CHANGELOG and docs/supportedsites
6 years ago
Mike Fährmann
5d6e219fb2
[joyreactor] update tests
6 years ago
Mike Fährmann
c59f56fe7e
[gfycat] fix extraction
...
/cajax/get/<id> doesn't work anymore
6 years ago
Mike Fährmann
ba56827f36
[newgrounds] add user-, video-, image-extractors ( #119 )
6 years ago
Mike Fährmann
15890930ea
[mangafox] fix extraction
...
use mobile version since desktop version is obfuscated
6 years ago
Mike Fährmann
a4263fb253
[luscious] add extractor for search results ( closes #127 )
6 years ago
Mike Fährmann
fb53b5dd55
fix control+c during -j and range tests
6 years ago
Mike Fährmann
a0ae156edc
[pornreactor] add tag-, user-, post-extractors ( #114 )
6 years ago
Mike Fährmann
bacbc2e7bd
[joyreactor] try to prevent JsonDecodeErrors ( #114 )
6 years ago
Mike Fährmann
503d42a1c2
[joyreactor] add tag-, user-, post-extractors ( #114 )
6 years ago
Mike Fährmann
59bb434ba5
[flickr] add ability to download all albums of a user
...
for example with 'https://www.flickr.com/photos/shona_s/albums '
6 years ago
Mike Fährmann
13cb270326
set target directory before postprocessor init ( fixes #126 )
6 years ago
Mike Fährmann
9e188f6a21
[4chan] support 4channel.org domain
6 years ago