Mike Fährmann
daeef8a5e3
[vsco] handle missing 'description' fields
4 years ago
Mike Fährmann
26a967cbd4
[pinterest] match 'pinterest.co.uk' URLs ( fixes #914 )
4 years ago
Mike Fährmann
c5aaa1de77
[inkbunny] simplify metadata structure ( #283 )
...
Just put everything at the top level,
instead of having a separate 'post' object.
4 years ago
Mike Fährmann
b921fee24d
[inkbunny] fix submission order ( #283 )
...
Getting detailed submission info via /api_submissions.php reordered the
input submissions and sorted them by ID. InkbunnyAPI.detail() now sorts
them back and ensures they are returned in their original order.
This commit also removes the 'metadata' option and always requests
submission descriptions.
4 years ago
Mike Fährmann
e50c75628c
[subscribestar] update 'date' parsing
4 years ago
Mike Fährmann
c4ed9f4faa
[inkbunny] add 'metadata' option ( #283 )
4 years ago
Mike Fährmann
493cadb1e7
[inkbunny] add 'orderby' option ( #283 )
4 years ago
Mike Fährmann
336e682a7a
[inkbunny] handle gallery/scraps URLs ( #283 )
4 years ago
Mike Fährmann
8dbf827649
[bobx] remove module
4 years ago
Mike Fährmann
8f64585ff2
[twitter] handle 429 responses without x-rate-limit-reset header
4 years ago
Mike Fährmann
d2e17e16bf
[inkbunny] update tests ( #283 )
4 years ago
Mike Fährmann
57f7d9b790
[inkbunny] improve error handling ( #283 )
4 years ago
Mike Fährmann
baf5d0e3c1
[gfycat] skip malformed gfycat responses ( closes #902 )
4 years ago
Mike Fährmann
453f3bc519
[blogger] improve error messages for missing posts/blogs ( #903 )
4 years ago
Mike Fährmann
87202b8d74
[inkbunny] add 'user' and 'post' extractors ( #283 )
4 years ago
Mike Fährmann
2ecf1efb16
update extractor test results
...
- tumblr: remove deleted post
- jaiminisbox: replace removed manga/chapters
- smugmug: one inconsequential field got removed
4 years ago
Mike Fährmann
d5fcffcced
[subscribestar] add login capabilities ( #852 )
4 years ago
Mike Fährmann
ecaecc4064
[exhentai] add 'domain' option ( #897 )
4 years ago
Mike Fährmann
45c32213dc
[gfycat] retry 404'ed videos on redgifs ( closes #874 )
4 years ago
Mike Fährmann
cf44571fe0
[gfycat] add 'user' and 'search' extractors
4 years ago
Mike Fährmann
11b744d971
[mangakakalot] improve/fix chapter extraction
4 years ago
Mike Fährmann
2da71cb561
[twitter] raise proper exception if user doesn't exist ( #891 )
4 years ago
Leonardo Taccari
86e5a05e29
[twitter] add support for nitter.net URLs in pattern ( #890 )
...
Please note that URLs are only "translated", all requests are still
done always via the Twitter API.
4 years ago
Mike Fährmann
e17d4f44f6
[newgrounds] fix favorites extraction
4 years ago
Mike Fährmann
c51fbd72ba
update extractor test results
4 years ago
Mike Fährmann
9cd1bc6907
[mangakakalot] update URL patterns, fix flake8 errors ( #876 )
4 years ago
jakem72360
7dfdcc3fbf
[mangakakalot] Added extractors for MangaKakalot ( #876 )
4 years ago
Mike Fährmann
cb0132e441
[khinsider] add 'format' option ( closes #840 )
4 years ago
Mike Fährmann
d594977ca1
[artstation] add 'following' extractor ( closes #888 )
4 years ago
Mike Fährmann
3855d0dd3c
[twitter] add debug messages for all skipped Tweets ( #867 )
4 years ago
Mike Fährmann
27d163afb3
[imgur] support all '/t/...' URLs ( closes #880 )
...
… instead of just '/t/unmuted/'
4 years ago
Mike Fährmann
f5c9f1d066
[subscribestar] use current date instead of hard-coded '2020' ( #852 )
4 years ago
Mike Fährmann
5a6e750704
[reddit] fix AttributeError when using 'recursion' ( fixes #879 )
4 years ago
Mike Fährmann
94a08f0bcb
[reddit] limit title length in default filenames ( #873 )
4 years ago
Mike Fährmann
3424fb96c3
[redgifs] support gifsdeliverynetwork.com URLs ( #874 )
4 years ago
Mike Fährmann
f1344fe552
[patreon] yield images and attachments before postfiles ( #871 )
...
The reported filename of the 'postfile' entry of each post may differ
from the corresponding entry in the list of images or attachments,
and be outright "wrong".
4 years ago
Mike Fährmann
6e2af9a8d8
[twitter] improve error message formatting
4 years ago
Mike Fährmann
c28db7a6ea
[8muses] support 'comics.8muses.com' URLs
4 years ago
Mike Fährmann
d5bfb0b38c
set pseudo extension for Metadata messages ( #865 )
...
This prevents pathfmt.filename from potentially being empty.
4 years ago
Mike Fährmann
821524e4ee
[subscribestar] add 'user' and 'post' extractors ( #852 )
4 years ago
Mike Fährmann
e62ebb4643
update CHANGELOG before building sdist and wheel packages
4 years ago
Mike Fährmann
f1ddbff0b5
[aryion] add 'recursive' option ( fixes #832 )
...
This is enabled by default and will recursively go through all
(sub)folders in an artist's gallery.
The old method of using "Latest Updates" lists can be restored by
disabling this option.
4 years ago
Mike Fährmann
699062b91f
Revert "[kissmanga] workaround for CAPTCHAs ( #818 )"
...
This reverts commit 4cf3d54718
.
4 years ago
Mike Fährmann
0cac14c3bd
update extractor test results
4 years ago
Mike Fährmann
5e5be67c26
[tumblr] prevent KeyErrors when using reblogs=same-blog
...
(fixes #851 )
4 years ago
Mike Fährmann
9da2bc67f8
[twitter] add option to filter media from quoted tweets ( #854 )
4 years ago
Mike Fährmann
56ab5fb8f4
[twitter] improve handling of quoted tweets ( #854 )
...
Split each "quote" into two parts:
- the original tweet
- the tweet that quoted the original
4 years ago
Mike Fährmann
bd0e1ca1a5
[imgur] build directory path for each file ( closes #842 )
4 years ago
Mike Fährmann
a8c2d997e8
[twitter] treat quoted tweets like retweets ( #833 )
...
- filter them when 'retweets' is disabled
- set 'author' to the creator of the quoted tweet
like it was before the rewrite
4 years ago
Mike Fährmann
aed1c63e51
[twitter] improve search results ( fixes #847 )
...
Adding 'tweet_search_mode=live' to the query parameters
is the most important part here.
4 years ago
Mike Fährmann
0e714b9a0e
[pinterest] add 'section' extractor ( #835 )
4 years ago
Mike Fährmann
53cc498d9c
improve config lookup when there are multiple possible locations
...
This specifically applies to all Mastodon extractors and all
extractors with a 'basecategory', i.e. 'booru', 'foolslide', etc.
Values inside those general config locations wouldn't be recognized
when a value with the same was set on the 'extractor' level.
For example 'extractor.mastodon.directory' should be used over
'extractor.directory' when both are set, but this was impossible
with the previous implementation.
(fixes #843 )
4 years ago
Mike Fährmann
d81a8e6544
[twitter] update tests
4 years ago
Mike Fährmann
d39eedd9bb
[twitter] improve handling of deleted tweets ( fixes #838 )
4 years ago
Mike Fährmann
1ae1df0d27
update '--write-pages' ( #737 )
...
- fix infinite recursion for responses with multiple entries in
'history'
- hide values of Set-Cookie headers
- only write the response content by default
(use '-o write-pages=all' to also include HTTP headers)
4 years ago
Mike Fährmann
dc16f73965
[twitter] move '_guest_token()' into TwitterAPI class
4 years ago
Mike Fährmann
3561d1020a
[twitter] always provide an 'author' field ( #831 , #833 )
...
The idea was to have less metadata clutter for most Tweets were
'author' and 'user' are the same (non-retweets), and only provide
a 'user' field.
The original Tweet author could be gotten with
{author[…]|user[…]}, but basically no one knows about that.
4 years ago
Mike Fährmann
7158bdd7c7
[weibo] improve extractor logic ( #829 )
4 years ago
Mike Fährmann
0371fd54a1
[artstation] add 'date' metadata field ( #839 )
4 years ago
Mike Fährmann
8c857052d7
[mastodon] ignore toots without media attachments
4 years ago
Mike Fährmann
de045d39b2
[mastodon] add 'date' metadata field ( #839 )
4 years ago
Mike Fährmann
d5d90a0450
[weibo] add 'date' field to 'status' objects ( #829 )
4 years ago
Mike Fährmann
5ba90f72ca
[pinterest] add support for sections ( closes #835 )
4 years ago
Mike Fährmann
c37a1c06c8
[twitter] add extractor for liked tweets ( closes #837 )
...
You need to be logged in to get access to anyone's liked tweets,
it seems.
4 years ago
Mike Fährmann
b94394104c
[twitter] don't download video previews ( #833 )
...
when 'videos' is set to False
4 years ago
Mike Fährmann
bb882b8cdb
improve output of '-K' for parent extractors ( #825 )
4 years ago
Mike Fährmann
4cf3d54718
[kissmanga] workaround for CAPTCHAs ( fixes #818 )
...
Requesting the same page again when being redirected to a CAPTCHA
lets us access that page without solving it.
4 years ago
Mike Fährmann
7daef6ee70
update extractor test results
...
- certain posts on Instagram now return
https://static.cdninstagram.com/rsrc.php/null.jpg
for public users
- MangaDex is deploying its new MangaDex@Home network similar to
exhentai's Hentai@Home
- realbooru has a new site layout, but the underlying booru API still
works like before
4 years ago
Mike Fährmann
ffb6c5277a
[furaffinity] add 'artist_url' metadata field ( closes #821 )
4 years ago
Mike Fährmann
be04e44e2c
[reddit] catch JSON decode errors ( #765 )
4 years ago
Mike Fährmann
cf863f60b3
[redgifs] add 'user' and 'search' extractors ( closes #724 )
4 years ago
Mike Fährmann
998d1d3a5c
[webtoons] generalize and improve comic extraction ( fixes #820 )
4 years ago
Mike Fährmann
036a40943a
[twitter] don't cache results of 'user_by_screen_name()'
...
A 'keyarg=1' argument to the memcache decorator would have worked as
well, but keeping the user object in memory isn't useful for the vast
majority of use cases and only wastes space.
(closes #817 )
4 years ago
Mike Fährmann
4442dfe7b8
[twitter] add 'reply_to' metadata to replies
4 years ago
Mike Fährmann
83b7bd0413
[nhentai] fix extraction ( closes #819 )
4 years ago
Mike Fährmann
d769bb4b80
[twitter] improve pagination
4 years ago
Mike Fährmann
5bc1097f9d
[twitter] metadata cleanup #2
...
- remove useless clutter by creating new tweet-data dicts instead of
reusing the original Tweet objects
- rename fields to how they were named before
('id_str' -> 'tweet_id', etc.)
- only include 'author' if it would differ from 'user'
- restore 'archive_fmt'
4 years ago
Mike Fährmann
c6c06c41f6
[deviantart] don't add journal text to description ( #712 )
4 years ago
Mike Fährmann
4aea5138dd
[sensescans] use https://
4 years ago
Mike Fährmann
3eed5f52d7
[twitter] small metadata cleanup
...
- add 'date' field
- remove 'entities' and 'extended_entities'
- don't include 'focus_fields' from 'original_info'
4 years ago
Mike Fährmann
655c98cbef
[twitter] skip unavailable tweets
4 years ago
Mike Fährmann
41d03160ff
[deviantart] also search journals for sta.sh links ( #712 )
...
when 'extra' is enabled
4 years ago
Mike Fährmann
2132e5461a
[twitter] restore TwitPic support
4 years ago
Mike Fährmann
bd0f21478a
[twitter] login using the mobile nojs login page
4 years ago
Mike Fährmann
a10f31dde5
[twitter] rewrite; use new interface ( #740 , #806 )
...
Everything except logging in with username & password and TwitPic
embeds should be working again.
Metadata per Tweet is massively different than before (mostly raw API
responses - might need some cleaning up) and the default 'archive_fmt'
changed.
4 years ago
Mike Fährmann
3bad1579ee
update extractor test results
4 years ago
Mike Fährmann
864f4220d9
update output of 'oauth:…' ( #616 )
4 years ago
Mike Fährmann
0f459f340b
[instagram] fix and re-enable login with username&password
...
This reverts commit 3e0848a482
.
(#756 , #771 , #797 , #803 )
https://github.com/althonos/InsaLooter/issues/287#issuecomment-630456522
4 years ago
Mike Fährmann
3e0848a482
[instagram] disable login with username&password ( #756 )
4 years ago
Mike Fährmann
a32aea41e1
[instagram] update 'query_hash' values
4 years ago
Mike Fährmann
2bff8dd465
[hentainexus] fix flake8 issues ( #787 )
4 years ago
Mike Fährmann
a63682a9c0
[instagram] simplify code & complete tests ( #743 )
4 years ago
墨焓
a4e3d40672
hentainexus.py minor fix ( #787 )
...
* rectify code of `join_title`, some minor fix.
* + hentainexus self.data
* fixed: call staticmethod join_title with data
4 years ago
Vrihub
62b65e59d0
Add instagram metadata: post_pageurl, post_tags ( #743 )
...
* Add instagram metadata: post_pageurl, post_tags
Add the following metadata for instagram:
- post_pageurl: json string with url of the post page
- post_tags: json array with instagram tags extracted from the post description
* Oops: rename post_tags to tags for --write-tags
This way, --write-tags will pick up the post tags.
* Rename to post_url, improve regex
* Add post_url and tags to tests
* Remove duplicate tags and sort them
* Bugfix: don't create empty tag lists
* Metadata: add location
* Metadata: add tagged_users for each media
* Move self._find_tags() to base class
* Make flake happy
4 years ago
Mike Fährmann
275cceeb6a
[redgifs] fix extraction ( #724 )
...
… and prepare for more potential extractors
4 years ago
Mike Fährmann
45baa13615
update extractor test results
...
- don't run Instagram tests on Travis anymore
- replace Twitter test because timeline was made private
- update Hiperdex domain to '.com' (again ...)
4 years ago
Mike Fährmann
dfcf2a2c91
write OAuth token to cache by default ( #616 )
4 years ago
Mike Fährmann
15c3d29062
move dump_response() into a separate function ( #737 )
4 years ago
Mike Fährmann
a363da4b43
include redirects and headers in --write-pages dumps ( #737 )
4 years ago
Mike Fährmann
6bcdb264e0
[imgur] treat 't/unmuted' URLs as galleries
4 years ago