Mike Fährmann
5158cbb4c1
[weibo] rework pagination logic ( #4168 )
...
don't automatically stop when receiving an empty status list
shouldn't improve 'tabtype=feed' results, but at least 'tabtype=album'
ones and others using cursors won't end prematurely
6 months ago
Mike Fährmann
ace16f00f5
[weibo] fix retweets ( #2825 , #3874 , #5263 )
...
- handle 快转 retweets
- disable 'retweets' by default
- skip all retweet media when 'retweets' are disabled
- extract all retweet media when 'retweets' is set to "original"
7 months ago
Mike Fährmann
0676a9d6ec
[weibo] fix 'livephoto' filename extensions ( #5287 )
7 months ago
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option ( #5183 )
7 months ago
Mike Fährmann
b4bcf40278
[weibo] fix AttributeError in 'user' extractor ( #5022 )
...
yet another bug caused by a383eca7
9 months ago
Mike Fährmann
e8b5e59a08
[weibo] detect redirects to login page ( #4773 )
10 months ago
Mike Fährmann
56cd9d408d
[weibo] fix Sina Visitor request
11 months ago
Mike Fährmann
3ecb512722
send Referer headers by default
1 year ago
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
1 year ago
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
1 year ago
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
1 year ago
Mike Fährmann
1d4db83d49
[weibo] fix end of cursor based pagination
1 year ago
Mike Fährmann
654267a335
[weibo] fix 'json' extension for some videos
1 year ago
Mike Fährmann
0a9aaa7a8d
[weibo] prevent fatal exception due to missing video ( #4150 )
1 year ago
Mike Fährmann
6b6bb4be73
[weibo] require numeric IDs to have length >= 10 ( #4059 )
1 year ago
Mike Fährmann
72f1f16eb2
[weibo] support 'mix_media_info' entries ( #3793 )
2 years ago
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2 years ago
Mike Fährmann
7e277d0f7d
[weibo] add 'count' metadata field ( #3305 )
...
or '{status[count]}', as most metadata for weibo is inside 'status'
2 years ago
Mike Fährmann
c25905641e
[weibo] fix bug with empty 'playback_list' ( #3301 )
2 years ago
Mike Fährmann
e3abab8629
[weibo] send 'Referer' headers ( #3188 )
2 years ago
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2 years ago
Mike Fährmann
1c89ccb27d
[weibo] prevent errors when paginating over album entries ( #2817 )
2 years ago
Mike Fährmann
0f5826e884
[weibo] prevent exception for missing 'playback_list' ( #2792 )
2 years ago
Mike Fährmann
c6a9bab019
update extractor test results
2 years ago
Mike Fährmann
539e3bbed9
[weibo] handle invalid/broken status objects
2 years ago
Mike Fährmann
6db77d4656
[weibo] support '?tabtype=video' listings ( #2601 )
2 years ago
Mike Fährmann
45c980daf0
[weibo] fix retweets ( #2601 )
2 years ago
Mike Fährmann
61cbf8318c
[weibo] fix URLs generated by 'user' extractor ( #2601 )
2 years ago
Mike Fährmann
e59bcb8437
[weibo] ensure media URLs use https://
2 years ago
Mike Fährmann
73f673e3ca
[weibo] handle 'gif' pictures
2 years ago
Mike Fährmann
57508d3bb7
[weibo] support all different 'tabtype' listings ( #686 , #2601 )
2 years ago
Mike Fährmann
7a9cba9c10
[weibo] add support for usernames in URLs ( #1662 )
2 years ago
Mike Fährmann
4bf5bc2403
[weibo] support 'livephoto' entries ( #2146 )
2 years ago
Mike Fährmann
a0692818af
[weibo] switch to desktop API ( #2601 )
2 years ago
Mike Fährmann
afde76269c
[weibo] fix infinite retries for deleted accounts ( fixes #2521 )
2 years ago
Mike Fährmann
e670dc518e
[weibo] update pagination code ( fixes #2244 )
...
- send proper headers and query parameters
- use 'since_id' instead of page numbers
- set a 1-2 second delay between requests
3 years ago
Mike Fährmann
c80b18a477
[weibo] extend 'retweets' option ( closes #1542 )
...
Setting 'retweets' to "original" will use metadata from the
original posts, and not from the retweeted ones.
3 years ago
Mike Fährmann
73373c06ec
[weibo] handle posts with more than 9 images ( closes #926 )
...
Responses from '/api/container/getIndex' don't list more than
9 images per 'status' object, but the embedded JSON from a
'/detail/<ID>' page does.
4 years ago
Mike Fährmann
c51fbd72ba
update extractor test results
4 years ago
Mike Fährmann
7158bdd7c7
[weibo] improve extractor logic ( #829 )
4 years ago
Mike Fährmann
d5d90a0450
[weibo] add 'date' field to 'status' objects ( #829 )
4 years ago
Mike Fährmann
5e2974d699
[weibo] add 'videos' option
4 years ago
Mike Fährmann
699036ea0c
[weibo] accept status URLs with non-numeric IDs ( #664 )
5 years ago
Mike Fährmann
e35c2ea1a6
[weibo] use youtube-dl to download from m3u8 manifests
5 years ago
Mike Fährmann
922b8a9595
[weibo] raise NotFoundError for unavailable/deleted statuses
5 years ago
Mike Fährmann
d1ea08c67d
[weibo] fixes and improvements
...
- ignore unavailable videos (fixes #427 )
- handle empty 'geo' fields
- consistent metadata fields for images and videos
5 years ago
Mike Fährmann
17c11393f5
[weibo] allow user-ids in status URLs
6 years ago
Mike Fährmann
973a720a7a
[weibo] fix unit test URL patterns
6 years ago
Mike Fährmann
19860655a3
[weibo] add 'user' and 'status' extractors
6 years ago