Mike Fährmann
caf31e751c
[kemonoparty] limit 'title' length ( #4741 )
11 months ago
Mike Fährmann
d0effcae20
[kemonoparty] add 'revision_index' metadata field ( #4727 )
11 months ago
Mike Fährmann
3bbaa875f1
[kemonoparty] fix parsing of non-standard 'dates' ( #4676 )
11 months ago
Mike Fährmann
0d52b775cb
[kemonoparty] add 'revisions' option ( #4498 , #4597 )
11 months ago
Mike Fährmann
6e830ffc9e
[kemonoparty] support post searches ( #3385 , #4057 )
11 months ago
Mike Fährmann
aaf539009b
[kemonoparty] initial support for post revisions ( #4498 , #4597 )
...
- single revision
https://kemono.party/SERVICE/user/12345/post/12345/revision/12345
- all revisions
https://kemono.party/SERVICE/user/12345/post/12345/revisions
11 months ago
Mike Fährmann
174191cb79
[kemonoparty] restore discord pagination ( #4676 )
11 months ago
Mike Fährmann
c9a976d8a6
[kemonoparty] various updates and fixes ( #4676 , #4681 )
...
- fix pagination
- fix 'date' metadata
- fix discord channel API endpoint
11 months ago
Klion Xu
dc1c2139b1
fix line too long
11 months ago
Klion Xu
6b22af9720
[kemonoparty] update API endpoint ( #4676 )
11 months ago
Mike Fährmann
ade8347ead
[kemonoparty] fix DM dates
11 months ago
Mike Fährmann
6dfe200ae4
[kemonoparty] support discord URLs with channel IDs ( #4662 )
11 months ago
Mike Fährmann
3ecb512722
send Referer headers by default
1 year ago
Mike Fährmann
d13c82eff1
[kemonoparty] update favorites API endpoint ( #4522 )
1 year ago
Mike Fährmann
27ec653991
fix bug in test_init and update example URLs
1 year ago
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
1 year ago
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
1 year ago
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
1 year ago
Mike Fährmann
4ae925c88f
[kemonoparty] support '.su' TLD ( #4139 )
1 year ago
Mike Fährmann
3516fdae74
[kemonoparty] fix kemono and coomer logins using the same cache
...
(#4098 )
1 year ago
Mike Fährmann
76b01b64cf
[kemonoparty] remove MD5 hash extraction ( #3531 )
...
This partially reverts commit 20d6194ffa
.
2 years ago
ClosedPort22
20d6194ffa
[kemonoparty] improve hash extraction
...
- extract MD5 hash from URLs
- extract MD5 and SHA256 hash from Discord URLs (kemono.party only)
- minor optimization (do not call 'hashes.add' when 'duplicates' is
true)
- update tests accordingly
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2 years ago
Mike Fährmann
85bd1cbc89
[kemonoparty] fix regression from 473bd380
( #3519 )
...
- do not access 'response.content' unless necessary
- only validate responses if filename extensions differ
2 years ago
Mike Fährmann
473bd380c8
[kemonoparty] reject invalid/empty files ( #3510 )
2 years ago
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2 years ago
Mike Fährmann
77173694d5
[kemonoparty] fix 'dms' extraction ( #3106 )
2 years ago
Mike Fährmann
94a2dfe205
[kemonoparty] update pagination offset
2 years ago
Mike Fährmann
78694a61bb
[kemonoparty] restore 'favorites' API endpoints ( #2994 )
2 years ago
Mike Fährmann
b84982b2f9
[kemonoparty] send Referer headers ( #2989 , #2990 )
2 years ago
Mike Fährmann
779e75c6f8
[kemonoparty] fix attachment IDs overwriting post IDs ( #2984 )
...
regression from 09a5cc61
2 years ago
Mike Fährmann
09a5cc6103
[kemonoparty] add 'count' metadata field ( #2952 )
2 years ago
enduser420
574e38a287
[kemonoparty] add 'favorites' option ( #2826 ) ( #2831 )
...
* [kemonoparty] add 'favorites' option (#2826 )
* [kemonoparty] add regex for the url parameter and fallback on the config
option
* [kemonoparty] simplify
2 years ago
Mike Fährmann
7c0505868c
[kemonoparty] ensure all files have an 'extension' ( #2740 )
2 years ago
Mike Fährmann
ba69fb669d
[kemonoparty] add 'duplicates' option ( closes #2440 )
3 years ago
Mike Fährmann
fac8047899
[kemonoparty] limit default filename length ( #2373 )
3 years ago
Mike Fährmann
bddcec49f1
implement 'text.root_from_url()'
...
use domain from input URL for kemono
3 years ago
Mike Fährmann
92c492dc09
[kemonoparty] match beta.kemono.party URLs ( #2348 )
3 years ago
Mike Fährmann
a57a44f510
[kemonoparty] handle files without 'name' ( fixes #2276 )
3 years ago
Mike Fährmann
d7b8e04b50
[kemonoparty] use 'Accept-Encoding: identity' for all downloads
...
(#2267 )
fixes issues when data send with 'Content-Encoding: gzip' or other
encodings is larger than the actual file
3 years ago
Mike Fährmann
a2eecc6aa8
[kemonoparty] fix DMs extraction ( #2008 )
3 years ago
Mike Fährmann
6af8d71da6
[kemonoparty] use service as subcategory ( closes #2147 )
3 years ago
Mike Fährmann
8ed282f7f2
[kemonoparty] support coomer.party URLs ( #2100 )
3 years ago
Mike Fährmann
f1b142e993
{kemonoparty[ change default 'files' order to attachments,file,inline
...
(#1991 )
3 years ago
Mike Fährmann
e298882acc
[kemonoparty] match URLs with www subdomain
3 years ago
Mike Fährmann
af6424f398
allow testing metadata in list elements
3 years ago
Mike Fährmann
c67756e187
[kemonoparty] add 'dms' option ( #2008 )
3 years ago
Mike Fährmann
9bc83af3a6
[kemonoparty] 'postfile' -> 'file' ( #1991 )
...
to stay consistent with the existing file types for kemono
3 years ago
Mike Fährmann
d433735750
[kemonoparty] skip duplicate files ( #2032 , #1991 , #1899 )
...
Extract the SHA-256 file hash from URLs
and skip files with the same hash in the same post.
- provide a 'hash' metadata field (empty string if not available)
- remove 'patreon-skip-file' option
3 years ago
Mike Fährmann
d4ec245554
[kemonoparty] implement a 'files' option ( #1991 )
...
similar to 8d676151
3 years ago
Mike Fährmann
6e3658ef52
[kemonoparty] provide 'date' metadata for gumroad ( #2007 )
...
Not the 'published' or 'edited' values since they are 'null',
but still better then nothing at all.
3 years ago