Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
1 year ago
Mike Fährmann
f856987297
[subscribestar] fix preview detection ( #4468 )
...
and show a warning message when posts contain previews
1 year ago
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
1 year ago
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2 years ago
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2 years ago
Mike Fährmann
541a61d344
[subscribestar] fix 'date' metadata ( #2642 )
...
Handle instances where the actual datetime information
is preceded by "Updated on "
2 years ago
Mike Fährmann
d50a1ec2cc
[subscribestar] unescape attachment URLs ( fixes #2370 )
3 years ago
Mike Fährmann
522782c09d
[subscribestar] emit metadata for posts without media ( #1569 )
3 years ago
Mike Fährmann
1c8aaf9318
[subscribestar] add 'num' enumeration index ( closes #2040 )
3 years ago
Mike Fährmann
21c2da454f
update extractor test results
3 years ago
Mike Fährmann
d09bc5bd34
[subscribestar] improve attachment filenames ( #1609 )
3 years ago
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
4 years ago
Mike Fährmann
69e4871005
update extractor test results
...
- sensescans: replace 404d chapters
- mangapark: replace 404d chapters
- subscribestar: update test for attached files
4 years ago
Mike Fährmann
0d84d3af55
[subscribestar] extract attached media files ( #852 )
4 years ago
Mike Fährmann
e50c75628c
[subscribestar] update 'date' parsing
4 years ago
Mike Fährmann
d5fcffcced
[subscribestar] add login capabilities ( #852 )
4 years ago
Mike Fährmann
f5c9f1d066
[subscribestar] use current date instead of hard-coded '2020' ( #852 )
4 years ago
Mike Fährmann
821524e4ee
[subscribestar] add 'user' and 'post' extractors ( #852 )
4 years ago