Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
1 year ago
Mike Fährmann
c84397023a
[slideshare] fix extraction
1 year ago
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2 years ago
Mike Fährmann
3cebf787c4
[slideshare] fix metadata extraction
2 years ago
Mike Fährmann
f2e59cc906
[slideshare] fix 'description' extraction
2 years ago
Mike Fährmann
7aa2e2cd84
[slideshare] fix extraction
3 years ago
Mike Fährmann
211de95dd0
update extractor test results
3 years ago
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
...
only leave them in oauth.py as noop results
3 years ago
Mike Fährmann
de14b7ad7a
[slideshare] fix extraction
3 years ago
Mike Fährmann
280b1ac16d
[slideshare] fix extraction
4 years ago
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
4 years ago
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
6 years ago
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
6 years ago
Mike Fährmann
f471161920
Merge branch 'master' into 1.4-dev
7 years ago
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
7 years ago
Mike Fährmann
10cc59f3b5
fix extractor names
7 years ago
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
7 years ago
Mike Fährmann
4edb25346e
[slideshare] support mobile URLs ( closes #67 )
7 years ago
Mike Fährmann
0a9a07a6e1
[slideshare] improve metadata; flake8
...
- added 'views' and 'published' keywords
- fixed longer titles and descriptions
7 years ago
Leonardo Taccari
a8d2dde8b2
[slideshare] Add a new extractor for slideshare.net ( #54 )
7 years ago