Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
4 years ago
Mike Fährmann
67ac6667af
[mangareader] fix extraction
4 years ago
Mike Fährmann
e6cd49e78b
update extractor test results
5 years ago
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
6 years ago
Mike Fährmann
32edf4fc7b
add '_extractor' info to manga extractor results
6 years ago
Mike Fährmann
580baef72c
change Chapter and MangaExtractor classes
...
- unify and simplify constructors
- rename get_metadata and get_images to just metadata() and images()
- rename self.url to chapter_url and manga_url
6 years ago
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
6 years ago
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
6 years ago
Mike Fährmann
34bab080ae
rewrite URL patterns to use only 1 per extractor
6 years ago
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
7 years ago
Mike Fährmann
6e38cf5aab
[mangareader] use 'https://'
...
The site now redirects from http://mangareader.net/
to https://mangareader.net/
7 years ago
Mike Fährmann
3cec533c28
Merge branch 'archive'
7 years ago
Mike Fährmann
5b3c34aa96
use generic chapter-extractor in more modules
7 years ago
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
7 years ago
Mike Fährmann
68a0a7579c
fix/improve some regular expressions
7 years ago
Mike Fährmann
633b376f35
improve/adjust default filename formats for manga sites
7 years ago
Mike Fährmann
9fc1d0c901
implement and use 'util.safe_int()'
...
same as Python's 'int()', except it doesn't raise any exceptions and
accepts a default value
7 years ago
Mike Fährmann
a3e40734d1
[mangareader] extract manga metadata
7 years ago
Mike Fährmann
6f30cf4c64
change keyword names to valid Python identifiers
...
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.
(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
7 years ago
Mike Fährmann
f226417420
simplify code by using a MangaExtractor base class
7 years ago
Mike Fährmann
94e10f249a
code adjustments according to pep8 nr2
8 years ago
Mike Fährmann
56d810c896
update keyword hashes for tests
8 years ago
Mike Fährmann
19c2d4ff6f
remove explicit (sub)category keywords
8 years ago
Mike Fährmann
d7e168799d
consistent extractor naming scheme + docstrings
8 years ago
Mike Fährmann
ba99506c72
more extractor test-cases
9 years ago
Mike Fährmann
f7c47a6018
add subcategories to extractors
9 years ago
Mike Fährmann
f48712c9c9
docstrings
9 years ago
Mike Fährmann
914062d172
use text.extract_iter where applicable
9 years ago
Mike Fährmann
63d693866c
[mangareader] remove leading spaces from manga names
9 years ago
Mike Fährmann
20efe49f83
[mangareader] unify extractor metadata in base class
9 years ago
Mike Fährmann
d5349c8cb5
[mangareader] add manga-extractor (all chapters)
9 years ago
Mike Fährmann
4d56b76aa8
update all other extractors
9 years ago
Mike Fährmann
c2f0720184
code cleanup to use nameext_from_url
9 years ago
Mike Fährmann
009761fcd5
[mangapanda] add extractor
9 years ago
Mike Fährmann
1fa6a99f18
[mangareader] rewrite
9 years ago
Mike Fährmann
3c13548f29
rewrite extractors to use config-module
9 years ago
Mike Fährmann
42b8e81a68
rewrite extractors to use text-module
9 years ago
Mike Fährmann
13ebca2a48
[mangareader] supply correct width and height
9 years ago
Mike Fährmann
675937c77c
[mangareader] add extractor
9 years ago