gallery-dl

Commit Graph

Author	SHA1	Message	Date
Mike Fährmann	8f38a35b91	[imgur] use API with "public" client_id (#446 ) Using the API endpoints makes it possible to access NSFW content without logging in.	5 years ago
Mike Fährmann	b23c822b23	[luscious] use GraphQL	5 years ago
Mike Fährmann	ef17d94469	update test results	5 years ago
Mike Fährmann	2057c6ba29	[naver] add blog and post extractors (closes #447 )	5 years ago
Mike Fährmann	389d2d7e38	implement 'cookies-update' option (#445 )	5 years ago
Mike Fährmann	fbc0a6a059	[nozomi] skip unavailable posts (#388 )	5 years ago
Mike Fährmann	ae98dbcbb3	[nozomi] implement searching for negated terms (#388 ) It's incredibly slow and resource intensive (> 1GB of memory), but that is also how it is implemented on nozomi.la itself.	5 years ago
Mike Fährmann	1c03a389df	[twitter] small improvements to search extractor - put search results in separate directories - set 'max_position' to '-1' for first request -> prevent duplicate results - add a test - flake8	5 years ago
Mike Fährmann	c3042978b8	[deviantart] match "/gallery/all" (closes #449 )	5 years ago
Alice	bcddcca6db	Add search downloading to twitter.py (#448 ) Adds the functionality to download search results on twitter.com/search. Since twitter only allows downloading of up to 3,200 of a users most recent tweets, you will be unable to download old images from users with a lot of tweets. To bypass this, you can use the twitter search to get the tweets from the sections in time you were stopped at. An example search would be "from:user since:2015-01-01 until:2016-01-01 filter:images". The URL you would use will look something like this https://twitter.com/search?f=tweets&q=from%3Asupernaturepics%20since%3A2015-01-01%20until%3A2016-01-01%20filter%3Aimages&src=typd&lang=en The _tweets_from_api function had to be changed because it would not get the next page of results using the last "data-tweet-id". It would return the same JSON but with a "min_position" string added. Using this string for the "max_position" param from the second page onwards correctly returned the next pages. This change does not interfere with how the other extractors work as far as I know. The 2 regex patterns in the extractors had to be changed to not match the search URL.	5 years ago
Mike Fährmann	1693d97bd3	update extractor class hierarchies - let the GalleryExtractor class inherit directly from Extractor - make ChapterExtractor a subclass of GalleryExtractor - change enumeration field names of GalleryExtractors to 'num'	5 years ago
Mike Fährmann	7ebd984e8d	[imgur] print error message if no JSON data is found (#446 )	5 years ago
Mike Fährmann	5882b00f2f	[imgur] implement login support (#446 )	5 years ago
Mike Fährmann	91643ca54b	[nozomi] add search extractor (#388 )	5 years ago
Mike Fährmann	df2b3c6888	restore OAuth2 authentication error messages	5 years ago
Mike Fährmann	6779512fc7	[nozomi] add post and tag extractors (#388 )	5 years ago
Mike Fährmann	6abe5f5bbb	[patreon] fix pagination (#444 ) The Patreon-provided URLs for the next set of posts aren't always complete, i.e. they can be missing their scheme and the subsequent double slash: "www.patreon.com/…"	5 years ago
Mike Fährmann	ff1e4a86aa	release version 1.10.6	5 years ago
Mike Fährmann	d4ffd6c952	[yaplog] improve metadata extraction (#443 ) - provide a fallback if there is no numerical image ID - add a 'filename' field - convert 'date' to an actual datetime object	5 years ago
Mike Fährmann	15af2f8464	[hitomi] fallback to /reader/ page if main page returns 404 Some galleries return a 404: Not Found error when trying to access them through the main gallery URL, but their content is still available on the respective /reader/ page.	5 years ago
Mike Fährmann	8af59a4bba	fix & update docs - update Requests links - add example for --exec - set '-dev' version	5 years ago
Mike Fährmann	dc6ad81e2e	[yaplog] prevent crash on empty posts (#443 )	5 years ago
Mike Fährmann	94eb7c6cad	[deviantart] fix sta.sh extraction (436)	5 years ago
Mike Fährmann	1032cfa34b	[downloader:http] extend mimetype map with archive formats	5 years ago
Mike Fährmann	27b5b2497e	[deviantart] fix download URLs (#436 ) ... except for sta.sh content. Instead of using the old '/api/v1/oauth2/deviation/download' endpoint, which started delivering URLs to 404 pages a while ago, it is also possible to get a download URL from the relatively new '/_napi/da-browse/shared_api/deviation/extended_fetch' endpoint used by DeviantArt's Eclipse interface. The current strategy is therefore: - Iterate over deviations using the OAuth2 API - Fetch original download URLs with the new NAPI/Shared API	5 years ago
Mike Fährmann	93aac8dfea	[yaplog] fix incomplete image URLs (#443 )	5 years ago
Mike Fährmann	a782b009b8	[yaplog] match blog names with '-' (#443 )	5 years ago
Mike Fährmann	cf5e716b9d	[hitomi] fix image URLs	5 years ago
Mike Fährmann	ad81c07204	[postprocessor] match logger names of downloader modules The logger name for a postprocessor object got changed to "postprocessor.<module-name>" instead of just "postprocessor"	5 years ago
Mike Fährmann	03bc8adfc7	[postprocessor:exec] run after file moved to target location (#421)	5 years ago
Mike Fährmann	35958bebd4	[postprocessor:exec] fix filename quoting on Windows (#421 )	5 years ago
Mike Fährmann	b06c372e4d	[postprocessor:exec] improve; add command-line option (#421 )	5 years ago
Mike Fährmann	5a54efa025	[xhamster] unescape 'title' and 'description'	5 years ago
Mike Fährmann	1b9bf4fc6e	[behance] fix 'tags' extraction	5 years ago
Mike Fährmann	bb97e87989	[komikcast] ignore banner image	5 years ago
Mike Fährmann	0ff90a3f7d	[gfycat] include title in default filenames (closes #434 )	5 years ago
Mike Fährmann	fabdc3b0c6	release version 1.10.5	5 years ago
Mike Fährmann	de4e2029d1	[nsfwalbum] update test album the old one is no longer available	5 years ago
Mike Fährmann	1faec285d1	[nijie] further improvements (closes #423 ) - provide a 'user_name' metadata field - usually the same as 'artist_id', except for favorite downloads - extract the whole description text and properly escape HTML entities - fixed an issue with titles or tags containing double quotes	5 years ago
Mike Fährmann	6d0a533d68	[reddit] respect 'comments:0' for single submissions (#429 )	5 years ago
Mike Fährmann	803d8f814e	[oauth] update scope for reddit tokens (#428 ) '/user/<username>/...' requires the 'history' scope to be accessible (https://www.reddit.com/dev/api/#GET_user_{username}_{where})	5 years ago
Mike Fährmann	46ba173ded	[reddit] fix documentation inconsistencies (closes #429 ) - Require 'reddit.comments' to be a number and convert it to an integer to be extra sure - Link to the README's OAuth section were appropriate	5 years ago
Mike Fährmann	20eb6c401f	[nijie] improvements and fixes (#423 ) - ignore unavailable image pages - more metadata fields: artist_name, date, tags - rename 'index' to 'num' - improved code structure	5 years ago
Mike Fährmann	d1ea08c67d	[weibo] fixes and improvements - ignore unavailable videos (fixes #427) - handle empty 'geo' fields - consistent metadata fields for images and videos	5 years ago
Mike Fährmann	38d97f3da6	[deviantart] add debug message about API credentials (#424 )	5 years ago
Mike Fährmann	80c2104fb5	[deviantart] fix 429 handling if 'fatal' is False (closes #424 )	5 years ago
Mike Fährmann	913460240d	[reddit] fix 'extractor.blacklist()' arguments The second argument must support 'append()'.	5 years ago
Mike Fährmann	22bac14452	[pixiv] match '/artworks/' URLs	5 years ago
Mike Fährmann	66cac207ac	[twitter] match and use 'i/web' status URLs	5 years ago
Mike Fährmann	946f2751e2	[reddit] add 'user' extractor (closes #350 )	5 years ago
Mike Fährmann	c14abb9fb8	[reddit] improve URL parameter handling for subreddit links	5 years ago
Mike Fährmann	ee8b654464	[instagram] implement 'highlights' option (closes #329 )	5 years ago
Mike Fährmann	f63c3097a9	[instagram] rework some code paths - combine fetching an HTML page and extracting its 'shared_data' - move 'shared_data' and field access info out of '_extract_page()' - introduce a '_request_graphql()' method	5 years ago
Mike Fährmann	4330133114	[imgur] add 'favorite' extractor (closes #420 ) … and use a newer site-internal API endpoint for user posts	5 years ago
Mike Fährmann	ee5e20221f	[imgth] fix image URLs	5 years ago
Mike Fährmann	b63b126808	[hentaicafe] extend URL pattern	5 years ago
Mike Fährmann	d780f0357e	[imgur] add user extractor	5 years ago
Mike Fährmann	11ea689013	[simplyhentai] fix image and video URLs	5 years ago
Mike Fährmann	15632a1570	[tsumino] fix extraction	5 years ago
Mike Fährmann	d92802fd37	[luscious] fix detection of unavailable galleries	5 years ago
Mike Fährmann	f99da2b866	[imgbb] detect invalid album and user profile links and update test results, since the old album got deleted	5 years ago
Mike Fährmann	01bc7adadc	[deviantart] improve journal detection (#419 ) Some journal-like posts are not reported to be journals (isJournal is set to False), even though they have a textContent field. https://www.deviantart.com/gliitchlord/art/brashstrokes-812942668	5 years ago
Mike Fährmann	776e9e073f	close archive on job completion (#417 )	5 years ago
Mike Fährmann	5ac9732adc	call 'sys.exit()' on Ctrl+c	5 years ago
Mike Fährmann	9178b54eae	handle errors when opening download archive file (#417 )	5 years ago
Mike Fährmann	6e12907de6	[deviantart] improve handling of private deviations (#414 ) - don't try to call '/deviation/metadata' with an empty list of deviation ids - print a warning when detecting private deviations without having a 'refresh-token'	5 years ago
Mike Fährmann	4203931d79	release version 1.10.4	5 years ago
Mike Fährmann	e7690ac694	[vsco] update URL pattern (closes #410 )	5 years ago
Mike Fährmann	1848788970	update test results etc	5 years ago
Mike Fährmann	d5fbb2d9de	[tumblr] ignore audio links from Spotify etc.	5 years ago
Mike Fährmann	b1cddce865	Revert "[simplyhentai] fix extraction; remove image+video extractors" This reverts commit `d1db5180ab`.	5 years ago
Mike Fährmann	d23660c04d	[hentaicafe] restore default 'request()' behavior	5 years ago
Mike Fährmann	9ae58a6b3e	[exhentai] update image limit checks - adjust cost of original images - delay limit initialization until gallery and first image page have been requested and all cookies are available	5 years ago
Mike Fährmann	6fe9a134bf	[lineblog] add blog and post extractors (closes #404 )	5 years ago
Mike Fährmann	4e8a548a61	[livedoor] update metadata extraction	5 years ago
Mike Fährmann	f9285f99e6	[pixiv] fix authentication	5 years ago
Mike Fährmann	6f3df3999a	[fuskator] add gallery and search extractor (closes #407 )	5 years ago
Mike Fährmann	bc0ca66c99	[twitter] small improvements - handle reply tweets (#403) - unset cookies in Tweet extractor to "force" the legacy interface	5 years ago
Mike Fährmann	682105b8ee	prevent crash when loading unavailable downloader (#405 )	5 years ago
Mike Fährmann	5fcebb69c2	[postprocessor:ugoira] improve error messages (#406 )	5 years ago
Mike Fährmann	f02a768b5c	[danbooru] add 'ugoira' option (#406 ) to choose between ZIP archives or converted video files for Ugoira posts	5 years ago
Mike Fährmann	9646ccb320	release version 1.10.3	5 years ago
Mike Fährmann	dedea3b4db	[deviantart] fix journal creation (#400 )	5 years ago
Mike Fährmann	c6c5cb1898	improve 'deviantart.quality' description	5 years ago
Mike Fährmann	8eaae58045	[downloader:http] change log message level to 'debug'	5 years ago
Mike Fährmann	efb64ad031	[deviantart] generate filenames (#392 , #400 )	5 years ago
Mike Fährmann	0ce98169b8	improve path generation - fix 'abspath()' results for Python <3.7 (closes #402) - 'abspath()' in Python 3.7+ removes trailing path separators - in Python <3.7 it doesn't - filter empty path segments	5 years ago
Mike Fährmann	b2151f3928	[seiga] support mobile URLs (closes #401 )	5 years ago
Mike Fährmann	20fd2d8450	[flickr] skip unavailable images/videos (fixes #398 )	5 years ago
Mike Fährmann	60c8e090da	[postprocessor:zip] fix archive names (closes #397 ) Remove the trailing path separator introduced in `3284c62` before adding the archive's filename extension. [ci skip]	5 years ago
Mike Fährmann	7c09545f70	[downloader:ytdl] add 'outtmpl' option (#395 )	5 years ago
Mike Fährmann	5cc7be2536	[piczel] update and improve - use proper pagination (fixes #396) - update API host and endpoints - "fix" double slash // in image URLs	5 years ago
Mike Fährmann	0c1c7abb4d	release version 1.10.2	5 years ago
Mike Fährmann	49f6d7176d	[deviantart] restore filenames (#392 ) <title>_by_<user>_<id> --> <title>_by_<user>-<id>	5 years ago
Mike Fährmann	63daa68d67	[deviantart] improvements (#392 ) - consistent 'filename' entries, at least as far as possible - GIFs and SWFs don't have a <title>_by_<artist>_<id> anywhere in their metadata - Generating <id> (from 'deviationid'?) might be something that needs to be figured out, so we can build those filenames ourselves - better code structure etc. - tests for videos, archives, and flash animations	5 years ago
Mike Fährmann	d1db5180ab	[simplyhentai] fix extraction; remove image+video extractors	5 years ago
Mike Fährmann	30d6e284b0	[deviantart] use NAPI for artworks and scraps (#392 ) TODO: - journal downloads - test for all media types	5 years ago
Mike Fährmann	7d6af936c5	[imgur] simplify gallery extraction	5 years ago
Mike Fährmann	3284c62f22	ensure PathFormat.directory ends with a path separator ... plus some other small optimizations	5 years ago
Mike Fährmann	ebabc5caf1	[downloader:http] treat 416 without downloaded data as error Downloading https://pbs.twimg.com/media/EB2cGUYX4AI2Vuu.jpg:orig (NSFW) sometimes returns a 416 status code, even though no 'Range' header was sent and no data was downloaded prior. This code usually means a file has already been downloaded completely and the download method indicates success, but in this case it causes an exception down the pipeline since no file was created.	5 years ago

1 2 3 4 5 ...

1893 Commits (b1f0609de5b6d9c54edca60e06bd83e10912f31f)