[blogger] Fix lh*.googleusercontent.com forward slash bug, add support for lh*-**.googleusercontent.com

Some URLs use "lh(number)-(locale).googleusercontent.com" format, so I added support for those.

Also, "lh(number).googleusercontent.com" formats were broken because the regex was looking for a second forward slash.

Examples:
lh7.googleusercontent.com
lh7-us.googleusercontent.com
pull/5091/head
Wiiplay123 8 months ago committed by GitHub
parent 6f8592eaff
commit a6fed628dd
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -37,7 +37,8 @@ class BloggerExtractor(BaseExtractor):
findall_image = re.compile(
r'src="(https?://(?:'
r'blogger\.googleusercontent\.com/img|'
r'lh\d+\.googleusercontent\.com/|'
r'lh\d+\.googleusercontent\.com|'
r'lh\d+-\w+\.googleusercontent\.com|'
r'\d+\.bp\.blogspot\.com)/[^"]+)').findall
findall_video = re.compile(
r'src="(https?://www\.blogger\.com/video\.g\?token=[^"]+)').findall

Loading…
Cancel
Save