[ytsearch] Fix extraction (closes #26920 )

[afreecatv] Fix typo (#26970 )
[23video] Relax _VALID_URL (#26870 )
2020-10-23 21:31:37 +07:00 · 2020-10-22 19:15:05 +07:00 · 2020-10-20 00:56:23 +07:00 · 2020-10-18 00:10:41 +07:00 · 2020-10-17 23:14:46 +07:00 · 2020-10-17 23:02:17 +07:00
33 changed files with 552 additions and 180 deletions
--- a/.github/ISSUE_TEMPLATE/1_broken_site.md
+++ b/.github/ISSUE_TEMPLATE/1_broken_site.md
@ -18,7 +18,7 @@ title: ''
 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
 - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->
 - [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
 - [ ] I've searched the bugtracker for similar issues including closed ones
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2020.09.06
+ [debug] youtube-dl version 2020.09.20
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/.github/ISSUE_TEMPLATE/2_site_support_request.md
+++ b/.github/ISSUE_TEMPLATE/2_site_support_request.md
@ -19,7 +19,7 @@ labels: 'site-support-request'
 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
 - Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->
 - [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that none of provided URLs violate any copyrights
 - [ ] I've searched the bugtracker for similar site support requests including closed ones
--- a/.github/ISSUE_TEMPLATE/3_site_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/3_site_feature_request.md
@ -18,13 +18,13 @@ title: ''
 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
 - Finally, put x into all relevant boxes (like this [x])
 -->
 - [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
 - [ ] I've searched the bugtracker for similar site feature requests including closed ones
--- a/.github/ISSUE_TEMPLATE/4_bug_report.md
+++ b/.github/ISSUE_TEMPLATE/4_bug_report.md
@ -18,7 +18,7 @@ title: ''
 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
 - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->
 - [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
 - [ ] I've searched the bugtracker for similar bug reports including closed ones
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2020.09.06
+ [debug] youtube-dl version 2020.09.20
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/.github/ISSUE_TEMPLATE/5_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/5_feature_request.md
@ -19,13 +19,13 @@ labels: 'request'
 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
 - Finally, put x into all relevant boxes (like this [x])
 -->
 - [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
 - [ ] I've searched the bugtracker for similar feature requests including closed ones
--- a/39
+++ b/39
@ -1,3 +1,42 @@
 version 2020.09.20
 Core
 * [extractor/common] Relax interaction count extraction in _json_ld
 + [extractor/common] Extract author as uploader for VideoObject in _json_ld
 * [downloader/hls] Fix incorrect end byte in Range HTTP header for
  media segments with EXT-X-BYTERANGE (#14748, #24512)
 * [extractor/common] Handle ssl.CertificateError in _request_webpage (#26601)
 * [downloader/http] Improve timeout detection when reading block of data
  (#10935)
 * [downloader/http] Retry download when urlopen times out (#10935, #26603)
 Extractors
 * [redtube] Extend URL regular expression (#26506)
 * [twitch] Refactor
 * [twitch:stream] Switch to GraphQL and fix reruns (#26535)
 + [telequebec] Add support for brightcove videos (#25833)
 * [pornhub] Extract metadata from JSON-LD (#26614)
 * [pornhub] Fix view count extraction (#26621, #26614)
 version 2020.09.14
 Core
 + [postprocessor/embedthumbnail] Add support for non jpg/png thumbnails
  (#25687, #25717)
 Extractors
 * [rtlnl] Extend URL regular expression (#26549, #25821)
 * [youtube] Fix empty description extraction (#26575, #26006)
 * [srgssr] Extend URL regular expression (#26555, #26556, #26578)
 * [googledrive] Use redirect URLs for source format (#18877, #23919, #24689,
  #26565)
 * [svtplay] Fix id extraction (#26576)
 * [redbulltv] Improve support for rebull.com TV localized URLs (#22063)
 + [redbulltv] Add support for new redbull.com TV URLs (#22037, #22063)
 * [soundcloud:pagedplaylist] Reduce pagination limit (#26557)
 version 2020.09.06
 Core
--- a/README.md
+++ b/README.md
@ -545,7 +545,7 @@ The basic usage is not to set any template arguments when downloading a single f
 - `extractor` (string): Name of the extractor
 - `extractor_key` (string): Key name of the extractor
 - `epoch` (numeric): Unix epoch when creating the file
- - `autonumber` (numeric): Five-digit number that will be increased with each download, starting at zero
+ - `autonumber` (numeric): Number that will be increased with each download, starting at `--autonumber-start`
 - `playlist` (string): Name or id of the playlist that contains the video
 - `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist
 - `playlist_id` (string): Playlist identifier
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -717,6 +717,8 @@
 - **RayWenderlichCourse**
 - **RBMARadio**
 - **RDS**: RDS.ca
 - **RedBull**
 - **RedBullEmbed**
 - **RedBullTV**
 - **RedBullTVRrnContent**
 - **Reddit**
--- a/test/test_utils.py
+++ b/test/test_utils.py
@ -994,6 +994,12 @@ class TestUtil(unittest.TestCase):
        on = js_to_json('{42:4.2e1}')
        self.assertEqual(json.loads(on), {'42': 42.0})
        on = js_to_json('{ "0x40": "0x40" }')
        self.assertEqual(json.loads(on), {'0x40': '0x40'})
        on = js_to_json('{ "040": "040" }')
        self.assertEqual(json.loads(on), {'040': '040'})
    def test_js_to_json_malformed(self):
        self.assertEqual(js_to_json('42a1'), '42"a1"')
        self.assertEqual(js_to_json('42a-1'), '42"a"-1')
--- a/youtube_dl/downloader/hls.py
+++ b/youtube_dl/downloader/hls.py
@ -141,7 +141,7 @@ class HlsFD(FragmentFD):
                    count = 0
                    headers = info_dict.get('http_headers', {})
                    if byte_range:
-                        headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'])
+                        headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'] - 1)
                    while count <= fragment_retries:
                        try:
                            success, frag_content = self._download_fragment(
--- a/youtube_dl/downloader/http.py
+++ b/youtube_dl/downloader/http.py
@ -106,7 +106,12 @@ class HttpFD(FileDownloader):
                set_range(request, range_start, range_end)
            # Establish connection
            try:
-                ctx.data = self.ydl.urlopen(request)
+                try:
                    ctx.data = self.ydl.urlopen(request)
                except (compat_urllib_error.URLError, ) as err:
                    if isinstance(err.reason, socket.timeout):
                        raise RetryDownload(err)
                    raise err
                # When trying to resume, Content-Range HTTP header of response has to be checked
                # to match the value of requested Range HTTP header. This is due to a webservers
                # that don't support resuming and serve a whole file with no Content-Range
@ -218,9 +223,10 @@ class HttpFD(FileDownloader):
            def retry(e):
                to_stdout = ctx.tmpfilename == '-'
-                if not to_stdout:
+                if ctx.stream is not None:
-                    ctx.stream.close()
+                    if not to_stdout:
-                ctx.stream = None
+                        ctx.stream.close()
                    ctx.stream = None
                ctx.resume_len = byte_counter if to_stdout else os.path.getsize(encodeFilename(ctx.tmpfilename))
                raise RetryDownload(e)
@ -233,9 +239,11 @@ class HttpFD(FileDownloader):
                except socket.timeout as e:
                    retry(e)
                except socket.error as e:
-                    if e.errno not in (errno.ECONNRESET, errno.ETIMEDOUT):
+                    # SSLError on python 2 (inherits socket.error) may have
-                        raise
+                    # no errno set but this error message
-                    retry(e)
+                    if e.errno in (errno.ECONNRESET, errno.ETIMEDOUT) or getattr(e, 'message', None) == 'The read operation timed out':
                        retry(e)
                    raise
                byte_counter += len(data_block)
--- a/youtube_dl/extractor/afreecatv.py
+++ b/youtube_dl/extractor/afreecatv.py
@ -275,7 +275,7 @@ class AfreecaTVIE(InfoExtractor):
        video_element = video_xml.findall(compat_xpath('./track/video'))[-1]
        if video_element is None or video_element.text is None:
            raise ExtractorError(
-                'Video %s video does not exist' % video_id, expected=True)
+                'Video %s does not exist' % video_id, expected=True)
        video_url = video_element.text.strip()
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@ -10,6 +10,7 @@ import os
 import random
 import re
 import socket
 import ssl
 import sys
 import time
 import math
@ -67,6 +68,7 @@ from ..utils import (
    sanitized_Request,
    sanitize_filename,
    str_or_none,
    str_to_int,
    strip_or_none,
    unescapeHTML,
    unified_strdate,
@ -623,9 +625,12 @@ class InfoExtractor(object):
                url_or_request = update_url_query(url_or_request, query)
            if data is not None or headers:
                url_or_request = sanitized_Request(url_or_request, data, headers)
        exceptions = [compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error]
        if hasattr(ssl, 'CertificateError'):
            exceptions.append(ssl.CertificateError)
        try:
            return self._downloader.urlopen(url_or_request)
-        except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
+        except tuple(exceptions) as err:
            if isinstance(err, compat_urllib_error.HTTPError):
                if self.__can_accept_status_code(err, expected_status):
                    # Retain reference to error to prevent file object from
@ -1244,7 +1249,10 @@ class InfoExtractor(object):
                interaction_type = is_e.get('interactionType')
                if not isinstance(interaction_type, compat_str):
                    continue
-                interaction_count = int_or_none(is_e.get('userInteractionCount'))
+                # For interaction count some sites provide string instead of
                # an integer (as per spec) with non digit characters (e.g. ",")
                # so extracting count with more relaxed str_to_int
                interaction_count = str_to_int(is_e.get('userInteractionCount'))
                if interaction_count is None:
                    continue
                count_kind = INTERACTION_TYPE_MAP.get(interaction_type.split('/')[-1])
@ -1264,6 +1272,7 @@ class InfoExtractor(object):
                'thumbnail': url_or_none(e.get('thumbnailUrl') or e.get('thumbnailURL')),
                'duration': parse_duration(e.get('duration')),
                'timestamp': unified_timestamp(e.get('uploadDate')),
                'uploader': str_or_none(e.get('author')),
                'filesize': float_or_none(e.get('contentSize')),
                'tbr': int_or_none(e.get('bitrate')),
                'width': int_or_none(e.get('width')),
--- a/youtube_dl/extractor/expressen.py
+++ b/youtube_dl/extractor/expressen.py
@ -15,7 +15,7 @@ from ..utils import (
 class ExpressenIE(InfoExtractor):
    _VALID_URL = r'''(?x)
                    https?://
-                        (?:www\.)?expressen\.se/
+                        (?:www\.)?(?:expressen|di)\.se/
                        (?:(?:tvspelare/video|videoplayer/embed)/)?
                        tv/(?:[^/]+/)*
                        (?P<id>[^/?#&]+)
@ -42,13 +42,16 @@ class ExpressenIE(InfoExtractor):
    }, {
        'url': 'https://www.expressen.se/videoplayer/embed/tv/ditv/ekonomistudion/experterna-har-ar-fragorna-som-avgor-valet/?embed=true&external=true&autoplay=true&startVolume=0&partnerId=di',
        'only_matching': True,
    }, {
        'url': 'https://www.di.se/videoplayer/embed/tv/ditv/borsmorgon/implantica-rusar-70--under-borspremiaren-hor-styrelsemedlemmen/?embed=true&external=true&autoplay=true&startVolume=0&partnerId=di',
        'only_matching': True,
    }]
    @staticmethod
    def _extract_urls(webpage):
        return [
            mobj.group('url') for mobj in re.finditer(
-                r'<iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?expressen\.se/(?:tvspelare/video|videoplayer/embed)/tv/.+?)\1',
+                r'<iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?(?:expressen|di)\.se/(?:tvspelare/video|videoplayer/embed)/tv/.+?)\1',
                webpage)]
    def _real_extract(self, url):
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@ -918,7 +918,9 @@ from .rbmaradio import RBMARadioIE
 from .rds import RDSIE
 from .redbulltv import (
    RedBullTVIE,
    RedBullEmbedIE,
    RedBullTVRrnContentIE,
    RedBullIE,
 )
 from .reddit import (
    RedditIE,
--- a/youtube_dl/extractor/googledrive.py
+++ b/youtube_dl/extractor/googledrive.py
@ -220,19 +220,27 @@ class GoogleDriveIE(InfoExtractor):
                'id': video_id,
                'export': 'download',
            })
-        urlh = self._request_webpage(
+
-            source_url, video_id, note='Requesting source file',
+        def request_source_file(source_url, kind):
-            errnote='Unable to request source file', fatal=False)
+            return self._request_webpage(
                source_url, video_id, note='Requesting %s file' % kind,
                errnote='Unable to request %s file' % kind, fatal=False)
        urlh = request_source_file(source_url, 'source')
        if urlh:
-            def add_source_format(src_url):
+            def add_source_format(urlh):
                formats.append({
-                    'url': src_url,
+                    # Use redirect URLs as download URLs in order to calculate
                    # correct cookies in _calc_cookies.
                    # Using original URLs may result in redirect loop due to
                    # google.com's cookies mistakenly used for googleusercontent.com
                    # redirect URLs (see #23919).
                    'url': urlh.geturl(),
                    'ext': determine_ext(title, 'mp4').lower(),
                    'format_id': 'source',
                    'quality': 1,
                })
            if urlh.headers.get('Content-Disposition'):
-                add_source_format(source_url)
+                add_source_format(urlh)
            else:
                confirmation_webpage = self._webpage_read_content(
                    urlh, url, video_id, note='Downloading confirmation page',
@ -242,9 +250,12 @@ class GoogleDriveIE(InfoExtractor):
                        r'confirm=([^&"\']+)', confirmation_webpage,
                        'confirmation code', fatal=False)
                    if confirm:
-                        add_source_format(update_url_query(source_url, {
+                        confirmed_source_url = update_url_query(source_url, {
                            'confirm': confirm,
-                        }))
+                        })
                        urlh = request_source_file(confirmed_source_url, 'confirmed source')
                        if urlh and urlh.headers.get('Content-Disposition'):
                            add_source_format(urlh)
        if not formats:
            reason = self._search_regex(
--- a/youtube_dl/extractor/iprima.py
+++ b/youtube_dl/extractor/iprima.py
@ -86,7 +86,8 @@ class IPrimaIE(InfoExtractor):
            (r'<iframe[^>]+\bsrc=["\'](?:https?:)?//(?:api\.play-backend\.iprima\.cz/prehravac/embedded|prima\.iprima\.cz/[^/]+/[^/]+)\?.*?\bid=(p\d+)',
             r'data-product="([^"]+)">',
             r'id=["\']player-(p\d+)"',
-             r'playerId\s*:\s*["\']player-(p\d+)'),
+             r'playerId\s*:\s*["\']player-(p\d+)',
             r'\bvideos\s*=\s*["\'](p\d+)'),
            webpage, 'real id')
        playerpage = self._download_webpage(
--- a/youtube_dl/extractor/iqiyi.py
+++ b/youtube_dl/extractor/iqiyi.py
@ -150,7 +150,7 @@ class IqiyiSDKInterpreter(object):
            elif function in other_functions:
                other_functions[function]()
            else:
-                raise ExtractorError('Unknown funcion %s' % function)
+                raise ExtractorError('Unknown function %s' % function)
        return sdk.target
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@ -17,6 +17,7 @@ from ..utils import (
    determine_ext,
    ExtractorError,
    int_or_none,
    merge_dicts,
    NO_DEFAULT,
    orderedSet,
    remove_quotes,
@ -59,13 +60,14 @@ class PornHubIE(PornHubBaseIE):
                    '''
    _TESTS = [{
        'url': 'http://www.pornhub.com/view_video.php?viewkey=648719015',
-        'md5': '1e19b41231a02eba417839222ac9d58e',
+        'md5': 'a6391306d050e4547f62b3f485dd9ba9',
        'info_dict': {
            'id': '648719015',
            'ext': 'mp4',
            'title': 'Seductive Indian beauty strips down and fingers her pink pussy',
            'uploader': 'Babes',
            'upload_date': '20130628',
            'timestamp': 1372447216,
            'duration': 361,
            'view_count': int,
            'like_count': int,
@ -82,8 +84,8 @@ class PornHubIE(PornHubBaseIE):
            'id': '1331683002',
            'ext': 'mp4',
            'title': '重庆婷婷女王足交',
            'uploader': 'Unknown',
            'upload_date': '20150213',
            'timestamp': 1423804862,
            'duration': 1753,
            'view_count': int,
            'like_count': int,
@ -121,6 +123,7 @@ class PornHubIE(PornHubBaseIE):
        'params': {
            'skip_download': True,
        },
        'skip': 'This video has been disabled',
    }, {
        'url': 'http://www.pornhub.com/view_video.php?viewkey=ph557bbb6676d2d',
        'only_matching': True,
@ -338,10 +341,10 @@ class PornHubIE(PornHubBaseIE):
        video_uploader = self._html_search_regex(
            r'(?s)From:&nbsp;.+?<(?:a\b[^>]+\bhref=["\']/(?:(?:user|channel)s|model|pornstar)/|span\b[^>]+\bclass=["\']username)[^>]+>(.+?)<',
-            webpage, 'uploader', fatal=False)
+            webpage, 'uploader', default=None)
        view_count = self._extract_count(
-            r'<span class="count">([\d,\.]+)</span> views', webpage, 'view')
+            r'<span class="count">([\d,\.]+)</span> [Vv]iews', webpage, 'view')
        like_count = self._extract_count(
            r'<span class="votesUp">([\d,\.]+)</span>', webpage, 'like')
        dislike_count = self._extract_count(
@ -356,7 +359,11 @@ class PornHubIE(PornHubBaseIE):
            if div:
                return re.findall(r'<a[^>]+\bhref=[^>]+>([^<]+)', div)
-        return {
+        info = self._search_json_ld(webpage, video_id, default={})
        # description provided in JSON-LD is irrelevant
        info['description'] = None
        return merge_dicts({
            'id': video_id,
            'uploader': video_uploader,
            'upload_date': upload_date,
@ -372,7 +379,7 @@ class PornHubIE(PornHubBaseIE):
            'tags': extract_list('tags'),
            'categories': extract_list('categories'),
            'subtitles': subtitles,
-        }
+        }, info)
 class PornHubPlaylistBaseIE(PornHubBaseIE):
--- a/youtube_dl/extractor/redbulltv.py
+++ b/youtube_dl/extractor/redbulltv.py
@ -1,6 +1,8 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..compat import compat_HTTPError
 from ..utils import (
@ -10,7 +12,7 @@ from ..utils import (
 class RedBullTVIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?redbull(?:\.tv|\.com(?:/[^/]+)?(?:/tv)?)(?:/events/[^/]+)?/(?:videos?|live)/(?P<id>AP-\w+)'
+    _VALID_URL = r'https?://(?:www\.)?redbull(?:\.tv|\.com(?:/[^/]+)?(?:/tv)?)(?:/events/[^/]+)?/(?:videos?|live|(?:film|episode)s)/(?P<id>AP-\w+)'
    _TESTS = [{
        # film
        'url': 'https://www.redbull.tv/video/AP-1Q6XCDTAN1W11',
@ -29,8 +31,8 @@ class RedBullTVIE(InfoExtractor):
            'id': 'AP-1PMHKJFCW1W11',
            'ext': 'mp4',
            'title': 'Grime - Hashtags S2E4',
-            'description': 'md5:b5f522b89b72e1e23216e5018810bb25',
+            'description': 'md5:5546aa612958c08a98faaad4abce484d',
-            'duration': 904.6,
+            'duration': 904,
        },
        'params': {
            'skip_download': True,
@ -44,11 +46,15 @@ class RedBullTVIE(InfoExtractor):
    }, {
        'url': 'https://www.redbull.com/us-en/events/AP-1XV2K61Q51W11/live/AP-1XUJ86FDH1W11',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/films/AP-1ZSMAW8FH2111',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/episodes/AP-1TQWK7XE11W11',
        'only_matching': True,
    }]
-    def _real_extract(self, url):
+    def extract_info(self, video_id):
        video_id = self._match_id(url)
        session = self._download_json(
            'https://api.redbull.tv/v3/session', video_id,
            note='Downloading access token', query={
@ -105,24 +111,119 @@ class RedBullTVIE(InfoExtractor):
            'subtitles': subtitles,
        }
    def _real_extract(self, url):
        video_id = self._match_id(url)
        return self.extract_info(video_id)
 class RedBullEmbedIE(RedBullTVIE):
    _VALID_URL = r'https?://(?:www\.)?redbull\.com/embed/(?P<id>rrn:content:[^:]+:[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}:[a-z]{2}-[A-Z]{2,3})'
    _TESTS = [{
        # HLS manifest accessible only using assetId
        'url': 'https://www.redbull.com/embed/rrn:content:episode-videos:f3021f4f-3ed4-51ac-915a-11987126e405:en-INT',
        'only_matching': True,
    }]
    _VIDEO_ESSENSE_TMPL = '''... on %s {
      videoEssence {
        attributes
      }
    }'''
    def _real_extract(self, url):
        rrn_id = self._match_id(url)
        asset_id = self._download_json(
            'https://edge-graphql.crepo-production.redbullaws.com/v1/graphql',
            rrn_id, headers={'API-KEY': 'e90a1ff11335423998b100c929ecc866'},
            query={
                'query': '''{
  resource(id: "%s", enforceGeoBlocking: false) {
    %s
    %s
  }
 }''' % (rrn_id, self._VIDEO_ESSENSE_TMPL % 'LiveVideo', self._VIDEO_ESSENSE_TMPL % 'VideoResource'),
            })['data']['resource']['videoEssence']['attributes']['assetId']
        return self.extract_info(asset_id)
 class RedBullTVRrnContentIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?redbull(?:\.tv|\.com(?:/[^/]+)?(?:/tv)?)/(?:video|live)/rrn:content:[^:]+:(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
+    _VALID_URL = r'https?://(?:www\.)?redbull\.com/(?P<region>[a-z]{2,3})-(?P<lang>[a-z]{2})/tv/(?:video|live|film)/(?P<id>rrn:content:[^:]+:[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
    _TESTS = [{
        'url': 'https://www.redbull.com/int-en/tv/video/rrn:content:live-videos:e3e6feb4-e95f-50b7-962a-c70f8fd13c73/mens-dh-finals-fort-william',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/tv/video/rrn:content:videos:a36a0f36-ff1b-5db8-a69d-ee11a14bf48b/tn-ts-style?playlist=rrn:content:event-profiles:83f05926-5de8-5389-b5e4-9bb312d715e8:extras',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/tv/film/rrn:content:films:d1f4d00e-4c04-5d19-b510-a805ffa2ab83/follow-me',
        'only_matching': True,
    }]
    def _real_extract(self, url):
-        display_id = self._match_id(url)
+        region, lang, rrn_id = re.search(self._VALID_URL, url).groups()
        rrn_id += ':%s-%s' % (lang, region.upper())
        return self.url_result(
            'https://www.redbull.com/embed/' + rrn_id,
            RedBullEmbedIE.ie_key(), rrn_id)
        webpage = self._download_webpage(url, display_id)
-        video_url = self._og_search_url(webpage)
+class RedBullIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?redbull\.com/(?P<region>[a-z]{2,3})-(?P<lang>[a-z]{2})/(?P<type>(?:episode|film|(?:(?:recap|trailer)-)?video)s|live)/(?!AP-|rrn:content:)(?P<id>[^/?#&]+)'
    _TESTS = [{
        'url': 'https://www.redbull.com/int-en/episodes/grime-hashtags-s02-e04',
        'md5': 'db8271a7200d40053a1809ed0dd574ff',
        'info_dict': {
            'id': 'AA-1MT8DQWA91W14',
            'ext': 'mp4',
            'title': 'Grime - Hashtags S2E4',
            'description': 'md5:5546aa612958c08a98faaad4abce484d',
        },
    }, {
        'url': 'https://www.redbull.com/int-en/films/kilimanjaro-mountain-of-greatness',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/recap-videos/uci-mountain-bike-world-cup-2017-mens-xco-finals-from-vallnord',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/trailer-videos/kings-of-content',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/videos/tnts-style-red-bull-dance-your-style-s1-e12',
        'only_matching': True,
    }, {
        'url': 'https://www.redbull.com/int-en/live/mens-dh-finals-fort-william',
        'only_matching': True,
    }, {
        # only available on the int-en website so a fallback is need for the API
        # https://www.redbull.com/v3/api/graphql/v1/v3/query/en-GB>en-INT?filter[uriSlug]=fia-wrc-saturday-recap-estonia&rb3Schema=v1:hero
        'url': 'https://www.redbull.com/gb-en/live/fia-wrc-saturday-recap-estonia',
        'only_matching': True,
    }]
    _INT_FALLBACK_LIST = ['de', 'en', 'es', 'fr']
    _LAT_FALLBACK_MAP = ['ar', 'bo', 'car', 'cl', 'co', 'mx', 'pe']
    def _real_extract(self, url):
        region, lang, filter_type, display_id = re.search(self._VALID_URL, url).groups()
        if filter_type == 'episodes':
            filter_type = 'episode-videos'
        elif filter_type == 'live':
            filter_type = 'live-videos'
        regions = [region.upper()]
        if region != 'int':
            if region in self._LAT_FALLBACK_MAP:
                regions.append('LAT')
            if lang in self._INT_FALLBACK_LIST:
                regions.append('INT')
        locale = '>'.join(['%s-%s' % (lang, reg) for reg in regions])
        rrn_id = self._download_json(
            'https://www.redbull.com/v3/api/graphql/v1/v3/query/' + locale,
            display_id, query={
                'filter[type]': filter_type,
                'filter[uriSlug]': display_id,
                'rb3Schema': 'v1:hero',
            })['data']['id']
        return self.url_result(
-            video_url, ie=RedBullTVIE.ie_key(),
+            'https://www.redbull.com/embed/' + rrn_id,
-            video_id=RedBullTVIE._match_id(video_url))
+            RedBullEmbedIE.ie_key(), rrn_id)
--- a/youtube_dl/extractor/redtube.py
+++ b/youtube_dl/extractor/redtube.py
@ -15,7 +15,7 @@ from ..utils import (
 class RedTubeIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:(?:www\.)?redtube\.com/|embed\.redtube\.com/\?.*?\bid=)(?P<id>[0-9]+)'
+    _VALID_URL = r'https?://(?:(?:\w+\.)?redtube\.com/|embed\.redtube\.com/\?.*?\bid=)(?P<id>[0-9]+)'
    _TESTS = [{
        'url': 'http://www.redtube.com/66418',
        'md5': 'fc08071233725f26b8f014dba9590005',
@ -31,6 +31,9 @@ class RedTubeIE(InfoExtractor):
    }, {
        'url': 'http://embed.redtube.com/?bgcolor=000000&id=1443286',
        'only_matching': True,
    }, {
        'url': 'http://it.redtube.com/66418',
        'only_matching': True,
    }]
    @staticmethod
--- a/youtube_dl/extractor/rtlnl.py
+++ b/youtube_dl/extractor/rtlnl.py
@ -14,12 +14,27 @@ class RtlNlIE(InfoExtractor):
    _VALID_URL = r'''(?x)
        https?://(?:(?:www|static)\.)?
        (?:
-            rtlxl\.nl/[^\#]*\#!/[^/]+/|
+            rtlxl\.nl/(?:[^\#]*\#!|programma)/[^/]+/|
-            rtl\.nl/(?:(?:system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html|embed)\b.+?\buuid=|video/)
+            rtl\.nl/(?:(?:system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html|embed)\b.+?\buuid=|video/)|
            embed\.rtl\.nl/\#uuid=
        )
        (?P<id>[0-9a-f-]+)'''
    _TESTS = [{
        # new URL schema
        'url': 'https://www.rtlxl.nl/programma/rtl-nieuws/0bd1384d-d970-3086-98bb-5c104e10c26f',
        'md5': '490428f1187b60d714f34e1f2e3af0b6',
        'info_dict': {
            'id': '0bd1384d-d970-3086-98bb-5c104e10c26f',
            'ext': 'mp4',
            'title': 'RTL Nieuws',
            'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
            'timestamp': 1593293400,
            'upload_date': '20200627',
            'duration': 661.08,
        },
    }, {
        # old URL schema
        'url': 'http://www.rtlxl.nl/#!/rtl-nieuws-132237/82b1aad1-4a14-3d7b-b554-b0aed1b2c416',
        'md5': '473d1946c1fdd050b2c0161a4b13c373',
        'info_dict': {
@ -31,6 +46,7 @@ class RtlNlIE(InfoExtractor):
            'upload_date': '20160429',
            'duration': 1167.96,
        },
        'skip': '404',
    }, {
        # best format available a3t
        'url': 'http://www.rtl.nl/system/videoplayer/derden/rtlnieuws/video_embed.html#uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed/autoplay=false',
@ -76,6 +92,10 @@ class RtlNlIE(InfoExtractor):
    }, {
        'url': 'https://static.rtl.nl/embed/?uuid=1a2970fc-5c0b-43ff-9fdc-927e39e6d1bc&autoplay=false&publicatiepunt=rtlnieuwsnl',
        'only_matching': True,
    }, {
        # new embed URL schema
        'url': 'https://embed.rtl.nl/#uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed/autoplay=false',
        'only_matching': True,
    }]
    def _real_extract(self, url):
--- a/youtube_dl/extractor/soundcloud.py
+++ b/youtube_dl/extractor/soundcloud.py
@ -558,8 +558,10 @@ class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
 class SoundcloudPagedPlaylistBaseIE(SoundcloudIE):
    def _extract_playlist(self, base_url, playlist_id, playlist_title):
        # Per the SoundCloud documentation, the maximum limit for a linked partioning query is 200.
        # https://developers.soundcloud.com/blog/offset-pagination-deprecated
        COMMON_QUERY = {
-            'limit': 80000,
+            'limit': 200,
            'linked_partitioning': '1',
        }
--- a/youtube_dl/extractor/srgssr.py
+++ b/youtube_dl/extractor/srgssr.py
@ -114,7 +114,7 @@ class SRGSSRPlayIE(InfoExtractor):
                            [^/]+/(?P<type>video|audio)/[^?]+|
                            popup(?P<type_2>video|audio)player
                        )
-                        \?id=(?P<id>[0-9a-f\-]{36}|\d+)
+                        \?.*?\b(?:id=|urn=urn:[^:]+:video:)(?P<id>[0-9a-f\-]{36}|\d+)
                    '''
    _TESTS = [{
@ -175,6 +175,12 @@ class SRGSSRPlayIE(InfoExtractor):
    }, {
        'url': 'https://www.srf.ch/play/tv/popupvideoplayer?id=c4dba0ca-e75b-43b2-a34f-f708a4932e01',
        'only_matching': True,
    }, {
        'url': 'https://www.srf.ch/play/tv/10vor10/video/snowden-beantragt-asyl-in-russland?urn=urn:srf:video:28e1a57d-5b76-4399-8ab3-9097f071e6c5',
        'only_matching': True,
    }, {
        'url': 'https://www.rts.ch/play/tv/19h30/video/le-19h30?urn=urn:rts:video:6348260',
        'only_matching': True,
    }]
    def _real_extract(self, url):
--- a/youtube_dl/extractor/svt.py
+++ b/youtube_dl/extractor/svt.py
@ -231,7 +231,9 @@ class SVTPlayIE(SVTPlayBaseIE):
        if not svt_id:
            svt_id = self._search_regex(
                (r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
-                 r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"'),
+                 r'["\']videoSvtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
                 r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"',
                 r'["\']svtId["\']\s*:\s*["\']([\da-zA-Z-]+)'),
                webpage, 'video id')
        return self._extract_by_video_id(svt_id, webpage)
--- a/youtube_dl/extractor/telequebec.py
+++ b/youtube_dl/extractor/telequebec.py
@ -13,14 +13,24 @@ from ..utils import (
 class TeleQuebecBaseIE(InfoExtractor):
    @staticmethod
-    def _limelight_result(media_id):
+    def _result(url, ie_key):
        return {
            '_type': 'url_transparent',
-            'url': smuggle_url(
+            'url': smuggle_url(url, {'geo_countries': ['CA']}),
-                'limelight:media:' + media_id, {'geo_countries': ['CA']}),
+            'ie_key': ie_key,
            'ie_key': 'LimelightMedia',
        }
    @staticmethod
    def _limelight_result(media_id):
        return TeleQuebecBaseIE._result(
            'limelight:media:' + media_id, 'LimelightMedia')
    @staticmethod
    def _brightcove_result(brightcove_id):
        return TeleQuebecBaseIE._result(
            'http://players.brightcove.net/6150020952001/default_default/index.html?videoId=%s'
            % brightcove_id, 'BrightcoveNew')
 class TeleQuebecIE(TeleQuebecBaseIE):
    _VALID_URL = r'''(?x)
@ -37,11 +47,27 @@ class TeleQuebecIE(TeleQuebecBaseIE):
            'id': '577116881b4b439084e6b1cf4ef8b1b3',
            'ext': 'mp4',
            'title': 'Un petit choc et puis repart!',
-            'description': 'md5:b04a7e6b3f74e32d7b294cffe8658374',
+            'description': 'md5:067bc84bd6afecad85e69d1000730907',
        },
        'params': {
            'skip_download': True,
        },
    }, {
        'url': 'https://zonevideo.telequebec.tv/media/55267/le-soleil/passe-partout',
        'info_dict': {
            'id': '6167180337001',
            'ext': 'mp4',
            'title': 'Le soleil',
            'description': 'md5:64289c922a8de2abbe99c354daffde02',
            'uploader_id': '6150020952001',
            'upload_date': '20200625',
            'timestamp': 1593090307,
        },
        'params': {
            'format': 'bestvideo',
            'skip_download': True,
        },
        'add_ie': ['BrightcoveNew'],
    }, {
        # no description
        'url': 'http://zonevideo.telequebec.tv/media/30261',
@ -58,7 +84,14 @@ class TeleQuebecIE(TeleQuebecBaseIE):
            'https://mnmedias.api.telequebec.tv/api/v2/media/' + media_id,
            media_id)['media']
-        info = self._limelight_result(media_data['streamInfo']['sourceId'])
+        source_id = media_data['streamInfo']['sourceId']
        source = (try_get(
            media_data, lambda x: x['streamInfo']['source'],
            compat_str) or 'limelight').lower()
        if source == 'brightcove':
            info = self._brightcove_result(source_id)
        else:
            info = self._limelight_result(source_id)
        info.update({
            'title': media_data.get('title'),
            'description': try_get(
--- a/youtube_dl/extractor/twentythreevideo.py
+++ b/youtube_dl/extractor/twentythreevideo.py
@ -8,8 +8,8 @@ from ..utils import int_or_none
 class TwentyThreeVideoIE(InfoExtractor):
    IE_NAME = '23video'
-    _VALID_URL = r'https?://video\.(?P<domain>twentythree\.net|23video\.com|filmweb\.no)/v\.ihtml/player\.html\?(?P<query>.*?\bphoto(?:_|%5f)id=(?P<id>\d+).*)'
+    _VALID_URL = r'https?://(?P<domain>[^.]+\.(?:twentythree\.net|23video\.com|filmweb\.no))/v\.ihtml/player\.html\?(?P<query>.*?\bphoto(?:_|%5f)id=(?P<id>\d+).*)'
-    _TEST = {
+    _TESTS = [{
        'url': 'https://video.twentythree.net/v.ihtml/player.html?showDescriptions=0&source=site&photo%5fid=20448876&autoPlay=1',
        'md5': '75fcf216303eb1dae9920d651f85ced4',
        'info_dict': {
@ -21,11 +21,14 @@ class TwentyThreeVideoIE(InfoExtractor):
            'uploader_id': '12258964',
            'uploader': 'Rasmus Bysted',
        }
-    }
+    }, {
        'url': 'https://bonnier-publications-danmark.23video.com/v.ihtml/player.html?token=f0dc46476e06e13afd5a1f84a29e31e8&source=embed&photo%5fid=36137620',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        domain, query, photo_id = re.match(self._VALID_URL, url).groups()
-        base_url = 'https://video.%s' % domain
+        base_url = 'https://%s' % domain
        photo_data = self._download_json(
            base_url + '/api/photo/list?' + query, photo_id, query={
                'format': 'json',
--- a/youtube_dl/extractor/twitch.py
+++ b/youtube_dl/extractor/twitch.py
@ -24,7 +24,6 @@ from ..utils import (
    parse_duration,
    parse_iso8601,
    qualities,
    str_or_none,
    try_get,
    unified_timestamp,
    update_url_query,
@ -337,19 +336,27 @@ def _make_video_result(node):
 class TwitchGraphQLBaseIE(TwitchBaseIE):
    _PAGE_LIMIT = 100
-    def _download_gql(self, video_id, op, variables, sha256_hash, note, fatal=True):
+    _OPERATION_HASHES = {
        'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
        'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
        'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
        'ChannelCollectionsContent': '07e3691a1bad77a36aba590c351180439a40baefc1c275356f40fc7082419a84',
        'StreamMetadata': '1c719a40e481453e5c48d9bb585d971b8b372f8ebb105b17076722264dfa5b3e',
        'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
        'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
    }
    def _download_gql(self, video_id, ops, note, fatal=True):
        for op in ops:
            op['extensions'] = {
                'persistedQuery': {
                    'version': 1,
                    'sha256Hash': self._OPERATION_HASHES[op['operationName']],
                }
            }
        return self._download_json(
            'https://gql.twitch.tv/gql', video_id, note,
-            data=json.dumps({
+            data=json.dumps(ops).encode(),
                'operationName': op,
                'variables': variables,
                'extensions': {
                    'persistedQuery': {
                        'version': 1,
                        'sha256Hash': sha256_hash,
                    }
                }
            }).encode(),
            headers={
                'Content-Type': 'text/plain;charset=UTF-8',
                'Client-ID': self._CLIENT_ID,
@ -369,14 +376,15 @@ class TwitchCollectionIE(TwitchGraphQLBaseIE):
    }]
    _OPERATION_NAME = 'CollectionSideBar'
    _SHA256_HASH = '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14'
    def _real_extract(self, url):
        collection_id = self._match_id(url)
        collection = self._download_gql(
-            collection_id, self._OPERATION_NAME,
+            collection_id, [{
-            {'collectionID': collection_id}, self._SHA256_HASH,
+                'operationName': self._OPERATION_NAME,
-            'Downloading collection GraphQL')['data']['collection']
+                'variables': {'collectionID': collection_id},
            }],
            'Downloading collection GraphQL')[0]['data']['collection']
        title = collection.get('title')
        entries = []
        for edge in collection['items']['edges']:
@ -403,14 +411,16 @@ class TwitchPlaylistBaseIE(TwitchGraphQLBaseIE):
            if cursor:
                variables['cursor'] = cursor
            page = self._download_gql(
-                channel_name, self._OPERATION_NAME, variables,
+                channel_name, [{
-                self._SHA256_HASH,
+                    'operationName': self._OPERATION_NAME,
                    'variables': variables,
                }],
                'Downloading %ss GraphQL page %s' % (self._NODE_KIND, page_num),
                fatal=False)
            if not page:
                break
            edges = try_get(
-                page, lambda x: x['data']['user'][entries_key]['edges'], list)
+                page, lambda x: x[0]['data']['user'][entries_key]['edges'], list)
            if not edges:
                break
            for edge in edges:
@ -553,7 +563,6 @@ class TwitchVideosIE(TwitchPlaylistBaseIE):
        'views': 'Popular',
    }
    _SHA256_HASH = 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb'
    _OPERATION_NAME = 'FilterableVideoTower_Videos'
    _ENTRY_KIND = 'video'
    _EDGE_KIND = 'VideoEdge'
@ -622,7 +631,6 @@ class TwitchVideosClipsIE(TwitchPlaylistBaseIE):
    # NB: values other than 20 result in skipped videos
    _PAGE_LIMIT = 20
    _SHA256_HASH = 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777'
    _OPERATION_NAME = 'ClipsCards__User'
    _ENTRY_KIND = 'clip'
    _EDGE_KIND = 'ClipEdge'
@ -680,7 +688,6 @@ class TwitchVideosCollectionsIE(TwitchPlaylistBaseIE):
        'playlist_mincount': 3,
    }]
    _SHA256_HASH = '07e3691a1bad77a36aba590c351180439a40baefc1c275356f40fc7082419a84'
    _OPERATION_NAME = 'ChannelCollectionsContent'
    _ENTRY_KIND = 'collection'
    _EDGE_KIND = 'CollectionsItemEdge'
@ -717,7 +724,7 @@ class TwitchVideosCollectionsIE(TwitchPlaylistBaseIE):
            playlist_title='%s - Collections' % channel_name)
-class TwitchStreamIE(TwitchBaseIE):
+class TwitchStreamIE(TwitchGraphQLBaseIE):
    IE_NAME = 'twitch:stream'
    _VALID_URL = r'''(?x)
                    https?://
@ -774,28 +781,43 @@ class TwitchStreamIE(TwitchBaseIE):
                else super(TwitchStreamIE, cls).suitable(url))
    def _real_extract(self, url):
-        channel_name = self._match_id(url)
+        channel_name = self._match_id(url).lower()
-        access_token = self._download_access_token(channel_name)
+        gql = self._download_gql(
            channel_name, [{
                'operationName': 'StreamMetadata',
                'variables': {'channelLogin': channel_name},
            }, {
                'operationName': 'ComscoreStreamingQuery',
                'variables': {
                    'channel': channel_name,
                    'clipSlug': '',
                    'isClip': False,
                    'isLive': True,
                    'isVodOrCollection': False,
                    'vodID': '',
                },
            }, {
                'operationName': 'VideoPreviewOverlay',
                'variables': {'login': channel_name},
            }],
            'Downloading stream GraphQL')
-        token = access_token['token']
+        user = gql[0]['data']['user']
        channel_id = self._extract_channel_id(token, channel_name)
-        stream = self._call_api(
+        if not user:
-            'kraken/streams/%s?stream_type=all' % channel_id,
+            raise ExtractorError(
-            channel_id, 'Downloading stream JSON').get('stream')
+                '%s does not exist' % channel_name, expected=True)
        stream = user['stream']
        if not stream:
-            raise ExtractorError('%s is offline' % channel_id, expected=True)
+            raise ExtractorError('%s is offline' % channel_name, expected=True)
-        # Channel name may be typed if different case than the original channel name
+        access_token = self._download_access_token(channel_name)
-        # (e.g. http://www.twitch.tv/TWITCHPLAYSPOKEMON) that will lead to constructing
+        token = access_token['token']
        # an invalid m3u8 URL. Working around by use of original channel name from stream
        # JSON and fallback to lowercase if it's not available.
        channel_name = try_get(
            stream, lambda x: x['channel']['name'],
            compat_str) or channel_name.lower()
        stream_id = stream.get('id') or channel_name
        query = {
            'allow_source': 'true',
            'allow_audio_only': 'true',
@ -808,41 +830,39 @@ class TwitchStreamIE(TwitchBaseIE):
            'token': token.encode('utf-8'),
        }
        formats = self._extract_m3u8_formats(
-            '%s/api/channel/hls/%s.m3u8?%s'
+            '%s/api/channel/hls/%s.m3u8' % (self._USHER_BASE, channel_name),
-            % (self._USHER_BASE, channel_name, compat_urllib_parse_urlencode(query)),
+            stream_id, 'mp4', query=query)
            channel_id, 'mp4')
        self._prefer_source(formats)
        view_count = stream.get('viewers')
-        timestamp = parse_iso8601(stream.get('created_at'))
+        timestamp = unified_timestamp(stream.get('createdAt'))
-        channel = stream['channel']
+        sq_user = try_get(gql, lambda x: x[1]['data']['user'], dict) or {}
-        title = self._live_title(channel.get('display_name') or channel.get('name'))
+        uploader = sq_user.get('displayName')
-        description = channel.get('status')
+        description = try_get(
            sq_user, lambda x: x['broadcastSettings']['title'], compat_str)
-        thumbnails = []
+        thumbnail = url_or_none(try_get(
-        for thumbnail_key, thumbnail_url in stream['preview'].items():
+            gql, lambda x: x[2]['data']['user']['stream']['previewImageURL'],
-            m = re.search(r'(?P<width>\d+)x(?P<height>\d+)\.jpg$', thumbnail_key)
+            compat_str))
-            if not m:
+
-                continue
+        title = uploader or channel_name
-            thumbnails.append({
+        stream_type = stream.get('type')
-                'url': thumbnail_url,
+        if stream_type in ['rerun', 'live']:
-                'width': int(m.group('width')),
+            title += ' (%s)' % stream_type
                'height': int(m.group('height')),
            })
        return {
-            'id': str_or_none(stream.get('_id')) or channel_id,
+            'id': stream_id,
            'display_id': channel_name,
-            'title': title,
+            'title': self._live_title(title),
            'description': description,
-            'thumbnails': thumbnails,
+            'thumbnail': thumbnail,
-            'uploader': channel.get('display_name'),
+            'uploader': uploader,
-            'uploader_id': channel.get('name'),
+            'uploader_id': channel_name,
            'timestamp': timestamp,
            'view_count': view_count,
            'formats': formats,
-            'is_live': True,
+            'is_live': stream_type == 'live',
        }
--- a/youtube_dl/extractor/ustream.py
+++ b/youtube_dl/extractor/ustream.py
@ -19,7 +19,7 @@ from ..utils import (
 class UstreamIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?ustream\.tv/(?P<type>recorded|embed|embed/recorded)/(?P<id>\d+)'
+    _VALID_URL = r'https?://(?:www\.)?(?:ustream\.tv|video\.ibm\.com)/(?P<type>recorded|embed|embed/recorded)/(?P<id>\d+)'
    IE_NAME = 'ustream'
    _TESTS = [{
        'url': 'http://www.ustream.tv/recorded/20274954',
@ -67,12 +67,15 @@ class UstreamIE(InfoExtractor):
        'params': {
            'skip_download': True,  # m3u8 download
        },
    }, {
        'url': 'https://video.ibm.com/embed/recorded/128240221?&autoplay=true&controls=true&volume=100',
        'only_matching': True,
    }]
    @staticmethod
    def _extract_url(webpage):
        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>http://www\.ustream\.tv/embed/.+?)\1', webpage)
+            r'<iframe[^>]+?src=(["\'])(?P<url>http://(?:www\.)?(?:ustream\.tv|video\.ibm\.com)/embed/.+?)\1', webpage)
        if mobj is not None:
            return mobj.group('url')
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@ -1264,7 +1264,23 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
            'params': {
                'skip_download': True,
            },
-        }
+        },
        {
            # empty description results in an empty string
            'url': 'https://www.youtube.com/watch?v=x41yOUIvK2k',
            'info_dict': {
                'id': 'x41yOUIvK2k',
                'ext': 'mp4',
                'title': 'IMG 3456',
                'description': '',
                'upload_date': '20170613',
                'uploader_id': 'ElevageOrVert',
                'uploader': 'ElevageOrVert',
            },
            'params': {
                'skip_download': True,
            },
        },
    ]
    def __init__(self, *args, **kwargs):
@ -1931,7 +1947,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
            ''', replace_url, video_description)
            video_description = clean_html(video_description)
        else:
-            video_description = video_details.get('shortDescription') or self._html_search_meta('description', video_webpage)
+            video_description = video_details.get('shortDescription')
            if video_description is None:
                video_description = self._html_search_meta('description', video_webpage)
        if not smuggled_data.get('force_singlefeed', False):
            if not self._downloader.params.get('noplaylist'):
@ -3163,54 +3181,94 @@ class YoutubeSearchIE(SearchInfoExtractor, YoutubeSearchBaseInfoExtractor):
    _MAX_RESULTS = float('inf')
    IE_NAME = 'youtube:search'
    _SEARCH_KEY = 'ytsearch'
-    _EXTRA_QUERY_ARGS = {}
+    _SEARCH_PARAMS = None
    _TESTS = []
    def _entries(self, query, n):
        data = {
            'context': {
                'client': {
                    'clientName': 'WEB',
                    'clientVersion': '2.20201021.03.00',
                }
            },
            'query': query,
        }
        if self._SEARCH_PARAMS:
            data['params'] = self._SEARCH_PARAMS
        total = 0
        for page_num in itertools.count(1):
            search = self._download_json(
                'https://www.youtube.com/youtubei/v1/search?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8',
                video_id='query "%s"' % query,
                note='Downloading page %s' % page_num,
                errnote='Unable to download API page', fatal=False,
                data=json.dumps(data).encode('utf8'),
                headers={'content-type': 'application/json'})
            if not search:
                break
            slr_contents = try_get(
                search,
                (lambda x: x['contents']['twoColumnSearchResultsRenderer']['primaryContents']['sectionListRenderer']['contents'],
                 lambda x: x['onResponseReceivedCommands'][0]['appendContinuationItemsAction']['continuationItems']),
                list)
            if not slr_contents:
                break
            isr_contents = try_get(
                slr_contents,
                lambda x: x[0]['itemSectionRenderer']['contents'],
                list)
            if not isr_contents:
                break
            for content in isr_contents:
                if not isinstance(content, dict):
                    continue
                video = content.get('videoRenderer')
                if not isinstance(video, dict):
                    continue
                video_id = video.get('videoId')
                if not video_id:
                    continue
                title = try_get(video, lambda x: x['title']['runs'][0]['text'], compat_str)
                description = try_get(video, lambda x: x['descriptionSnippet']['runs'][0]['text'], compat_str)
                duration = parse_duration(try_get(video, lambda x: x['lengthText']['simpleText'], compat_str))
                view_count_text = try_get(video, lambda x: x['viewCountText']['simpleText'], compat_str) or ''
                view_count = int_or_none(self._search_regex(
                    r'^(\d+)', re.sub(r'\s', '', view_count_text),
                    'view count', default=None))
                uploader = try_get(video, lambda x: x['ownerText']['runs'][0]['text'], compat_str)
                total += 1
                yield {
                    '_type': 'url_transparent',
                    'ie_key': YoutubeIE.ie_key(),
                    'id': video_id,
                    'url': video_id,
                    'title': title,
                    'description': description,
                    'duration': duration,
                    'view_count': view_count,
                    'uploader': uploader,
                }
                if total == n:
                    return
            token = try_get(
                slr_contents,
                lambda x: x[1]['continuationItemRenderer']['continuationEndpoint']['continuationCommand']['token'],
                compat_str)
            if not token:
                break
            data['continuation'] = token
    def _get_n_results(self, query, n):
        """Get a specified number of results for a query"""
-
+        return self.playlist_result(self._entries(query, n), query)
        videos = []
        limit = n
        url_query = {
            'search_query': query.encode('utf-8'),
        }
        url_query.update(self._EXTRA_QUERY_ARGS)
        result_url = 'https://www.youtube.com/results?' + compat_urllib_parse_urlencode(url_query)
        for pagenum in itertools.count(1):
            data = self._download_json(
                result_url, video_id='query "%s"' % query,
                note='Downloading page %s' % pagenum,
                errnote='Unable to download API page',
                query={'spf': 'navigate'})
            html_content = data[1]['body']['content']
            if 'class="search-message' in html_content:
                raise ExtractorError(
                    '[youtube] No video results', expected=True)
            new_videos = list(self._process_page(html_content))
            videos += new_videos
            if not new_videos or len(videos) > limit:
                break
            next_link = self._html_search_regex(
                r'href="(/results\?[^"]*\bsp=[^"]+)"[^>]*>\s*<span[^>]+class="[^"]*\byt-uix-button-content\b[^"]*"[^>]*>Next',
                html_content, 'next link', default=None)
            if next_link is None:
                break
            result_url = compat_urlparse.urljoin('https://www.youtube.com/', next_link)
        if len(videos) > n:
            videos = videos[:n]
        return self.playlist_result(videos, query)
 class YoutubeSearchDateIE(YoutubeSearchIE):
    IE_NAME = YoutubeSearchIE.IE_NAME + ':date'
    _SEARCH_KEY = 'ytsearchdate'
    IE_DESC = 'YouTube.com searches, newest videos first'
-    _EXTRA_QUERY_ARGS = {'search_sort': 'video_date_uploaded'}
+    _SEARCH_PARAMS = 'CAI%3D'
 class YoutubeSearchURLIE(YoutubeSearchBaseInfoExtractor):
--- a/youtube_dl/postprocessor/embedthumbnail.py
+++ b/youtube_dl/postprocessor/embedthumbnail.py
@ -13,6 +13,7 @@ from ..utils import (
    encodeFilename,
    PostProcessingError,
    prepend_extension,
    replace_extension,
    shell_quote
 )
@ -41,6 +42,38 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
                'Skipping embedding the thumbnail because the file is missing.')
            return [], info
        def is_webp(path):
            with open(encodeFilename(path), 'rb') as f:
                b = f.read(12)
            return b[0:4] == b'RIFF' and b[8:] == b'WEBP'
        # Correct extension for WebP file with wrong extension (see #25687, #25717)
        _, thumbnail_ext = os.path.splitext(thumbnail_filename)
        if thumbnail_ext:
            thumbnail_ext = thumbnail_ext[1:].lower()
            if thumbnail_ext != 'webp' and is_webp(thumbnail_filename):
                self._downloader.to_screen(
                    '[ffmpeg] Correcting extension to webp and escaping path for thumbnail "%s"' % thumbnail_filename)
                thumbnail_webp_filename = replace_extension(thumbnail_filename, 'webp')
                os.rename(encodeFilename(thumbnail_filename), encodeFilename(thumbnail_webp_filename))
                thumbnail_filename = thumbnail_webp_filename
                thumbnail_ext = 'webp'
        # Convert unsupported thumbnail formats to JPEG (see #25687, #25717)
        if thumbnail_ext not in ['jpg', 'png']:
            # NB: % is supposed to be escaped with %% but this does not work
            # for input files so working around with standard substitution
            escaped_thumbnail_filename = thumbnail_filename.replace('%', '#')
            os.rename(encodeFilename(thumbnail_filename), encodeFilename(escaped_thumbnail_filename))
            escaped_thumbnail_jpg_filename = replace_extension(escaped_thumbnail_filename, 'jpg')
            self._downloader.to_screen('[ffmpeg] Converting thumbnail "%s" to JPEG' % escaped_thumbnail_filename)
            self.run_ffmpeg(escaped_thumbnail_filename, escaped_thumbnail_jpg_filename, ['-bsf:v', 'mjpeg2jpeg'])
            os.remove(encodeFilename(escaped_thumbnail_filename))
            thumbnail_jpg_filename = replace_extension(thumbnail_filename, 'jpg')
            # Rename back to unescaped for further processing
            os.rename(encodeFilename(escaped_thumbnail_jpg_filename), encodeFilename(thumbnail_jpg_filename))
            thumbnail_filename = thumbnail_jpg_filename
        if info['ext'] == 'mp3':
            options = [
                '-c', 'copy', '-map', '0', '-map', '1',
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@ -4088,12 +4088,12 @@ def js_to_json(code):
                '\\\n': '',
                '\\x': '\\u00',
            }.get(m.group(0), m.group(0)), v[1:-1])
-
+        else:
-        for regex, base in INTEGER_TABLE:
+            for regex, base in INTEGER_TABLE:
-            im = re.match(regex, v)
+                im = re.match(regex, v)
-            if im:
+                if im:
-                i = int(im.group(1), base)
+                    i = int(im.group(1), base)
-                return '"%d":' % i if v.endswith(':') else '%d' % i
+                    return '"%d":' % i if v.endswith(':') else '%d' % i
        return '"%s"' % v
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals
-__version__ = '2020.09.06'
+__version__ = '2020.09.20'
Author	SHA1	Message	Date
Sergey M․	416da574ec	[ytsearch] Fix extraction (closes #26920 )	2020-10-23 21:31:37 +07:00
Toan Nguyen	48c5663c5f	[afreecatv] Fix typo (#26970 )	2020-10-22 19:15:05 +07:00
Hannu Hartikainen	7d740e7dc7	[23video] Relax _VALID_URL (#26870 )	2020-10-20 00:56:23 +07:00
Kevin O'Connor	4eda10499e	[utils] Don't attempt to coerce JS strings to numbers in js_to_json (#26851 ) The current logic in `js_to_json` tries to rewrite octal/hex numbers to decimal. However, when the logic actually happens the `"` or `'` have already been trimmed off. This causes what were originally strings, that happen to look like octal/hex numbers, to get rewritten to decimal and returned as a number rather than a string. In practive something like: ```js { "0x40": "foo", "040": "bar", } ``` would get rewritten as: ```json { 64: "foo", 32: "bar } ``` This is problematic since this isn't valid JSON as you cannot have non-string keys.	2020-10-18 00:10:41 +07:00
Sergio Livi	605535776a	[ustream] Add support for video.ibm.com (#26894 )	2020-10-17 23:14:46 +07:00
Felix Yan	1050e0d09f	[iqiyi] Fix typo (#26884 )	2020-10-17 23:02:17 +07:00
Sergey M․	d65d89183f	[expressen] Add support for di.se (closes #26670 )	2020-09-24 07:37:10 +07:00
Surkal	0c92f1e96b	[iprima] Improve video id extraction (#26507 ) (closes #26494 )	2020-09-24 06:46:58 +07:00
Sergey M․	adae9e844b	[README.md] Fix autonumber sequence description (refs #26686 )	2020-09-24 06:36:07 +07:00
Sergey M․	c5764b3f89	[downloader/http] Properly handle missing message in SSLError (closes #26646 )	2020-09-22 07:01:59 +07:00
Sergey M․	0837992a22	[downloader/http] Fix access to not yet opened stream in retry	2020-09-22 06:44:14 +07:00
Sergey M․	b55715934b	release 2020.09.20	2020-09-20 12:30:45 +07:00
Sergey M․	bbc3b5b4bb	[ChangeLog] Actualize [ci skip]	2020-09-20 12:24:32 +07:00
nixxo	1ca5f821c8	[redtube] Extend _VALID_URL (#26506 )	2020-09-20 11:39:42 +07:00
Sergey M․	defc820b70	[twitch] Switch streams to GraphQL and refactor (closes #26535 )	2020-09-20 10:05:00 +07:00
Sergey M․	82ef02e936	[telequebec] Fix issues (closes #26368 )	2020-09-19 07:56:00 +07:00
Patrick Dessalle	b856b3997c	[telequebec] Add support for brightcove videos (closes #25833 )	2020-09-19 07:52:57 +07:00
Sergey M․	cd85a1bb8b	[pornhub] Extract metadata from JSON-LD (closes #26614 )	2020-09-19 06:34:34 +07:00
Sergey M․	ce5b904050	[extractor/common] Relax interaction count extraction in _json_ld	2020-09-19 06:33:17 +07:00
Sergey M․	ad06b99dd4	[extractor/common] Extract author as uploader for VideoObject in _json_ld	2020-09-19 06:13:42 +07:00
JChris246	540b9f5164	[pornhub] Fix view count extraction (#26621 ) (refs #26614 )	2020-09-19 05:59:19 +07:00
Stefan Pöschel	6e65a2a67e	[downloader/hls] Fix incorrect end byte in Range HTTP header for media segments with EXT-X-BYTERANGE (#24512 ) (closes #14748 ) The end of the byte range is the first byte that is NOT part of the to be downloaded range. So don't include it into the requested HTTP download range, as this additional byte leads to a broken TS packet and subsequently to e.g. visible video corruption. Fixes #14748.	2020-09-18 05:26:56 +07:00
Sergey M․	f8c7bed133	[extractor/common] Handle ssl.CertificateError in _request_webpage (closes #26601 ) ssl.CertificateError is raised on some python versions <= 3.7.x	2020-09-18 03:41:16 +07:00
Sergey M․	cdc55e666f	[downloader/http] Improve timeout detection when reading block of data (refs #10935 )	2020-09-18 03:32:54 +07:00
Ori Avtalion	86b7c00adc	[downloader/http] Retry download when urlopen times out (#26603 ) (refs #10935 )	2020-09-18 03:15:44 +07:00
Sergey M․	e8c5d40bc8	release 2020.09.14	2020-09-14 03:37:36 +07:00
Sergey M․	ca7ebc4e5e	[ChangeLog] Actualize [ci skip]	2020-09-14 03:35:18 +07:00
Sergey M․	bff857a8af	[postprocessor/embedthumbnail] Fix issues (closes #25717 ) * Fix WebP with wrong extension processing * Fix embedding of thumbnails with % character in path	2020-09-14 03:28:31 +07:00
Alex Merkel	a31a022efd	[postprocessor/embedthumbnail] Add support for non jpeg/png thumbnails (closes #25687 )	2020-09-14 03:10:01 +07:00
Sergey M․	45f6362464	[rtlnl] Extend _VALID_URL for new embed URL schema	2020-09-13 21:42:06 +07:00
Derek Land	97f34a48d7	[rtlnl] Extend _VALID_URL (#26549 ) (closes #25821 )	2020-09-13 21:38:16 +07:00
Daniel Peukert	ea74e00b3a	[youtube] Fix empty description extraction (#26575 ) (closes #26006 )	2020-09-13 21:23:21 +07:00
Sergey M․	06cd4cdb25	[srgssr] Extend _VALID_URL (closes #26555 , closes #26556 , closes #26578 )	2020-09-13 21:07:25 +07:00
Sergey M․	da2069fb22	[googledrive] Use redirect URLs for source format (closes #18877 , closes #23919 , closes #24689 , closes #26565 )	2020-09-13 20:49:32 +07:00
Sergey M․	95c9810015	[svtplay] Fix id extraction (closes #26576 )	2020-09-13 18:59:37 +07:00
Remita Amine	b03eebdb6a	[redbulltv] improve support for rebull.com TV localized URLS(#22063 )	2020-09-13 11:26:11 +01:00
Remita Amine	1f7675451c	[redbulltv] Add support for new redbull.com TV URLs(closes #22037 )(closes #22063 )	2020-09-12 19:27:58 +01:00
tfvlrue	aa27253556	[soundcloud] Reduce pagination limit to fix 502 Bad Gateway errors when listing a user's tracks. (#26557 ) Per the documentation here https://developers.soundcloud.com/blog/offset-pagination-deprecated the maximum limit is 200, so let's respect that (even if a higher value sometimes works). Co-authored-by: tfvlrue <tfvlrue>	2020-09-12 09:35:11 +00:00
`@ -1,3 +1,3 @@`
	`from __future__ import unicode_literals`	`from __future__ import unicode_literals`

	`__version__ = '2020.09.06'`	`__version__ = '2020.09.20'`