Compare commits

..

2 commits

Author SHA1 Message Date
Filippo Valsorda 97bc05116e
Merge branch 'master' into totalwebcasting 2018-01-07 15:03:28 +01:00
Filippo Valsorda 7608a91ee7 [totalwebcasting] Add new extractor 2017-01-11 18:51:25 -05:00
581 changed files with 19481 additions and 38826 deletions

60
.github/ISSUE_TEMPLATE.md vendored Normal file
View file

@ -0,0 +1,60 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.12.31*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.12.31**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.12.31
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View file

@ -1,63 +0,0 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.09.20
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View file

@ -1,54 +0,0 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View file

@ -1,37 +0,0 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View file

@ -1,65 +0,0 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.09.20
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View file

@ -1,38 +0,0 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View file

@ -1,38 +0,0 @@
---
name: Ask question
about: Ask youtube-dl related question
title: ''
labels: 'question'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- Look through the README (http://yt-dl.org/readme) and FAQ (http://yt-dl.org/faq) for similar questions
- Search the bugtracker for similar questions: http://yt-dl.org/search-issues
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm asking a question
- [ ] I've looked through the README and FAQ for similar questions
- [ ] I've searched the bugtracker for similar questions including closed ones
## Question
<!--
Ask your question in an arbitrary form. Please make sure it's worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient.
-->
WRITE QUESTION HERE

60
.github/ISSUE_TEMPLATE_tmpl.md vendored Normal file
View file

@ -0,0 +1,60 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View file

@ -1,63 +0,0 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View file

@ -1,54 +0,0 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View file

@ -1,37 +0,0 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View file

@ -1,65 +0,0 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View file

@ -1,38 +0,0 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View file

@ -7,8 +7,8 @@
--- ---
### Before submitting a *pull request* make sure you have: ### Before submitting a *pull request* make sure you have:
- [ ] At least skimmed through [adding new extractor tutorial](https://github.com/ytdl-org/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/ytdl-org/youtube-dl#youtube-dl-coding-conventions) sections - [ ] At least skimmed through [adding new extractor tutorial](https://github.com/rg3/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/rg3/youtube-dl#youtube-dl-coding-conventions) sections
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests - [ ] [Searched](https://github.com/rg3/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
- [ ] Checked the code with [flake8](https://pypi.python.org/pypi/flake8) - [ ] Checked the code with [flake8](https://pypi.python.org/pypi/flake8)
### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options: ### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options:

4
.gitignore vendored
View file

@ -47,7 +47,3 @@ youtube-dl.zsh
*.iml *.iml
tmp/ tmp/
venv/
# VS Code related files
.vscode

View file

@ -9,37 +9,14 @@ python:
- "3.6" - "3.6"
- "pypy" - "pypy"
- "pypy3" - "pypy3"
dist: trusty sudo: false
env: env:
- YTDL_TEST_SET=core - YTDL_TEST_SET=core
- YTDL_TEST_SET=download - YTDL_TEST_SET=download
jobs: matrix:
include: include:
- python: 3.7
dist: xenial
env: YTDL_TEST_SET=core
- python: 3.7
dist: xenial
env: YTDL_TEST_SET=download
- python: 3.8
dist: xenial
env: YTDL_TEST_SET=core
- python: 3.8
dist: xenial
env: YTDL_TEST_SET=download
- python: 3.8-dev
dist: xenial
env: YTDL_TEST_SET=core
- python: 3.8-dev
dist: xenial
env: YTDL_TEST_SET=download
- env: JYTHON=true; YTDL_TEST_SET=core - env: JYTHON=true; YTDL_TEST_SET=core
- env: JYTHON=true; YTDL_TEST_SET=download - env: JYTHON=true; YTDL_TEST_SET=download
- name: flake8
python: 3.8
dist: xenial
install: pip install flake8
script: flake8 .
fast_finish: true fast_finish: true
allow_failures: allow_failures:
- env: YTDL_TEST_SET=download - env: YTDL_TEST_SET=download

15
AUTHORS
View file

@ -231,18 +231,3 @@ John Dong
Tatsuyuki Ishi Tatsuyuki Ishi
Daniel Weber Daniel Weber
Kay Bouché Kay Bouché
Yang Hongbo
Lei Wang
Petr Novák
Leonardo Taccari
Martin Weinelt
Surya Oktafendri
TingPing
Alexandre Macabies
Bastian de Groot
Niklas Haas
András Veres-Szentkirályi
Enes Solak
Nathan Rossi
Thomas van der Berg
Luca Cherubin

View file

@ -42,11 +42,11 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented? ### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/ytdl-org/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity. Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough? ### Why are existing options not enough?
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem. Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
### Is there enough context in your bug report? ### Is there enough context in your bug report?
@ -70,7 +70,7 @@ It may sound strange, but some bug reports we receive are completely unrelated t
# DEVELOPER INSTRUCTIONS # DEVELOPER INSTRUCTIONS
Most users do not need to build youtube-dl and can [download the builds](https://ytdl-org.github.io/youtube-dl/download.html) or get them from their distribution. Most users do not need to build youtube-dl and can [download the builds](https://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
To run youtube-dl as a developer, you don't need to build anything either. Simply execute To run youtube-dl as a developer, you don't need to build anything either. Simply execute
@ -98,7 +98,7 @@ If you want to add support for a new site, first of all **make sure** this site
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`): After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/ytdl-org/youtube-dl/fork) 1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with: 2. Check out the source code with:
git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
@ -150,22 +150,18 @@ After you have ensured this site is distributing its content legally, you can fo
# TODO more properties (see youtube_dl/extractor/common.py) # TODO more properties (see youtube_dl/extractor/common.py)
} }
``` ```
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py). 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart): 8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](https://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
9. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files and [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
$ flake8 youtube_dl/extractor/yourextractor.py
9. Make sure your code works under all [Python](https://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
10. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files and [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/extractors.py $ git add youtube_dl/extractor/extractors.py
$ git add youtube_dl/extractor/yourextractor.py $ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor' $ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor $ git push origin yourextractor
11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it. 10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions! In any case, thank you very much for your contributions!
@ -177,7 +173,7 @@ Extractors are very fragile by nature since they depend on the layout of the sou
### Mandatory and optional metafields ### Mandatory and optional metafields
For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl: For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl:
- `id` (media identifier) - `id` (media identifier)
- `title` (media title) - `title` (media title)
@ -185,7 +181,7 @@ For extraction to work youtube-dl relies on metadata your extractor extracts and
In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken. In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken.
[Any field](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. [Any field](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
#### Example #### Example
@ -261,33 +257,11 @@ title = meta.get('title') or self._og_search_title(webpage)
This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`. This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
### Regular expressions ### Make regular expressions flexible
#### Don't capture groups you don't use When using regular expressions try to write them fuzzy and flexible.
Capturing group must be an indication that it's used somewhere in the code. Any group that is not used must be non capturing.
##### Example
Don't capture id attribute name here since you can't use it for anything anyway.
Correct:
```python
r'(?:id|ID)=(?P<id>\d+)'
```
Incorrect:
```python
r'(id|ID)=(?P<id>\d+)'
```
#### Make regular expressions relaxed and flexible
When using regular expressions try to write them fuzzy, relaxed and flexible, skipping insignificant parts that are more likely to change, allowing both single and double quotes for quoted values and so on.
##### Example #### Example
Say you need to extract `title` from the following HTML code: Say you need to extract `title` from the following HTML code:
@ -320,115 +294,7 @@ title = self._search_regex(
webpage, 'title', group='title') webpage, 'title', group='title')
``` ```
### Long lines policy ### Use safe conversion functions
There is a soft limit to keep lines of code under 80 characters long. This means it should be respected if possible and if it does not make readability and code maintenance worse. Wrap all extracted numeric data into safe functions from `utils`: `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
For example, you should **never** split long string literals like URLs or some other often copied entities over multiple lines to fit this limit:
Correct:
```python
'https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
```
Incorrect:
```python
'https://www.youtube.com/watch?v=FqZTN594JQw&list='
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
```
### Inline values
Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
#### Example
Correct:
```python
title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
```
Incorrect:
```python
TITLE_RE = r'<title>([^<]+)</title>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
```
### Collapse fallbacks
Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
#### Example
Good:
```python
description = self._html_search_meta(
['og:description', 'description', 'twitter:description'],
webpage, 'description', default=None)
```
Unwieldy:
```python
description = (
self._og_search_description(webpage, default=None)
or self._html_search_meta('description', webpage, default=None)
or self._html_search_meta('twitter:description', webpage, default=None))
```
Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
### Trailing parentheses
Always move trailing parentheses after the last argument.
#### Example
Correct:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Incorrect:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list,
)
```
### Use convenience conversion and parsing functions
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
Use `url_or_none` for safe URL processing.
Use `try_get` for safe metadata extraction from parsed JSON.
Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.
Explore [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py) for more useful convenience functions.
#### More examples
##### Safely extract optional description from parsed JSON
```python
description = try_get(response, lambda x: x['result']['video'][0]['summary'], compat_str)
```
##### Safely extract more optional metadata
```python
video = try_get(response, lambda x: x['result']['video'][0], dict) or {}
description = video.get('summary')
duration = float_or_none(video.get('durationMs'), scale=1000)
view_count = int_or_none(video.get('views'))
```

2346
ChangeLog

File diff suppressed because it is too large Load diff

View file

@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean: clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete find . -name "*.pyc" -delete
find . -name "*.class" -delete find . -name "*.class" -delete
@ -14,9 +14,6 @@ PYTHON ?= /usr/bin/env python
# set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local # set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local
SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi) SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi)
# set markdown input format to "markdown-smart" for pandoc version 2 and to "markdown" for pandoc prior to version 2
MARKDOWN = $(shell if [ `pandoc -v | head -n1 | cut -d" " -f2 | head -c1` = "2" ]; then echo markdown-smart; else echo markdown; fi)
install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
install -d $(DESTDIR)$(BINDIR) install -d $(DESTDIR)$(BINDIR)
install -m 755 youtube-dl $(DESTDIR)$(BINDIR) install -m 755 youtube-dl $(DESTDIR)$(BINDIR)
@ -78,22 +75,18 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py
CONTRIBUTING.md: README.md CONTRIBUTING.md: README.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md $(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dl/version.py .github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE/1_broken_site.md $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE/2_site_support_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE/4_bug_report.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md .github/ISSUE_TEMPLATE/5_feature_request.md
supportedsites: supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md $(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md
README.txt: README.md README.txt: README.md
pandoc -f $(MARKDOWN) -t plain README.md -o README.txt pandoc -f markdown -t plain README.md -o README.txt
youtube-dl.1: README.md youtube-dl.1: README.md
$(PYTHON) devscripts/prepare_manpage.py youtube-dl.1.temp.md $(PYTHON) devscripts/prepare_manpage.py youtube-dl.1.temp.md
pandoc -s -f $(MARKDOWN) -t man youtube-dl.1.temp.md -o youtube-dl.1 pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
rm -f youtube-dl.1.temp.md rm -f youtube-dl.1.temp.md
youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in

258
README.md
View file

@ -1,4 +1,4 @@
[![Build Status](https://travis-ci.org/ytdl-org/youtube-dl.svg?branch=master)](https://travis-ci.org/ytdl-org/youtube-dl) [![Build Status](https://travis-ci.org/rg3/youtube-dl.svg?branch=master)](https://travis-ci.org/rg3/youtube-dl)
youtube-dl - download videos from youtube.com or other video platforms youtube-dl - download videos from youtube.com or other video platforms
@ -17,7 +17,7 @@ youtube-dl - download videos from youtube.com or other video platforms
# INSTALLATION # INSTALLATION
To install it right away for all UNIX users (Linux, macOS, etc.), type: To install it right away for all UNIX users (Linux, OS X, etc.), type:
sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl sudo chmod a+rx /usr/local/bin/youtube-dl
@ -35,7 +35,7 @@ You can also use pip:
This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information. This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
macOS users can install youtube-dl with [Homebrew](https://brew.sh/): OS X users can install youtube-dl with [Homebrew](https://brew.sh/):
brew install youtube-dl brew install youtube-dl
@ -43,10 +43,10 @@ Or with [MacPorts](https://www.macports.org/):
sudo port install youtube-dl sudo port install youtube-dl
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://ytdl-org.github.io/youtube-dl/download.html). Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION # DESCRIPTION
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like. **youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
youtube-dl [OPTIONS] URL [URL...] youtube-dl [OPTIONS] URL [URL...]
@ -93,8 +93,8 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
## Network Options: ## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. --proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
To enable SOCKS proxy, specify a proper To enable experimental SOCKS proxy, specify
scheme. For example a proper scheme. For example
socks5://127.0.0.1:1080/. Pass in an empty socks5://127.0.0.1:1080/. Pass in an empty
string (--proxy "") for direct connection string (--proxy "") for direct connection
--socket-timeout SECONDS Time to wait before giving up, in seconds --socket-timeout SECONDS Time to wait before giving up, in seconds
@ -106,18 +106,16 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--geo-verification-proxy URL Use this proxy to verify the IP address for --geo-verification-proxy URL Use this proxy to verify the IP address for
some geo-restricted sites. The default some geo-restricted sites. The default
proxy specified by --proxy (or none, if the proxy specified by --proxy (or none, if the
option is not present) is used for the options is not present) is used for the
actual downloading. actual downloading.
--geo-bypass Bypass geographic restriction via faking --geo-bypass Bypass geographic restriction via faking
X-Forwarded-For HTTP header X-Forwarded-For HTTP header (experimental)
--no-geo-bypass Do not bypass geographic restriction via --no-geo-bypass Do not bypass geographic restriction via
faking X-Forwarded-For HTTP header faking X-Forwarded-For HTTP header
(experimental)
--geo-bypass-country CODE Force bypass geographic restriction with --geo-bypass-country CODE Force bypass geographic restriction with
explicitly provided two-letter ISO 3166-2 explicitly provided two-letter ISO 3166-2
country code country code (experimental)
--geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction with
explicitly provided IP block in CIDR
notation
## Video Selection: ## Video Selection:
--playlist-start NUMBER Playlist video to start at (default is 1) --playlist-start NUMBER Playlist video to start at (default is 1)
@ -200,15 +198,10 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
size. By default, the buffer size is size. By default, the buffer size is
automatically resized from an initial value automatically resized from an initial value
of SIZE. of SIZE.
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP
downloading (e.g. 10485760 or 10M) (default
is disabled). May be useful for bypassing
bandwidth throttling imposed by a webserver
(experimental)
--playlist-reverse Download playlist videos in reverse order --playlist-reverse Download playlist videos in reverse order
--playlist-random Download playlist videos in random order --playlist-random Download playlist videos in random order
--xattr-set-filesize Set file xattribute ytdl.filesize with --xattr-set-filesize Set file xattribute ytdl.filesize with
expected file size expected file size (experimental)
--hls-prefer-native Use the native HLS downloader instead of --hls-prefer-native Use the native HLS downloader instead of
ffmpeg ffmpeg
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS --hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
@ -225,9 +218,7 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
## Filesystem Options: ## Filesystem Options:
-a, --batch-file FILE File containing URLs to download ('-' for -a, --batch-file FILE File containing URLs to download ('-' for
stdin), one URL per line. Lines starting stdin)
with '#', ';' or ']' are considered as
comments and ignored.
--id Use only video ID in file name --id Use only video ID in file name
-o, --output TEMPLATE Output filename template, see the "OUTPUT -o, --output TEMPLATE Output filename template, see the "OUTPUT
TEMPLATE" for all the info TEMPLATE" for all the info
@ -427,22 +418,22 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
default; fix file if we can, warn default; fix file if we can, warn
otherwise) otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the --prefer-avconv Prefer avconv over ffmpeg for running the
postprocessors
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
postprocessors (default) postprocessors (default)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
postprocessors
--ffmpeg-location PATH Location of the ffmpeg/avconv binary; --ffmpeg-location PATH Location of the ffmpeg/avconv binary;
either the path to the binary or its either the path to the binary or its
containing directory. containing directory.
--exec CMD Execute a command on the file after --exec CMD Execute a command on the file after
downloading and post-processing, similar to downloading, similar to find's -exec
find's -exec syntax. Example: --exec 'adb syntax. Example: --exec 'adb push {}
push {} /sdcard/Music/ && rm {}' /sdcard/Music/ && rm {}'
--convert-subs FORMAT Convert the subtitles to other format --convert-subs FORMAT Convert the subtitles to other format
(currently supported: srt|ass|vtt|lrc) (currently supported: srt|ass|vtt|lrc)
# CONFIGURATION # CONFIGURATION
You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux and macOS, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`. Note that by default configuration file may not exist so you may need to create it yourself. You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux and OS X, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`. Note that by default configuration file may not exist so you may need to create it yourself.
For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime, use a proxy and save all videos under `Movies` directory in your home directory: For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime, use a proxy and save all videos under `Movies` directory in your home directory:
``` ```
@ -496,7 +487,7 @@ The `-o` option allows users to indicate a template for the output file names.
**tl;dr:** [navigate me to examples](#output-template-examples). **tl;dr:** [navigate me to examples](#output-template-examples).
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Allowed names along with sequence type are: The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a formatting operations. Allowed names along with sequence type are:
- `id` (string): Video identifier - `id` (string): Video identifier
- `title` (string): Video title - `title` (string): Video title
@ -511,8 +502,6 @@ The basic usage is not to set any template arguments when downloading a single f
- `timestamp` (numeric): UNIX timestamp of the moment the video became available - `timestamp` (numeric): UNIX timestamp of the moment the video became available
- `upload_date` (string): Video upload date (YYYYMMDD) - `upload_date` (string): Video upload date (YYYYMMDD)
- `uploader_id` (string): Nickname or id of the video uploader - `uploader_id` (string): Nickname or id of the video uploader
- `channel` (string): Full name of the channel the video is uploaded on
- `channel_id` (string): Id of the channel
- `location` (string): Physical location where the video was filmed - `location` (string): Physical location where the video was filmed
- `duration` (numeric): Length of the video in seconds - `duration` (numeric): Length of the video in seconds
- `view_count` (numeric): How many users have watched the video on the platform - `view_count` (numeric): How many users have watched the video on the platform
@ -545,7 +534,7 @@ The basic usage is not to set any template arguments when downloading a single f
- `extractor` (string): Name of the extractor - `extractor` (string): Name of the extractor
- `extractor_key` (string): Key name of the extractor - `extractor_key` (string): Key name of the extractor
- `epoch` (numeric): Unix epoch when creating the file - `epoch` (numeric): Unix epoch when creating the file
- `autonumber` (numeric): Number that will be increased with each download, starting at `--autonumber-start` - `autonumber` (numeric): Five-digit number that will be increased with each download, starting at zero
- `playlist` (string): Name or id of the playlist that contains the video - `playlist` (string): Name or id of the playlist that contains the video
- `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist - `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist
- `playlist_id` (string): Playlist identifier - `playlist_id` (string): Playlist identifier
@ -642,7 +631,6 @@ The simplest case is requesting a specific format, for example with `-f 22` you
You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file. You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file.
You can also use special names to select particular edge case formats: You can also use special names to select particular edge case formats:
- `best`: Select the best quality format represented by a single file with video and audio. - `best`: Select the best quality format represented by a single file with video and audio.
- `worst`: Select the worst quality format represented by a single file with video and audio. - `worst`: Select the worst quality format represented by a single file with video and audio.
- `bestvideo`: Select the best quality video-only format (e.g. DASH video). May not be available. - `bestvideo`: Select the best quality video-only format (e.g. DASH video). May not be available.
@ -659,7 +647,6 @@ If you want to download several formats of the same video use a comma as a separ
You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`). You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`).
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals): The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance - `filesize`: The number of bytes, if known in advance
- `width`: Width of the video, if known - `width`: Width of the video, if known
- `height`: Height of the video, if known - `height`: Height of the video, if known
@ -669,8 +656,7 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `asr`: Audio sampling rate in Hertz - `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate - `fps`: Frame rate
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains) and following string meta fields: Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begins with), `$=` (ends with), `*=` (contains) and following string meta fields:
- `ext`: File extension - `ext`: File extension
- `acodec`: Name of the audio codec in use - `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use - `vcodec`: Name of the video codec in use
@ -678,8 +664,6 @@ Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`) - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format - `format_id`: A short description of the format
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain).
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster. Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s.
@ -688,7 +672,7 @@ You can merge the video and audio of two formats into a single file using `-f <v
Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`. Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`.
Since the end of April 2015 and version 2015.04.26, youtube-dl uses `-f bestvideo+bestaudio/best` as the default format selection (see [#5447](https://github.com/ytdl-org/youtube-dl/issues/5447), [#5456](https://github.com/ytdl-org/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed. Since the end of April 2015 and version 2015.04.26, youtube-dl uses `-f bestvideo+bestaudio/best` as the default format selection (see [#5447](https://github.com/rg3/youtube-dl/issues/5447), [#5456](https://github.com/rg3/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed.
If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl. If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl.
@ -700,7 +684,7 @@ Note that on Windows you may need to use double quotes instead of single.
# Download best mp4 format available or any other best if no mp4 available # Download best mp4 format available or any other best if no mp4 available
$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best' $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but no better than 480p # Download best format available but not better that 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]' $ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger than 50 MB # Download best video only format but no bigger than 50 MB
@ -739,7 +723,7 @@ $ youtube-dl --dateafter 20000101 --datebefore 20091231
### How do I update youtube-dl? ### How do I update youtube-dl?
If you've followed [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html), you can simply run `youtube-dl -U` (or, on Linux, `sudo youtube-dl -U`). If you've followed [our manual installation instructions](https://rg3.github.io/youtube-dl/download.html), you can simply run `youtube-dl -U` (or, on Linux, `sudo youtube-dl -U`).
If you have used pip, a simple `sudo pip install -U youtube-dl` is sufficient to update. If you have used pip, a simple `sudo pip install -U youtube-dl` is sufficient to update.
@ -749,11 +733,11 @@ As a last resort, you can also uninstall the version installed by your package m
sudo apt-get remove -y youtube-dl sudo apt-get remove -y youtube-dl
Afterwards, simply follow [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html): Afterwards, simply follow [our manual installation instructions](https://rg3.github.io/youtube-dl/download.html):
``` ```
sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl sudo wget https://yt-dl.org/latest/youtube-dl -O /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl sudo chmod a+x /usr/local/bin/youtube-dl
hash -r hash -r
``` ```
@ -783,7 +767,7 @@ Most people asking this question are not aware that youtube-dl now defaults to d
### I get HTTP error 402 when trying to download a video. What's this? ### I get HTTP error 402 when trying to download a video. What's this?
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/ytdl-org/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl. Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### Do I need any other programs? ### Do I need any other programs?
@ -835,9 +819,7 @@ In February 2015, the new YouTube player contained a character sequence in a str
### HTTP Error 429: Too Many Requests or 402: Payment Required ### HTTP Error 429: Too Many Requests or 402: Payment Required
These two error codes indicate that the service is blocking your IP address because of overuse. Usually this is a soft block meaning that you can gain access again after solving CAPTCHA. Just open a browser and solve a CAPTCHA the service suggests you and after that [pass cookies](#how-do-i-pass-cookies-to-youtube-dl) to youtube-dl. Note that if your machine has multiple external IPs then you should also pass exactly the same IP you've used for solving CAPTCHA with [`--source-address`](#network-options). Also you may need to pass a `User-Agent` HTTP header of your browser with [`--user-agent`](#workarounds). These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
If this is not the case (no CAPTCHA suggested to solve by the service) then you can contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
### SyntaxError: Non-ASCII character ### SyntaxError: Non-ASCII character
@ -850,7 +832,7 @@ means you're using an outdated version of Python. Please update to Python 2.6 or
### What is this binary file? Where has the code gone? ### What is this binary file? Where has the code gone?
Since June 2012 ([#342](https://github.com/ytdl-org/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`. Since June 2012 ([#342](https://github.com/rg3/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
### The exe throws an error due to missing `MSVCR100.dll` ### The exe throws an error due to missing `MSVCR100.dll`
@ -879,9 +861,9 @@ Either prepend `https://www.youtube.com/watch?v=` or separate the ID from the op
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`. Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox). In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format. Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare). Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
@ -909,7 +891,7 @@ When youtube-dl detects an HLS video, it can download it either with the built-i
When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg. When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg.
In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](https://ytdl-org.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader. In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](https://rg3.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader.
If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case. If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case.
@ -949,7 +931,7 @@ youtube-dl is an open-source project manned by too few volunteers, so we'd rathe
# DEVELOPER INSTRUCTIONS # DEVELOPER INSTRUCTIONS
Most users do not need to build youtube-dl and can [download the builds](https://ytdl-org.github.io/youtube-dl/download.html) or get them from their distribution. Most users do not need to build youtube-dl and can [download the builds](https://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
To run youtube-dl as a developer, you don't need to build anything either. Simply execute To run youtube-dl as a developer, you don't need to build anything either. Simply execute
@ -977,7 +959,7 @@ If you want to add support for a new site, first of all **make sure** this site
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`): After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/ytdl-org/youtube-dl/fork) 1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with: 2. Check out the source code with:
git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
@ -1029,22 +1011,18 @@ After you have ensured this site is distributing its content legally, you can fo
# TODO more properties (see youtube_dl/extractor/common.py) # TODO more properties (see youtube_dl/extractor/common.py)
} }
``` ```
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py). 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart): 8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](https://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
9. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files and [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
$ flake8 youtube_dl/extractor/yourextractor.py
9. Make sure your code works under all [Python](https://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
10. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files and [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/extractors.py $ git add youtube_dl/extractor/extractors.py
$ git add youtube_dl/extractor/yourextractor.py $ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor' $ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor $ git push origin yourextractor
11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it. 10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions! In any case, thank you very much for your contributions!
@ -1056,7 +1034,7 @@ Extractors are very fragile by nature since they depend on the layout of the sou
### Mandatory and optional metafields ### Mandatory and optional metafields
For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl: For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl:
- `id` (media identifier) - `id` (media identifier)
- `title` (media title) - `title` (media title)
@ -1064,7 +1042,7 @@ For extraction to work youtube-dl relies on metadata your extractor extracts and
In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken. In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken.
[Any field](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. [Any field](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
#### Example #### Example
@ -1140,33 +1118,11 @@ title = meta.get('title') or self._og_search_title(webpage)
This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`. This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
### Regular expressions ### Make regular expressions flexible
#### Don't capture groups you don't use When using regular expressions try to write them fuzzy and flexible.
Capturing group must be an indication that it's used somewhere in the code. Any group that is not used must be non capturing.
##### Example
Don't capture id attribute name here since you can't use it for anything anyway.
Correct:
```python
r'(?:id|ID)=(?P<id>\d+)'
```
Incorrect:
```python
r'(id|ID)=(?P<id>\d+)'
```
#### Make regular expressions relaxed and flexible
When using regular expressions try to write them fuzzy, relaxed and flexible, skipping insignificant parts that are more likely to change, allowing both single and double quotes for quoted values and so on.
##### Example #### Example
Say you need to extract `title` from the following HTML code: Say you need to extract `title` from the following HTML code:
@ -1199,121 +1155,13 @@ title = self._search_regex(
webpage, 'title', group='title') webpage, 'title', group='title')
``` ```
### Long lines policy ### Use safe conversion functions
There is a soft limit to keep lines of code under 80 characters long. This means it should be respected if possible and if it does not make readability and code maintenance worse. Wrap all extracted numeric data into safe functions from `utils`: `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
For example, you should **never** split long string literals like URLs or some other often copied entities over multiple lines to fit this limit:
Correct:
```python
'https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
```
Incorrect:
```python
'https://www.youtube.com/watch?v=FqZTN594JQw&list='
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
```
### Inline values
Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
#### Example
Correct:
```python
title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
```
Incorrect:
```python
TITLE_RE = r'<title>([^<]+)</title>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
```
### Collapse fallbacks
Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
#### Example
Good:
```python
description = self._html_search_meta(
['og:description', 'description', 'twitter:description'],
webpage, 'description', default=None)
```
Unwieldy:
```python
description = (
self._og_search_description(webpage, default=None)
or self._html_search_meta('description', webpage, default=None)
or self._html_search_meta('twitter:description', webpage, default=None))
```
Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
### Trailing parentheses
Always move trailing parentheses after the last argument.
#### Example
Correct:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Incorrect:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list,
)
```
### Use convenience conversion and parsing functions
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
Use `url_or_none` for safe URL processing.
Use `try_get` for safe metadata extraction from parsed JSON.
Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.
Explore [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py) for more useful convenience functions.
#### More examples
##### Safely extract optional description from parsed JSON
```python
description = try_get(response, lambda x: x['result']['video'][0]['summary'], compat_str)
```
##### Safely extract more optional metadata
```python
video = try_get(response, lambda x: x['result']['video'][0], dict) or {}
description = video.get('summary')
duration = float_or_none(video.get('durationMs'), scale=1000)
view_count = int_or_none(video.get('views'))
```
# EMBEDDING YOUTUBE-DL # EMBEDDING YOUTUBE-DL
youtube-dl makes the best effort to be a good command-line program, and thus should be callable from any programming language. If you encounter any problems parsing its output, feel free to [create a report](https://github.com/ytdl-org/youtube-dl/issues/new). youtube-dl makes the best effort to be a good command-line program, and thus should be callable from any programming language. If you encounter any problems parsing its output, feel free to [create a report](https://github.com/rg3/youtube-dl/issues/new).
From a Python program, you can embed youtube-dl in a more powerful fashion, like this: From a Python program, you can embed youtube-dl in a more powerful fashion, like this:
@ -1326,7 +1174,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc']) ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc'])
``` ```
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/ytdl-org/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312). For a start, if you want to intercept youtube-dl's output, set a `logger` object. Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file: Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
@ -1367,7 +1215,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
# BUGS # BUGS
Bugs and suggestions should be reported at: <https://github.com/ytdl-org/youtube-dl/issues>. Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)). Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>. Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this: **Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
``` ```
@ -1413,11 +1261,11 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented? ### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/ytdl-org/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity. Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough? ### Why are existing options not enough?
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem. Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
### Is there enough context in your bug report? ### Is there enough context in your bug report?

View file

@ -322,7 +322,7 @@ class GITBuilder(GITInfoBuilder):
class YoutubeDLBuilder(object): class YoutubeDLBuilder(object):
authorizedUsers = ['fraca7', 'phihag', 'rg3', 'FiloSottile', 'ytdl-org'] authorizedUsers = ['fraca7', 'phihag', 'rg3', 'FiloSottile']
def __init__(self, **kwargs): def __init__(self, **kwargs):
if self.repoName != 'youtube-dl': if self.repoName != 'youtube-dl':

View file

@ -45,12 +45,12 @@ for test in gettestcases():
RESULT = ('.' + domain + '\n' in LIST or '\n' + domain + '\n' in LIST) RESULT = ('.' + domain + '\n' in LIST or '\n' + domain + '\n' in LIST)
if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict'] if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict'] or
or test['info_dict']['age_limit'] != 18): test['info_dict']['age_limit'] != 18):
print('\nPotential missing age_limit check: {0}'.format(test['name'])) print('\nPotential missing age_limit check: {0}'.format(test['name']))
elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict'] elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict'] and
and test['info_dict']['age_limit'] == 18): test['info_dict']['age_limit'] == 18):
print('\nPotential false negative: {0}'.format(test['name'])) print('\nPotential false negative: {0}'.format(test['name']))
else: else:

View file

@ -1,6 +1,7 @@
#!/usr/bin/env python #!/usr/bin/env python
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import io import io
import json import json
import mimetypes import mimetypes
@ -14,6 +15,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_basestring, compat_basestring,
compat_input,
compat_getpass, compat_getpass,
compat_print, compat_print,
compat_urllib_request, compat_urllib_request,
@ -25,8 +27,8 @@ from youtube_dl.utils import (
class GitHubReleaser(object): class GitHubReleaser(object):
_API_URL = 'https://api.github.com/repos/ytdl-org/youtube-dl/releases' _API_URL = 'https://api.github.com/repos/rg3/youtube-dl/releases'
_UPLOADS_URL = 'https://uploads.github.com/repos/ytdl-org/youtube-dl/releases/%s/assets?name=%s' _UPLOADS_URL = 'https://uploads.github.com/repos/rg3/youtube-dl/releases/%s/assets?name=%s'
_NETRC_MACHINE = 'github.com' _NETRC_MACHINE = 'github.com'
def __init__(self, debuglevel=0): def __init__(self, debuglevel=0):
@ -38,20 +40,28 @@ class GitHubReleaser(object):
try: try:
info = netrc.netrc().authenticators(self._NETRC_MACHINE) info = netrc.netrc().authenticators(self._NETRC_MACHINE)
if info is not None: if info is not None:
self._token = info[2] self._username = info[0]
self._password = info[2]
compat_print('Using GitHub credentials found in .netrc...') compat_print('Using GitHub credentials found in .netrc...')
return return
else: else:
compat_print('No GitHub credentials found in .netrc') compat_print('No GitHub credentials found in .netrc')
except (IOError, netrc.NetrcParseError): except (IOError, netrc.NetrcParseError):
compat_print('Unable to parse .netrc') compat_print('Unable to parse .netrc')
self._token = compat_getpass( self._username = compat_input(
'Type your GitHub PAT (personal access token) and press [Return]: ') 'Type your GitHub username or email address and press [Return]: ')
self._password = compat_getpass(
'Type your GitHub password and press [Return]: ')
def _call(self, req): def _call(self, req):
if isinstance(req, compat_basestring): if isinstance(req, compat_basestring):
req = sanitized_Request(req) req = sanitized_Request(req)
req.add_header('Authorization', 'token %s' % self._token) # Authorizing manually since GitHub does not response with 401 with
# WWW-Authenticate header set (see
# https://developer.github.com/v3/#basic-authentication)
b64 = base64.b64encode(
('%s:%s' % (self._username, self._password)).encode('utf-8')).decode('ascii')
req.add_header('Authorization', 'Basic %s' % b64)
response = self._opener.open(req).read().decode('utf-8') response = self._opener.open(req).read().decode('utf-8')
return json.loads(response) return json.loads(response)

View file

@ -1,22 +1,27 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
from __future__ import unicode_literals from __future__ import unicode_literals
import hashlib
import urllib.request
import json import json
versions_info = json.load(open('update/versions.json')) versions_info = json.load(open('update/versions.json'))
version = versions_info['latest'] version = versions_info['latest']
version_dict = versions_info['versions'][version] URL = versions_info['versions'][version]['bin'][0]
data = urllib.request.urlopen(URL).read()
# Read template page # Read template page
with open('download.html.in', 'r', encoding='utf-8') as tmplf: with open('download.html.in', 'r', encoding='utf-8') as tmplf:
template = tmplf.read() template = tmplf.read()
sha256sum = hashlib.sha256(data).hexdigest()
template = template.replace('@PROGRAM_VERSION@', version) template = template.replace('@PROGRAM_VERSION@', version)
template = template.replace('@PROGRAM_URL@', version_dict['bin'][0]) template = template.replace('@PROGRAM_URL@', URL)
template = template.replace('@PROGRAM_SHA256SUM@', version_dict['bin'][1]) template = template.replace('@PROGRAM_SHA256SUM@', sha256sum)
template = template.replace('@EXE_URL@', version_dict['exe'][0]) template = template.replace('@EXE_URL@', versions_info['versions'][version]['exe'][0])
template = template.replace('@EXE_SHA256SUM@', version_dict['exe'][1]) template = template.replace('@EXE_SHA256SUM@', versions_info['versions'][version]['exe'][1])
template = template.replace('@TAR_URL@', version_dict['tar'][0]) template = template.replace('@TAR_URL@', versions_info['versions'][version]['tar'][0])
template = template.replace('@TAR_SHA256SUM@', version_dict['tar'][1]) template = template.replace('@TAR_SHA256SUM@', versions_info['versions'][version]['tar'][1])
with open('download.html', 'w', encoding='utf-8') as dlf: with open('download.html', 'w', encoding='utf-8') as dlf:
dlf.write(template) dlf.write(template)

View file

@ -13,7 +13,7 @@ year = str(datetime.datetime.now().year)
for fn in glob.glob('*.html*'): for fn in glob.glob('*.html*'):
with io.open(fn, encoding='utf-8') as f: with io.open(fn, encoding='utf-8') as f:
content = f.read() content = f.read()
newc = re.sub(r'(?P<copyright>Copyright © 2011-)(?P<year>[0-9]{4})', 'Copyright © 2011-' + year, content) newc = re.sub(r'(?P<copyright>Copyright © 2006-)(?P<year>[0-9]{4})', 'Copyright © 2006-' + year, content)
if content != newc: if content != newc:
tmpFn = fn + '.part' tmpFn = fn + '.part'
with io.open(tmpFn, 'wt', encoding='utf-8') as outf: with io.open(tmpFn, 'wt', encoding='utf-8') as outf:

View file

@ -10,7 +10,7 @@ import textwrap
atom_template = textwrap.dedent("""\ atom_template = textwrap.dedent("""\
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"> <feed xmlns="http://www.w3.org/2005/Atom">
<link rel="self" href="http://ytdl-org.github.io/youtube-dl/update/releases.atom" /> <link rel="self" href="http://rg3.github.io/youtube-dl/update/releases.atom" />
<title>youtube-dl releases</title> <title>youtube-dl releases</title>
<id>https://yt-dl.org/feed/youtube-dl-updates-feed</id> <id>https://yt-dl.org/feed/youtube-dl-updates-feed</id>
<updated>@TIMESTAMP@</updated> <updated>@TIMESTAMP@</updated>
@ -21,7 +21,7 @@ entry_template = textwrap.dedent("""
<entry> <entry>
<id>https://yt-dl.org/feed/youtube-dl-updates-feed/youtube-dl-@VERSION@</id> <id>https://yt-dl.org/feed/youtube-dl-updates-feed/youtube-dl-@VERSION@</id>
<title>New version @VERSION@</title> <title>New version @VERSION@</title>
<link href="http://ytdl-org.github.io/youtube-dl" /> <link href="http://rg3.github.io/youtube-dl" />
<content type="xhtml"> <content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml"> <div xmlns="http://www.w3.org/1999/xhtml">
Downloads available at <a href="https://yt-dl.org/downloads/@VERSION@/">https://yt-dl.org/downloads/@VERSION@/</a> Downloads available at <a href="https://yt-dl.org/downloads/@VERSION@/">https://yt-dl.org/downloads/@VERSION@/</a>

View file

@ -78,8 +78,8 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
sed -i "s/<unreleased>/$version/" ChangeLog sed -i "s/<unreleased>/$version/" ChangeLog
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..." /bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md issuetemplates supportedsites make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dl/version.py ChangeLog git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog
git commit $gpg_sign_commits -m "release $version" git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..." /bin/echo -e "\n### Now tagging, signing and pushing..."
@ -96,7 +96,7 @@ git push origin "$version"
REV=$(git rev-parse HEAD) REV=$(git rev-parse HEAD)
make youtube-dl youtube-dl.tar.gz make youtube-dl youtube-dl.tar.gz
read -p "VM running? (y/n) " -n 1 read -p "VM running? (y/n) " -n 1
wget "http://$buildserver/build/ytdl-org/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe wget "http://$buildserver/build/rg3/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe
mkdir -p "build/$version" mkdir -p "build/$version"
mv youtube-dl youtube-dl.exe "build/$version" mv youtube-dl youtube-dl.exe "build/$version"
mv youtube-dl.tar.gz "build/$version/youtube-dl-$version.tar.gz" mv youtube-dl.tar.gz "build/$version/youtube-dl-$version.tar.gz"

View file

@ -24,7 +24,7 @@ total_bytes = 0
for page in itertools.count(1): for page in itertools.count(1):
releases = json.loads(compat_urllib_request.urlopen( releases = json.loads(compat_urllib_request.urlopen(
'https://api.github.com/repos/ytdl-org/youtube-dl/releases?page=%s' % page 'https://api.github.com/repos/rg3/youtube-dl/releases?page=%s' % page
).read().decode('utf-8')) ).read().decode('utf-8'))
if not releases: if not releases:

File diff suppressed because it is too large Load diff

View file

@ -2,5 +2,5 @@
universal = True universal = True
[flake8] [flake8]
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
ignore = E402,E501,E731,E741,W503 ignore = E402,E501,E731

View file

@ -104,7 +104,7 @@ setup(
version=__version__, version=__version__,
description=DESCRIPTION, description=DESCRIPTION,
long_description=LONG_DESCRIPTION, long_description=LONG_DESCRIPTION,
url='https://github.com/ytdl-org/youtube-dl', url='https://github.com/rg3/youtube-dl',
author='Ricardo Garcia', author='Ricardo Garcia',
author_email='ytdl@yt-dl.org', author_email='ytdl@yt-dl.org',
maintainer='Sergey M.', maintainer='Sergey M.',
@ -124,8 +124,6 @@ setup(
'Development Status :: 5 - Production/Stable', 'Development Status :: 5 - Production/Stable',
'Environment :: Console', 'Environment :: Console',
'License :: Public Domain', 'License :: Public Domain',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.6', 'Programming Language :: Python :: 2.6',
'Programming Language :: Python :: 2.7', 'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3',
@ -134,13 +132,6 @@ setup(
'Programming Language :: Python :: 3.4', 'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6', 'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: Implementation',
'Programming Language :: Python :: Implementation :: CPython',
'Programming Language :: Python :: Implementation :: IronPython',
'Programming Language :: Python :: Implementation :: Jython',
'Programming Language :: Python :: Implementation :: PyPy',
], ],
cmdclass={'build_lazy_extractors': build_lazy_extractors}, cmdclass={'build_lazy_extractors': build_lazy_extractors},

View file

@ -7,7 +7,6 @@ import json
import os.path import os.path
import re import re
import types import types
import ssl
import sys import sys
import youtube_dl.extractor import youtube_dl.extractor
@ -153,27 +152,15 @@ def expect_value(self, got, expected, field):
isinstance(got, compat_str), isinstance(got, compat_str),
'Expected field %s to be a unicode object, but got value %r of type %r' % (field, got, type(got))) 'Expected field %s to be a unicode object, but got value %r of type %r' % (field, got, type(got)))
got = 'md5:' + md5(got) got = 'md5:' + md5(got)
elif isinstance(expected, compat_str) and re.match(r'^(?:min|max)?count:\d+', expected): elif isinstance(expected, compat_str) and expected.startswith('mincount:'):
self.assertTrue( self.assertTrue(
isinstance(got, (list, dict)), isinstance(got, (list, dict)),
'Expected field %s to be a list or a dict, but it is of type %s' % ( 'Expected field %s to be a list or a dict, but it is of type %s' % (
field, type(got).__name__)) field, type(got).__name__))
op, _, expected_num = expected.partition(':') expected_num = int(expected.partition(':')[2])
expected_num = int(expected_num) assertGreaterEqual(
if op == 'mincount':
assert_func = assertGreaterEqual
msg_tmpl = 'Expected %d items in field %s, but only got %d'
elif op == 'maxcount':
assert_func = assertLessEqual
msg_tmpl = 'Expected maximum %d items in field %s, but got %d'
elif op == 'count':
assert_func = assertEqual
msg_tmpl = 'Expected exactly %d items in field %s, but got %d'
else:
assert False
assert_func(
self, len(got), expected_num, self, len(got), expected_num,
msg_tmpl % (expected_num, field, len(got))) 'Expected %d items in field %s, but only got %d' % (expected_num, field, len(got)))
return return
self.assertEqual( self.assertEqual(
expected, got, expected, got,
@ -249,20 +236,6 @@ def assertGreaterEqual(self, got, expected, msg=None):
self.assertTrue(got >= expected, msg) self.assertTrue(got >= expected, msg)
def assertLessEqual(self, got, expected, msg=None):
if not (got <= expected):
if msg is None:
msg = '%r not less than or equal to %r' % (got, expected)
self.assertTrue(got <= expected, msg)
def assertEqual(self, got, expected, msg=None):
if not (got == expected):
if msg is None:
msg = '%r not equal to %r' % (got, expected)
self.assertTrue(got == expected, msg)
def expect_warnings(ydl, warnings_re): def expect_warnings(ydl, warnings_re):
real_warning = ydl.report_warning real_warning = ydl.report_warning
@ -271,12 +244,3 @@ def expect_warnings(ydl, warnings_re):
real_warning(w) real_warning(w)
ydl.report_warning = _report_warning ydl.report_warning = _report_warning
def http_server_port(httpd):
if os.name == 'java' and isinstance(httpd.socket, ssl.SSLSocket):
# In Jython SSLSocket is not a subclass of socket.socket
sock = httpd.socket.sock
else:
sock = httpd.socket
return sock.getsockname()[1]

View file

@ -9,30 +9,11 @@ import sys
import unittest import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL, expect_dict, expect_value, http_server_port from test.helper import FakeYDL, expect_dict, expect_value
from youtube_dl.compat import compat_etree_fromstring, compat_http_server from youtube_dl.compat import compat_etree_fromstring
from youtube_dl.extractor.common import InfoExtractor from youtube_dl.extractor.common import InfoExtractor
from youtube_dl.extractor import YoutubeIE, get_info_extractor from youtube_dl.extractor import YoutubeIE, get_info_extractor
from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError
import threading
TEAPOT_RESPONSE_STATUS = 418
TEAPOT_RESPONSE_BODY = "<h1>418 I'm a teapot</h1>"
class InfoExtractorTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
def log_message(self, format, *args):
pass
def do_GET(self):
if self.path == '/teapot':
self.send_response(TEAPOT_RESPONSE_STATUS)
self.send_header('Content-Type', 'text/html; charset=utf-8')
self.end_headers()
self.wfile.write(TEAPOT_RESPONSE_BODY.encode())
else:
assert False
class TestIE(InfoExtractor): class TestIE(InfoExtractor):
@ -61,7 +42,6 @@ class TestInfoExtractor(unittest.TestCase):
<meta content='Foo' property=og:foobar> <meta content='Foo' property=og:foobar>
<meta name="og:test1" content='foo > < bar'/> <meta name="og:test1" content='foo > < bar'/>
<meta name="og:test2" content="foo >//< bar"/> <meta name="og:test2" content="foo >//< bar"/>
<meta property=og-test3 content='Ill-formatted opengraph'/>
''' '''
self.assertEqual(ie._og_search_title(html), 'Foo') self.assertEqual(ie._og_search_title(html), 'Foo')
self.assertEqual(ie._og_search_description(html), 'Some video\'s description ') self.assertEqual(ie._og_search_description(html), 'Some video\'s description ')
@ -70,7 +50,6 @@ class TestInfoExtractor(unittest.TestCase):
self.assertEqual(ie._og_search_property('foobar', html), 'Foo') self.assertEqual(ie._og_search_property('foobar', html), 'Foo')
self.assertEqual(ie._og_search_property('test1', html), 'foo > < bar') self.assertEqual(ie._og_search_property('test1', html), 'foo > < bar')
self.assertEqual(ie._og_search_property('test2', html), 'foo >//< bar') self.assertEqual(ie._og_search_property('test2', html), 'foo >//< bar')
self.assertEqual(ie._og_search_property('test3', html), 'Ill-formatted opengraph')
self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar') self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar')
self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True) self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True)
self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True) self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True)
@ -107,184 +86,6 @@ class TestInfoExtractor(unittest.TestCase):
self.assertRaises(ExtractorError, self.ie._download_json, uri, None) self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None) self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
def test_parse_html5_media_entries(self):
# from https://www.r18.com/
# with kpbs in label
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.r18.com/',
r'''
<video id="samplevideo_amateur" class="js-samplevideo video-js vjs-default-skin vjs-big-play-centered" controls preload="auto" width="400" height="225" poster="//pics.r18.com/digital/amateur/mgmr105/mgmr105jp.jpg">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_sm_w.mp4" type="video/mp4" res="240" label="300kbps">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dm_w.mp4" type="video/mp4" res="480" label="1000kbps">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dmb_w.mp4" type="video/mp4" res="740" label="1500kbps">
<p>Your browser does not support the video tag.</p>
</video>
''', None)[0],
{
'formats': [{
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_sm_w.mp4',
'ext': 'mp4',
'format_id': '300kbps',
'height': 240,
'tbr': 300,
}, {
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dm_w.mp4',
'ext': 'mp4',
'format_id': '1000kbps',
'height': 480,
'tbr': 1000,
}, {
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dmb_w.mp4',
'ext': 'mp4',
'format_id': '1500kbps',
'height': 740,
'tbr': 1500,
}],
'thumbnail': '//pics.r18.com/digital/amateur/mgmr105/mgmr105jp.jpg'
})
# from https://www.csfd.cz/
# with width and height
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.csfd.cz/',
r'''
<video width="770" height="328" preload="none" controls poster="https://img.csfd.cz/files/images/film/video/preview/163/344/163344118_748d20.png?h360" >
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327358_eac647.mp4" type="video/mp4" width="640" height="360">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327360_3d2646.mp4" type="video/mp4" width="1280" height="720">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327356_91f258.mp4" type="video/mp4" width="1920" height="1080">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327359_962b4a.webm" type="video/webm" width="640" height="360">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327361_6feee0.webm" type="video/webm" width="1280" height="720">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327357_8ab472.webm" type="video/webm" width="1920" height="1080">
<track src="https://video.csfd.cz/files/subtitles/163/344/163344115_4c388b.srt" type="text/x-srt" kind="subtitles" srclang="cs" label="cs">
</video>
''', None)[0],
{
'formats': [{
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327358_eac647.mp4',
'ext': 'mp4',
'width': 640,
'height': 360,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327360_3d2646.mp4',
'ext': 'mp4',
'width': 1280,
'height': 720,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327356_91f258.mp4',
'ext': 'mp4',
'width': 1920,
'height': 1080,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327359_962b4a.webm',
'ext': 'webm',
'width': 640,
'height': 360,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327361_6feee0.webm',
'ext': 'webm',
'width': 1280,
'height': 720,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327357_8ab472.webm',
'ext': 'webm',
'width': 1920,
'height': 1080,
}],
'subtitles': {
'cs': [{'url': 'https://video.csfd.cz/files/subtitles/163/344/163344115_4c388b.srt'}]
},
'thumbnail': 'https://img.csfd.cz/files/images/film/video/preview/163/344/163344118_748d20.png?h360'
})
# from https://tamasha.com/v/Kkdjw
# with height in label
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://tamasha.com/v/Kkdjw',
r'''
<video crossorigin="anonymous">
<source src="https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4" type="video/mp4" label="AUTO" res="0"/>
<source src="https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4" type="video/mp4"
label="240p" res="240"/>
<source src="https://s-v2.tamasha.com/statics/videos_file/20/00/Kkdjw_200041c66f657fc967db464d156eafbc1ed9fe6f_n_144.mp4" type="video/mp4"
label="144p" res="144"/>
</video>
''', None)[0],
{
'formats': [{
'url': 'https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4',
}, {
'url': 'https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4',
'ext': 'mp4',
'format_id': '240p',
'height': 240,
}, {
'url': 'https://s-v2.tamasha.com/statics/videos_file/20/00/Kkdjw_200041c66f657fc967db464d156eafbc1ed9fe6f_n_144.mp4',
'ext': 'mp4',
'format_id': '144p',
'height': 144,
}]
})
# from https://www.directvnow.com
# with data-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video id="vid1" class="header--video-masked active" muted playsinline>
<source data-src="https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'ext': 'mp4',
'url': 'https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4',
}]
})
# from https://www.directvnow.com
# with data-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video id="vid1" class="header--video-masked active" muted playsinline>
<source data-src="https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'url': 'https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4',
'ext': 'mp4',
}]
})
# from https://www.klarna.com/uk/
# with data-video-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video loop autoplay muted class="responsive-video block-kl__video video-on-medium">
<source src="" data-video-desktop data-video-src="https://www.klarna.com/uk/wp-content/uploads/sites/11/2019/01/KL062_Smooth3_0_DogWalking_5s_920x080_.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'url': 'https://www.klarna.com/uk/wp-content/uploads/sites/11/2019/01/KL062_Smooth3_0_DogWalking_5s_920x080_.mp4',
'ext': 'mp4',
}],
})
def test_extract_jwplayer_data_realworld(self): def test_extract_jwplayer_data_realworld(self):
# from http://www.suffolk.edu/sjc/ # from http://www.suffolk.edu/sjc/
expect_dict( expect_dict(
@ -379,7 +180,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
def test_parse_m3u8_formats(self): def test_parse_m3u8_formats(self):
_TEST_CASES = [ _TEST_CASES = [
( (
# https://github.com/ytdl-org/youtube-dl/issues/11507 # https://github.com/rg3/youtube-dl/issues/11507
# http://pluzz.francetv.fr/videos/le_ministere.html # http://pluzz.francetv.fr/videos/le_ministere.html
'pluzz_francetv_11507', 'pluzz_francetv_11507',
'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais', 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
@ -441,7 +242,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
}] }]
), ),
( (
# https://github.com/ytdl-org/youtube-dl/issues/11995 # https://github.com/rg3/youtube-dl/issues/11995
# http://teamcoco.com/video/clueless-gamer-super-bowl-for-honor # http://teamcoco.com/video/clueless-gamer-super-bowl-for-honor
'teamcoco_11995', 'teamcoco_11995',
'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8', 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
@ -515,7 +316,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
}] }]
), ),
( (
# https://github.com/ytdl-org/youtube-dl/issues/12211 # https://github.com/rg3/youtube-dl/issues/12211
# http://video.toggle.sg/en/series/whoopie-s-world/ep3/478601 # http://video.toggle.sg/en/series/whoopie-s-world/ep3/478601
'toggle_mobile_12211', 'toggle_mobile_12211',
'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8', 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
@ -677,64 +478,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'width': 1280, 'width': 1280,
'height': 720, 'height': 720,
}] }]
), )
(
# https://github.com/ytdl-org/youtube-dl/issues/18923
# https://www.ted.com/talks/boris_hesser_a_grassroots_healthcare_revolution_in_africa
'ted_18923',
'http://hls.ted.com/talks/31241.m3u8',
[{
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '600k-Audio',
'vcodec': 'none',
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '68',
'vcodec': 'none',
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/64k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '163',
'acodec': 'none',
'width': 320,
'height': 180,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/180k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '481',
'acodec': 'none',
'width': 512,
'height': 288,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/320k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '769',
'acodec': 'none',
'width': 512,
'height': 288,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/450k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '984',
'acodec': 'none',
'width': 512,
'height': 288,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/600k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '1255',
'acodec': 'none',
'width': 640,
'height': 360,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/950k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '1693',
'acodec': 'none',
'width': 853,
'height': 480,
}, {
'url': 'http://hls.ted.com/videos/BorisHesser_2018S/video/1500k.m3u8?nobumpers=true&uniqueId=76011e2b',
'format_id': '2462',
'acodec': 'none',
'width': 1280,
'height': 720,
}]
),
] ]
for m3u8_file, m3u8_url, expected_formats in _TEST_CASES: for m3u8_file, m3u8_url, expected_formats in _TEST_CASES:
@ -748,12 +492,11 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
def test_parse_mpd_formats(self): def test_parse_mpd_formats(self):
_TEST_CASES = [ _TEST_CASES = [
( (
# https://github.com/ytdl-org/youtube-dl/issues/13919 # https://github.com/rg3/youtube-dl/issues/13919
# Also tests duplicate representation ids, see # Also tests duplicate representation ids, see
# https://github.com/ytdl-org/youtube-dl/issues/15111 # https://github.com/rg3/youtube-dl/issues/15111
'float_duration', 'float_duration',
'http://unknown/manifest.mpd', # mpd_url 'http://unknown/manifest.mpd',
None, # mpd_base_url
[{ [{
'manifest_url': 'http://unknown/manifest.mpd', 'manifest_url': 'http://unknown/manifest.mpd',
'ext': 'm4a', 'ext': 'm4a',
@ -831,10 +574,9 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 1080, 'height': 1080,
}] }]
), ( ), (
# https://github.com/ytdl-org/youtube-dl/pull/14844 # https://github.com/rg3/youtube-dl/pull/14844
'urls_only', 'urls_only',
'http://unknown/manifest.mpd', # mpd_url 'http://unknown/manifest.mpd',
None, # mpd_base_url
[{ [{
'manifest_url': 'http://unknown/manifest.mpd', 'manifest_url': 'http://unknown/manifest.mpd',
'ext': 'mp4', 'ext': 'mp4',
@ -913,68 +655,22 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'width': 1920, 'width': 1920,
'height': 1080, 'height': 1080,
}] }]
), (
# https://github.com/ytdl-org/youtube-dl/issues/20346
# Media considered unfragmented even though it contains
# Initialization tag
'unfragmented',
'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd', # mpd_url
'https://v.redd.it/hw1x7rcg7zl21', # mpd_base_url
[{
'url': 'https://v.redd.it/hw1x7rcg7zl21/audio',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'm4a',
'format_id': 'AUDIO-1',
'format_note': 'DASH audio',
'container': 'm4a_dash',
'acodec': 'mp4a.40.2',
'vcodec': 'none',
'tbr': 129.87,
'asr': 48000,
}, {
'url': 'https://v.redd.it/hw1x7rcg7zl21/DASH_240',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'mp4',
'format_id': 'VIDEO-2',
'format_note': 'DASH video',
'container': 'mp4_dash',
'acodec': 'none',
'vcodec': 'avc1.4d401e',
'tbr': 608.0,
'width': 240,
'height': 240,
'fps': 30,
}, {
'url': 'https://v.redd.it/hw1x7rcg7zl21/DASH_360',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'mp4',
'format_id': 'VIDEO-1',
'format_note': 'DASH video',
'container': 'mp4_dash',
'acodec': 'none',
'vcodec': 'avc1.4d401e',
'tbr': 804.261,
'width': 360,
'height': 360,
'fps': 30,
}]
) )
] ]
for mpd_file, mpd_url, mpd_base_url, expected_formats in _TEST_CASES: for mpd_file, mpd_url, expected_formats in _TEST_CASES:
with io.open('./test/testdata/mpd/%s.mpd' % mpd_file, with io.open('./test/testdata/mpd/%s.mpd' % mpd_file,
mode='r', encoding='utf-8') as f: mode='r', encoding='utf-8') as f:
formats = self.ie._parse_mpd_formats( formats = self.ie._parse_mpd_formats(
compat_etree_fromstring(f.read().encode('utf-8')), compat_etree_fromstring(f.read().encode('utf-8')),
mpd_base_url=mpd_base_url, mpd_url=mpd_url) mpd_url=mpd_url)
self.ie._sort_formats(formats) self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None) expect_value(self, formats, expected_formats, None)
def test_parse_f4m_formats(self): def test_parse_f4m_formats(self):
_TEST_CASES = [ _TEST_CASES = [
( (
# https://github.com/ytdl-org/youtube-dl/issues/14660 # https://github.com/rg3/youtube-dl/issues/14660
'custom_base_url', 'custom_base_url',
'http://api.new.livestream.com/accounts/6115179/events/6764928/videos/144884262.f4m', 'http://api.new.livestream.com/accounts/6115179/events/6764928/videos/144884262.f4m',
[{ [{
@ -998,74 +694,6 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
self.ie._sort_formats(formats) self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None) expect_value(self, formats, expected_formats, None)
def test_parse_xspf(self):
_TEST_CASES = [
(
'foo_xspf',
'https://example.org/src/foo_xspf.xspf',
[{
'id': 'foo_xspf',
'title': 'Pandemonium',
'description': 'Visit http://bigbrother404.bandcamp.com',
'duration': 202.416,
'formats': [{
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.org/src/cd1/track%201.mp3',
}],
}, {
'id': 'foo_xspf',
'title': 'Final Cartridge (Nichico Twelve Remix)',
'description': 'Visit http://bigbrother404.bandcamp.com',
'duration': 255.857,
'formats': [{
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.org/%E3%83%88%E3%83%A9%E3%83%83%E3%82%AF%E3%80%80%EF%BC%92.mp3',
}],
}, {
'id': 'foo_xspf',
'title': 'Rebuilding Nightingale',
'description': 'Visit http://bigbrother404.bandcamp.com',
'duration': 287.915,
'formats': [{
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.org/src/track3.mp3',
}, {
'manifest_url': 'https://example.org/src/foo_xspf.xspf',
'url': 'https://example.com/track3.mp3',
}]
}]
),
]
for xspf_file, xspf_url, expected_entries in _TEST_CASES:
with io.open('./test/testdata/xspf/%s.xspf' % xspf_file,
mode='r', encoding='utf-8') as f:
entries = self.ie._parse_xspf(
compat_etree_fromstring(f.read().encode('utf-8')),
xspf_file, xspf_url=xspf_url, xspf_base_url=xspf_url)
expect_value(self, entries, expected_entries, None)
for i in range(len(entries)):
expect_dict(self, entries[i], expected_entries[i])
def test_response_with_expected_status_returns_content(self):
# Checks for mitigations against the effects of
# <https://bugs.python.org/issue15002> that affect Python 3.4.1+, which
# manifest as `_download_webpage`, `_download_xml`, `_download_json`,
# or the underlying `_download_webpage_handle` returning no content
# when a response matches `expected_status`.
httpd = compat_http_server.HTTPServer(
('127.0.0.1', 0), InfoExtractorTestRequestHandler)
port = http_server_port(httpd)
server_thread = threading.Thread(target=httpd.serve_forever)
server_thread.daemon = True
server_thread.start()
(content, urlh) = self.ie._download_webpage_handle(
'http://127.0.0.1:%d/teapot' % port, None,
expected_status=TEAPOT_RESPONSE_STATUS)
self.assertEqual(content, TEAPOT_RESPONSE_BODY)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View file

@ -239,76 +239,6 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0] downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vid-vcodec-dot') self.assertEqual(downloaded['format_id'], 'vid-vcodec-dot')
def test_format_selection_string_ops(self):
formats = [
{'format_id': 'abc-cba', 'ext': 'mp4', 'url': TEST_URL},
{'format_id': 'zxc-cxz', 'ext': 'webm', 'url': TEST_URL},
]
info_dict = _make_result(formats)
# equals (=)
ydl = YDL({'format': '[format_id=abc-cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not equal (!=)
ydl = YDL({'format': '[format_id!=abc-cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!=abc-cba][format_id!=zxc-cxz]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
# starts with (^=)
ydl = YDL({'format': '[format_id^=abc]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not start with (!^=)
ydl = YDL({'format': '[format_id!^=abc]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!^=abc][format_id!^=zxc]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
# ends with ($=)
ydl = YDL({'format': '[format_id$=cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not end with (!$=)
ydl = YDL({'format': '[format_id!$=cba]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!$=cba][format_id!$=cxz]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
# contains (*=)
ydl = YDL({'format': '[format_id*=bc-cb]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'abc-cba')
# does not contain (!*=)
ydl = YDL({'format': '[format_id!*=bc-cb]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'zxc-cxz')
ydl = YDL({'format': '[format_id!*=abc][format_id!*=zxc]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
ydl = YDL({'format': '[format_id!*=-]'})
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
def test_youtube_format_selection(self): def test_youtube_format_selection(self):
order = [ order = [
'38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '17', '36', '13', '38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '17', '36', '13',
@ -411,7 +341,7 @@ class TestFormatSelection(unittest.TestCase):
# For extractors with incomplete formats (all formats are audio-only or # For extractors with incomplete formats (all formats are audio-only or
# video-only) best and worst should fallback to corresponding best/worst # video-only) best and worst should fallback to corresponding best/worst
# video-only or audio-only formats (as per # video-only or audio-only formats (as per
# https://github.com/ytdl-org/youtube-dl/pull/5556) # https://github.com/rg3/youtube-dl/pull/5556)
formats = [ formats = [
{'format_id': 'low', 'ext': 'mp3', 'preference': 1, 'vcodec': 'none', 'url': TEST_URL}, {'format_id': 'low', 'ext': 'mp3', 'preference': 1, 'vcodec': 'none', 'url': TEST_URL},
{'format_id': 'high', 'ext': 'mp3', 'preference': 2, 'vcodec': 'none', 'url': TEST_URL}, {'format_id': 'high', 'ext': 'mp3', 'preference': 2, 'vcodec': 'none', 'url': TEST_URL},
@ -442,7 +372,7 @@ class TestFormatSelection(unittest.TestCase):
self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy()) self.assertRaises(ExtractorError, ydl.process_ie_result, info_dict.copy())
def test_format_selection_issue_10083(self): def test_format_selection_issue_10083(self):
# See https://github.com/ytdl-org/youtube-dl/issues/10083 # See https://github.com/rg3/youtube-dl/issues/10083
formats = [ formats = [
{'format_id': 'regular', 'height': 360, 'url': TEST_URL}, {'format_id': 'regular', 'height': 360, 'url': TEST_URL},
{'format_id': 'video', 'height': 720, 'acodec': 'none', 'url': TEST_URL}, {'format_id': 'video', 'height': 720, 'acodec': 'none', 'url': TEST_URL},
@ -816,15 +746,11 @@ class TestYoutubeDL(unittest.TestCase):
'webpage_url': 'http://example.com', 'webpage_url': 'http://example.com',
} }
def get_downloaded_info_dicts(params):
ydl = YDL(params)
# make a deep copy because the dictionary and nested entries
# can be modified
ydl.process_ie_result(copy.deepcopy(playlist))
return ydl.downloaded_info_dicts
def get_ids(params): def get_ids(params):
return [int(v['id']) for v in get_downloaded_info_dicts(params)] ydl = YDL(params)
# make a copy because the dictionary can be modified
ydl.process_ie_result(playlist.copy())
return [int(v['id']) for v in ydl.downloaded_info_dicts]
result = get_ids({}) result = get_ids({})
self.assertEqual(result, [1, 2, 3, 4]) self.assertEqual(result, [1, 2, 3, 4])
@ -856,24 +782,8 @@ class TestYoutubeDL(unittest.TestCase):
result = get_ids({'playlist_items': '2-4,3-4,3'}) result = get_ids({'playlist_items': '2-4,3-4,3'})
self.assertEqual(result, [2, 3, 4]) self.assertEqual(result, [2, 3, 4])
# Tests for https://github.com/ytdl-org/youtube-dl/issues/10591
# @{
result = get_downloaded_info_dicts({'playlist_items': '2-4,3-4,3'})
self.assertEqual(result[0]['playlist_index'], 2)
self.assertEqual(result[1]['playlist_index'], 3)
result = get_downloaded_info_dicts({'playlist_items': '2-4,3-4,3'})
self.assertEqual(result[0]['playlist_index'], 2)
self.assertEqual(result[1]['playlist_index'], 3)
self.assertEqual(result[2]['playlist_index'], 4)
result = get_downloaded_info_dicts({'playlist_items': '4,2'})
self.assertEqual(result[0]['playlist_index'], 4)
self.assertEqual(result[1]['playlist_index'], 2)
# @}
def test_urlopen_no_file_protocol(self): def test_urlopen_no_file_protocol(self):
# see https://github.com/ytdl-org/youtube-dl/issues/8227 # see https://github.com/rg3/youtube-dl/issues/8227
ydl = YDL() ydl = YDL()
self.assertRaises(compat_urllib_error.URLError, ydl.urlopen, 'file:///etc/passwd') self.assertRaises(compat_urllib_error.URLError, ydl.urlopen, 'file:///etc/passwd')

View file

@ -1,51 +0,0 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
import os
import re
import sys
import tempfile
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.utils import YoutubeDLCookieJar
class TestYoutubeDLCookieJar(unittest.TestCase):
def test_keep_session_cookies(self):
cookiejar = YoutubeDLCookieJar('./test/testdata/cookies/session_cookies.txt')
cookiejar.load(ignore_discard=True, ignore_expires=True)
tf = tempfile.NamedTemporaryFile(delete=False)
try:
cookiejar.save(filename=tf.name, ignore_discard=True, ignore_expires=True)
temp = tf.read().decode('utf-8')
self.assertTrue(re.search(
r'www\.foobar\.foobar\s+FALSE\s+/\s+TRUE\s+0\s+YoutubeDLExpiresEmpty\s+YoutubeDLExpiresEmptyValue', temp))
self.assertTrue(re.search(
r'www\.foobar\.foobar\s+FALSE\s+/\s+TRUE\s+0\s+YoutubeDLExpires0\s+YoutubeDLExpires0Value', temp))
finally:
tf.close()
os.remove(tf.name)
def test_strip_httponly_prefix(self):
cookiejar = YoutubeDLCookieJar('./test/testdata/cookies/httponly_cookies.txt')
cookiejar.load(ignore_discard=True, ignore_expires=True)
def assert_cookie_has_value(key):
self.assertEqual(cookiejar._cookies['www.foobar.foobar']['/'][key].value, key + '_VALUE')
assert_cookie_has_value('HTTPONLY_COOKIE')
assert_cookie_has_value('JS_ACCESSIBLE_COOKIE')
def test_malformed_cookies(self):
cookiejar = YoutubeDLCookieJar('./test/testdata/cookies/malformed_cookies.txt')
cookiejar.load(ignore_discard=True, ignore_expires=True)
# Cookies should be empty since all malformed cookie file entries
# will be ignored
self.assertFalse(cookiejar._cookies)
if __name__ == '__main__':
unittest.main()

View file

@ -44,16 +44,16 @@ class TestAES(unittest.TestCase):
def test_decrypt_text(self): def test_decrypt_text(self):
password = intlist_to_bytes(self.key).decode('utf-8') password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) intlist_to_bytes(self.iv[:8]) +
+ b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae' b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae'
).decode('utf-8') ).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 16)) decrypted = (aes_decrypt_text(encrypted, password, 16))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)
password = intlist_to_bytes(self.key).decode('utf-8') password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) intlist_to_bytes(self.iv[:8]) +
+ b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83' b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83'
).decode('utf-8') ).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 32)) decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)

View file

@ -110,7 +110,7 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch('https://vimeo.com/user7108434/videos', ['vimeo:user']) self.assertMatch('https://vimeo.com/user7108434/videos', ['vimeo:user'])
self.assertMatch('https://vimeo.com/user21297594/review/75524534/3c257a1b5d', ['vimeo:review']) self.assertMatch('https://vimeo.com/user21297594/review/75524534/3c257a1b5d', ['vimeo:review'])
# https://github.com/ytdl-org/youtube-dl/issues/1930 # https://github.com/rg3/youtube-dl/issues/1930
def test_soundcloud_not_matching_sets(self): def test_soundcloud_not_matching_sets(self):
self.assertMatch('http://soundcloud.com/floex/sets/gone-ep', ['soundcloud:set']) self.assertMatch('http://soundcloud.com/floex/sets/gone-ep', ['soundcloud:set'])
@ -119,10 +119,16 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch('http://tatianamaslanydaily.tumblr.com/post/54196191430', ['Tumblr']) self.assertMatch('http://tatianamaslanydaily.tumblr.com/post/54196191430', ['Tumblr'])
def test_pbs(self): def test_pbs(self):
# https://github.com/ytdl-org/youtube-dl/issues/2350 # https://github.com/rg3/youtube-dl/issues/2350
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs']) self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs'])
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs']) self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs'])
def test_yahoo_https(self):
# https://github.com/rg3/youtube-dl/issues/2701
self.assertMatch(
'https://screen.yahoo.com/smartwatches-latest-wearable-gadgets-163745379-cbs.html',
['Yahoo'])
def test_no_duplicated_ie_names(self): def test_no_duplicated_ie_names(self):
name_accu = collections.defaultdict(list) name_accu = collections.defaultdict(list)
for ie in self.ies: for ie in self.ies:

View file

@ -13,7 +13,6 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_getenv, compat_getenv,
compat_setenv, compat_setenv,
compat_etree_Element,
compat_etree_fromstring, compat_etree_fromstring,
compat_expanduser, compat_expanduser,
compat_shlex_split, compat_shlex_split,
@ -40,7 +39,7 @@ class TestCompat(unittest.TestCase):
def test_compat_expanduser(self): def test_compat_expanduser(self):
old_home = os.environ.get('HOME') old_home = os.environ.get('HOME')
test_str = r'C:\Documents and Settings\тест\Application Data' test_str = 'C:\Documents and Settings\тест\Application Data'
compat_setenv('HOME', test_str) compat_setenv('HOME', test_str)
self.assertEqual(compat_expanduser('~'), test_str) self.assertEqual(compat_expanduser('~'), test_str)
compat_setenv('HOME', old_home or '') compat_setenv('HOME', old_home or '')
@ -91,12 +90,6 @@ class TestCompat(unittest.TestCase):
self.assertEqual(compat_shlex_split('-option "one\ntwo" \n -flag'), ['-option', 'one\ntwo', '-flag']) self.assertEqual(compat_shlex_split('-option "one\ntwo" \n -flag'), ['-option', 'one\ntwo', '-flag'])
self.assertEqual(compat_shlex_split('-val 中文'), ['-val', '中文']) self.assertEqual(compat_shlex_split('-val 中文'), ['-val', '中文'])
def test_compat_etree_Element(self):
try:
compat_etree_Element.items
except AttributeError:
self.fail('compat_etree_Element is not a type')
def test_compat_etree_fromstring(self): def test_compat_etree_fromstring(self):
xml = ''' xml = '''
<root foo="bar" spam="中文"> <root foo="bar" spam="中文">

View file

@ -92,8 +92,8 @@ class TestDownload(unittest.TestCase):
def generator(test_case, tname): def generator(test_case, tname):
def test_template(self): def test_template(self):
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])() ie = youtube_dl.extractor.get_info_extractor(test_case['name'])
other_ies = [get_info_extractor(ie_key)() for ie_key in test_case.get('add_ie', [])] other_ies = [get_info_extractor(ie_key) for ie_key in test_case.get('add_ie', [])]
is_playlist = any(k.startswith('playlist') for k in test_case) is_playlist = any(k.startswith('playlist') for k in test_case)
test_cases = test_case.get( test_cases = test_case.get(
'playlist', [] if is_playlist else [test_case]) 'playlist', [] if is_playlist else [test_case])

View file

@ -1,115 +0,0 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import re
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import http_server_port, try_rm
from youtube_dl import YoutubeDL
from youtube_dl.compat import compat_http_server
from youtube_dl.downloader.http import HttpFD
from youtube_dl.utils import encodeFilename
import threading
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
TEST_SIZE = 10 * 1024
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
def log_message(self, format, *args):
pass
def send_content_range(self, total=None):
range_header = self.headers.get('Range')
start = end = None
if range_header:
mobj = re.search(r'^bytes=(\d+)-(\d+)', range_header)
if mobj:
start = int(mobj.group(1))
end = int(mobj.group(2))
valid_range = start is not None and end is not None
if valid_range:
content_range = 'bytes %d-%d' % (start, end)
if total:
content_range += '/%d' % total
self.send_header('Content-Range', content_range)
return (end - start + 1) if valid_range else total
def serve(self, range=True, content_length=True):
self.send_response(200)
self.send_header('Content-Type', 'video/mp4')
size = TEST_SIZE
if range:
size = self.send_content_range(TEST_SIZE)
if content_length:
self.send_header('Content-Length', size)
self.end_headers()
self.wfile.write(b'#' * size)
def do_GET(self):
if self.path == '/regular':
self.serve()
elif self.path == '/no-content-length':
self.serve(content_length=False)
elif self.path == '/no-range':
self.serve(range=False)
elif self.path == '/no-range-no-content-length':
self.serve(range=False, content_length=False)
else:
assert False
class FakeLogger(object):
def debug(self, msg):
pass
def warning(self, msg):
pass
def error(self, msg):
pass
class TestHttpFD(unittest.TestCase):
def setUp(self):
self.httpd = compat_http_server.HTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler)
self.port = http_server_port(self.httpd)
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True
self.server_thread.start()
def download(self, params, ep):
params['logger'] = FakeLogger()
ydl = YoutubeDL(params)
downloader = HttpFD(ydl, params)
filename = 'testfile.mp4'
try_rm(encodeFilename(filename))
self.assertTrue(downloader.real_download(filename, {
'url': 'http://127.0.0.1:%d/%s' % (self.port, ep),
}))
self.assertEqual(os.path.getsize(encodeFilename(filename)), TEST_SIZE)
try_rm(encodeFilename(filename))
def download_all(self, params):
for ep in ('regular', 'no-content-length', 'no-range', 'no-range-no-content-length'):
self.download(params, ep)
def test_regular(self):
self.download_all({})
def test_chunked(self):
self.download_all({
'http_chunk_size': 1000,
})
if __name__ == '__main__':
unittest.main()

View file

@ -8,7 +8,6 @@ import sys
import unittest import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import http_server_port
from youtube_dl import YoutubeDL from youtube_dl import YoutubeDL
from youtube_dl.compat import compat_http_server, compat_urllib_request from youtube_dl.compat import compat_http_server, compat_urllib_request
import ssl import ssl
@ -17,6 +16,15 @@ import threading
TEST_DIR = os.path.dirname(os.path.abspath(__file__)) TEST_DIR = os.path.dirname(os.path.abspath(__file__))
def http_server_port(httpd):
if os.name == 'java' and isinstance(httpd.socket, ssl.SSLSocket):
# In Jython SSLSocket is not a subclass of socket.socket
sock = httpd.socket.sock
else:
sock = httpd.socket
return sock.getsockname()[1]
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler): class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
def log_message(self, format, *args): def log_message(self, format, *args):
pass pass
@ -39,7 +47,7 @@ class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
self.end_headers() self.end_headers()
return return
new_url = 'http://127.0.0.1:%d/中文.html' % http_server_port(self.server) new_url = 'http://localhost:%d/中文.html' % http_server_port(self.server)
self.send_response(302) self.send_response(302)
self.send_header(b'Location', new_url.encode('utf-8')) self.send_header(b'Location', new_url.encode('utf-8'))
self.end_headers() self.end_headers()
@ -66,7 +74,7 @@ class FakeLogger(object):
class TestHTTP(unittest.TestCase): class TestHTTP(unittest.TestCase):
def setUp(self): def setUp(self):
self.httpd = compat_http_server.HTTPServer( self.httpd = compat_http_server.HTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler) ('localhost', 0), HTTPTestRequestHandler)
self.port = http_server_port(self.httpd) self.port = http_server_port(self.httpd)
self.server_thread = threading.Thread(target=self.httpd.serve_forever) self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True self.server_thread.daemon = True
@ -78,15 +86,15 @@ class TestHTTP(unittest.TestCase):
return return
ydl = YoutubeDL({'logger': FakeLogger()}) ydl = YoutubeDL({'logger': FakeLogger()})
r = ydl.extract_info('http://127.0.0.1:%d/302' % self.port) r = ydl.extract_info('http://localhost:%d/302' % self.port)
self.assertEqual(r['entries'][0]['url'], 'http://127.0.0.1:%d/vid.mp4' % self.port) self.assertEqual(r['entries'][0]['url'], 'http://localhost:%d/vid.mp4' % self.port)
class TestHTTPS(unittest.TestCase): class TestHTTPS(unittest.TestCase):
def setUp(self): def setUp(self):
certfn = os.path.join(TEST_DIR, 'testcert.pem') certfn = os.path.join(TEST_DIR, 'testcert.pem')
self.httpd = compat_http_server.HTTPServer( self.httpd = compat_http_server.HTTPServer(
('127.0.0.1', 0), HTTPTestRequestHandler) ('localhost', 0), HTTPTestRequestHandler)
self.httpd.socket = ssl.wrap_socket( self.httpd.socket = ssl.wrap_socket(
self.httpd.socket, certfile=certfn, server_side=True) self.httpd.socket, certfile=certfn, server_side=True)
self.port = http_server_port(self.httpd) self.port = http_server_port(self.httpd)
@ -99,11 +107,11 @@ class TestHTTPS(unittest.TestCase):
ydl = YoutubeDL({'logger': FakeLogger()}) ydl = YoutubeDL({'logger': FakeLogger()})
self.assertRaises( self.assertRaises(
Exception, Exception,
ydl.extract_info, 'https://127.0.0.1:%d/video.html' % self.port) ydl.extract_info, 'https://localhost:%d/video.html' % self.port)
ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True}) ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True})
r = ydl.extract_info('https://127.0.0.1:%d/video.html' % self.port) r = ydl.extract_info('https://localhost:%d/video.html' % self.port)
self.assertEqual(r['entries'][0]['url'], 'https://127.0.0.1:%d/vid.mp4' % self.port) self.assertEqual(r['entries'][0]['url'], 'https://localhost:%d/vid.mp4' % self.port)
def _build_proxy_handler(name): def _build_proxy_handler(name):
@ -124,23 +132,23 @@ def _build_proxy_handler(name):
class TestProxy(unittest.TestCase): class TestProxy(unittest.TestCase):
def setUp(self): def setUp(self):
self.proxy = compat_http_server.HTTPServer( self.proxy = compat_http_server.HTTPServer(
('127.0.0.1', 0), _build_proxy_handler('normal')) ('localhost', 0), _build_proxy_handler('normal'))
self.port = http_server_port(self.proxy) self.port = http_server_port(self.proxy)
self.proxy_thread = threading.Thread(target=self.proxy.serve_forever) self.proxy_thread = threading.Thread(target=self.proxy.serve_forever)
self.proxy_thread.daemon = True self.proxy_thread.daemon = True
self.proxy_thread.start() self.proxy_thread.start()
self.geo_proxy = compat_http_server.HTTPServer( self.geo_proxy = compat_http_server.HTTPServer(
('127.0.0.1', 0), _build_proxy_handler('geo')) ('localhost', 0), _build_proxy_handler('geo'))
self.geo_port = http_server_port(self.geo_proxy) self.geo_port = http_server_port(self.geo_proxy)
self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever) self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever)
self.geo_proxy_thread.daemon = True self.geo_proxy_thread.daemon = True
self.geo_proxy_thread.start() self.geo_proxy_thread.start()
def test_proxy(self): def test_proxy(self):
geo_proxy = '127.0.0.1:{0}'.format(self.geo_port) geo_proxy = 'localhost:{0}'.format(self.geo_port)
ydl = YoutubeDL({ ydl = YoutubeDL({
'proxy': '127.0.0.1:{0}'.format(self.port), 'proxy': 'localhost:{0}'.format(self.port),
'geo_verification_proxy': geo_proxy, 'geo_verification_proxy': geo_proxy,
}) })
url = 'http://foo.com/bar' url = 'http://foo.com/bar'
@ -154,7 +162,7 @@ class TestProxy(unittest.TestCase):
def test_proxy_with_idn(self): def test_proxy_with_idn(self):
ydl = YoutubeDL({ ydl = YoutubeDL({
'proxy': '127.0.0.1:{0}'.format(self.port), 'proxy': 'localhost:{0}'.format(self.port),
}) })
url = 'http://中文.tw/' url = 'http://中文.tw/'
response = ydl.urlopen(url).read().decode('utf-8') response = ydl.urlopen(url).read().decode('utf-8')

View file

@ -14,4 +14,4 @@ from youtube_dl.postprocessor import MetadataFromTitlePP
class TestMetadataFromTitle(unittest.TestCase): class TestMetadataFromTitle(unittest.TestCase):
def test_format_to_regex(self): def test_format_to_regex(self):
pp = MetadataFromTitlePP(None, '%(title)s - %(artist)s') pp = MetadataFromTitlePP(None, '%(title)s - %(artist)s')
self.assertEqual(pp._titleregex, r'(?P<title>.+)\ \-\ (?P<artist>.+)') self.assertEqual(pp._titleregex, '(?P<title>.+)\ \-\ (?P<artist>.+)')

View file

@ -26,6 +26,7 @@ from youtube_dl.extractor import (
ThePlatformIE, ThePlatformIE,
ThePlatformFeedIE, ThePlatformFeedIE,
RTVEALaCartaIE, RTVEALaCartaIE,
FunnyOrDieIE,
DemocracynowIE, DemocracynowIE,
) )
@ -231,7 +232,7 @@ class TestNPOSubtitles(BaseTestSubtitles):
class TestMTVSubtitles(BaseTestSubtitles): class TestMTVSubtitles(BaseTestSubtitles):
url = 'http://www.cc.com/video-clips/p63lk0/adam-devine-s-house-party-chasing-white-swans' url = 'http://www.cc.com/video-clips/kllhuv/stand-up-greg-fitzsimmons--uncensored---too-good-of-a-mother'
IE = ComedyCentralIE IE = ComedyCentralIE
def getInfoDict(self): def getInfoDict(self):
@ -242,7 +243,7 @@ class TestMTVSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles() subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en'])) self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), '78206b8d8a0cfa9da64dc026eea48961') self.assertEqual(md5(subtitles['en']), 'b9f6ca22a6acf597ec76f61749765e65')
class TestNRKSubtitles(BaseTestSubtitles): class TestNRKSubtitles(BaseTestSubtitles):
@ -321,6 +322,18 @@ class TestRtveSubtitles(BaseTestSubtitles):
self.assertEqual(md5(subtitles['es']), '69e70cae2d40574fb7316f31d6eb7fca') self.assertEqual(md5(subtitles['es']), '69e70cae2d40574fb7316f31d6eb7fca')
class TestFunnyOrDieSubtitles(BaseTestSubtitles):
url = 'http://www.funnyordie.com/videos/224829ff6d/judd-apatow-will-direct-your-vine'
IE = FunnyOrDieIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), 'c5593c193eacd353596c11c2d4f9ecc4')
class TestDemocracynowSubtitles(BaseTestSubtitles): class TestDemocracynowSubtitles(BaseTestSubtitles):
url = 'http://www.democracynow.org/shows/2015/7/3' url = 'http://www.democracynow.org/shows/2015/7/3'
IE = DemocracynowIE IE = DemocracynowIE

View file

@ -34,8 +34,8 @@ def _make_testfunc(testfile):
def test_func(self): def test_func(self):
as_file = os.path.join(TEST_DIR, testfile) as_file = os.path.join(TEST_DIR, testfile)
swf_file = os.path.join(TEST_DIR, test_id + '.swf') swf_file = os.path.join(TEST_DIR, test_id + '.swf')
if ((not os.path.exists(swf_file)) if ((not os.path.exists(swf_file)) or
or os.path.getmtime(swf_file) < os.path.getmtime(as_file)): os.path.getmtime(swf_file) < os.path.getmtime(as_file)):
# Recompile # Recompile
try: try:
subprocess.check_call([ subprocess.check_call([

View file

@ -19,7 +19,6 @@ from youtube_dl.utils import (
age_restricted, age_restricted,
args_to_str, args_to_str,
encode_base_n, encode_base_n,
caesar,
clean_html, clean_html,
date_from_str, date_from_str,
DateRange, DateRange,
@ -34,18 +33,15 @@ from youtube_dl.utils import (
ExtractorError, ExtractorError,
find_xpath_attr, find_xpath_attr,
fix_xml_ampersands, fix_xml_ampersands,
float_or_none,
get_element_by_class, get_element_by_class,
get_element_by_attribute, get_element_by_attribute,
get_elements_by_class, get_elements_by_class,
get_elements_by_attribute, get_elements_by_attribute,
InAdvancePagedList, InAdvancePagedList,
int_or_none,
intlist_to_bytes, intlist_to_bytes,
is_html, is_html,
js_to_json, js_to_json,
limit_length, limit_length,
merge_dicts,
mimetype2ext, mimetype2ext,
month_by_name, month_by_name,
multipart_encode, multipart_encode,
@ -57,26 +53,20 @@ from youtube_dl.utils import (
parse_filesize, parse_filesize,
parse_count, parse_count,
parse_iso8601, parse_iso8601,
parse_resolution,
parse_bitrate,
pkcs1pad, pkcs1pad,
read_batch_urls, read_batch_urls,
sanitize_filename, sanitize_filename,
sanitize_path, sanitize_path,
sanitize_url,
expand_path, expand_path,
prepend_extension, prepend_extension,
replace_extension, replace_extension,
remove_start, remove_start,
remove_end, remove_end,
remove_quotes, remove_quotes,
rot47,
shell_quote, shell_quote,
smuggle_url, smuggle_url,
str_to_int, str_to_int,
strip_jsonp, strip_jsonp,
strip_or_none,
subtitles_filename,
timeconvert, timeconvert,
unescapeHTML, unescapeHTML,
unified_strdate, unified_strdate,
@ -85,7 +75,6 @@ from youtube_dl.utils import (
uppercase_escape, uppercase_escape,
lowercase_escape, lowercase_escape,
url_basename, url_basename,
url_or_none,
base_url, base_url,
urljoin, urljoin,
urlencode_postdata, urlencode_postdata,
@ -187,7 +176,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_filename( self.assertEqual(sanitize_filename(
'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ', restricted=True), 'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ', restricted=True),
'AAAAAAAECEEEEIIIIDNOOOOOOOOEUUUUUYTHssaaaaaaaeceeeeiiiionooooooooeuuuuuythy') 'AAAAAAAECEEEEIIIIDNOOOOOOOOEUUUUUYPssaaaaaaaeceeeeiiiionooooooooeuuuuuypy')
def test_sanitize_ids(self): def test_sanitize_ids(self):
self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw') self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw')
@ -230,12 +219,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_path('./abc'), 'abc') self.assertEqual(sanitize_path('./abc'), 'abc')
self.assertEqual(sanitize_path('./../abc'), '..\\abc') self.assertEqual(sanitize_path('./../abc'), '..\\abc')
def test_sanitize_url(self):
self.assertEqual(sanitize_url('//foo.bar'), 'http://foo.bar')
self.assertEqual(sanitize_url('httpss://foo.bar'), 'https://foo.bar')
self.assertEqual(sanitize_url('rmtps://foo.bar'), 'rtmps://foo.bar')
self.assertEqual(sanitize_url('https://foo.bar'), 'https://foo.bar')
def test_expand_path(self): def test_expand_path(self):
def env(var): def env(var):
return '%{0}%'.format(var) if sys.platform == 'win32' else '${0}'.format(var) return '%{0}%'.format(var) if sys.platform == 'win32' else '${0}'.format(var)
@ -264,11 +247,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp') self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp')
self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp') self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp')
def test_subtitles_filename(self):
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt'), 'abc.en.vtt')
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt', 'ext'), 'abc.en.vtt')
self.assertEqual(subtitles_filename('abc.unexpected_ext', 'en', 'vtt', 'ext'), 'abc.unexpected_ext.en.vtt')
def test_remove_start(self): def test_remove_start(self):
self.assertEqual(remove_start(None, 'A - '), None) self.assertEqual(remove_start(None, 'A - '), None)
self.assertEqual(remove_start('A - B', 'A - '), 'B') self.assertEqual(remove_start('A - B', 'A - '), 'B')
@ -342,8 +320,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_strdate('July 15th, 2013'), '20130715') self.assertEqual(unified_strdate('July 15th, 2013'), '20130715')
self.assertEqual(unified_strdate('September 1st, 2013'), '20130901') self.assertEqual(unified_strdate('September 1st, 2013'), '20130901')
self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902') self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902')
self.assertEqual(unified_strdate('November 3rd, 2019'), '20191103')
self.assertEqual(unified_strdate('October 23rd, 2005'), '20051023')
def test_unified_timestamps(self): def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600) self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)
@ -368,7 +344,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361) self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540) self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)
self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140) self.assertEqual(unified_timestamp('December 15, 2017 at 7:49 am'), 1513324140)
self.assertEqual(unified_timestamp('2018-03-14T08:32:43.1493874+00:00'), 1521016363)
def test_determine_ext(self): def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4') self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
@ -376,7 +351,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(determine_ext('http://example.com/foo/bar.nonext/?download', None), None) self.assertEqual(determine_ext('http://example.com/foo/bar.nonext/?download', None), None)
self.assertEqual(determine_ext('http://example.com/foo/bar/mp4?download', None), None) self.assertEqual(determine_ext('http://example.com/foo/bar/mp4?download', None), None)
self.assertEqual(determine_ext('http://example.com/foo/bar.m3u8//?download'), 'm3u8') self.assertEqual(determine_ext('http://example.com/foo/bar.m3u8//?download'), 'm3u8')
self.assertEqual(determine_ext('foobar', None), None)
def test_find_xpath_attr(self): def test_find_xpath_attr(self):
testxml = '''<root> testxml = '''<root>
@ -481,30 +455,9 @@ class TestUtil(unittest.TestCase):
shell_quote(args), shell_quote(args),
"""ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''') """ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''')
def test_float_or_none(self):
self.assertEqual(float_or_none('42.42'), 42.42)
self.assertEqual(float_or_none('42'), 42.0)
self.assertEqual(float_or_none(''), None)
self.assertEqual(float_or_none(None), None)
self.assertEqual(float_or_none([]), None)
self.assertEqual(float_or_none(set()), None)
def test_int_or_none(self):
self.assertEqual(int_or_none('42'), 42)
self.assertEqual(int_or_none(''), None)
self.assertEqual(int_or_none(None), None)
self.assertEqual(int_or_none([]), None)
self.assertEqual(int_or_none(set()), None)
def test_str_to_int(self): def test_str_to_int(self):
self.assertEqual(str_to_int('123,456'), 123456) self.assertEqual(str_to_int('123,456'), 123456)
self.assertEqual(str_to_int('123.456'), 123456) self.assertEqual(str_to_int('123.456'), 123456)
self.assertEqual(str_to_int(523), 523)
# Python 3 has no long
if sys.version_info < (3, 0):
eval('self.assertEqual(str_to_int(123456L), 123456)')
self.assertEqual(str_to_int('noninteger'), None)
self.assertEqual(str_to_int([]), None)
def test_url_basename(self): def test_url_basename(self):
self.assertEqual(url_basename('http://foo.de/'), '') self.assertEqual(url_basename('http://foo.de/'), '')
@ -542,18 +495,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(urljoin('http://foo.de/', ''), None) self.assertEqual(urljoin('http://foo.de/', ''), None)
self.assertEqual(urljoin('http://foo.de/', ['foobar']), None) self.assertEqual(urljoin('http://foo.de/', ['foobar']), None)
self.assertEqual(urljoin('http://foo.de/a/b/c.txt', '.././../d.txt'), 'http://foo.de/d.txt') self.assertEqual(urljoin('http://foo.de/a/b/c.txt', '.././../d.txt'), 'http://foo.de/d.txt')
self.assertEqual(urljoin('http://foo.de/a/b/c.txt', 'rtmp://foo.de'), 'rtmp://foo.de')
self.assertEqual(urljoin(None, 'rtmp://foo.de'), 'rtmp://foo.de')
def test_url_or_none(self):
self.assertEqual(url_or_none(None), None)
self.assertEqual(url_or_none(''), None)
self.assertEqual(url_or_none('foo'), None)
self.assertEqual(url_or_none('http://foo.de'), 'http://foo.de')
self.assertEqual(url_or_none('https://foo.de'), 'https://foo.de')
self.assertEqual(url_or_none('http$://foo.de'), None)
self.assertEqual(url_or_none('http://foo.de'), 'http://foo.de')
self.assertEqual(url_or_none('//foo.de'), '//foo.de')
def test_parse_age_limit(self): def test_parse_age_limit(self):
self.assertEqual(parse_age_limit(None), None) self.assertEqual(parse_age_limit(None), None)
@ -568,8 +509,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_age_limit('PG-13'), 13) self.assertEqual(parse_age_limit('PG-13'), 13)
self.assertEqual(parse_age_limit('TV-14'), 14) self.assertEqual(parse_age_limit('TV-14'), 14)
self.assertEqual(parse_age_limit('TV-MA'), 17) self.assertEqual(parse_age_limit('TV-MA'), 17)
self.assertEqual(parse_age_limit('TV14'), 14)
self.assertEqual(parse_age_limit('TV_G'), 0)
def test_parse_duration(self): def test_parse_duration(self):
self.assertEqual(parse_duration(None), None) self.assertEqual(parse_duration(None), None)
@ -721,17 +660,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(dict_get(d, ('b', 'c', key, )), None) self.assertEqual(dict_get(d, ('b', 'c', key, )), None)
self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value) self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value)
def test_merge_dicts(self):
self.assertEqual(merge_dicts({'a': 1}, {'b': 2}), {'a': 1, 'b': 2})
self.assertEqual(merge_dicts({'a': 1}, {'a': 2}), {'a': 1})
self.assertEqual(merge_dicts({'a': 1}, {'a': None}), {'a': 1})
self.assertEqual(merge_dicts({'a': 1}, {'a': ''}), {'a': 1})
self.assertEqual(merge_dicts({'a': 1}, {}), {'a': 1})
self.assertEqual(merge_dicts({'a': None}, {'a': 1}), {'a': 1})
self.assertEqual(merge_dicts({'a': ''}, {'a': 1}), {'a': ''})
self.assertEqual(merge_dicts({'a': ''}, {'a': 'abc'}), {'a': 'abc'})
self.assertEqual(merge_dicts({'a': None}, {'a': ''}, {'a': 'abc'}), {'a': 'abc'})
def test_encode_compat_str(self): def test_encode_compat_str(self):
self.assertEqual(encode_compat_str(b'\xd1\x82\xd0\xb5\xd1\x81\xd1\x82', 'utf-8'), 'тест') self.assertEqual(encode_compat_str(b'\xd1\x82\xd0\xb5\xd1\x81\xd1\x82', 'utf-8'), 'тест')
self.assertEqual(encode_compat_str('тест', 'utf-8'), 'тест') self.assertEqual(encode_compat_str('тест', 'utf-8'), 'тест')
@ -765,22 +693,6 @@ class TestUtil(unittest.TestCase):
d = json.loads(stripped) d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'}) self.assertEqual(d, {'status': 'success'})
stripped = strip_jsonp('({"status": "success"});')
d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'})
def test_strip_or_none(self):
self.assertEqual(strip_or_none(' abc'), 'abc')
self.assertEqual(strip_or_none('abc '), 'abc')
self.assertEqual(strip_or_none(' abc '), 'abc')
self.assertEqual(strip_or_none('\tabc\t'), 'abc')
self.assertEqual(strip_or_none('\n\tabc\n\t'), 'abc')
self.assertEqual(strip_or_none('abc'), 'abc')
self.assertEqual(strip_or_none(''), '')
self.assertEqual(strip_or_none(None), None)
self.assertEqual(strip_or_none(42), None)
self.assertEqual(strip_or_none([]), None)
def test_uppercase_escape(self): def test_uppercase_escape(self):
self.assertEqual(uppercase_escape(''), '') self.assertEqual(uppercase_escape(''), '')
self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐') self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐')
@ -803,8 +715,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(mimetype2ext('text/vtt'), 'vtt') self.assertEqual(mimetype2ext('text/vtt'), 'vtt')
self.assertEqual(mimetype2ext('text/vtt;charset=utf-8'), 'vtt') self.assertEqual(mimetype2ext('text/vtt;charset=utf-8'), 'vtt')
self.assertEqual(mimetype2ext('text/html; charset=utf-8'), 'html') self.assertEqual(mimetype2ext('text/html; charset=utf-8'), 'html')
self.assertEqual(mimetype2ext('audio/x-wav'), 'wav')
self.assertEqual(mimetype2ext('audio/x-wav;codec=pcm'), 'wav')
def test_month_by_name(self): def test_month_by_name(self):
self.assertEqual(month_by_name(None), None) self.assertEqual(month_by_name(None), None)
@ -836,19 +746,6 @@ class TestUtil(unittest.TestCase):
'vcodec': 'h264', 'vcodec': 'h264',
'acodec': 'aac', 'acodec': 'aac',
}) })
self.assertEqual(parse_codecs('av01.0.05M.08'), {
'vcodec': 'av01.0.05M.08',
'acodec': 'none',
})
self.assertEqual(parse_codecs('theora, vorbis'), {
'vcodec': 'theora',
'acodec': 'vorbis',
})
self.assertEqual(parse_codecs('unknownvcodec, unknownacodec'), {
'vcodec': 'unknownvcodec',
'acodec': 'unknownacodec',
})
self.assertEqual(parse_codecs('unknown'), {})
def test_escape_rfc3986(self): def test_escape_rfc3986(self):
reserved = "!*'();:@&=+$,/?#[]" reserved = "!*'();:@&=+$,/?#[]"
@ -917,9 +814,6 @@ class TestUtil(unittest.TestCase):
inp = '''{"duration": "00:01:07"}''' inp = '''{"duration": "00:01:07"}'''
self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''') self.assertEqual(js_to_json(inp), '''{"duration": "00:01:07"}''')
inp = '''{segments: [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}'''
self.assertEqual(js_to_json(inp), '''{"segments": [{"offset":-3.885780586188048e-16,"duration":39.75000000000001}]}''')
def test_js_to_json_edgecases(self): def test_js_to_json_edgecases(self):
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}") on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"}) self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
@ -991,19 +885,6 @@ class TestUtil(unittest.TestCase):
on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}') on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}')
self.assertEqual(json.loads(on), {'42': 42}) self.assertEqual(json.loads(on), {'42': 42})
on = js_to_json('{42:4.2e1}')
self.assertEqual(json.loads(on), {'42': 42.0})
on = js_to_json('{ "0x40": "0x40" }')
self.assertEqual(json.loads(on), {'0x40': '0x40'})
on = js_to_json('{ "040": "040" }')
self.assertEqual(json.loads(on), {'040': '040'})
def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1')
def test_extract_attributes(self): def test_extract_attributes(self):
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'}) self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'}) self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
@ -1084,23 +965,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_count('1.1kk '), 1100000) self.assertEqual(parse_count('1.1kk '), 1100000)
self.assertEqual(parse_count('1.1kk views'), 1100000) self.assertEqual(parse_count('1.1kk views'), 1100000)
def test_parse_resolution(self):
self.assertEqual(parse_resolution(None), {})
self.assertEqual(parse_resolution(''), {})
self.assertEqual(parse_resolution('1920x1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('1920×1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('1920 x 1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('720p'), {'height': 720})
self.assertEqual(parse_resolution('4k'), {'height': 2160})
self.assertEqual(parse_resolution('8K'), {'height': 4320})
def test_parse_bitrate(self):
self.assertEqual(parse_bitrate(None), None)
self.assertEqual(parse_bitrate(''), None)
self.assertEqual(parse_bitrate('300kbps'), 300)
self.assertEqual(parse_bitrate('1500kbps'), 1500)
self.assertEqual(parse_bitrate('300 kbps'), 300)
def test_version_tuple(self): def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,)) self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344)) self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
@ -1179,18 +1043,6 @@ ffmpeg version 2.4.4 Copyright (c) 2000-2014 the FFmpeg ...'''), '2.4.4')
self.assertFalse(match_str( self.assertFalse(match_str(
'like_count > 100 & dislike_count <? 50 & description', 'like_count > 100 & dislike_count <? 50 & description',
{'like_count': 190, 'dislike_count': 10})) {'like_count': 190, 'dislike_count': 10}))
self.assertTrue(match_str('is_live', {'is_live': True}))
self.assertFalse(match_str('is_live', {'is_live': False}))
self.assertFalse(match_str('is_live', {'is_live': None}))
self.assertFalse(match_str('is_live', {}))
self.assertFalse(match_str('!is_live', {'is_live': True}))
self.assertTrue(match_str('!is_live', {'is_live': False}))
self.assertTrue(match_str('!is_live', {'is_live': None}))
self.assertTrue(match_str('!is_live', {}))
self.assertTrue(match_str('title', {'title': 'abc'}))
self.assertTrue(match_str('title', {'title': ''}))
self.assertFalse(match_str('!title', {'title': 'abc'}))
self.assertFalse(match_str('!title', {'title': ''}))
def test_parse_dfxp_time_expr(self): def test_parse_dfxp_time_expr(self):
self.assertEqual(parse_dfxp_time_expr(None), None) self.assertEqual(parse_dfxp_time_expr(None), None)
@ -1385,20 +1237,6 @@ Line 1
self.assertRaises(ValueError, encode_base_n, 0, 70) self.assertRaises(ValueError, encode_base_n, 0, 70)
self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table) self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table)
def test_caesar(self):
self.assertEqual(caesar('ace', 'abcdef', 2), 'cea')
self.assertEqual(caesar('cea', 'abcdef', -2), 'ace')
self.assertEqual(caesar('ace', 'abcdef', -2), 'eac')
self.assertEqual(caesar('eac', 'abcdef', 2), 'ace')
self.assertEqual(caesar('ace', 'abcdef', 0), 'ace')
self.assertEqual(caesar('xyz', 'abcdef', 2), 'xyz')
self.assertEqual(caesar('abc', 'acegik', 2), 'ebg')
self.assertEqual(caesar('ebg', 'acegik', -2), 'abc')
def test_rot47(self):
self.assertEqual(rot47('youtube-dl'), r'J@FEF36\5=')
self.assertEqual(rot47('YOUTUBE-DL'), r'*~&%&qt\s{')
def test_urshift(self): def test_urshift(self):
self.assertEqual(urshift(3, 1), 1) self.assertEqual(urshift(3, 1), 1)
self.assertEqual(urshift(-3, 1), 2147483646) self.assertEqual(urshift(-3, 1), 2147483646)

View file

@ -267,7 +267,7 @@ class TestYoutubeChapters(unittest.TestCase):
for description, duration, expected_chapters in self._TEST_CASES: for description, duration, expected_chapters in self._TEST_CASES:
ie = YoutubeIE() ie = YoutubeIE()
expect_value( expect_value(
self, ie._extract_chapters_from_description(description, duration), self, ie._extract_chapters(description, duration),
expected_chapters, None) expected_chapters, None)

View file

@ -61,7 +61,7 @@ class TestYoutubeLists(unittest.TestCase):
dl = FakeYDL() dl = FakeYDL()
dl.params['extract_flat'] = True dl.params['extract_flat'] = True
ie = YoutubePlaylistIE(dl) ie = YoutubePlaylistIE(dl)
result = ie.extract('https://www.youtube.com/playlist?list=PL-KKIb8rvtMSrAO9YFbeM6UQrAqoFTUWv') result = ie.extract('https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re')
self.assertIsPlaylist(result) self.assertIsPlaylist(result)
for entry in result['entries']: for entry in result['entries']:
self.assertTrue(entry.get('title')) self.assertTrue(entry.get('title'))

View file

@ -74,28 +74,6 @@ _TESTS = [
] ]
class TestPlayerInfo(unittest.TestCase):
def test_youtube_extract_player_info(self):
PLAYER_URLS = (
('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/en_US/base.js', '64dddad9'),
# obsolete
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
('https://www.youtube.com/yts/jsbin/player_ias-vflCPQUIL/en_US/base.js', 'vflCPQUIL'),
('https://www.youtube.com/yts/jsbin/player-vflzQZbt7/en_US/base.js', 'vflzQZbt7'),
('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
('http://s.ytimg.com/yt/swfbin/watch_as3-vflrEm9Nq.swf', 'vflrEm9Nq'),
('https://s.ytimg.com/yts/swfbin/player-vflenCdZL/watch_as3.swf', 'vflenCdZL'),
)
for player_url, expected_player_id in PLAYER_URLS:
expected_player_type = player_url.split('.')[-1]
player_type, player_id = YoutubeIE._extract_player_info(player_url)
self.assertEqual(player_type, expected_player_type)
self.assertEqual(player_id, expected_player_id)
class TestSignature(unittest.TestCase): class TestSignature(unittest.TestCase):
def setUp(self): def setUp(self):
TEST_DIR = os.path.dirname(os.path.abspath(__file__)) TEST_DIR = os.path.dirname(os.path.abspath(__file__))

View file

@ -1,6 +0,0 @@
# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This is a generated file! Do not edit.
#HttpOnly_www.foobar.foobar FALSE / TRUE 2147483647 HTTPONLY_COOKIE HTTPONLY_COOKIE_VALUE
www.foobar.foobar FALSE / TRUE 2147483647 JS_ACCESSIBLE_COOKIE JS_ACCESSIBLE_COOKIE_VALUE

View file

@ -1,9 +0,0 @@
# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This is a generated file! Do not edit.
# Cookie file entry with invalid number of fields - 6 instead of 7
www.foobar.foobar FALSE / FALSE 0 COOKIE
# Cookie file entry with invalid expires at
www.foobar.foobar FALSE / FALSE 1.7976931348623157e+308 COOKIE VALUE

View file

@ -1,6 +0,0 @@
# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This is a generated file! Do not edit.
www.foobar.foobar FALSE / TRUE YoutubeDLExpiresEmpty YoutubeDLExpiresEmptyValue
www.foobar.foobar FALSE / TRUE 0 YoutubeDLExpires0 YoutubeDLExpires0Value

View file

@ -1,28 +0,0 @@
#EXTM3U
#EXT-X-VERSION:4
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=1255659,PROGRAM-ID=1,CODECS="avc1.42c01e,mp4a.40.2",RESOLUTION=640x360
/videos/BorisHesser_2018S/video/600k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=163154,PROGRAM-ID=1,CODECS="avc1.42c00c,mp4a.40.2",RESOLUTION=320x180
/videos/BorisHesser_2018S/video/64k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=481701,PROGRAM-ID=1,CODECS="avc1.42c015,mp4a.40.2",RESOLUTION=512x288
/videos/BorisHesser_2018S/video/180k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=769968,PROGRAM-ID=1,CODECS="avc1.42c015,mp4a.40.2",RESOLUTION=512x288
/videos/BorisHesser_2018S/video/320k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=984037,PROGRAM-ID=1,CODECS="avc1.42c015,mp4a.40.2",RESOLUTION=512x288
/videos/BorisHesser_2018S/video/450k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=1693925,PROGRAM-ID=1,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=853x480
/videos/BorisHesser_2018S/video/950k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=2462469,PROGRAM-ID=1,CODECS="avc1.640028,mp4a.40.2",RESOLUTION=1280x720
/videos/BorisHesser_2018S/video/1500k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-STREAM-INF:AUDIO="600k",BANDWIDTH=68101,PROGRAM-ID=1,CODECS="mp4a.40.2",DEFAULT=YES
/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=74298,PROGRAM-ID=1,CODECS="avc1.42c00c",RESOLUTION=320x180,URI="/videos/BorisHesser_2018S/video/64k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=216200,PROGRAM-ID=1,CODECS="avc1.42c015",RESOLUTION=512x288,URI="/videos/BorisHesser_2018S/video/180k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=304717,PROGRAM-ID=1,CODECS="avc1.42c015",RESOLUTION=512x288,URI="/videos/BorisHesser_2018S/video/320k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=350933,PROGRAM-ID=1,CODECS="avc1.42c015",RESOLUTION=512x288,URI="/videos/BorisHesser_2018S/video/450k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=495850,PROGRAM-ID=1,CODECS="avc1.42c01e",RESOLUTION=640x360,URI="/videos/BorisHesser_2018S/video/600k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=810750,PROGRAM-ID=1,CODECS="avc1.4d401f",RESOLUTION=853x480,URI="/videos/BorisHesser_2018S/video/950k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=1273700,PROGRAM-ID=1,CODECS="avc1.640028",RESOLUTION=1280x720,URI="/videos/BorisHesser_2018S/video/1500k_iframe.m3u8?nobumpers=true&uniqueId=76011e2b"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="600k",LANGUAGE="en",NAME="Audio",AUTOSELECT=YES,DEFAULT=YES,URI="/videos/BorisHesser_2018S/audio/600k.m3u8?nobumpers=true&uniqueId=76011e2b",BANDWIDTH=614400

View file

@ -1,28 +0,0 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MPD mediaPresentationDuration="PT54.915S" minBufferTime="PT1.500S" profiles="urn:mpeg:dash:profile:isoff-on-demand:2011" type="static" xmlns="urn:mpeg:dash:schema:mpd:2011">
<Period duration="PT54.915S">
<AdaptationSet segmentAlignment="true" subsegmentAlignment="true" subsegmentStartsWithSAP="1">
<Representation bandwidth="804261" codecs="avc1.4d401e" frameRate="30" height="360" id="VIDEO-1" mimeType="video/mp4" startWithSAP="1" width="360">
<BaseURL>DASH_360</BaseURL>
<SegmentBase indexRange="915-1114" indexRangeExact="true">
<Initialization range="0-914"/>
</SegmentBase>
</Representation>
<Representation bandwidth="608000" codecs="avc1.4d401e" frameRate="30" height="240" id="VIDEO-2" mimeType="video/mp4" startWithSAP="1" width="240">
<BaseURL>DASH_240</BaseURL>
<SegmentBase indexRange="913-1112" indexRangeExact="true">
<Initialization range="0-912"/>
</SegmentBase>
</Representation>
</AdaptationSet>
<AdaptationSet>
<Representation audioSamplingRate="48000" bandwidth="129870" codecs="mp4a.40.2" id="AUDIO-1" mimeType="audio/mp4" startWithSAP="1">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/>
<BaseURL>audio</BaseURL>
<SegmentBase indexRange="832-1007" indexRangeExact="true">
<Initialization range="0-831"/>
</SegmentBase>
</Representation>
</AdaptationSet>
</Period>
</MPD>

View file

@ -1,34 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<playlist version="1" xmlns="http://xspf.org/ns/0/">
<date>2018-03-09T18:01:43Z</date>
<trackList>
<track>
<location>cd1/track%201.mp3</location>
<title>Pandemonium</title>
<creator>Foilverb</creator>
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
<album>Pandemonium EP</album>
<trackNum>1</trackNum>
<duration>202416</duration>
</track>
<track>
<location>../%E3%83%88%E3%83%A9%E3%83%83%E3%82%AF%E3%80%80%EF%BC%92.mp3</location>
<title>Final Cartridge (Nichico Twelve Remix)</title>
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
<creator>Foilverb</creator>
<album>Pandemonium EP</album>
<trackNum>2</trackNum>
<duration>255857</duration>
</track>
<track>
<location>track3.mp3</location>
<location>https://example.com/track3.mp3</location>
<title>Rebuilding Nightingale</title>
<annotation>Visit http://bigbrother404.bandcamp.com</annotation>
<creator>Foilverb</creator>
<album>Pandemonium EP</album>
<trackNum>3</trackNum>
<duration>287915</duration>
</track>
</trackList>
</playlist>

View file

@ -7,7 +7,7 @@
# https://github.com/zsh-users/antigen # https://github.com/zsh-users/antigen
# Install youtube-dl: # Install youtube-dl:
# antigen bundle ytdl-org/youtube-dl # antigen bundle rg3/youtube-dl
# Bundles installed by antigen are available for use immediately. # Bundles installed by antigen are available for use immediately.
# Update youtube-dl (and all other antigen bundles): # Update youtube-dl (and all other antigen bundles):

View file

@ -82,17 +82,14 @@ from .utils import (
sanitize_url, sanitize_url,
sanitized_Request, sanitized_Request,
std_headers, std_headers,
str_or_none,
subtitles_filename, subtitles_filename,
UnavailableVideoError, UnavailableVideoError,
url_basename, url_basename,
version_tuple, version_tuple,
write_json_file, write_json_file,
write_string, write_string,
YoutubeDLCookieJar,
YoutubeDLCookieProcessor, YoutubeDLCookieProcessor,
YoutubeDLHandler, YoutubeDLHandler,
YoutubeDLRedirectHandler,
) )
from .cache import Cache from .cache import Cache
from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER
@ -214,7 +211,7 @@ class YoutubeDL(object):
At the moment, this is only supported by YouTube. At the moment, this is only supported by YouTube.
proxy: URL of the proxy server to use proxy: URL of the proxy server to use
geo_verification_proxy: URL of the proxy to use for IP address verification geo_verification_proxy: URL of the proxy to use for IP address verification
on geo-restricted sites. on geo-restricted sites. (Experimental)
socket_timeout: Time to wait for unresponsive hosts, in seconds socket_timeout: Time to wait for unresponsive hosts, in seconds
bidi_workaround: Work around buggy terminals without bidirectional text bidi_workaround: Work around buggy terminals without bidirectional text
support, using fridibi support, using fridibi
@ -262,7 +259,7 @@ class YoutubeDL(object):
- "warn": only emit a warning - "warn": only emit a warning
- "detect_or_warn": check whether we can do anything - "detect_or_warn": check whether we can do anything
about it, warn otherwise (default) about it, warn otherwise (default)
source_address: Client-side IP address to bind to. source_address: (Experimental) Client-side IP address to bind to.
call_home: Boolean, true iff we are allowed to contact the call_home: Boolean, true iff we are allowed to contact the
youtube-dl servers for debugging. youtube-dl servers for debugging.
sleep_interval: Number of seconds to sleep before each download when sleep_interval: Number of seconds to sleep before each download when
@ -284,14 +281,11 @@ class YoutubeDL(object):
match_filter_func in utils.py is one example for this. match_filter_func in utils.py is one example for this.
no_color: Do not emit color codes in output. no_color: Do not emit color codes in output.
geo_bypass: Bypass geographic restriction via faking X-Forwarded-For geo_bypass: Bypass geographic restriction via faking X-Forwarded-For
HTTP header HTTP header (experimental)
geo_bypass_country: geo_bypass_country:
Two-letter ISO 3166-2 country code that will be used for Two-letter ISO 3166-2 country code that will be used for
explicit geographic restriction bypassing via faking explicit geographic restriction bypassing via faking
X-Forwarded-For HTTP header X-Forwarded-For HTTP header (experimental)
geo_bypass_ip_block:
IP range in CIDR notation that will be used similarly to
geo_bypass_country
The following options determine which downloader is picked: The following options determine which downloader is picked:
external_downloader: Executable of the external downloader to call. external_downloader: Executable of the external downloader to call.
@ -304,14 +298,11 @@ class YoutubeDL(object):
the downloader (see youtube_dl/downloader/common.py): the downloader (see youtube_dl/downloader/common.py):
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test, nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
noresizebuffer, retries, continuedl, noprogress, consoletitle, noresizebuffer, retries, continuedl, noprogress, consoletitle,
xattr_set_filesize, external_downloader_args, hls_use_mpegts, xattr_set_filesize, external_downloader_args, hls_use_mpegts.
http_chunk_size.
The following options are used by the post processors: The following options are used by the post processors:
prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available, prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
otherwise prefer ffmpeg. otherwise prefer avconv.
ffmpeg_location: Location of the ffmpeg/avconv binary; either the path
to the binary or its containing directory.
postprocessor_args: A list of additional command-line arguments for the postprocessor_args: A list of additional command-line arguments for the
postprocessor. postprocessor.
@ -401,9 +392,9 @@ class YoutubeDL(object):
else: else:
raise raise
if (sys.platform != 'win32' if (sys.platform != 'win32' and
and sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] and
and not params.get('restrictfilenames', False)): not params.get('restrictfilenames', False)):
# Unicode filesystem API will throw errors (#1474, #13027) # Unicode filesystem API will throw errors (#1474, #13027)
self.report_warning( self.report_warning(
'Assuming --restrict-filenames since file system encoding ' 'Assuming --restrict-filenames since file system encoding '
@ -441,9 +432,9 @@ class YoutubeDL(object):
if re.match(r'^-[0-9A-Za-z_-]{10}$', a)] if re.match(r'^-[0-9A-Za-z_-]{10}$', a)]
if idxs: if idxs:
correct_argv = ( correct_argv = (
['youtube-dl'] ['youtube-dl'] +
+ [a for i, a in enumerate(argv) if i not in idxs] [a for i, a in enumerate(argv) if i not in idxs] +
+ ['--'] + [argv[i] for i in idxs] ['--'] + [argv[i] for i in idxs]
) )
self.report_warning( self.report_warning(
'Long argument string detected. ' 'Long argument string detected. '
@ -540,8 +531,6 @@ class YoutubeDL(object):
def save_console_title(self): def save_console_title(self):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if self.params.get('simulate', False):
return
if compat_os_name != 'nt' and 'TERM' in os.environ: if compat_os_name != 'nt' and 'TERM' in os.environ:
# Save the title on stack # Save the title on stack
self._write_string('\033[22;0t', self._screen_file) self._write_string('\033[22;0t', self._screen_file)
@ -549,8 +538,6 @@ class YoutubeDL(object):
def restore_console_title(self): def restore_console_title(self):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if self.params.get('simulate', False):
return
if compat_os_name != 'nt' and 'TERM' in os.environ: if compat_os_name != 'nt' and 'TERM' in os.environ:
# Restore the title from stack # Restore the title from stack
self._write_string('\033[23;0t', self._screen_file) self._write_string('\033[23;0t', self._screen_file)
@ -563,7 +550,7 @@ class YoutubeDL(object):
self.restore_console_title() self.restore_console_title()
if self.params.get('cookiefile') is not None: if self.params.get('cookiefile') is not None:
self.cookiejar.save(ignore_discard=True, ignore_expires=True) self.cookiejar.save()
def trouble(self, message=None, tb=None): def trouble(self, message=None, tb=None):
"""Determine action to take when a download problem appears. """Determine action to take when a download problem appears.
@ -851,11 +838,10 @@ class YoutubeDL(object):
if result_type in ('url', 'url_transparent'): if result_type in ('url', 'url_transparent'):
ie_result['url'] = sanitize_url(ie_result['url']) ie_result['url'] = sanitize_url(ie_result['url'])
extract_flat = self.params.get('extract_flat', False) extract_flat = self.params.get('extract_flat', False)
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info) if ((extract_flat == 'in_playlist' and 'playlist' in extra_info) or
or extract_flat is True): extract_flat is True):
self.__forced_printings( if self.params.get('forcejson', False):
ie_result, self.prepare_filename(ie_result), self.to_stdout(json.dumps(ie_result))
incomplete=True)
return ie_result return ie_result
if result_type == 'video': if result_type == 'video':
@ -893,7 +879,7 @@ class YoutubeDL(object):
# url_transparent. In such cases outer metadata (from ie_result) # url_transparent. In such cases outer metadata (from ie_result)
# should be propagated to inner one (info). For this to happen # should be propagated to inner one (info). For this to happen
# _type of info should be overridden with url_transparent. This # _type of info should be overridden with url_transparent. This
# fixes issue from https://github.com/ytdl-org/youtube-dl/pull/11163. # fixes issue from https://github.com/rg3/youtube-dl/pull/11163.
if new_result.get('_type') == 'url': if new_result.get('_type') == 'url':
new_result['_type'] = 'url_transparent' new_result['_type'] = 'url_transparent'
@ -991,7 +977,7 @@ class YoutubeDL(object):
'playlist_title': ie_result.get('title'), 'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'), 'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'), 'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart, 'playlist_index': i + playliststart,
'extractor': ie_result['extractor'], 'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'], 'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']), 'webpage_url_basename': url_basename(ie_result['webpage_url']),
@ -1046,7 +1032,7 @@ class YoutubeDL(object):
'!=': operator.ne, '!=': operator.ne,
} }
operator_rex = re.compile(r'''(?x)\s* operator_rex = re.compile(r'''(?x)\s*
(?P<key>width|height|tbr|abr|vbr|asr|filesize|filesize_approx|fps) (?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s* \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
(?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?) (?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
$ $
@ -1068,24 +1054,21 @@ class YoutubeDL(object):
if not m: if not m:
STR_OPERATORS = { STR_OPERATORS = {
'=': operator.eq, '=': operator.eq,
'!=': operator.ne,
'^=': lambda attr, value: attr.startswith(value), '^=': lambda attr, value: attr.startswith(value),
'$=': lambda attr, value: attr.endswith(value), '$=': lambda attr, value: attr.endswith(value),
'*=': lambda attr, value: value in attr, '*=': lambda attr, value: value in attr,
} }
str_operator_rex = re.compile(r'''(?x) str_operator_rex = re.compile(r'''(?x)
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id) \s*(?P<key>ext|acodec|vcodec|container|protocol|format_id)
\s*(?P<negation>!\s*)?(?P<op>%s)(?P<none_inclusive>\s*\?)? \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9._-]+) \s*(?P<value>[a-zA-Z0-9._-]+)
\s*$ \s*$
''' % '|'.join(map(re.escape, STR_OPERATORS.keys()))) ''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
m = str_operator_rex.search(filter_spec) m = str_operator_rex.search(filter_spec)
if m: if m:
comparison_value = m.group('value') comparison_value = m.group('value')
str_op = STR_OPERATORS[m.group('op')] op = STR_OPERATORS[m.group('op')]
if m.group('negation'):
op = lambda attr, value: not str_op(attr, value)
else:
op = str_op
if not m: if not m:
raise ValueError('Invalid filter specification %r' % filter_spec) raise ValueError('Invalid filter specification %r' % filter_spec)
@ -1491,28 +1474,23 @@ class YoutubeDL(object):
if info_dict.get('%s_number' % field) is not None and not info_dict.get(field): if info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field]) info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
for cc_kind in ('subtitles', 'automatic_captions'):
cc = info_dict.get(cc_kind)
if cc:
for _, subtitle in cc.items():
for subtitle_format in subtitle:
if subtitle_format.get('url'):
subtitle_format['url'] = sanitize_url(subtitle_format['url'])
if subtitle_format.get('ext') is None:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
automatic_captions = info_dict.get('automatic_captions')
subtitles = info_dict.get('subtitles') subtitles = info_dict.get('subtitles')
if subtitles:
for _, subtitle in subtitles.items():
for subtitle_format in subtitle:
if subtitle_format.get('url'):
subtitle_format['url'] = sanitize_url(subtitle_format['url'])
if subtitle_format.get('ext') is None:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
if self.params.get('listsubtitles', False): if self.params.get('listsubtitles', False):
if 'automatic_captions' in info_dict: if 'automatic_captions' in info_dict:
self.list_subtitles( self.list_subtitles(info_dict['id'], info_dict.get('automatic_captions'), 'automatic captions')
info_dict['id'], automatic_captions, 'automatic captions')
self.list_subtitles(info_dict['id'], subtitles, 'subtitles') self.list_subtitles(info_dict['id'], subtitles, 'subtitles')
return return
info_dict['requested_subtitles'] = self.process_subtitles( info_dict['requested_subtitles'] = self.process_subtitles(
info_dict['id'], subtitles, automatic_captions) info_dict['id'], subtitles,
info_dict.get('automatic_captions'))
# We now pick which formats have to be downloaded # We now pick which formats have to be downloaded
if info_dict.get('formats') is None: if info_dict.get('formats') is None:
@ -1610,7 +1588,7 @@ class YoutubeDL(object):
# by extractor are incomplete or not (i.e. whether extractor provides only # by extractor are incomplete or not (i.e. whether extractor provides only
# video-only or audio-only formats) for proper formats selection for # video-only or audio-only formats) for proper formats selection for
# extractors with such incomplete formats (see # extractors with such incomplete formats (see
# https://github.com/ytdl-org/youtube-dl/pull/5556). # https://github.com/rg3/youtube-dl/pull/5556).
# Since formats may be filtered during format selection and may not match # Since formats may be filtered during format selection and may not match
# the original formats the results may be incorrect. Thus original formats # the original formats the results may be incorrect. Thus original formats
# or pre-calculated metrics should be passed to format selection routines # or pre-calculated metrics should be passed to format selection routines
@ -1618,12 +1596,12 @@ class YoutubeDL(object):
# We will pass a context object containing all necessary additional data # We will pass a context object containing all necessary additional data
# instead of just formats. # instead of just formats.
# This fixes incorrect format selection issue (see # This fixes incorrect format selection issue (see
# https://github.com/ytdl-org/youtube-dl/issues/10083). # https://github.com/rg3/youtube-dl/issues/10083).
incomplete_formats = ( incomplete_formats = (
# All formats are video-only or # All formats are video-only or
all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats) all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats) or
# all formats are audio-only # all formats are audio-only
or all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats)) all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats))
ctx = { ctx = {
'formats': formats, 'formats': formats,
@ -1695,36 +1673,6 @@ class YoutubeDL(object):
subs[lang] = f subs[lang] = f
return subs return subs
def __forced_printings(self, info_dict, filename, incomplete):
def print_mandatory(field):
if (self.params.get('force%s' % field, False)
and (not incomplete or info_dict.get(field) is not None)):
self.to_stdout(info_dict[field])
def print_optional(field):
if (self.params.get('force%s' % field, False)
and info_dict.get(field) is not None):
self.to_stdout(info_dict[field])
print_mandatory('title')
print_mandatory('id')
if self.params.get('forceurl', False) and not incomplete:
if info_dict.get('requested_formats') is not None:
for f in info_dict['requested_formats']:
self.to_stdout(f['url'] + f.get('play_path', ''))
else:
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
print_optional('thumbnail')
print_optional('description')
if self.params.get('forcefilename', False) and filename is not None:
self.to_stdout(filename)
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
self.to_stdout(formatSeconds(info_dict['duration']))
print_mandatory('format')
if self.params.get('forcejson', False):
self.to_stdout(json.dumps(info_dict))
def process_info(self, info_dict): def process_info(self, info_dict):
"""Process a single resolved IE result.""" """Process a single resolved IE result."""
@ -1735,8 +1683,9 @@ class YoutubeDL(object):
if self._num_downloads >= int(max_downloads): if self._num_downloads >= int(max_downloads):
raise MaxDownloadsReached() raise MaxDownloadsReached()
# TODO: backward compatibility, to be removed
info_dict['fulltitle'] = info_dict['title'] info_dict['fulltitle'] = info_dict['title']
if len(info_dict['title']) > 200:
info_dict['title'] = info_dict['title'][:197] + '...'
if 'format' not in info_dict: if 'format' not in info_dict:
info_dict['format'] = info_dict['ext'] info_dict['format'] = info_dict['ext']
@ -1751,7 +1700,29 @@ class YoutubeDL(object):
info_dict['_filename'] = filename = self.prepare_filename(info_dict) info_dict['_filename'] = filename = self.prepare_filename(info_dict)
# Forced printings # Forced printings
self.__forced_printings(info_dict, filename, incomplete=False) if self.params.get('forcetitle', False):
self.to_stdout(info_dict['fulltitle'])
if self.params.get('forceid', False):
self.to_stdout(info_dict['id'])
if self.params.get('forceurl', False):
if info_dict.get('requested_formats') is not None:
for f in info_dict['requested_formats']:
self.to_stdout(f['url'] + f.get('play_path', ''))
else:
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
self.to_stdout(info_dict['thumbnail'])
if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
self.to_stdout(info_dict['description'])
if self.params.get('forcefilename', False) and filename is not None:
self.to_stdout(filename)
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
self.to_stdout(formatSeconds(info_dict['duration']))
if self.params.get('forceformat', False):
self.to_stdout(info_dict['format'])
if self.params.get('forcejson', False):
self.to_stdout(json.dumps(info_dict))
# Do nothing else if in simulate mode # Do nothing else if in simulate mode
if self.params.get('simulate', False): if self.params.get('simulate', False):
@ -1792,8 +1763,6 @@ class YoutubeDL(object):
annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext')) annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)): if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)):
self.to_screen('[info] Video annotations are already present') self.to_screen('[info] Video annotations are already present')
elif not info_dict.get('annotations'):
self.report_warning('There are no annotations to write.')
else: else:
try: try:
self.to_screen('[info] Writing video annotations to: ' + annofn) self.to_screen('[info] Writing video annotations to: ' + annofn)
@ -1815,7 +1784,7 @@ class YoutubeDL(object):
ie = self.get_info_extractor(info_dict['extractor_key']) ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items(): for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext'] sub_format = sub_info['ext']
sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext')) sub_filename = subtitles_filename(filename, sub_lang, sub_format)
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)): if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format)) self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
else: else:
@ -1823,7 +1792,7 @@ class YoutubeDL(object):
if sub_info.get('data') is not None: if sub_info.get('data') is not None:
try: try:
# Use newline='' to prevent conversion of newline characters # Use newline='' to prevent conversion of newline characters
# See https://github.com/ytdl-org/youtube-dl/issues/10268 # See https://github.com/rg3/youtube-dl/issues/10268
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile: with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
subfile.write(sub_info['data']) subfile.write(sub_info['data'])
except (OSError, IOError): except (OSError, IOError):
@ -1879,7 +1848,7 @@ class YoutubeDL(object):
def compatible_formats(formats): def compatible_formats(formats):
video, audio = formats video, audio = formats
# Check extension # Check extension
video_ext, audio_ext = video.get('ext'), audio.get('ext') video_ext, audio_ext = audio.get('ext'), video.get('ext')
if video_ext and audio_ext: if video_ext and audio_ext:
COMPATIBLE_EXTS = ( COMPATIBLE_EXTS = (
('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v', 'ismv', 'isma'), ('mp3', 'mp4', 'm4a', 'm4p', 'm4b', 'm4r', 'm4v', 'ismv', 'isma'),
@ -1958,8 +1927,8 @@ class YoutubeDL(object):
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
if (info_dict.get('requested_formats') is None if (info_dict.get('requested_formats') is None and
and info_dict.get('container') == 'm4a_dash'): info_dict.get('container') == 'm4a_dash'):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning( self.report_warning(
'%s: writing DASH m4a. ' '%s: writing DASH m4a. '
@ -1978,9 +1947,9 @@ class YoutubeDL(object):
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
if (info_dict.get('protocol') == 'm3u8_native' if (info_dict.get('protocol') == 'm3u8_native' or
or info_dict.get('protocol') == 'm3u8' info_dict.get('protocol') == 'm3u8' and
and self.params.get('hls_prefer_native')): self.params.get('hls_prefer_native')):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning('%s: malformed AAC bitstream detected.' % ( self.report_warning('%s: malformed AAC bitstream detected.' % (
info_dict['id'])) info_dict['id']))
@ -2006,10 +1975,10 @@ class YoutubeDL(object):
def download(self, url_list): def download(self, url_list):
"""Download a given list of URLs.""" """Download a given list of URLs."""
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL) outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
if (len(url_list) > 1 if (len(url_list) > 1 and
and outtmpl != '-' outtmpl != '-' and
and '%' not in outtmpl '%' not in outtmpl and
and self.params.get('max_downloads') != 1): self.params.get('max_downloads') != 1):
raise SameFileError(outtmpl) raise SameFileError(outtmpl)
for url in url_list: for url in url_list:
@ -2074,24 +2043,15 @@ class YoutubeDL(object):
self.report_warning('Unable to remove downloaded original file') self.report_warning('Unable to remove downloaded original file')
def _make_archive_id(self, info_dict): def _make_archive_id(self, info_dict):
video_id = info_dict.get('id')
if not video_id:
return
# Future-proof against any change in case # Future-proof against any change in case
# and backwards compatibility with prior versions # and backwards compatibility with prior versions
extractor = info_dict.get('extractor_key') or info_dict.get('ie_key') # key in a playlist extractor = info_dict.get('extractor_key')
if extractor is None: if extractor is None:
url = str_or_none(info_dict.get('url')) if 'id' in info_dict:
if not url: extractor = info_dict.get('ie_key') # key in a playlist
return if extractor is None:
# Try to find matching extractor for the URL and take its ie_key return None # Incomplete video information
for ie in self._ies: return extractor.lower() + ' ' + info_dict['id']
if ie.suitable(url):
extractor = ie.ie_key()
break
else:
return
return extractor.lower() + ' ' + video_id
def in_download_archive(self, info_dict): def in_download_archive(self, info_dict):
fn = self.params.get('download_archive') fn = self.params.get('download_archive')
@ -2099,7 +2059,7 @@ class YoutubeDL(object):
return False return False
vid_id = self._make_archive_id(info_dict) vid_id = self._make_archive_id(info_dict)
if not vid_id: if vid_id is None:
return False # Incomplete video information return False # Incomplete video information
try: try:
@ -2154,8 +2114,8 @@ class YoutubeDL(object):
if res: if res:
res += ', ' res += ', '
res += '%s container' % fdict['container'] res += '%s container' % fdict['container']
if (fdict.get('vcodec') is not None if (fdict.get('vcodec') is not None and
and fdict.get('vcodec') != 'none'): fdict.get('vcodec') != 'none'):
if res: if res:
res += ', ' res += ', '
res += fdict['vcodec'] res += fdict['vcodec']
@ -2242,7 +2202,7 @@ class YoutubeDL(object):
return return
if type('') is not compat_str: if type('') is not compat_str:
# Python 2.6 on SLES11 SP1 (https://github.com/ytdl-org/youtube-dl/issues/3326) # Python 2.6 on SLES11 SP1 (https://github.com/rg3/youtube-dl/issues/3326)
self.report_warning( self.report_warning(
'Your Python is broken! Update to a newer and supported version') 'Your Python is broken! Update to a newer and supported version')
@ -2324,9 +2284,10 @@ class YoutubeDL(object):
self.cookiejar = compat_cookiejar.CookieJar() self.cookiejar = compat_cookiejar.CookieJar()
else: else:
opts_cookiefile = expand_path(opts_cookiefile) opts_cookiefile = expand_path(opts_cookiefile)
self.cookiejar = YoutubeDLCookieJar(opts_cookiefile) self.cookiejar = compat_cookiejar.MozillaCookieJar(
opts_cookiefile)
if os.access(opts_cookiefile, os.R_OK): if os.access(opts_cookiefile, os.R_OK):
self.cookiejar.load(ignore_discard=True, ignore_expires=True) self.cookiejar.load()
cookie_processor = YoutubeDLCookieProcessor(self.cookiejar) cookie_processor = YoutubeDLCookieProcessor(self.cookiejar)
if opts_proxy is not None: if opts_proxy is not None:
@ -2336,7 +2297,7 @@ class YoutubeDL(object):
proxies = {'http': opts_proxy, 'https': opts_proxy} proxies = {'http': opts_proxy, 'https': opts_proxy}
else: else:
proxies = compat_urllib_request.getproxies() proxies = compat_urllib_request.getproxies()
# Set HTTPS proxy to HTTP one if given (https://github.com/ytdl-org/youtube-dl/issues/805) # Set HTTPS proxy to HTTP one if given (https://github.com/rg3/youtube-dl/issues/805)
if 'http' in proxies and 'https' not in proxies: if 'http' in proxies and 'https' not in proxies:
proxies['https'] = proxies['http'] proxies['https'] = proxies['http']
proxy_handler = PerRequestProxyHandler(proxies) proxy_handler = PerRequestProxyHandler(proxies)
@ -2344,13 +2305,12 @@ class YoutubeDL(object):
debuglevel = 1 if self.params.get('debug_printtraffic') else 0 debuglevel = 1 if self.params.get('debug_printtraffic') else 0
https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel) https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel)
ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel) ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel)
redirect_handler = YoutubeDLRedirectHandler()
data_handler = compat_urllib_request_DataHandler() data_handler = compat_urllib_request_DataHandler()
# When passing our own FileHandler instance, build_opener won't add the # When passing our own FileHandler instance, build_opener won't add the
# default FileHandler and allows us to disable the file protocol, which # default FileHandler and allows us to disable the file protocol, which
# can be used for malicious purposes (see # can be used for malicious purposes (see
# https://github.com/ytdl-org/youtube-dl/issues/8227) # https://github.com/rg3/youtube-dl/issues/8227)
file_handler = compat_urllib_request.FileHandler() file_handler = compat_urllib_request.FileHandler()
def file_open(*args, **kwargs): def file_open(*args, **kwargs):
@ -2358,11 +2318,11 @@ class YoutubeDL(object):
file_handler.file_open = file_open file_handler.file_open = file_open
opener = compat_urllib_request.build_opener( opener = compat_urllib_request.build_opener(
proxy_handler, https_handler, cookie_processor, ydlh, redirect_handler, data_handler, file_handler) proxy_handler, https_handler, cookie_processor, ydlh, data_handler, file_handler)
# Delete the default user-agent header, which would otherwise apply in # Delete the default user-agent header, which would otherwise apply in
# cases where our custom HTTP handler doesn't come into play # cases where our custom HTTP handler doesn't come into play
# (See https://github.com/ytdl-org/youtube-dl/issues/1309 for details) # (See https://github.com/rg3/youtube-dl/issues/1309 for details)
opener.addheaders = [] opener.addheaders = []
self._opener = opener self._opener = opener

View file

@ -48,7 +48,7 @@ from .YoutubeDL import YoutubeDL
def _real_main(argv=None): def _real_main(argv=None):
# Compatibility fixes for Windows # Compatibility fixes for Windows
if sys.platform == 'win32': if sys.platform == 'win32':
# https://github.com/ytdl-org/youtube-dl/issues/820 # https://github.com/rg3/youtube-dl/issues/820
codecs.register(lambda name: codecs.lookup('utf-8') if name == 'cp65001' else None) codecs.register(lambda name: codecs.lookup('utf-8') if name == 'cp65001' else None)
workaround_optparse_bug9161() workaround_optparse_bug9161()
@ -94,7 +94,7 @@ def _real_main(argv=None):
if opts.verbose: if opts.verbose:
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n') write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except IOError: except IOError:
sys.exit('ERROR: batch file %s could not be read' % opts.batchfile) sys.exit('ERROR: batch file could not be read')
all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls
_enc = preferredencoding() _enc = preferredencoding()
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls] all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
@ -166,8 +166,6 @@ def _real_main(argv=None):
if opts.max_sleep_interval is not None: if opts.max_sleep_interval is not None:
if opts.max_sleep_interval < 0: if opts.max_sleep_interval < 0:
parser.error('max sleep interval must be positive or 0') parser.error('max sleep interval must be positive or 0')
if opts.sleep_interval is None:
parser.error('min sleep interval must be specified, use --min-sleep-interval')
if opts.max_sleep_interval < opts.sleep_interval: if opts.max_sleep_interval < opts.sleep_interval:
parser.error('max sleep interval must be greater than or equal to min sleep interval') parser.error('max sleep interval must be greater than or equal to min sleep interval')
else: else:
@ -193,11 +191,6 @@ def _real_main(argv=None):
if numeric_buffersize is None: if numeric_buffersize is None:
parser.error('invalid buffer size specified') parser.error('invalid buffer size specified')
opts.buffersize = numeric_buffersize opts.buffersize = numeric_buffersize
if opts.http_chunk_size is not None:
numeric_chunksize = FileDownloader.parse_bytes(opts.http_chunk_size)
if not numeric_chunksize:
parser.error('invalid http chunk size specified')
opts.http_chunk_size = numeric_chunksize
if opts.playliststart <= 0: if opts.playliststart <= 0:
raise ValueError('Playlist start must be positive') raise ValueError('Playlist start must be positive')
if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart: if opts.playlistend not in (-1, None) and opts.playlistend < opts.playliststart:
@ -230,14 +223,14 @@ def _real_main(argv=None):
if opts.allsubtitles and not opts.writeautomaticsub: if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True opts.writesubtitles = True
outtmpl = ((opts.outtmpl is not None and opts.outtmpl) outtmpl = ((opts.outtmpl is not None and opts.outtmpl) or
or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s') (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s') or
or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s') (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s') or
or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s') (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s') or
or (opts.usetitle and '%(title)s-%(id)s.%(ext)s') (opts.usetitle and '%(title)s-%(id)s.%(ext)s') or
or (opts.useid and '%(id)s.%(ext)s') (opts.useid and '%(id)s.%(ext)s') or
or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s') (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s') or
or DEFAULT_OUTTMPL) DEFAULT_OUTTMPL)
if not os.path.splitext(outtmpl)[1] and opts.extractaudio: if not os.path.splitext(outtmpl)[1] and opts.extractaudio:
parser.error('Cannot download a video and extract audio into the same' parser.error('Cannot download a video and extract audio into the same'
' file! Use "{0}.%(ext)s" instead of "{0}" as the output' ' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
@ -353,7 +346,6 @@ def _real_main(argv=None):
'keep_fragments': opts.keep_fragments, 'keep_fragments': opts.keep_fragments,
'buffersize': opts.buffersize, 'buffersize': opts.buffersize,
'noresizebuffer': opts.noresizebuffer, 'noresizebuffer': opts.noresizebuffer,
'http_chunk_size': opts.http_chunk_size,
'continuedl': opts.continue_dl, 'continuedl': opts.continue_dl,
'noprogress': opts.noprogress, 'noprogress': opts.noprogress,
'progress_with_newline': opts.progress_with_newline, 'progress_with_newline': opts.progress_with_newline,
@ -432,7 +424,6 @@ def _real_main(argv=None):
'config_location': opts.config_location, 'config_location': opts.config_location,
'geo_bypass': opts.geo_bypass, 'geo_bypass': opts.geo_bypass,
'geo_bypass_country': opts.geo_bypass_country, 'geo_bypass_country': opts.geo_bypass_country,
'geo_bypass_ip_block': opts.geo_bypass_ip_block,
# just for deprecation check # just for deprecation check
'autonumber': opts.autonumber if opts.autonumber is True else None, 'autonumber': opts.autonumber if opts.autonumber is True else None,
'usetitle': opts.usetitle if opts.usetitle is True else None, 'usetitle': opts.usetitle if opts.usetitle is True else None,

View file

@ -1,8 +1,8 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
from math import ceil from math import ceil
from .compat import compat_b64decode
from .utils import bytes_to_intlist, intlist_to_bytes from .utils import bytes_to_intlist, intlist_to_bytes
BLOCK_SIZE_BYTES = 16 BLOCK_SIZE_BYTES = 16
@ -180,7 +180,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
""" """
NONCE_LENGTH_BYTES = 8 NONCE_LENGTH_BYTES = 8
data = bytes_to_intlist(compat_b64decode(data)) data = bytes_to_intlist(base64.b64decode(data.encode('utf-8')))
password = bytes_to_intlist(password.encode('utf-8')) password = bytes_to_intlist(password.encode('utf-8'))
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password)) key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))

View file

@ -1,7 +1,6 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import binascii import binascii
import collections import collections
import ctypes import ctypes
@ -57,17 +56,6 @@ try:
except ImportError: # Python 2 except ImportError: # Python 2
import cookielib as compat_cookiejar import cookielib as compat_cookiejar
if sys.version_info[0] == 2:
class compat_cookiejar_Cookie(compat_cookiejar.Cookie):
def __init__(self, version, name, value, *args, **kwargs):
if isinstance(name, compat_str):
name = name.encode()
if isinstance(value, compat_str):
value = value.encode()
compat_cookiejar.Cookie.__init__(self, version, name, value, *args, **kwargs)
else:
compat_cookiejar_Cookie = compat_cookiejar.Cookie
try: try:
import http.cookies as compat_cookies import http.cookies as compat_cookies
except ImportError: # Python 2 except ImportError: # Python 2
@ -2375,7 +2363,7 @@ except ImportError: # Python 2
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus # HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
# implementations from cpython 3.4.3's stdlib. Python 2's version # implementations from cpython 3.4.3's stdlib. Python 2's version
# is apparently broken (see https://github.com/ytdl-org/youtube-dl/pull/6244) # is apparently broken (see https://github.com/rg3/youtube-dl/pull/6244)
def compat_urllib_parse_unquote_to_bytes(string): def compat_urllib_parse_unquote_to_bytes(string):
"""unquote_to_bytes('abc%20def') -> b'abc def'.""" """unquote_to_bytes('abc%20def') -> b'abc def'."""
@ -2519,15 +2507,6 @@ class _TreeBuilder(etree.TreeBuilder):
pass pass
try:
# xml.etree.ElementTree.Element is a method in Python <=2.6 and
# the following will crash with:
# TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types
isinstance(None, xml.etree.ElementTree.Element)
from xml.etree.ElementTree import Element as compat_etree_Element
except TypeError: # Python <=2.6
from xml.etree.ElementTree import _ElementInterface as compat_etree_Element
if sys.version_info[0] >= 3: if sys.version_info[0] >= 3:
def compat_etree_fromstring(text): def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder())) return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder()))
@ -2660,9 +2639,9 @@ else:
try: try:
args = shlex.split('中文') args = shlex.split('中文')
assert (isinstance(args, list) assert (isinstance(args, list) and
and isinstance(args[0], compat_str) isinstance(args[0], compat_str) and
and args[0] == '中文') args[0] == '中文')
compat_shlex_split = shlex.split compat_shlex_split = shlex.split
except (AssertionError, UnicodeEncodeError): except (AssertionError, UnicodeEncodeError):
# Working around shlex issue with unicode strings on some python 2 # Working around shlex issue with unicode strings on some python 2
@ -2765,17 +2744,6 @@ else:
compat_expanduser = os.path.expanduser compat_expanduser = os.path.expanduser
if compat_os_name == 'nt' and sys.version_info < (3, 8):
# os.path.realpath on Windows does not follow symbolic links
# prior to Python 3.8 (see https://bugs.python.org/issue9949)
def compat_realpath(path):
while os.path.islink(path):
path = os.path.abspath(os.readlink(path))
return path
else:
compat_realpath = os.path.realpath
if sys.version_info < (3, 0): if sys.version_info < (3, 0):
def compat_print(s): def compat_print(s):
from .utils import preferredencoding from .utils import preferredencoding
@ -2818,12 +2786,6 @@ except NameError: # Python 3
compat_numeric_types = (int, float, complex) compat_numeric_types = (int, float, complex)
try:
compat_integer_types = (int, long)
except NameError: # Python 3
compat_integer_types = (int, )
if sys.version_info < (2, 7): if sys.version_info < (2, 7):
def compat_socket_create_connection(address, timeout, source_address=None): def compat_socket_create_connection(address, timeout, source_address=None):
host, port = address host, port = address
@ -2850,7 +2812,7 @@ else:
compat_socket_create_connection = socket.create_connection compat_socket_create_connection = socket.create_connection
# Fix https://github.com/ytdl-org/youtube-dl/issues/4223 # Fix https://github.com/rg3/youtube-dl/issues/4223
# See http://bugs.python.org/issue9161 for what is broken # See http://bugs.python.org/issue9161 for what is broken
def workaround_optparse_bug9161(): def workaround_optparse_bug9161():
op = optparse.OptionParser() op = optparse.OptionParser()
@ -2934,24 +2896,9 @@ except TypeError:
if isinstance(spec, compat_str): if isinstance(spec, compat_str):
spec = spec.encode('ascii') spec = spec.encode('ascii')
return struct.unpack(spec, *args) return struct.unpack(spec, *args)
class compat_Struct(struct.Struct):
def __init__(self, fmt):
if isinstance(fmt, compat_str):
fmt = fmt.encode('ascii')
super(compat_Struct, self).__init__(fmt)
else: else:
compat_struct_pack = struct.pack compat_struct_pack = struct.pack
compat_struct_unpack = struct.unpack compat_struct_unpack = struct.unpack
if platform.python_implementation() == 'IronPython' and sys.version_info < (2, 7, 8):
class compat_Struct(struct.Struct):
def unpack(self, string):
if not isinstance(string, buffer): # noqa: F821
string = buffer(string) # noqa: F821
return super(compat_Struct, self).unpack(string)
else:
compat_Struct = struct.Struct
try: try:
from future_builtins import zip as compat_zip from future_builtins import zip as compat_zip
@ -2961,21 +2908,11 @@ except ImportError: # not 2.6+ or is 3.x
except ImportError: except ImportError:
compat_zip = zip compat_zip = zip
if sys.version_info < (3, 3):
def compat_b64decode(s, *args, **kwargs):
if isinstance(s, compat_str):
s = s.encode('ascii')
return base64.b64decode(s, *args, **kwargs)
else:
compat_b64decode = base64.b64decode
if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0): if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
# PyPy2 prior to version 5.4.0 expects byte strings as Windows function # PyPy2 prior to version 5.4.0 expects byte strings as Windows function
# names, see the original PyPy issue [1] and the youtube-dl one [2]. # names, see the original PyPy issue [1] and the youtube-dl one [2].
# 1. https://bitbucket.org/pypy/pypy/issues/2360/windows-ctypescdll-typeerror-function-name # 1. https://bitbucket.org/pypy/pypy/issues/2360/windows-ctypescdll-typeerror-function-name
# 2. https://github.com/ytdl-org/youtube-dl/pull/4392 # 2. https://github.com/rg3/youtube-dl/pull/4392
def compat_ctypes_WINFUNCTYPE(*args, **kwargs): def compat_ctypes_WINFUNCTYPE(*args, **kwargs):
real = ctypes.WINFUNCTYPE(*args, **kwargs) real = ctypes.WINFUNCTYPE(*args, **kwargs)
@ -2993,15 +2930,11 @@ __all__ = [
'compat_HTMLParseError', 'compat_HTMLParseError',
'compat_HTMLParser', 'compat_HTMLParser',
'compat_HTTPError', 'compat_HTTPError',
'compat_Struct',
'compat_b64decode',
'compat_basestring', 'compat_basestring',
'compat_chr', 'compat_chr',
'compat_cookiejar', 'compat_cookiejar',
'compat_cookiejar_Cookie',
'compat_cookies', 'compat_cookies',
'compat_ctypes_WINFUNCTYPE', 'compat_ctypes_WINFUNCTYPE',
'compat_etree_Element',
'compat_etree_fromstring', 'compat_etree_fromstring',
'compat_etree_register_namespace', 'compat_etree_register_namespace',
'compat_expanduser', 'compat_expanduser',
@ -3013,7 +2946,6 @@ __all__ = [
'compat_http_client', 'compat_http_client',
'compat_http_server', 'compat_http_server',
'compat_input', 'compat_input',
'compat_integer_types',
'compat_itertools_count', 'compat_itertools_count',
'compat_kwargs', 'compat_kwargs',
'compat_numeric_types', 'compat_numeric_types',
@ -3021,7 +2953,6 @@ __all__ = [
'compat_os_name', 'compat_os_name',
'compat_parse_qs', 'compat_parse_qs',
'compat_print', 'compat_print',
'compat_realpath',
'compat_setenv', 'compat_setenv',
'compat_shlex_quote', 'compat_shlex_quote',
'compat_shlex_split', 'compat_shlex_split',

View file

@ -45,12 +45,10 @@ class FileDownloader(object):
min_filesize: Skip files smaller than this size min_filesize: Skip files smaller than this size
max_filesize: Skip files larger than this size max_filesize: Skip files larger than this size
xattr_set_filesize: Set ytdl.filesize user xattribute with expected size. xattr_set_filesize: Set ytdl.filesize user xattribute with expected size.
(experimental)
external_downloader_args: A list of additional command-line arguments for the external_downloader_args: A list of additional command-line arguments for the
external downloader. external downloader.
hls_use_mpegts: Use the mpegts container for HLS videos. hls_use_mpegts: Use the mpegts container for HLS videos.
http_chunk_size: Size of a chunk for chunk-based HTTP downloading. May be
useful for bypassing bandwidth throttling imposed by
a webserver (experimental)
Subclasses of this one must re-define the real_download method. Subclasses of this one must re-define the real_download method.
""" """
@ -176,9 +174,7 @@ class FileDownloader(object):
return return
speed = float(byte_counter) / elapsed speed = float(byte_counter) / elapsed
if speed > rate_limit: if speed > rate_limit:
sleep_time = float(byte_counter) / rate_limit - elapsed time.sleep(max((byte_counter // rate_limit) - elapsed, 0))
if sleep_time > 0:
time.sleep(sleep_time)
def temp_name(self, filename): def temp_name(self, filename):
"""Returns a temporary filename for the given filename.""" """Returns a temporary filename for the given filename."""
@ -250,13 +246,12 @@ class FileDownloader(object):
if self.params.get('noprogress', False): if self.params.get('noprogress', False):
self.to_screen('[download] Download completed') self.to_screen('[download] Download completed')
else: else:
msg_template = '100%%' s['_total_bytes_str'] = format_bytes(s['total_bytes'])
if s.get('total_bytes') is not None:
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
msg_template += ' of %(_total_bytes_str)s'
if s.get('elapsed') is not None: if s.get('elapsed') is not None:
s['_elapsed_str'] = self.format_seconds(s['elapsed']) s['_elapsed_str'] = self.format_seconds(s['elapsed'])
msg_template += ' in %(_elapsed_str)s' msg_template = '100%% of %(_total_bytes_str)s in %(_elapsed_str)s'
else:
msg_template = '100%% of %(_total_bytes_str)s'
self._report_progress_status( self._report_progress_status(
msg_template % s, is_last_line=True) msg_template % s, is_last_line=True)
@ -332,15 +327,15 @@ class FileDownloader(object):
""" """
nooverwrites_and_exists = ( nooverwrites_and_exists = (
self.params.get('nooverwrites', False) self.params.get('nooverwrites', False) and
and os.path.exists(encodeFilename(filename)) os.path.exists(encodeFilename(filename))
) )
if not hasattr(filename, 'write'): if not hasattr(filename, 'write'):
continuedl_and_exists = ( continuedl_and_exists = (
self.params.get('continuedl', True) self.params.get('continuedl', True) and
and os.path.isfile(encodeFilename(filename)) os.path.isfile(encodeFilename(filename)) and
and not self.params.get('nopart', False) not self.params.get('nopart', False)
) )
# Check file already present # Check file already present

View file

@ -2,10 +2,7 @@ from __future__ import unicode_literals
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import compat_urllib_error from ..compat import compat_urllib_error
from ..utils import ( from ..utils import urljoin
DownloadError,
urljoin,
)
class DashSegmentsFD(FragmentFD): class DashSegmentsFD(FragmentFD):
@ -53,21 +50,13 @@ class DashSegmentsFD(FragmentFD):
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
# YouTube may often return 404 HTTP error for a fragment causing the # YouTube may often return 404 HTTP error for a fragment causing the
# whole download to fail. However if the same fragment is immediately # whole download to fail. However if the same fragment is immediately
# retried with the same request data this usually succeeds (1-2 attempts # retried with the same request data this usually succeeds (1-2 attemps
# is usually enough) thus allowing to download the whole file successfully. # is usually enough) thus allowing to download the whole file successfully.
# To be future-proof we will retry all fragments that fail with any # To be future-proof we will retry all fragments that fail with any
# HTTP error. # HTTP error.
count += 1 count += 1
if count <= fragment_retries: if count <= fragment_retries:
self.report_retry_fragment(err, frag_index, count, fragment_retries) self.report_retry_fragment(err, frag_index, count, fragment_retries)
except DownloadError:
# Don't retry fragment if error occurred during HTTP downloading
# itself since it has own retry settings
if not fatal:
self.report_skip_fragment(frag_index)
break
raise
if count > fragment_retries: if count > fragment_retries:
if not fatal: if not fatal:
self.report_skip_fragment(frag_index) self.report_skip_fragment(frag_index)

View file

@ -1,10 +1,9 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import os.path import os.path
import re
import subprocess import subprocess
import sys import sys
import time import re
from .common import FileDownloader from .common import FileDownloader
from ..compat import ( from ..compat import (
@ -31,7 +30,6 @@ class ExternalFD(FileDownloader):
tmpfilename = self.temp_name(filename) tmpfilename = self.temp_name(filename)
try: try:
started = time.time()
retval = self._call_downloader(tmpfilename, info_dict) retval = self._call_downloader(tmpfilename, info_dict)
except KeyboardInterrupt: except KeyboardInterrupt:
if not info_dict.get('is_live'): if not info_dict.get('is_live'):
@ -43,20 +41,15 @@ class ExternalFD(FileDownloader):
self.to_screen('[%s] Interrupted by user' % self.get_basename()) self.to_screen('[%s] Interrupted by user' % self.get_basename())
if retval == 0: if retval == 0:
status = { fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
'total_bytes': fsize,
'filename': filename, 'filename': filename,
'status': 'finished', 'status': 'finished',
'elapsed': time.time() - started, })
}
if filename != '-':
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
self.try_rename(tmpfilename, filename)
status.update({
'downloaded_bytes': fsize,
'total_bytes': fsize,
})
self._hook_progress(status)
return True return True
else: else:
self.to_stderr('\n') self.to_stderr('\n')
@ -121,11 +114,7 @@ class CurlFD(ExternalFD):
cmd += self._valueless_option('--silent', 'noprogress') cmd += self._valueless_option('--silent', 'noprogress')
cmd += self._valueless_option('--verbose', 'verbose') cmd += self._valueless_option('--verbose', 'verbose')
cmd += self._option('--limit-rate', 'ratelimit') cmd += self._option('--limit-rate', 'ratelimit')
retry = self._option('--retry', 'retries') cmd += self._option('--retry', 'retries')
if len(retry) == 2:
if retry[1] in ('inf', 'infinite'):
retry[1] = '2147483647'
cmd += retry
cmd += self._option('--max-filesize', 'max_filesize') cmd += self._option('--max-filesize', 'max_filesize')
cmd += self._option('--interface', 'source_address') cmd += self._option('--interface', 'source_address')
cmd += self._option('--proxy', 'proxy') cmd += self._option('--proxy', 'proxy')
@ -164,12 +153,6 @@ class WgetFD(ExternalFD):
cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies'] cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)] cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._option('--limit-rate', 'ratelimit')
retry = self._option('--tries', 'retries')
if len(retry) == 2:
if retry[1] in ('inf', 'infinite'):
retry[1] = '0'
cmd += retry
cmd += self._option('--bind-address', 'source_address') cmd += self._option('--bind-address', 'source_address')
cmd += self._option('--proxy', 'proxy') cmd += self._option('--proxy', 'proxy')
cmd += self._valueless_option('--no-check-certificate', 'nocheckcertificate') cmd += self._valueless_option('--no-check-certificate', 'nocheckcertificate')
@ -194,7 +177,6 @@ class Aria2cFD(ExternalFD):
cmd += self._option('--interface', 'source_address') cmd += self._option('--interface', 'source_address')
cmd += self._option('--all-proxy', 'proxy') cmd += self._option('--all-proxy', 'proxy')
cmd += self._bool_option('--check-certificate', 'nocheckcertificate', 'false', 'true', '=') cmd += self._bool_option('--check-certificate', 'nocheckcertificate', 'false', 'true', '=')
cmd += self._bool_option('--remote-time', 'updatetime', 'true', 'false', '=')
cmd += ['--', info_dict['url']] cmd += ['--', info_dict['url']]
return cmd return cmd
@ -240,7 +222,7 @@ class FFmpegFD(ExternalFD):
# setting -seekable prevents ffmpeg from guessing if the server # setting -seekable prevents ffmpeg from guessing if the server
# supports seeking(by adding the header `Range: bytes=0-`), which # supports seeking(by adding the header `Range: bytes=0-`), which
# can cause problems in some cases # can cause problems in some cases
# https://github.com/ytdl-org/youtube-dl/issues/11800#issuecomment-275037127 # https://github.com/rg3/youtube-dl/issues/11800#issuecomment-275037127
# http://trac.ffmpeg.org/ticket/6125#comment:10 # http://trac.ffmpeg.org/ticket/6125#comment:10
args += ['-seekable', '1' if seekable else '0'] args += ['-seekable', '1' if seekable else '0']
@ -290,7 +272,6 @@ class FFmpegFD(ExternalFD):
tc_url = info_dict.get('tc_url') tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version') flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False) live = info_dict.get('rtmp_live', False)
conn = info_dict.get('rtmp_conn')
if player_url is not None: if player_url is not None:
args += ['-rtmp_swfverify', player_url] args += ['-rtmp_swfverify', player_url]
if page_url is not None: if page_url is not None:
@ -305,11 +286,6 @@ class FFmpegFD(ExternalFD):
args += ['-rtmp_flashver', flash_version] args += ['-rtmp_flashver', flash_version]
if live: if live:
args += ['-rtmp_live', 'live'] args += ['-rtmp_live', 'live']
if isinstance(conn, list):
for entry in conn:
args += ['-rtmp_conn', entry]
elif isinstance(conn, compat_str):
args += ['-rtmp_conn', conn]
args += ['-i', url, '-c', 'copy'] args += ['-i', url, '-c', 'copy']
@ -341,7 +317,7 @@ class FFmpegFD(ExternalFD):
# mp4 file couldn't be played, but if we ask ffmpeg to quit it # mp4 file couldn't be played, but if we ask ffmpeg to quit it
# produces a file that is playable (this is mostly useful for live # produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable # streams). Note that Windows is not affected and produces playable
# files (see https://github.com/ytdl-org/youtube-dl/issues/8300). # files (see https://github.com/rg3/youtube-dl/issues/8300).
if sys.platform != 'win32': if sys.platform != 'win32':
proc.communicate(b'q') proc.communicate(b'q')
raise raise

View file

@ -1,12 +1,12 @@
from __future__ import division, unicode_literals from __future__ import division, unicode_literals
import base64
import io import io
import itertools import itertools
import time import time
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import ( from ..compat import (
compat_b64decode,
compat_etree_fromstring, compat_etree_fromstring,
compat_urlparse, compat_urlparse,
compat_urllib_error, compat_urllib_error,
@ -238,8 +238,8 @@ def write_metadata_tag(stream, metadata):
def remove_encrypted_media(media): def remove_encrypted_media(media):
return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
and 'drmAdditionalHeaderSetId' not in e.attrib, 'drmAdditionalHeaderSetId' not in e.attrib,
media)) media))
@ -267,8 +267,8 @@ class F4mFD(FragmentFD):
media = doc.findall(_add_ns('media')) media = doc.findall(_add_ns('media'))
if not media: if not media:
self.report_error('No media found') self.report_error('No media found')
for e in (doc.findall(_add_ns('drmAdditionalHeader')) for e in (doc.findall(_add_ns('drmAdditionalHeader')) +
+ doc.findall(_add_ns('drmAdditionalHeaderSet'))): doc.findall(_add_ns('drmAdditionalHeaderSet'))):
# If id attribute is missing it's valid for all media nodes # If id attribute is missing it's valid for all media nodes
# without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute # without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
if 'id' not in e.attrib: if 'id' not in e.attrib:
@ -312,7 +312,7 @@ class F4mFD(FragmentFD):
boot_info = self._get_bootstrap_from_url(bootstrap_url) boot_info = self._get_bootstrap_from_url(bootstrap_url)
else: else:
bootstrap_url = None bootstrap_url = None
bootstrap = compat_b64decode(node.text) bootstrap = base64.b64decode(node.text.encode('ascii'))
boot_info = read_bootstrap_info(bootstrap) boot_info = read_bootstrap_info(bootstrap)
return boot_info, bootstrap_url return boot_info, bootstrap_url
@ -324,8 +324,8 @@ class F4mFD(FragmentFD):
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url)) urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
man_url = urlh.geturl() man_url = urlh.geturl()
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests # Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/ytdl-org/youtube-dl/issues/6215#issuecomment-121704244 # (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244
# and https://github.com/ytdl-org/youtube-dl/issues/7823) # and https://github.com/rg3/youtube-dl/issues/7823)
manifest = fix_xml_ampersands(urlh.read().decode('utf-8', 'ignore')).strip() manifest = fix_xml_ampersands(urlh.read().decode('utf-8', 'ignore')).strip()
doc = compat_etree_fromstring(manifest) doc = compat_etree_fromstring(manifest)
@ -349,7 +349,7 @@ class F4mFD(FragmentFD):
live = boot_info['live'] live = boot_info['live']
metadata_node = media.find(_add_ns('metadata')) metadata_node = media.find(_add_ns('metadata'))
if metadata_node is not None: if metadata_node is not None:
metadata = compat_b64decode(metadata_node.text) metadata = base64.b64decode(metadata_node.text.encode('ascii'))
else: else:
metadata = None metadata = None
@ -409,7 +409,7 @@ class F4mFD(FragmentFD):
# In tests, segments may be truncated, and thus # In tests, segments may be truncated, and thus
# FlvReader may not be able to parse the whole # FlvReader may not be able to parse the whole
# chunk. If so, write the segment as is # chunk. If so, write the segment as is
# See https://github.com/ytdl-org/youtube-dl/issues/9214 # See https://github.com/rg3/youtube-dl/issues/9214
dest_stream.write(down_data) dest_stream.write(down_data)
break break
raise raise

View file

@ -74,14 +74,9 @@ class FragmentFD(FileDownloader):
return not ctx['live'] and not ctx['tmpfilename'] == '-' return not ctx['live'] and not ctx['tmpfilename'] == '-'
def _read_ytdl_file(self, ctx): def _read_ytdl_file(self, ctx):
assert 'ytdl_corrupt' not in ctx
stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'r') stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'r')
try: ctx['fragment_index'] = json.loads(stream.read())['downloader']['current_fragment']['index']
ctx['fragment_index'] = json.loads(stream.read())['downloader']['current_fragment']['index'] stream.close()
except Exception:
ctx['ytdl_corrupt'] = True
finally:
stream.close()
def _write_ytdl_file(self, ctx): def _write_ytdl_file(self, ctx):
frag_index_stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'w') frag_index_stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'w')
@ -163,17 +158,11 @@ class FragmentFD(FileDownloader):
if self.__do_ytdl_file(ctx): if self.__do_ytdl_file(ctx):
if os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))): if os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))):
self._read_ytdl_file(ctx) self._read_ytdl_file(ctx)
is_corrupt = ctx.get('ytdl_corrupt') is True if ctx['fragment_index'] > 0 and resume_len == 0:
is_inconsistent = ctx['fragment_index'] > 0 and resume_len == 0
if is_corrupt or is_inconsistent:
message = (
'.ytdl file is corrupt' if is_corrupt else
'Inconsistent state of incomplete fragment download')
self.report_warning( self.report_warning(
'%s. Restarting from the beginning...' % message) 'Inconsistent state of incomplete fragment download. '
'Restarting from the beginning...')
ctx['fragment_index'] = resume_len = 0 ctx['fragment_index'] = resume_len = 0
if 'ytdl_corrupt' in ctx:
del ctx['ytdl_corrupt']
self._write_ytdl_file(ctx) self._write_ytdl_file(ctx)
else: else:
self._write_ytdl_file(ctx) self._write_ytdl_file(ctx)
@ -190,13 +179,12 @@ class FragmentFD(FileDownloader):
}) })
def _start_frag_download(self, ctx): def _start_frag_download(self, ctx):
resume_len = ctx['complete_frags_downloaded_bytes']
total_frags = ctx['total_frags'] total_frags = ctx['total_frags']
# This dict stores the download progress, it's updated by the progress # This dict stores the download progress, it's updated by the progress
# hook # hook
state = { state = {
'status': 'downloading', 'status': 'downloading',
'downloaded_bytes': resume_len, 'downloaded_bytes': ctx['complete_frags_downloaded_bytes'],
'fragment_index': ctx['fragment_index'], 'fragment_index': ctx['fragment_index'],
'fragment_count': total_frags, 'fragment_count': total_frags,
'filename': ctx['filename'], 'filename': ctx['filename'],
@ -220,8 +208,8 @@ class FragmentFD(FileDownloader):
frag_total_bytes = s.get('total_bytes') or 0 frag_total_bytes = s.get('total_bytes') or 0
if not ctx['live']: if not ctx['live']:
estimated_size = ( estimated_size = (
(ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) (ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) /
/ (state['fragment_index'] + 1) * total_frags) (state['fragment_index'] + 1) * total_frags)
state['total_bytes_estimate'] = estimated_size state['total_bytes_estimate'] = estimated_size
if s['status'] == 'finished': if s['status'] == 'finished':
@ -235,8 +223,8 @@ class FragmentFD(FileDownloader):
state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes'] state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes']
if not ctx['live']: if not ctx['live']:
state['eta'] = self.calc_eta( state['eta'] = self.calc_eta(
start, time_now, estimated_size - resume_len, start, time_now, estimated_size,
state['downloaded_bytes'] - resume_len) state['downloaded_bytes'])
state['speed'] = s.get('speed') or ctx.get('speed') state['speed'] = s.get('speed') or ctx.get('speed')
ctx['speed'] = state['speed'] ctx['speed'] = state['speed']
ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes
@ -253,16 +241,12 @@ class FragmentFD(FileDownloader):
if os.path.isfile(ytdl_filename): if os.path.isfile(ytdl_filename):
os.remove(ytdl_filename) os.remove(ytdl_filename)
elapsed = time.time() - ctx['started'] elapsed = time.time() - ctx['started']
self.try_rename(ctx['tmpfilename'], ctx['filename'])
if ctx['tmpfilename'] == '-': fsize = os.path.getsize(encodeFilename(ctx['filename']))
downloaded_bytes = ctx['complete_frags_downloaded_bytes']
else:
self.try_rename(ctx['tmpfilename'], ctx['filename'])
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename']))
self._hook_progress({ self._hook_progress({
'downloaded_bytes': downloaded_bytes, 'downloaded_bytes': fsize,
'total_bytes': downloaded_bytes, 'total_bytes': fsize,
'filename': ctx['filename'], 'filename': ctx['filename'],
'status': 'finished', 'status': 'finished',
'elapsed': elapsed, 'elapsed': elapsed,

View file

@ -64,7 +64,7 @@ class HlsFD(FragmentFD):
s = urlh.read().decode('utf-8', 'ignore') s = urlh.read().decode('utf-8', 'ignore')
if not self.can_download(s, info_dict): if not self.can_download(s, info_dict):
if info_dict.get('extra_param_to_segment_url') or info_dict.get('_decryption_key_url'): if info_dict.get('extra_param_to_segment_url'):
self.report_error('pycrypto not found. Please install it.') self.report_error('pycrypto not found. Please install it.')
return False return False
self.report_warning( self.report_warning(
@ -75,13 +75,8 @@ class HlsFD(FragmentFD):
fd.add_progress_hook(ph) fd.add_progress_hook(ph)
return fd.real_download(filename, info_dict) return fd.real_download(filename, info_dict)
def is_ad_fragment_start(s): def anvato_ad(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s return s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
def is_ad_fragment_end(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment'))
media_frags = 0 media_frags = 0
ad_frags = 0 ad_frags = 0
@ -91,13 +86,12 @@ class HlsFD(FragmentFD):
if not line: if not line:
continue continue
if line.startswith('#'): if line.startswith('#'):
if is_ad_fragment_start(line): if anvato_ad(line):
ad_frags += 1
ad_frag_next = True ad_frag_next = True
elif is_ad_fragment_end(line):
ad_frag_next = False
continue continue
if ad_frag_next: if ad_frag_next:
ad_frags += 1 ad_frag_next = False
continue continue
media_frags += 1 media_frags += 1
@ -128,6 +122,7 @@ class HlsFD(FragmentFD):
if line: if line:
if not line.startswith('#'): if not line.startswith('#'):
if ad_frag_next: if ad_frag_next:
ad_frag_next = False
continue continue
frag_index += 1 frag_index += 1
if frag_index <= ctx['fragment_index']: if frag_index <= ctx['fragment_index']:
@ -141,7 +136,7 @@ class HlsFD(FragmentFD):
count = 0 count = 0
headers = info_dict.get('http_headers', {}) headers = info_dict.get('http_headers', {})
if byte_range: if byte_range:
headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'] - 1) headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'])
while count <= fragment_retries: while count <= fragment_retries:
try: try:
success, frag_content = self._download_fragment( success, frag_content = self._download_fragment(
@ -152,8 +147,8 @@ class HlsFD(FragmentFD):
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
# Unavailable (possibly temporary) fragments may be served. # Unavailable (possibly temporary) fragments may be served.
# First we try to retry then either skip or abort. # First we try to retry then either skip or abort.
# See https://github.com/ytdl-org/youtube-dl/issues/10165, # See https://github.com/rg3/youtube-dl/issues/10165,
# https://github.com/ytdl-org/youtube-dl/issues/10448). # https://github.com/rg3/youtube-dl/issues/10448).
count += 1 count += 1
if count <= fragment_retries: if count <= fragment_retries:
self.report_retry_fragment(err, frag_index, count, fragment_retries) self.report_retry_fragment(err, frag_index, count, fragment_retries)
@ -169,7 +164,7 @@ class HlsFD(FragmentFD):
if decrypt_info['METHOD'] == 'AES-128': if decrypt_info['METHOD'] == 'AES-128':
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence) iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen( decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
self._prepare_url(info_dict, info_dict.get('_decryption_key_url') or decrypt_info['URI'])).read() self._prepare_url(info_dict, decrypt_info['URI'])).read()
frag_content = AES.new( frag_content = AES.new(
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content) decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
self._append_fragment(ctx, frag_content) self._append_fragment(ctx, frag_content)
@ -200,10 +195,8 @@ class HlsFD(FragmentFD):
'start': sub_range_start, 'start': sub_range_start,
'end': sub_range_start + int(splitted_byte_range[0]), 'end': sub_range_start + int(splitted_byte_range[0]),
} }
elif is_ad_fragment_start(line): elif anvato_ad(line):
ad_frag_next = True ad_frag_next = True
elif is_ad_fragment_end(line):
ad_frag_next = False
self._finish_frag_download(ctx) self._finish_frag_download(ctx)

View file

@ -4,18 +4,13 @@ import errno
import os import os
import socket import socket
import time import time
import random
import re import re
from .common import FileDownloader from .common import FileDownloader
from ..compat import ( from ..compat import compat_urllib_error
compat_str,
compat_urllib_error,
)
from ..utils import ( from ..utils import (
ContentTooShortError, ContentTooShortError,
encodeFilename, encodeFilename,
int_or_none,
sanitize_open, sanitize_open,
sanitized_Request, sanitized_Request,
write_xattr, write_xattr,
@ -43,26 +38,21 @@ class HttpFD(FileDownloader):
add_headers = info_dict.get('http_headers') add_headers = info_dict.get('http_headers')
if add_headers: if add_headers:
headers.update(add_headers) headers.update(add_headers)
basic_request = sanitized_Request(url, None, headers)
request = sanitized_Request(url, None, headers)
is_test = self.params.get('test', False) is_test = self.params.get('test', False)
chunk_size = self._TEST_FILE_SIZE if is_test else (
info_dict.get('downloader_options', {}).get('http_chunk_size') if is_test:
or self.params.get('http_chunk_size') or 0) request.add_header('Range', 'bytes=0-%s' % str(self._TEST_FILE_SIZE - 1))
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
ctx.resume_len = 0 ctx.resume_len = 0
ctx.data_len = None
ctx.block_size = self.params.get('buffersize', 1024)
ctx.start_time = time.time()
ctx.chunk_size = None
if self.params.get('continuedl', True): if self.params.get('continuedl', True):
# Establish possible resume length # Establish possible resume length
if os.path.isfile(encodeFilename(ctx.tmpfilename)): if os.path.isfile(encodeFilename(ctx.tmpfilename)):
ctx.resume_len = os.path.getsize( ctx.resume_len = os.path.getsize(encodeFilename(ctx.tmpfilename))
encodeFilename(ctx.tmpfilename))
ctx.is_resume = ctx.resume_len > 0
count = 0 count = 0
retries = self.params.get('retries', 0) retries = self.params.get('retries', 0)
@ -74,91 +64,50 @@ class HttpFD(FileDownloader):
def __init__(self, source_error): def __init__(self, source_error):
self.source_error = source_error self.source_error = source_error
class NextFragment(Exception):
pass
def set_range(req, start, end):
range_header = 'bytes=%d-' % start
if end:
range_header += compat_str(end)
req.add_header('Range', range_header)
def establish_connection(): def establish_connection():
ctx.chunk_size = (random.randint(int(chunk_size * 0.95), chunk_size) if ctx.resume_len != 0:
if not is_test and chunk_size else chunk_size) self.report_resuming_byte(ctx.resume_len)
if ctx.resume_len > 0: request.add_header('Range', 'bytes=%d-' % ctx.resume_len)
range_start = ctx.resume_len
if ctx.is_resume:
self.report_resuming_byte(ctx.resume_len)
ctx.open_mode = 'ab' ctx.open_mode = 'ab'
elif ctx.chunk_size > 0:
range_start = 0
else:
range_start = None
ctx.is_resume = False
range_end = range_start + ctx.chunk_size - 1 if ctx.chunk_size else None
if range_end and ctx.data_len is not None and range_end >= ctx.data_len:
range_end = ctx.data_len - 1
has_range = range_start is not None
ctx.has_range = has_range
request = sanitized_Request(url, None, headers)
if has_range:
set_range(request, range_start, range_end)
# Establish connection # Establish connection
try: try:
try: ctx.data = self.ydl.urlopen(request)
ctx.data = self.ydl.urlopen(request)
except (compat_urllib_error.URLError, ) as err:
if isinstance(err.reason, socket.timeout):
raise RetryDownload(err)
raise err
# When trying to resume, Content-Range HTTP header of response has to be checked # When trying to resume, Content-Range HTTP header of response has to be checked
# to match the value of requested Range HTTP header. This is due to a webservers # to match the value of requested Range HTTP header. This is due to a webservers
# that don't support resuming and serve a whole file with no Content-Range # that don't support resuming and serve a whole file with no Content-Range
# set in response despite of requested Range (see # set in response despite of requested Range (see
# https://github.com/ytdl-org/youtube-dl/issues/6057#issuecomment-126129799) # https://github.com/rg3/youtube-dl/issues/6057#issuecomment-126129799)
if has_range: if ctx.resume_len > 0:
content_range = ctx.data.headers.get('Content-Range') content_range = ctx.data.headers.get('Content-Range')
if content_range: if content_range:
content_range_m = re.search(r'bytes (\d+)-(\d+)?(?:/(\d+))?', content_range) content_range_m = re.search(r'bytes (\d+)-', content_range)
# Content-Range is present and matches requested Range, resume is possible # Content-Range is present and matches requested Range, resume is possible
if content_range_m: if content_range_m and ctx.resume_len == int(content_range_m.group(1)):
if range_start == int(content_range_m.group(1)): return
content_range_end = int_or_none(content_range_m.group(2))
content_len = int_or_none(content_range_m.group(3))
accept_content_len = (
# Non-chunked download
not ctx.chunk_size
# Chunked download and requested piece or
# its part is promised to be served
or content_range_end == range_end
or content_len < range_end)
if accept_content_len:
ctx.data_len = content_len
return
# Content-Range is either not present or invalid. Assuming remote webserver is # Content-Range is either not present or invalid. Assuming remote webserver is
# trying to send the whole file, resume is not possible, so wiping the local file # trying to send the whole file, resume is not possible, so wiping the local file
# and performing entire redownload # and performing entire redownload
self.report_unable_to_resume() self.report_unable_to_resume()
ctx.resume_len = 0 ctx.resume_len = 0
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
ctx.data_len = int_or_none(ctx.data.info().get('Content-length', None))
return return
except (compat_urllib_error.HTTPError, ) as err: except (compat_urllib_error.HTTPError, ) as err:
if err.code == 416: if (err.code < 500 or err.code >= 600) and err.code != 416:
# Unexpected HTTP error
raise
elif err.code == 416:
# Unable to resume (requested range not satisfiable) # Unable to resume (requested range not satisfiable)
try: try:
# Open the connection again without the range header # Open the connection again without the range header
ctx.data = self.ydl.urlopen( ctx.data = self.ydl.urlopen(basic_request)
sanitized_Request(url, None, headers))
content_length = ctx.data.info()['Content-Length'] content_length = ctx.data.info()['Content-Length']
except (compat_urllib_error.HTTPError, ) as err: except (compat_urllib_error.HTTPError, ) as err:
if err.code < 500 or err.code >= 600: if err.code < 500 or err.code >= 600:
raise raise
else: else:
# Examine the reported length # Examine the reported length
if (content_length is not None if (content_length is not None and
and (ctx.resume_len - 100 < int(content_length) < ctx.resume_len + 100)): (ctx.resume_len - 100 < int(content_length) < ctx.resume_len + 100)):
# The file had already been fully downloaded. # The file had already been fully downloaded.
# Explanation to the above condition: in issue #175 it was revealed that # Explanation to the above condition: in issue #175 it was revealed that
# YouTube sometimes adds or removes a few bytes from the end of the file, # YouTube sometimes adds or removes a few bytes from the end of the file,
@ -181,9 +130,6 @@ class HttpFD(FileDownloader):
ctx.resume_len = 0 ctx.resume_len = 0
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
return return
elif err.code < 500 or err.code >= 600:
# Unexpected HTTP error
raise
raise RetryDownload(err) raise RetryDownload(err)
except socket.error as err: except socket.error as err:
if err.errno != errno.ECONNRESET: if err.errno != errno.ECONNRESET:
@ -214,7 +160,7 @@ class HttpFD(FileDownloader):
return False return False
byte_counter = 0 + ctx.resume_len byte_counter = 0 + ctx.resume_len
block_size = ctx.block_size block_size = self.params.get('buffersize', 1024)
start = time.time() start = time.time()
# measure time over whole while-loop, so slow_down() and best_block_size() work together properly # measure time over whole while-loop, so slow_down() and best_block_size() work together properly
@ -222,28 +168,24 @@ class HttpFD(FileDownloader):
before = start # start measuring before = start # start measuring
def retry(e): def retry(e):
to_stdout = ctx.tmpfilename == '-' if ctx.tmpfilename != '-':
if ctx.stream is not None: ctx.stream.close()
if not to_stdout: ctx.stream = None
ctx.stream.close() ctx.resume_len = os.path.getsize(encodeFilename(ctx.tmpfilename))
ctx.stream = None
ctx.resume_len = byte_counter if to_stdout else os.path.getsize(encodeFilename(ctx.tmpfilename))
raise RetryDownload(e) raise RetryDownload(e)
while True: while True:
try: try:
# Download and write # Download and write
data_block = ctx.data.read(block_size if data_len is None else min(block_size, data_len - byte_counter)) data_block = ctx.data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
# socket.timeout is a subclass of socket.error but may not have # socket.timeout is a subclass of socket.error but may not have
# errno set # errno set
except socket.timeout as e: except socket.timeout as e:
retry(e) retry(e)
except socket.error as e: except socket.error as e:
# SSLError on python 2 (inherits socket.error) may have if e.errno not in (errno.ECONNRESET, errno.ETIMEDOUT):
# no errno set but this error message raise
if e.errno in (errno.ECONNRESET, errno.ETIMEDOUT) or getattr(e, 'message', None) == 'The read operation timed out': retry(e)
retry(e)
raise
byte_counter += len(data_block) byte_counter += len(data_block)
@ -291,30 +233,25 @@ class HttpFD(FileDownloader):
# Progress message # Progress message
speed = self.calc_speed(start, now, byte_counter - ctx.resume_len) speed = self.calc_speed(start, now, byte_counter - ctx.resume_len)
if ctx.data_len is None: if data_len is None:
eta = None eta = None
else: else:
eta = self.calc_eta(start, time.time(), ctx.data_len - ctx.resume_len, byte_counter - ctx.resume_len) eta = self.calc_eta(start, time.time(), data_len - ctx.resume_len, byte_counter - ctx.resume_len)
self._hook_progress({ self._hook_progress({
'status': 'downloading', 'status': 'downloading',
'downloaded_bytes': byte_counter, 'downloaded_bytes': byte_counter,
'total_bytes': ctx.data_len, 'total_bytes': data_len,
'tmpfilename': ctx.tmpfilename, 'tmpfilename': ctx.tmpfilename,
'filename': ctx.filename, 'filename': ctx.filename,
'eta': eta, 'eta': eta,
'speed': speed, 'speed': speed,
'elapsed': now - ctx.start_time, 'elapsed': now - start,
}) })
if data_len is not None and byte_counter == data_len: if is_test and byte_counter == data_len:
break break
if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len:
ctx.resume_len = byte_counter
# ctx.block_size = block_size
raise NextFragment()
if ctx.stream is None: if ctx.stream is None:
self.to_stderr('\n') self.to_stderr('\n')
self.report_error('Did not get any data blocks') self.report_error('Did not get any data blocks')
@ -339,7 +276,7 @@ class HttpFD(FileDownloader):
'total_bytes': byte_counter, 'total_bytes': byte_counter,
'filename': ctx.filename, 'filename': ctx.filename,
'status': 'finished', 'status': 'finished',
'elapsed': time.time() - ctx.start_time, 'elapsed': time.time() - start,
}) })
return True return True
@ -353,8 +290,6 @@ class HttpFD(FileDownloader):
if count <= retries: if count <= retries:
self.report_retry(e.source_error, count, retries) self.report_retry(e.source_error, count, retries)
continue continue
except NextFragment:
continue
except SucceedDownload: except SucceedDownload:
return True return True

View file

@ -1,27 +1,25 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import time import time
import struct
import binascii import binascii
import io import io
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import ( from ..compat import compat_urllib_error
compat_Struct,
compat_urllib_error,
)
u8 = compat_Struct('>B') u8 = struct.Struct(b'>B')
u88 = compat_Struct('>Bx') u88 = struct.Struct(b'>Bx')
u16 = compat_Struct('>H') u16 = struct.Struct(b'>H')
u1616 = compat_Struct('>Hxx') u1616 = struct.Struct(b'>Hxx')
u32 = compat_Struct('>I') u32 = struct.Struct(b'>I')
u64 = compat_Struct('>Q') u64 = struct.Struct(b'>Q')
s88 = compat_Struct('>bx') s88 = struct.Struct(b'>bx')
s16 = compat_Struct('>h') s16 = struct.Struct(b'>h')
s1616 = compat_Struct('>hxx') s1616 = struct.Struct(b'>hxx')
s32 = compat_Struct('>i') s32 = struct.Struct(b'>i')
unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000) unity_matrix = (s32.pack(0x10000) + s32.pack(0) * 3) * 2 + s32.pack(0x40000000)
@ -141,12 +139,12 @@ def write_piff_header(stream, params):
sample_entry_payload += u16.pack(0x18) # depth sample_entry_payload += u16.pack(0x18) # depth
sample_entry_payload += s16.pack(-1) # pre defined sample_entry_payload += s16.pack(-1) # pre defined
codec_private_data = binascii.unhexlify(params['codec_private_data'].encode('utf-8')) codec_private_data = binascii.unhexlify(params['codec_private_data'])
if fourcc in ('H264', 'AVC1'): if fourcc in ('H264', 'AVC1'):
sps, pps = codec_private_data.split(u32.pack(1))[1:] sps, pps = codec_private_data.split(u32.pack(1))[1:]
avcc_payload = u8.pack(1) # configuration version avcc_payload = u8.pack(1) # configuration version
avcc_payload += sps[1:4] # avc profile indication + profile compatibility + avc level indication avcc_payload += sps[1:4] # avc profile indication + profile compatibility + avc level indication
avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete representation (1) + reserved (11111) + length size minus one avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete represenation (1) + reserved (11111) + length size minus one
avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001) avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001)
avcc_payload += u16.pack(len(sps)) avcc_payload += u16.pack(len(sps))
avcc_payload += sps avcc_payload += sps

View file

@ -29,68 +29,66 @@ class RtmpFD(FileDownloader):
proc = subprocess.Popen(args, stderr=subprocess.PIPE) proc = subprocess.Popen(args, stderr=subprocess.PIPE)
cursor_in_new_line = True cursor_in_new_line = True
proc_stderr_closed = False proc_stderr_closed = False
try: while not proc_stderr_closed:
while not proc_stderr_closed: # read line from stderr
# read line from stderr line = ''
line = '' while True:
while True: char = proc.stderr.read(1)
char = proc.stderr.read(1) if not char:
if not char: proc_stderr_closed = True
proc_stderr_closed = True break
break if char in [b'\r', b'\n']:
if char in [b'\r', b'\n']: break
break line += char.decode('ascii', 'replace')
line += char.decode('ascii', 'replace') if not line:
if not line: # proc_stderr_closed is True
# proc_stderr_closed is True continue
continue mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec \(([0-9]{1,2}\.[0-9])%\)', line)
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec \(([0-9]{1,2}\.[0-9])%\)', line) if mobj:
downloaded_data_len = int(float(mobj.group(1)) * 1024)
percent = float(mobj.group(2))
if not resume_percent:
resume_percent = percent
resume_downloaded_data_len = downloaded_data_len
time_now = time.time()
eta = self.calc_eta(start, time_now, 100 - resume_percent, percent - resume_percent)
speed = self.calc_speed(start, time_now, downloaded_data_len - resume_downloaded_data_len)
data_len = None
if percent > 0:
data_len = int(downloaded_data_len * 100 / percent)
self._hook_progress({
'status': 'downloading',
'downloaded_bytes': downloaded_data_len,
'total_bytes_estimate': data_len,
'tmpfilename': tmpfilename,
'filename': filename,
'eta': eta,
'elapsed': time_now - start,
'speed': speed,
})
cursor_in_new_line = False
else:
# no percent for live streams
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec', line)
if mobj: if mobj:
downloaded_data_len = int(float(mobj.group(1)) * 1024) downloaded_data_len = int(float(mobj.group(1)) * 1024)
percent = float(mobj.group(2))
if not resume_percent:
resume_percent = percent
resume_downloaded_data_len = downloaded_data_len
time_now = time.time() time_now = time.time()
eta = self.calc_eta(start, time_now, 100 - resume_percent, percent - resume_percent) speed = self.calc_speed(start, time_now, downloaded_data_len)
speed = self.calc_speed(start, time_now, downloaded_data_len - resume_downloaded_data_len)
data_len = None
if percent > 0:
data_len = int(downloaded_data_len * 100 / percent)
self._hook_progress({ self._hook_progress({
'status': 'downloading',
'downloaded_bytes': downloaded_data_len, 'downloaded_bytes': downloaded_data_len,
'total_bytes_estimate': data_len,
'tmpfilename': tmpfilename, 'tmpfilename': tmpfilename,
'filename': filename, 'filename': filename,
'eta': eta, 'status': 'downloading',
'elapsed': time_now - start, 'elapsed': time_now - start,
'speed': speed, 'speed': speed,
}) })
cursor_in_new_line = False cursor_in_new_line = False
else: elif self.params.get('verbose', False):
# no percent for live streams if not cursor_in_new_line:
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec', line) self.to_screen('')
if mobj: cursor_in_new_line = True
downloaded_data_len = int(float(mobj.group(1)) * 1024) self.to_screen('[rtmpdump] ' + line)
time_now = time.time() proc.wait()
speed = self.calc_speed(start, time_now, downloaded_data_len)
self._hook_progress({
'downloaded_bytes': downloaded_data_len,
'tmpfilename': tmpfilename,
'filename': filename,
'status': 'downloading',
'elapsed': time_now - start,
'speed': speed,
})
cursor_in_new_line = False
elif self.params.get('verbose', False):
if not cursor_in_new_line:
self.to_screen('')
cursor_in_new_line = True
self.to_screen('[rtmpdump] ' + line)
finally:
proc.wait()
if not cursor_in_new_line: if not cursor_in_new_line:
self.to_screen('') self.to_screen('')
return proc.returncode return proc.returncode
@ -165,15 +163,7 @@ class RtmpFD(FileDownloader):
RD_INCOMPLETE = 2 RD_INCOMPLETE = 2
RD_NO_CONNECT = 3 RD_NO_CONNECT = 3
started = time.time() retval = run_rtmpdump(args)
try:
retval = run_rtmpdump(args)
except KeyboardInterrupt:
if not info_dict.get('is_live'):
raise
retval = RD_SUCCESS
self.to_screen('\n[rtmpdump] Interrupted by user')
if retval == RD_NO_CONNECT: if retval == RD_NO_CONNECT:
self.report_error('[rtmpdump] Could not connect to RTMP server.') self.report_error('[rtmpdump] Could not connect to RTMP server.')
@ -181,7 +171,7 @@ class RtmpFD(FileDownloader):
while retval in (RD_INCOMPLETE, RD_FAILED) and not test and not live: while retval in (RD_INCOMPLETE, RD_FAILED) and not test and not live:
prevsize = os.path.getsize(encodeFilename(tmpfilename)) prevsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('[rtmpdump] Downloaded %s bytes' % prevsize) self.to_screen('[rtmpdump] %s bytes' % prevsize)
time.sleep(5.0) # This seems to be needed time.sleep(5.0) # This seems to be needed
args = basic_args + ['--resume'] args = basic_args + ['--resume']
if retval == RD_FAILED: if retval == RD_FAILED:
@ -198,14 +188,13 @@ class RtmpFD(FileDownloader):
break break
if retval == RD_SUCCESS or (test and retval == RD_INCOMPLETE): if retval == RD_SUCCESS or (test and retval == RD_INCOMPLETE):
fsize = os.path.getsize(encodeFilename(tmpfilename)) fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('[rtmpdump] Downloaded %s bytes' % fsize) self.to_screen('[rtmpdump] %s bytes' % fsize)
self.try_rename(tmpfilename, filename) self.try_rename(tmpfilename, filename)
self._hook_progress({ self._hook_progress({
'downloaded_bytes': fsize, 'downloaded_bytes': fsize,
'total_bytes': fsize, 'total_bytes': fsize,
'filename': filename, 'filename': filename,
'status': 'finished', 'status': 'finished',
'elapsed': time.time() - started,
}) })
return True return True
else: else:

View file

@ -13,7 +13,6 @@ from ..utils import (
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
try_get, try_get,
unescapeHTML,
update_url_query, update_url_query,
) )
@ -105,22 +104,21 @@ class ABCIE(InfoExtractor):
class ABCIViewIE(InfoExtractor): class ABCIViewIE(InfoExtractor):
IE_NAME = 'abc.net.au:iview' IE_NAME = 'abc.net.au:iview'
_VALID_URL = r'https?://iview\.abc\.net\.au/(?:[^/]+/)*video/(?P<id>[^/?#]+)' _VALID_URL = r'https?://iview\.abc\.net\.au/programs/[^/]+/(?P<id>[^/?#]+)'
_GEO_COUNTRIES = ['AU'] _GEO_COUNTRIES = ['AU']
# ABC iview programs are normally available for 14 days only. # ABC iview programs are normally available for 14 days only.
_TESTS = [{ _TESTS = [{
'url': 'https://iview.abc.net.au/show/gruen/series/11/video/LE1927H001S00', 'url': 'http://iview.abc.net.au/programs/call-the-midwife/ZW0898A003S00',
'md5': '67715ce3c78426b11ba167d875ac6abf', 'md5': 'cde42d728b3b7c2b32b1b94b4a548afc',
'info_dict': { 'info_dict': {
'id': 'LE1927H001S00', 'id': 'ZW0898A003S00',
'ext': 'mp4', 'ext': 'mp4',
'title': "Series 11 Ep 1", 'title': 'Series 5 Ep 3',
'series': "Gruen", 'description': 'md5:e0ef7d4f92055b86c4f33611f180ed79',
'description': 'md5:52cc744ad35045baf6aded2ce7287f67', 'upload_date': '20171228',
'upload_date': '20190925',
'uploader_id': 'abc1', 'uploader_id': 'abc1',
'timestamp': 1569445289, 'timestamp': 1514499187,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@ -129,16 +127,17 @@ class ABCIViewIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
video_params = self._download_json( webpage = self._download_webpage(url, video_id)
'https://iview.abc.net.au/api/programs/' + video_id, video_id) video_params = self._parse_json(self._search_regex(
title = unescapeHTML(video_params.get('title') or video_params['seriesTitle']) r'videoParams\s*=\s*({.+?});', webpage, 'video params'), video_id)
stream = next(s for s in video_params['playlist'] if s.get('type') in ('program', 'livestream')) title = video_params.get('title') or video_params['seriesTitle']
stream = next(s for s in video_params['playlist'] if s.get('type') == 'program')
house_number = video_params.get('episodeHouseNumber') or video_id house_number = video_params.get('episodeHouseNumber')
path = '/auth/hls/sign?ts={0}&hn={1}&d=android-tablet'.format( path = '/auth/hls/sign?ts={0}&hn={1}&d=android-mobile'.format(
int(time.time()), house_number) int(time.time()), house_number)
sig = hmac.new( sig = hmac.new(
b'android.content.res.Resources', 'android.content.res.Resources'.encode('utf-8'),
path.encode('utf-8'), hashlib.sha256).hexdigest() path.encode('utf-8'), hashlib.sha256).hexdigest()
token = self._download_webpage( token = self._download_webpage(
'http://iview.abc.net.au{0}&sig={1}'.format(path, sig), video_id) 'http://iview.abc.net.au{0}&sig={1}'.format(path, sig), video_id)
@ -148,7 +147,7 @@ class ABCIViewIE(InfoExtractor):
'hdnea': token, 'hdnea': token,
}) })
for sd in ('720', 'sd', 'sd-low'): for sd in ('sd', 'sd-low'):
sd_url = try_get( sd_url = try_get(
stream, lambda x: x['streams']['hls'][sd], compat_str) stream, lambda x: x['streams']['hls'][sd], compat_str)
if not sd_url: if not sd_url:
@ -168,26 +167,18 @@ class ABCIViewIE(InfoExtractor):
'ext': 'vtt', 'ext': 'vtt',
}] }]
is_live = video_params.get('livestream') == '1'
if is_live:
title = self._live_title(title)
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'description': video_params.get('description'), 'description': self._html_search_meta(['og:description', 'twitter:description'], webpage),
'thumbnail': video_params.get('thumbnail'), 'thumbnail': self._html_search_meta(['og:image', 'twitter:image:src'], webpage),
'duration': int_or_none(video_params.get('eventDuration')), 'duration': int_or_none(video_params.get('eventDuration')),
'timestamp': parse_iso8601(video_params.get('pubDate'), ' '), 'timestamp': parse_iso8601(video_params.get('pubDate'), ' '),
'series': unescapeHTML(video_params.get('seriesTitle')), 'series': video_params.get('seriesTitle'),
'series_id': video_params.get('seriesHouseNumber') or video_id[:7], 'series_id': video_params.get('seriesHouseNumber') or video_id[:7],
'season_number': int_or_none(self._search_regex( 'episode_number': int_or_none(self._html_search_meta('episodeNumber', webpage, default=None)),
r'\bSeries\s+(\d+)\b', title, 'season number', default=None)), 'episode': self._html_search_meta('episode_title', webpage, default=None),
'episode_number': int_or_none(self._search_regex(
r'\bEp\s+(\d+)\b', title, 'episode number', default=None)),
'episode_id': house_number,
'uploader_id': video_params.get('channel'), 'uploader_id': video_params.get('channel'),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'is_live': is_live,
} }

View file

@ -15,13 +15,10 @@ class AbcNewsVideoIE(AMPIE):
IE_NAME = 'abcnews:video' IE_NAME = 'abcnews:video'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
abcnews\.go\.com/
(?: (?:
abcnews\.go\.com/ [^/]+/video/(?P<display_id>[0-9a-z-]+)-|
(?: video/embed\?.*?\bid=
[^/]+/video/(?P<display_id>[0-9a-z-]+)-|
video/embed\?.*?\bid=
)|
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
) )
(?P<id>\d+) (?P<id>\d+)
''' '''
@ -69,7 +66,7 @@ class AbcNewsIE(InfoExtractor):
_TESTS = [{ _TESTS = [{
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY', 'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
'info_dict': { 'info_dict': {
'id': '10505354', 'id': '10498713',
'ext': 'flv', 'ext': 'flv',
'display_id': 'dramatic-video-rare-death-job-america', 'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards', 'title': 'Occupational Hazards',
@ -82,7 +79,7 @@ class AbcNewsIE(InfoExtractor):
}, { }, {
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818', 'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
'info_dict': { 'info_dict': {
'id': '38897857', 'id': '39125818',
'ext': 'mp4', 'ext': 'mp4',
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016', 'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
'title': 'Justin Timberlake Drops Hints For Secret Single', 'title': 'Justin Timberlake Drops Hints For Secret Single',

View file

@ -4,30 +4,29 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
dict_get,
int_or_none, int_or_none,
try_get, parse_iso8601,
) )
class ABCOTVSIE(InfoExtractor): class ABCOTVSIE(InfoExtractor):
IE_NAME = 'abcotvs' IE_NAME = 'abcotvs'
IE_DESC = 'ABC Owned Television Stations' IE_DESC = 'ABC Owned Television Stations'
_VALID_URL = r'https?://(?P<site>abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:(?:/[^/]+)*/(?P<display_id>[^/]+))?/(?P<id>\d+)' _VALID_URL = r'https?://(?:abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
_TESTS = [ _TESTS = [
{ {
'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/', 'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/',
'info_dict': { 'info_dict': {
'id': '472548', 'id': '472581',
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers', 'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
'ext': 'mp4', 'ext': 'mp4',
'title': 'East Bay museum celebrates synthesized music', 'title': 'East Bay museum celebrates vintage synthesizers',
'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3', 'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421118520, 'timestamp': 1421123075,
'upload_date': '20150113', 'upload_date': '20150113',
'uploader': 'Jonathan Bloom',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -38,63 +37,39 @@ class ABCOTVSIE(InfoExtractor):
'url': 'http://abc7news.com/472581', 'url': 'http://abc7news.com/472581',
'only_matching': True, 'only_matching': True,
}, },
{
'url': 'https://6abc.com/man-75-killed-after-being-struck-by-vehicle-in-chester/5725182/',
'only_matching': True,
},
] ]
_SITE_MAP = {
'6abc': 'wpvi',
'abc11': 'wtvd',
'abc13': 'ktrk',
'abc30': 'kfsn',
'abc7': 'kabc',
'abc7chicago': 'wls',
'abc7news': 'kgo',
'abc7ny': 'wabc',
}
def _real_extract(self, url): def _real_extract(self, url):
site, display_id, video_id = re.match(self._VALID_URL, url).groups() mobj = re.match(self._VALID_URL, url)
display_id = display_id or video_id video_id = mobj.group('id')
station = self._SITE_MAP[site] display_id = mobj.group('display_id') or video_id
data = self._download_json( webpage = self._download_webpage(url, display_id)
'https://api.abcotvs.com/v2/content', display_id, query={
'id': video_id,
'key': 'otv.web.%s.story' % station,
'station': station,
})['data']
video = try_get(data, lambda x: x['featuredMedia']['video'], dict) or data
video_id = compat_str(dict_get(video, ('id', 'publishedKey'), video_id))
title = video.get('title') or video['linkText']
formats = [] m3u8 = self._html_search_meta(
m3u8_url = video.get('m3u8') 'contentURL', webpage, 'm3u8 url', fatal=True).split('?')[0]
if m3u8_url:
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4')
video['m3u8'].split('?')[0], display_id, 'mp4', m3u8_id='hls', fatal=False)
mp4_url = video.get('mp4')
if mp4_url:
formats.append({
'abr': 128,
'format_id': 'https',
'height': 360,
'url': mp4_url,
'width': 640,
})
self._sort_formats(formats) self._sort_formats(formats)
image = video.get('image') or {} title = self._og_search_title(webpage).strip()
description = self._og_search_description(webpage).strip()
thumbnail = self._og_search_thumbnail(webpage)
timestamp = parse_iso8601(self._search_regex(
r'<div class="meta">\s*<time class="timeago" datetime="([^"]+)">',
webpage, 'upload date', fatal=False))
uploader = self._search_regex(
r'rel="author">([^<]+)</a>',
webpage, 'uploader', default=None)
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'description': dict_get(video, ('description', 'caption'), try_get(video, lambda x: x['meta']['description'])), 'description': description,
'thumbnail': dict_get(image, ('source', 'dynamicSource')), 'thumbnail': thumbnail,
'timestamp': int_or_none(video.get('date')), 'timestamp': timestamp,
'duration': int_or_none(video.get('length')), 'uploader': uploader,
'formats': formats, 'formats': formats,
} }

View file

@ -7,10 +7,7 @@ import functools
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
clean_html,
float_or_none,
int_or_none, int_or_none,
try_get,
unified_timestamp, unified_timestamp,
OnDemandPagedList, OnDemandPagedList,
) )
@ -18,98 +15,65 @@ from ..utils import (
class ACastIE(InfoExtractor): class ACastIE(InfoExtractor):
IE_NAME = 'acast' IE_NAME = 'acast'
_VALID_URL = r'''(?x) _VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<channel>[^/]+)/(?P<id>[^/#?]+)'
https?://
(?:
(?:(?:embed|www)\.)?acast\.com/|
play\.acast\.com/s/
)
(?P<channel>[^/]+)/(?P<id>[^/#?]+)
'''
_TESTS = [{ _TESTS = [{
# test with one bling
'url': 'https://www.acast.com/condenasttraveler/-where-are-you-taipei-101-taiwan',
'md5': 'ada3de5a1e3a2a381327d749854788bb',
'info_dict': {
'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
'ext': 'mp3',
'title': '"Where Are You?": Taipei 101, Taiwan',
'timestamp': 1196172000,
'upload_date': '20071127',
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
'duration': 211,
}
}, {
# test with multiple blings
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna', 'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
'md5': '16d936099ec5ca2d5869e3a813ee8dc4', 'md5': 'e87d5b8516cd04c0d81b6ee1caca28d0',
'info_dict': { 'info_dict': {
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9', 'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
'ext': 'mp3', 'ext': 'mp3',
'title': '2. Raggarmordet - Röster ur det förflutna', 'title': '2. Raggarmordet - Röster ur det förflutna',
'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
'timestamp': 1477346700, 'timestamp': 1477346700,
'upload_date': '20161024', 'upload_date': '20161024',
'duration': 2766.602563, 'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
'creator': 'Anton Berg & Martin Johnson', 'duration': 2766,
'series': 'Spår',
'episode': '2. Raggarmordet - Röster ur det förflutna',
} }
}, {
'url': 'http://embed.acast.com/adambuxton/ep.12-adam-joeschristmaspodcast2015',
'only_matching': True,
}, {
'url': 'https://play.acast.com/s/rattegangspodden/s04e09-styckmordet-i-helenelund-del-22',
'only_matching': True,
}, {
'url': 'https://play.acast.com/s/sparpodcast/2a92b283-1a75-4ad8-8396-499c641de0d9',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups() channel, display_id = re.match(self._VALID_URL, url).groups()
s = self._download_json(
'https://feeder.acast.com/api/v1/shows/%s/episodes/%s' % (channel, display_id),
display_id)
media_url = s['url']
if re.search(r'[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12}', display_id):
episode_url = s.get('episodeUrl')
if episode_url:
display_id = episode_url
else:
channel, display_id = re.match(self._VALID_URL, s['link']).groups()
cast_data = self._download_json( cast_data = self._download_json(
'https://play-api.acast.com/splash/%s/%s' % (channel, display_id), 'https://play-api.acast.com/splash/%s/%s' % (channel, display_id), display_id)
display_id)['result'] e = cast_data['result']['episode']
e = cast_data['episode']
title = e.get('name') or s['title']
return { return {
'id': compat_str(e['id']), 'id': compat_str(e['id']),
'display_id': display_id, 'display_id': display_id,
'url': media_url, 'url': e['mediaUrl'],
'title': title, 'title': e['name'],
'description': e.get('summary') or clean_html(e.get('description') or s.get('description')), 'description': e.get('description'),
'thumbnail': e.get('image'), 'thumbnail': e.get('image'),
'timestamp': unified_timestamp(e.get('publishingDate') or s.get('publishDate')), 'timestamp': unified_timestamp(e.get('publishingDate')),
'duration': float_or_none(e.get('duration') or s.get('duration')), 'duration': int_or_none(e.get('duration')),
'filesize': int_or_none(e.get('contentLength')),
'creator': try_get(cast_data, lambda x: x['show']['author'], compat_str),
'series': try_get(cast_data, lambda x: x['show']['name'], compat_str),
'season_number': int_or_none(e.get('seasonNumber')),
'episode': title,
'episode_number': int_or_none(e.get('episodeNumber')),
} }
class ACastChannelIE(InfoExtractor): class ACastChannelIE(InfoExtractor):
IE_NAME = 'acast:channel' IE_NAME = 'acast:channel'
_VALID_URL = r'''(?x) _VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<id>[^/#?]+)'
https?:// _TEST = {
(?: 'url': 'https://www.acast.com/condenasttraveler',
(?:www\.)?acast\.com/|
play\.acast\.com/s/
)
(?P<id>[^/#?]+)
'''
_TESTS = [{
'url': 'https://www.acast.com/todayinfocus',
'info_dict': { 'info_dict': {
'id': '4efc5294-5385-4847-98bd-519799ce5786', 'id': '50544219-29bb-499e-a083-6087f4cb7797',
'title': 'Today in Focus', 'title': 'Condé Nast Traveler Podcast',
'description': 'md5:9ba5564de5ce897faeb12963f4537a64', 'description': 'md5:98646dee22a5b386626ae31866638fbd',
}, },
'playlist_mincount': 35, 'playlist_mincount': 20,
}, { }
'url': 'http://play.acast.com/s/ft-banking-weekly', _API_BASE_URL = 'https://www.acast.com/api/'
'only_matching': True,
}]
_API_BASE_URL = 'https://play.acast.com/api/'
_PAGE_SIZE = 10 _PAGE_SIZE = 10
@classmethod @classmethod
@ -122,7 +86,7 @@ class ACastChannelIE(InfoExtractor):
channel_slug, note='Download page %d of channel data' % page) channel_slug, note='Download page %d of channel data' % page)
for cast in casts: for cast in casts:
yield self.url_result( yield self.url_result(
'https://play.acast.com/s/%s/%s' % (channel_slug, cast['url']), 'https://www.acast.com/%s/%s' % (channel_slug, cast['url']),
'ACast', cast['id']) 'ACast', cast['id'])
def _real_extract(self, url): def _real_extract(self, url):

View file

@ -0,0 +1,95 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse,
)
from ..utils import (
ExtractorError,
qualities,
)
class AddAnimeIE(InfoExtractor):
_VALID_URL = r'https?://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_TESTS = [{
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0',
'info_dict': {
'id': '24MR3YO5SAS9',
'ext': 'mp4',
'description': 'One Piece 606',
'title': 'One Piece 606',
},
'skip': 'Video is gone',
}, {
'url': 'http://add-anime.net/video/MDUGWYKNGBD8/One-Piece-687',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
try:
webpage = self._download_webpage(url, video_id)
except ExtractorError as ee:
if not isinstance(ee.cause, compat_HTTPError) or \
ee.cause.code != 503:
raise
redir_webpage = ee.cause.read().decode('utf-8')
action = self._search_regex(
r'<form id="challenge-form" action="([^"]+)"',
redir_webpage, 'Redirect form')
vc = self._search_regex(
r'<input type="hidden" name="jschl_vc" value="([^"]+)"/>',
redir_webpage, 'redirect vc value')
av = re.search(
r'a\.value = ([0-9]+)[+]([0-9]+)[*]([0-9]+);',
redir_webpage)
if av is None:
raise ExtractorError('Cannot find redirect math task')
av_res = int(av.group(1)) + int(av.group(2)) * int(av.group(3))
parsed_url = compat_urllib_parse_urlparse(url)
av_val = av_res + len(parsed_url.netloc)
confirm_url = (
parsed_url.scheme + '://' + parsed_url.netloc +
action + '?' +
compat_urllib_parse_urlencode({
'jschl_vc': vc, 'jschl_answer': compat_str(av_val)}))
self._download_webpage(
confirm_url, video_id,
note='Confirming after redirect')
webpage = self._download_webpage(url, video_id)
FORMATS = ('normal', 'hq')
quality = qualities(FORMATS)
formats = []
for format_id in FORMATS:
rex = r"var %s_video_file = '(.*?)';" % re.escape(format_id)
video_url = self._search_regex(rex, webpage, 'video file URLx',
fatal=False)
if not video_url:
continue
formats.append({
'format_id': format_id,
'url': video_url,
'quality': quality(format_id),
})
self._sort_formats(formats)
video_title = self._og_search_title(webpage)
video_description = self._og_search_description(webpage)
return {
'_type': 'video',
'id': video_id,
'formats': formats,
'title': video_title,
'description': video_description
}

View file

@ -2,25 +2,18 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import base64 import base64
import binascii
import json import json
import os import os
import random
from .common import InfoExtractor from .common import InfoExtractor
from ..aes import aes_cbc_decrypt from ..aes import aes_cbc_decrypt
from ..compat import ( from ..compat import compat_ord
compat_b64decode,
compat_ord,
)
from ..utils import ( from ..utils import (
bytes_to_intlist, bytes_to_intlist,
bytes_to_long,
ExtractorError, ExtractorError,
float_or_none, float_or_none,
intlist_to_bytes, intlist_to_bytes,
long_to_bytes, srt_subtitles_timecode,
pkcs1pad,
strip_or_none, strip_or_none,
urljoin, urljoin,
) )
@ -40,19 +33,6 @@ class ADNIE(InfoExtractor):
} }
} }
_BASE_URL = 'http://animedigitalnetwork.fr' _BASE_URL = 'http://animedigitalnetwork.fr'
_RSA_KEY = (0xc35ae1e4356b65a73b551493da94b8cb443491c0aa092a357a5aee57ffc14dda85326f42d716e539a34542a0d3f363adf16c5ec222d713d5997194030ee2e4f0d1fb328c01a81cf6868c090d50de8e169c6b13d1675b9eeed1cbc51e1fffca9b38af07f37abd790924cd3bee59d0257cfda4fe5f3f0534877e21ce5821447d1b, 65537)
_POS_ALIGN_MAP = {
'start': 1,
'end': 3,
}
_LINE_ALIGN_MAP = {
'middle': 8,
'end': 4,
}
@staticmethod
def _ass_subtitles_timecode(seconds):
return '%01d:%02d:%02d.%02d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 100)
def _get_subtitles(self, sub_path, video_id): def _get_subtitles(self, sub_path, video_id):
if not sub_path: if not sub_path:
@ -60,21 +40,17 @@ class ADNIE(InfoExtractor):
enc_subtitles = self._download_webpage( enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, sub_path), urljoin(self._BASE_URL, sub_path),
video_id, 'Downloading subtitles location', fatal=False) or '{}' video_id, fatal=False, headers={
subtitle_location = (self._parse_json(enc_subtitles, video_id, fatal=False) or {}).get('location') 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0',
if subtitle_location: })
enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, subtitle_location),
video_id, 'Downloading subtitles data', fatal=False,
headers={'Origin': 'https://animedigitalnetwork.fr'})
if not enc_subtitles: if not enc_subtitles:
return None return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js # http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt( dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])), bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
bytes_to_intlist(binascii.unhexlify(self._K + '4b8ef13ec1872730')), bytes_to_intlist(b'\x1b\xe0\x29\x61\x38\x94\x24\x00\x12\xbd\xc5\x80\xac\xce\xbe\xb0'),
bytes_to_intlist(compat_b64decode(enc_subtitles[:24])) bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
)) ))
subtitles_json = self._parse_json( subtitles_json = self._parse_json(
dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(), dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(),
@ -84,27 +60,23 @@ class ADNIE(InfoExtractor):
subtitles = {} subtitles = {}
for sub_lang, sub in subtitles_json.items(): for sub_lang, sub in subtitles_json.items():
ssa = '''[Script Info] srt = ''
ScriptType:V4.00 for num, current in enumerate(sub):
[V4 Styles] start, end, text = (
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,TertiaryColour,BackColour,Bold,Italic,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,AlphaLevel,Encoding
Style: Default,Arial,18,16777215,16777215,16777215,0,-1,0,1,1,0,2,20,20,20,0,0
[Events]
Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
for current in sub:
start, end, text, line_align, position_align = (
float_or_none(current.get('startTime')), float_or_none(current.get('startTime')),
float_or_none(current.get('endTime')), float_or_none(current.get('endTime')),
current.get('text'), current.get('lineAlign'), current.get('text'))
current.get('positionAlign'))
if start is None or end is None or text is None: if start is None or end is None or text is None:
continue continue
alignment = self._POS_ALIGN_MAP.get(position_align, 2) + self._LINE_ALIGN_MAP.get(line_align, 0) srt += os.linesep.join(
ssa += os.linesep + 'Dialogue: Marked=0,%s,%s,Default,,0,0,0,,%s%s' % ( (
self._ass_subtitles_timecode(start), '%d' % num,
self._ass_subtitles_timecode(end), '%s --> %s' % (
'{\\a%d}' % alignment if alignment != 2 else '', srt_subtitles_timecode(start),
text.replace('\n', '\\N').replace('<i>', '{\\i1}').replace('</i>', '{\\i0}')) srt_subtitles_timecode(end)),
text,
os.linesep,
))
if sub_lang == 'vostf': if sub_lang == 'vostf':
sub_lang = 'fr' sub_lang = 'fr'
@ -112,8 +84,8 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
'ext': 'json', 'ext': 'json',
'data': json.dumps(sub), 'data': json.dumps(sub),
}, { }, {
'ext': 'ssa', 'ext': 'srt',
'data': ssa, 'data': srt,
}]) }])
return subtitles return subtitles
@ -121,15 +93,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
player_config = self._parse_json(self._search_regex( player_config = self._parse_json(self._search_regex(
r'playerConfig\s*=\s*({.+});', webpage, r'playerConfig\s*=\s*({.+});', webpage, 'player config'), video_id)
'player config', default='{}'), video_id, fatal=False)
if not player_config:
config_url = urljoin(self._BASE_URL, self._search_regex(
r'(?:id="player"|class="[^"]*adn-player-container[^"]*")[^>]+data-url="([^"]+)"',
webpage, 'config url'))
player_config = self._download_json(
config_url, video_id,
'Downloading player config JSON metadata')['player']
video_info = {} video_info = {}
video_info_str = self._search_regex( video_info_str = self._search_regex(
@ -141,44 +105,23 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
options = player_config.get('options') or {} options = player_config.get('options') or {}
metas = options.get('metas') or {} metas = options.get('metas') or {}
title = metas.get('title') or video_info['title']
links = player_config.get('links') or {} links = player_config.get('links') or {}
sub_path = player_config.get('subtitles')
error = None error = None
if not links: if not links:
links_url = player_config.get('linksurl') or options['videoUrl'] links_url = player_config['linksurl']
token = options['token'] links_data = self._download_json(urljoin(
self._K = ''.join([random.choice('0123456789abcdef') for _ in range(16)]) self._BASE_URL, links_url), video_id)
message = bytes_to_intlist(json.dumps({
'k': self._K,
'e': 60,
't': token,
}))
padded_message = intlist_to_bytes(pkcs1pad(message, 128))
n, e = self._RSA_KEY
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
authorization = base64.b64encode(encrypted_message).decode()
links_data = self._download_json(
urljoin(self._BASE_URL, links_url), video_id,
'Downloading links JSON metadata', headers={
'Authorization': 'Bearer ' + authorization,
})
links = links_data.get('links') or {} links = links_data.get('links') or {}
metas = metas or links_data.get('meta') or {}
sub_path = sub_path or links_data.get('subtitles') or \
'index.php?option=com_vodapi&task=subtitles.getJSON&format=json&id=' + video_id
sub_path += '&token=' + token
error = links_data.get('error') error = links_data.get('error')
title = metas.get('title') or video_info['title']
formats = [] formats = []
for format_id, qualities in links.items(): for format_id, qualities in links.items():
if not isinstance(qualities, dict): if not isinstance(qualities, dict):
continue continue
for quality, load_balancer_url in qualities.items(): for load_balancer_url in qualities.values():
load_balancer_data = self._download_json( load_balancer_data = self._download_json(
load_balancer_url, video_id, load_balancer_url, video_id, fatal=False) or {}
'Downloading %s %s JSON metadata' % (format_id, quality),
fatal=False) or {}
m3u8_url = load_balancer_data.get('location') m3u8_url = load_balancer_data.get('location')
if not m3u8_url: if not m3u8_url:
continue continue
@ -201,7 +144,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
'description': strip_or_none(metas.get('summary') or video_info.get('resume')), 'description': strip_or_none(metas.get('summary') or video_info.get('resume')),
'thumbnail': video_info.get('image'), 'thumbnail': video_info.get('image'),
'formats': formats, 'formats': formats,
'subtitles': self.extract_subtitles(sub_path, video_id), 'subtitles': self.extract_subtitles(player_config.get('subtitles'), video_id),
'episode': metas.get('subtitle') or video_info.get('videoTitle'), 'episode': metas.get('subtitle') or video_info.get('videoTitle'),
'series': video_info.get('playlistTitle'), 'series': video_info.get('playlistTitle'),
} }

View file

@ -1,37 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urlparse,
)
class AdobeConnectIE(InfoExtractor):
_VALID_URL = r'https?://\w+\.adobeconnect\.com/(?P<id>[\w-]+)'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
qs = compat_parse_qs(self._search_regex(r"swfUrl\s*=\s*'([^']+)'", webpage, 'swf url').split('?')[1])
is_live = qs.get('isLive', ['false'])[0] == 'true'
formats = []
for con_string in qs['conStrings'][0].split(','):
formats.append({
'format_id': con_string.split('://')[0],
'app': compat_urlparse.quote('?' + con_string.split('?')[1] + 'flvplayerapp/' + qs['appInstance'][0]),
'ext': 'flv',
'play_path': 'mp4:' + qs['streamName'][0],
'rtmp_conn': 'S:' + qs['ticket'][0],
'rtmp_live': is_live,
'url': con_string,
})
return {
'id': video_id,
'title': self._live_title(title) if is_live else title,
'formats': formats,
'is_live': is_live,
}

View file

@ -25,11 +25,6 @@ MSO_INFO = {
'username_field': 'username', 'username_field': 'username',
'password_field': 'password', 'password_field': 'password',
}, },
'ATT': {
'name': 'AT&T U-verse',
'username_field': 'userid',
'password_field': 'password',
},
'ATTOTT': { 'ATTOTT': {
'name': 'DIRECTV NOW', 'name': 'DIRECTV NOW',
'username_field': 'email', 'username_field': 'email',
@ -1330,8 +1325,8 @@ class AdobePassIE(InfoExtractor):
_DOWNLOADING_LOGIN_PAGE = 'Downloading Provider Login Page' _DOWNLOADING_LOGIN_PAGE = 'Downloading Provider Login Page'
def _download_webpage_handle(self, *args, **kwargs): def _download_webpage_handle(self, *args, **kwargs):
headers = self.geo_verification_headers() headers = kwargs.get('headers', {})
headers.update(kwargs.get('headers', {})) headers.update(self.geo_verification_headers())
kwargs['headers'] = headers kwargs['headers'] = headers
return super(AdobePassIE, self)._download_webpage_handle( return super(AdobePassIE, self)._download_webpage_handle(
*args, **compat_kwargs(kwargs)) *args, **compat_kwargs(kwargs))

View file

@ -1,119 +1,25 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import functools
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
float_or_none,
int_or_none,
ISO639Utils,
OnDemandPagedList,
parse_duration, parse_duration,
str_or_none,
str_to_int,
unified_strdate, unified_strdate,
str_to_int,
int_or_none,
float_or_none,
ISO639Utils,
determine_ext,
) )
class AdobeTVBaseIE(InfoExtractor): class AdobeTVBaseIE(InfoExtractor):
def _call_api(self, path, video_id, query, note=None): _API_BASE_URL = 'http://tv.adobe.com/api/v4/'
return self._download_json(
'http://tv.adobe.com/api/v4/' + path,
video_id, note, query=query)['data']
def _parse_subtitles(self, video_data, url_key):
subtitles = {}
for translation in video_data.get('translations', []):
vtt_path = translation.get(url_key)
if not vtt_path:
continue
lang = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
subtitles.setdefault(lang, []).append({
'ext': 'vtt',
'url': vtt_path,
})
return subtitles
def _parse_video_data(self, video_data):
video_id = compat_str(video_data['id'])
title = video_data['title']
s3_extracted = False
formats = []
for source in video_data.get('videos', []):
source_url = source.get('url')
if not source_url:
continue
f = {
'format_id': source.get('quality_level'),
'fps': int_or_none(source.get('frame_rate')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
'width': int_or_none(source.get('width')),
'url': source_url,
}
original_filename = source.get('original_filename')
if original_filename:
if not (f.get('height') and f.get('width')):
mobj = re.search(r'_(\d+)x(\d+)', original_filename)
if mobj:
f.update({
'height': int(mobj.group(2)),
'width': int(mobj.group(1)),
})
if original_filename.startswith('s3://') and not s3_extracted:
formats.append({
'format_id': 'original',
'preference': 1,
'url': original_filename.replace('s3://', 'https://s3.amazonaws.com/'),
})
s3_extracted = True
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
'subtitles': self._parse_subtitles(video_data, 'vtt'),
}
class AdobeTVEmbedIE(AdobeTVBaseIE):
IE_NAME = 'adobetv:embed'
_VALID_URL = r'https?://tv\.adobe\.com/embed/\d+/(?P<id>\d+)'
_TEST = {
'url': 'https://tv.adobe.com/embed/22/4153',
'md5': 'c8c0461bf04d54574fc2b4d07ac6783a',
'info_dict': {
'id': '4153',
'ext': 'flv',
'title': 'Creating Graphics Optimized for BlackBerry',
'description': 'md5:eac6e8dced38bdaae51cd94447927459',
'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20091109',
'duration': 377,
'view_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._call_api(
'episode/' + video_id, video_id, {'disclosure': 'standard'})[0]
return self._parse_video_data(video_data)
class AdobeTVIE(AdobeTVBaseIE): class AdobeTVIE(AdobeTVBaseIE):
IE_NAME = 'adobetv'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)'
_TEST = { _TEST = {
@ -136,33 +42,45 @@ class AdobeTVIE(AdobeTVBaseIE):
if not language: if not language:
language = 'en' language = 'en'
video_data = self._call_api( video_data = self._download_json(
'episode/get', urlname, { self._API_BASE_URL + 'episode/get/?language=%s&show_urlname=%s&urlname=%s&disclosure=standard' % (language, show_urlname, urlname),
'disclosure': 'standard', urlname)['data'][0]
'language': language,
'show_urlname': show_urlname, formats = [{
'urlname': urlname, 'url': source['url'],
})[0] 'format_id': source.get('quality_level') or source['url'].split('-')[-1].split('.')[0] or None,
return self._parse_video_data(video_data) 'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
} for source in video_data['videos']]
self._sort_formats(formats)
return {
'id': compat_str(video_data['id']),
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
}
class AdobeTVPlaylistBaseIE(AdobeTVBaseIE): class AdobeTVPlaylistBaseIE(AdobeTVBaseIE):
_PAGE_SIZE = 25 def _parse_page_data(self, page_data):
return [self.url_result(self._get_element_url(element_data)) for element_data in page_data]
def _fetch_page(self, display_id, query, page): def _extract_playlist_entries(self, url, display_id):
page += 1 page = self._download_json(url, display_id)
query['page'] = page entries = self._parse_page_data(page['data'])
for element_data in self._call_api( for page_num in range(2, page['paging']['pages'] + 1):
self._RESOURCE, display_id, query, 'Download Page %d' % page): entries.extend(self._parse_page_data(
yield self._process_data(element_data) self._download_json(url + '&page=%d' % page_num, display_id)['data']))
return entries
def _extract_playlist_entries(self, display_id, query):
return OnDemandPagedList(functools.partial(
self._fetch_page, display_id, query), self._PAGE_SIZE)
class AdobeTVShowIE(AdobeTVPlaylistBaseIE): class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:show'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)'
_TEST = { _TEST = {
@ -174,31 +92,26 @@ class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
}, },
'playlist_mincount': 136, 'playlist_mincount': 136,
} }
_RESOURCE = 'episode'
_process_data = AdobeTVBaseIE._parse_video_data def _get_element_url(self, element_data):
return element_data['urls'][0]
def _real_extract(self, url): def _real_extract(self, url):
language, show_urlname = re.match(self._VALID_URL, url).groups() language, show_urlname = re.match(self._VALID_URL, url).groups()
if not language: if not language:
language = 'en' language = 'en'
query = { query = 'language=%s&show_urlname=%s' % (language, show_urlname)
'disclosure': 'standard',
'language': language,
'show_urlname': show_urlname,
}
show_data = self._call_api( show_data = self._download_json(self._API_BASE_URL + 'show/get/?%s' % query, show_urlname)['data'][0]
'show/get', show_urlname, query)[0]
return self.playlist_result( return self.playlist_result(
self._extract_playlist_entries(show_urlname, query), self._extract_playlist_entries(self._API_BASE_URL + 'episode/?%s' % query, show_urlname),
str_or_none(show_data.get('id')), compat_str(show_data['id']),
show_data.get('show_name'), show_data['show_name'],
show_data.get('show_description')) show_data['show_description'])
class AdobeTVChannelIE(AdobeTVPlaylistBaseIE): class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:channel'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?'
_TEST = { _TEST = {
@ -208,30 +121,24 @@ class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
}, },
'playlist_mincount': 96, 'playlist_mincount': 96,
} }
_RESOURCE = 'show'
def _process_data(self, show_data): def _get_element_url(self, element_data):
return self.url_result( return element_data['url']
show_data['url'], 'AdobeTVShow', str_or_none(show_data.get('id')))
def _real_extract(self, url): def _real_extract(self, url):
language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups() language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups()
if not language: if not language:
language = 'en' language = 'en'
query = { query = 'language=%s&channel_urlname=%s' % (language, channel_urlname)
'channel_urlname': channel_urlname,
'language': language,
}
if category_urlname: if category_urlname:
query['category_urlname'] = category_urlname query += '&category_urlname=%s' % category_urlname
return self.playlist_result( return self.playlist_result(
self._extract_playlist_entries(channel_urlname, query), self._extract_playlist_entries(self._API_BASE_URL + 'show/?%s' % query, channel_urlname),
channel_urlname) channel_urlname)
class AdobeTVVideoIE(AdobeTVBaseIE): class AdobeTVVideoIE(InfoExtractor):
IE_NAME = 'adobetv:video'
_VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)' _VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)'
_TEST = { _TEST = {
@ -253,36 +160,38 @@ class AdobeTVVideoIE(AdobeTVBaseIE):
video_data = self._parse_json(self._search_regex( video_data = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id) r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id)
title = video_data['title']
formats = [] formats = [{
sources = video_data.get('sources') or [] 'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')),
for source in sources: 'url': source['src'],
source_src = source.get('src') 'width': int_or_none(source.get('width')),
if not source_src: 'height': int_or_none(source.get('height')),
continue 'tbr': int_or_none(source.get('bitrate')),
formats.append({ } for source in video_data['sources']]
'filesize': int_or_none(source.get('kilobytes') or None, invscale=1000),
'format_id': '-'.join(filter(None, [source.get('format'), source.get('label')])),
'height': int_or_none(source.get('height') or None),
'tbr': int_or_none(source.get('bitrate') or None),
'width': int_or_none(source.get('width') or None),
'url': source_src,
})
self._sort_formats(formats) self._sort_formats(formats)
# For both metadata and downloaded files the duration varies among # For both metadata and downloaded files the duration varies among
# formats. I just pick the max one # formats. I just pick the max one
duration = max(filter(None, [ duration = max(filter(None, [
float_or_none(source.get('duration'), scale=1000) float_or_none(source.get('duration'), scale=1000)
for source in sources])) for source in video_data['sources']]))
subtitles = {}
for translation in video_data.get('translations', []):
lang_id = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
if lang_id not in subtitles:
subtitles[lang_id] = []
subtitles[lang_id].append({
'url': translation['vttPath'],
'ext': 'vtt',
})
return { return {
'id': video_id, 'id': video_id,
'formats': formats, 'formats': formats,
'title': title, 'title': video_data['title'],
'description': video_data.get('description'), 'description': video_data.get('description'),
'thumbnail': video_data.get('video', {}).get('poster'), 'thumbnail': video_data['video'].get('poster'),
'duration': duration, 'duration': duration,
'subtitles': self._parse_subtitles(video_data, 'vttPath'), 'subtitles': subtitles,
} }

View file

@ -1,19 +1,12 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re import re
from .turner import TurnerBaseIE from .turner import TurnerBaseIE
from ..utils import ( from ..utils import (
determine_ext,
float_or_none,
int_or_none, int_or_none,
mimetype2ext,
parse_age_limit,
parse_iso8601,
strip_or_none, strip_or_none,
try_get,
) )
@ -27,8 +20,8 @@ class AdultSwimIE(TurnerBaseIE):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Rick and Morty - Pilot', 'title': 'Rick and Morty - Pilot',
'description': 'Rick moves in with his daughter\'s family and establishes himself as a bad influence on his grandson, Morty.', 'description': 'Rick moves in with his daughter\'s family and establishes himself as a bad influence on his grandson, Morty.',
'timestamp': 1543294800, 'timestamp': 1493267400,
'upload_date': '20181127', 'upload_date': '20170427',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -49,7 +42,6 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'skip': '404 Not Found',
}, { }, {
'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/', 'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/',
'info_dict': { 'info_dict': {
@ -68,9 +60,9 @@ class AdultSwimIE(TurnerBaseIE):
}, { }, {
'url': 'http://www.adultswim.com/videos/attack-on-titan', 'url': 'http://www.adultswim.com/videos/attack-on-titan',
'info_dict': { 'info_dict': {
'id': 'attack-on-titan', 'id': 'b7A69dzfRzuaXIECdxW8XQ',
'title': 'Attack on Titan', 'title': 'Attack on Titan',
'description': 'md5:41caa9416906d90711e31dc00cb7db7e', 'description': 'md5:6c8e003ea0777b47013e894767f5e114',
}, },
'playlist_mincount': 12, 'playlist_mincount': 12,
}, { }, {
@ -85,118 +77,83 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'skip': '404 Not Found',
}] }]
def _real_extract(self, url): def _real_extract(self, url):
show_path, episode_path = re.match(self._VALID_URL, url).groups() show_path, episode_path = re.match(self._VALID_URL, url).groups()
display_id = episode_path or show_path display_id = episode_path or show_path
query = '''query { webpage = self._download_webpage(url, display_id)
getShowBySlug(slug:"%s") { initial_data = self._parse_json(self._search_regex(
%%s r'AS_INITIAL_DATA(?:__)?\s*=\s*({.+?});',
} webpage, 'initial data'), display_id)
}''' % show_path
if episode_path:
query = query % '''title
getVideoBySlug(slug:"%s") {
_id
auth
description
duration
episodeNumber
launchDate
mediaID
seasonNumber
poster
title
tvRating
}''' % episode_path
['getVideoBySlug']
else:
query = query % '''metaDescription
title
videos(first:1000,sort:["episode_number"]) {
edges {
node {
_id
slug
}
}
}'''
show_data = self._download_json(
'https://www.adultswim.com/api/search', display_id,
data=json.dumps({'query': query}).encode(),
headers={'Content-Type': 'application/json'})['data']['getShowBySlug']
if episode_path:
video_data = show_data['getVideoBySlug']
video_id = video_data['_id']
episode_title = title = video_data['title']
series = show_data.get('title')
if series:
title = '%s - %s' % (series, title)
info = {
'id': video_id,
'title': title,
'description': strip_or_none(video_data.get('description')),
'duration': float_or_none(video_data.get('duration')),
'formats': [],
'subtitles': {},
'age_limit': parse_age_limit(video_data.get('tvRating')),
'thumbnail': video_data.get('poster'),
'timestamp': parse_iso8601(video_data.get('launchDate')),
'series': series,
'season_number': int_or_none(video_data.get('seasonNumber')),
'episode': episode_title,
'episode_number': int_or_none(video_data.get('episodeNumber')),
}
auth = video_data.get('auth') is_stream = show_path == 'streams'
media_id = video_data.get('mediaID') if is_stream:
if media_id: if not episode_path:
info.update(self._extract_ngtv_info(media_id, { episode_path = 'live-stream'
# CDN_TOKEN_APP_ID from:
# https://d2gg02c3xr550i.cloudfront.net/assets/asvp.e9c8bef24322d060ef87.bundle.js
'appId': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHBJZCI6ImFzLXR2ZS1kZXNrdG9wLXB0enQ2bSIsInByb2R1Y3QiOiJ0dmUiLCJuZXR3b3JrIjoiYXMiLCJwbGF0Zm9ybSI6ImRlc2t0b3AiLCJpYXQiOjE1MzI3MDIyNzl9.BzSCk-WYOZ2GMCIaeVb8zWnzhlgnXuJTCu0jGp_VaZE',
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': auth,
}))
if not auth: video_data = next(stream for stream_path, stream in initial_data['streams'].items() if stream_path == episode_path)
extract_data = self._download_json( video_id = video_data.get('stream')
'https://www.adultswim.com/api/shows/v1/videos/' + video_id,
video_id, query={'fields': 'stream'}, fatal=False) or {} if not video_id:
assets = try_get(extract_data, lambda x: x['data']['video']['stream']['assets'], list) or [] entries = []
for asset in assets: for episode in video_data.get('archiveEpisodes', []):
asset_url = asset.get('url') episode_url = episode.get('url')
if not asset_url: if not episode_url:
continue continue
ext = determine_ext(asset_url, mimetype2ext(asset.get('mime_type'))) entries.append(self.url_result(
if ext == 'm3u8': episode_url, 'AdultSwim', episode.get('id')))
info['formats'].extend(self._extract_m3u8_formats( return self.playlist_result(
asset_url, video_id, 'mp4', m3u8_id='hls', fatal=False)) entries, video_data.get('id'), video_data.get('title'),
elif ext == 'f4m': strip_or_none(video_data.get('description')))
continue
# info['formats'].extend(self._extract_f4m_formats(
# asset_url, video_id, f4m_id='hds', fatal=False))
elif ext in ('scc', 'ttml', 'vtt'):
info['subtitles'].setdefault('en', []).append({
'url': asset_url,
})
self._sort_formats(info['formats'])
return info
else: else:
entries = [] show_data = initial_data['show']
for edge in show_data.get('videos', {}).get('edges', []):
video = edge.get('node') or {} if not episode_path:
slug = video.get('slug') entries = []
if not slug: for video in show_data.get('videos', []):
continue slug = video.get('slug')
entries.append(self.url_result( if not slug:
'http://adultswim.com/videos/%s/%s' % (show_path, slug), continue
'AdultSwim', video.get('_id'))) entries.append(self.url_result(
return self.playlist_result( 'http://adultswim.com/videos/%s/%s' % (show_path, slug),
entries, show_path, show_data.get('title'), 'AdultSwim', video.get('id')))
strip_or_none(show_data.get('metaDescription'))) return self.playlist_result(
entries, show_data.get('id'), show_data.get('title'),
strip_or_none(show_data.get('metadata', {}).get('description')))
video_data = show_data['sluggedVideo']
video_id = video_data['id']
info = self._extract_cvp_info(
'http://www.adultswim.com/videos/api/v0/assets?platform=desktop&id=' + video_id,
video_id, {
'secure': {
'media_src': 'http://androidhls-secure.cdn.turner.com/adultswim/big',
'tokenizer_src': 'http://www.adultswim.com/astv/mvpd/processors/services/token_ipadAdobe.do',
},
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': video_data.get('auth'),
})
info.update({
'id': video_id,
'display_id': display_id,
'description': info.get('description') or strip_or_none(video_data.get('description')),
})
if not is_stream:
info.update({
'duration': info.get('duration') or int_or_none(video_data.get('duration')),
'timestamp': info.get('timestamp') or int_or_none(video_data.get('launch_date')),
'season_number': info.get('season_number') or int_or_none(video_data.get('season_number')),
'episode': info['title'],
'episode_number': info.get('episode_number') or int_or_none(video_data.get('episode_number')),
})
info['series'] = video_data.get('collection_title') or info.get('series')
if info['series'] and info['series'] != info['title']:
info['title'] = '%s - %s' % (info['series'], info['title'])
return info

View file

@ -1,15 +1,14 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re import re
from .theplatform import ThePlatformIE from .theplatform import ThePlatformIE
from ..utils import ( from ..utils import (
extract_attributes,
ExtractorError,
int_or_none,
smuggle_url, smuggle_url,
update_url_query, update_url_query,
unescapeHTML,
extract_attributes,
get_element_by_attribute,
) )
from ..compat import ( from ..compat import (
compat_urlparse, compat_urlparse,
@ -20,76 +19,35 @@ class AENetworksBaseIE(ThePlatformIE):
_THEPLATFORM_KEY = 'crazyjava' _THEPLATFORM_KEY = 'crazyjava'
_THEPLATFORM_SECRET = 's3cr3t' _THEPLATFORM_SECRET = 's3cr3t'
def _extract_aen_smil(self, smil_url, video_id, auth=None):
query = {'mbr': 'true'}
if auth:
query['auth'] = auth
TP_SMIL_QUERY = [{
'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak'
}, {
'assetTypes': 'high_video_s3'
}, {
'assetTypes': 'high_video_s3',
'switch': 'hls_ingest_fastly'
}]
formats = []
subtitles = {}
last_e = None
for q in TP_SMIL_QUERY:
q.update(query)
m_url = update_url_query(smil_url, q)
m_url = self._sign_url(m_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
try:
tp_formats, tp_subtitles = self._extract_theplatform_smil(
m_url, video_id, 'Downloading %s SMIL data' % (q.get('switch') or q['assetTypes']))
except ExtractorError as e:
last_e = e
continue
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
if last_e and not formats:
raise last_e
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
}
class AENetworksIE(AENetworksBaseIE): class AENetworksIE(AENetworksBaseIE):
IE_NAME = 'aenetworks' IE_NAME = 'aenetworks'
IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault' IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
(?:www\.)? (?:www\.)?
(?P<domain> (?P<domain>
(?:history(?:vault)?|aetv|mylifetime|lifetimemovieclub)\.com| (?:history|aetv|mylifetime|lifetimemovieclub)\.com|
fyi\.tv fyi\.tv
)/ )/
(?: (?:
shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})| shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|
movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?| movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?|
specials/(?P<special_display_id>[^/]+)/(?:full-special|preview-)| specials/(?P<special_display_id>[^/]+)/full-special
collections/[^/]+/(?P<collection_display_id>[^/]+)
) )
''' '''
_TESTS = [{ _TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1', 'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': { 'info_dict': {
'id': '22253814', 'id': '22253814',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Winter is Coming', 'title': 'Winter Is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74', 'description': 'md5:641f424b7a19d8e24f26dea22cf59d74',
'timestamp': 1338306241, 'timestamp': 1338306241,
'upload_date': '20120529', 'upload_date': '20120529',
'uploader': 'AENE-NEW', 'uploader': 'AENE-NEW',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'], 'add_ie': ['ThePlatform'],
}, { }, {
'url': 'http://www.history.com/shows/ancient-aliens/season-1', 'url': 'http://www.history.com/shows/ancient-aliens/season-1',
@ -122,12 +80,6 @@ class AENetworksIE(AENetworksBaseIE):
}, { }, {
'url': 'http://www.history.com/specials/sniper-into-the-kill-zone/full-special', 'url': 'http://www.history.com/specials/sniper-into-the-kill-zone/full-special',
'only_matching': True 'only_matching': True
}, {
'url': 'https://www.historyvault.com/collections/america-the-story-of-us/westward',
'only_matching': True
}, {
'url': 'https://www.aetv.com/specials/hunting-jonbenets-killer-the-untold-story/preview-hunting-jonbenets-killer-the-untold-story',
'only_matching': True
}] }]
_DOMAIN_TO_REQUESTOR_ID = { _DOMAIN_TO_REQUESTOR_ID = {
'history.com': 'HISTORY', 'history.com': 'HISTORY',
@ -138,9 +90,9 @@ class AENetworksIE(AENetworksBaseIE):
} }
def _real_extract(self, url): def _real_extract(self, url):
domain, show_path, movie_display_id, special_display_id, collection_display_id = re.match(self._VALID_URL, url).groups() domain, show_path, movie_display_id, special_display_id = re.match(self._VALID_URL, url).groups()
display_id = show_path or movie_display_id or special_display_id or collection_display_id display_id = show_path or movie_display_id or special_display_id
webpage = self._download_webpage(url, display_id, headers=self.geo_verification_headers()) webpage = self._download_webpage(url, display_id)
if show_path: if show_path:
url_parts = show_path.split('/') url_parts = show_path.split('/')
url_parts_len = len(url_parts) url_parts_len = len(url_parts)
@ -168,6 +120,10 @@ class AENetworksIE(AENetworksBaseIE):
return self.playlist_result( return self.playlist_result(
entries, self._html_search_meta('aetn:SeasonId', webpage)) entries, self._html_search_meta('aetn:SeasonId', webpage))
query = {
'mbr': 'true',
'assetTypes': 'high_video_s3'
}
video_id = self._html_search_meta('aetn:VideoID', webpage) video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex( media_url = self._search_regex(
[r"media_url\s*=\s*'(?P<url>[^']+)'", [r"media_url\s*=\s*'(?P<url>[^']+)'",
@ -177,39 +133,64 @@ class AENetworksIE(AENetworksBaseIE):
theplatform_metadata = self._download_theplatform_metadata(self._search_regex( theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id) r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
info = self._parse_theplatform_metadata(theplatform_metadata) info = self._parse_theplatform_metadata(theplatform_metadata)
auth = None
if theplatform_metadata.get('AETN$isBehindWall'): if theplatform_metadata.get('AETN$isBehindWall'):
requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain] requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain]
resource = self._get_mvpd_resource( resource = self._get_mvpd_resource(
requestor_id, theplatform_metadata['title'], requestor_id, theplatform_metadata['title'],
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'), theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
theplatform_metadata['ratings'][0]['rating']) theplatform_metadata['ratings'][0]['rating'])
auth = self._extract_mvpd_auth( query['auth'] = self._extract_mvpd_auth(
url, video_id, requestor_id, resource) url, video_id, requestor_id, resource)
info.update(self._search_json_ld(webpage, video_id, fatal=False)) info.update(self._search_json_ld(webpage, video_id, fatal=False))
info.update(self._extract_aen_smil(media_url, video_id, auth)) media_url = update_url_query(media_url, query)
media_url = self._sign_url(media_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
formats, subtitles = self._extract_theplatform_smil(media_url, video_id)
self._sort_formats(formats)
info.update({
'id': video_id,
'formats': formats,
'subtitles': subtitles,
})
return info return info
class HistoryTopicIE(AENetworksBaseIE): class HistoryTopicIE(AENetworksBaseIE):
IE_NAME = 'history:topic' IE_NAME = 'history:topic'
IE_DESC = 'History.com Topic' IE_DESC = 'History.com Topic'
_VALID_URL = r'https?://(?:www\.)?history\.com/topics/[^/]+/(?P<id>[\w+-]+?)-video' _VALID_URL = r'https?://(?:www\.)?history\.com/topics/(?:[^/]+/)?(?P<topic_id>[^/]+)(?:/[^/]+(?:/(?P<video_display_id>[^/?#]+))?)?'
_TESTS = [{ _TESTS = [{
'url': 'https://www.history.com/topics/valentines-day/history-of-valentines-day-video', 'url': 'http://www.history.com/topics/valentines-day/history-of-valentines-day/videos/bet-you-didnt-know-valentines-day?m=528e394da93ae&s=undefined&f=1&free=false',
'info_dict': { 'info_dict': {
'id': '40700995724', 'id': '40700995724',
'ext': 'mp4', 'ext': 'mp4',
'title': "History of Valentines Day", 'title': "Bet You Didn't Know: Valentine's Day",
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7', 'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
'timestamp': 1375819729, 'timestamp': 1375819729,
'upload_date': '20130806', 'upload_date': '20130806',
'uploader': 'AENE-NEW',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'add_ie': ['ThePlatform'], 'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/videos',
'info_dict':
{
'id': 'world-war-i-history',
'title': 'World War I History',
},
'playlist_mincount': 23,
}, {
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/speeches',
'only_matching': True,
}] }]
def theplatform_url_result(self, theplatform_url, video_id, query): def theplatform_url_result(self, theplatform_url, video_id, query):
@ -229,19 +210,27 @@ class HistoryTopicIE(AENetworksBaseIE):
} }
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) topic_id, video_display_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id) if video_display_id:
video_id = self._search_regex( webpage = self._download_webpage(url, video_display_id)
r'<phoenix-iframe[^>]+src="[^"]+\btpid=(\d+)', webpage, 'tpid') release_url, video_id = re.search(r"_videoPlayer.play\('([^']+)'\s*,\s*'[^']+'\s*,\s*'(\d+)'\)", webpage).groups()
result = self._download_json( release_url = unescapeHTML(release_url)
'https://feeds.video.aetnd.com/api/v2/history/videos',
video_id, query={'filter[id]': video_id})['results'][0] return self.theplatform_url_result(
title = result['title'] release_url, video_id, {
info = self._extract_aen_smil(result['publicUrl'], video_id) 'mbr': 'true',
info.update({ 'switch': 'hls',
'title': title, 'assetTypes': 'high_video_ak',
'description': result.get('description'), })
'duration': int_or_none(result.get('duration')), else:
'timestamp': int_or_none(result.get('added'), 1000), webpage = self._download_webpage(url, topic_id)
}) entries = []
return info for episode_item in re.findall(r'<a.+?data-release-url="[^"]+"[^>]*>', webpage):
video_attributes = extract_attributes(episode_item)
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
'switch': 'hls',
'assetTypes': 'high_video_ak',
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))

View file

@ -9,8 +9,6 @@ from ..utils import (
determine_ext, determine_ext,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
url_or_none,
urlencode_postdata,
xpath_text, xpath_text,
) )
@ -30,7 +28,6 @@ class AfreecaTVIE(InfoExtractor):
) )
(?P<id>\d+) (?P<id>\d+)
''' '''
_NETRC_MACHINE = 'afreecatv'
_TESTS = [{ _TESTS = [{
'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=', 'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=',
'md5': 'f72c89fe7ecc14c1b5ce506c4996046e', 'md5': 'f72c89fe7ecc14c1b5ce506c4996046e',
@ -142,22 +139,22 @@ class AfreecaTVIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
}, { }, {
# PARTIAL_ADULT # adult video
'url': 'http://vod.afreecatv.com/PLAYER/STATION/32028439', 'url': 'http://vod.afreecatv.com/PLAYER/STATION/26542731',
'info_dict': { 'info_dict': {
'id': '20180327_27901457_202289533_1', 'id': '20171001_F1AE1711_196617479_1',
'ext': 'mp4', 'ext': 'mp4',
'title': '[생]빨개요♥ (part 1)', 'title': '[생]서아 초심 찾기 방송 (part 1)',
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$', 'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': '[SA]서아', 'uploader': 'BJ서아',
'uploader_id': 'bjdyrksu', 'uploader_id': 'bjdyrksu',
'upload_date': '20180327', 'upload_date': '20171001',
'duration': 3601, 'duration': 3600,
'age_limit': 18,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['adult content'],
}, { }, {
'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652', 'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652',
'only_matching': True, 'only_matching': True,
@ -175,107 +172,25 @@ class AfreecaTVIE(InfoExtractor):
video_key['part'] = int(m.group('part')) video_key['part'] = int(m.group('part'))
return video_key return video_key
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_form = {
'szWork': 'login',
'szType': 'json',
'szUid': username,
'szPassword': password,
'isSaveId': 'false',
'szScriptVar': 'oLoginRet',
'szAction': '',
}
response = self._download_json(
'https://login.afreecatv.com/app/LoginAction.php', None,
'Logging in', data=urlencode_postdata(login_form))
_ERRORS = {
-4: 'Your account has been suspended due to a violation of our terms and policies.',
-5: 'https://member.afreecatv.com/app/user_delete_progress.php',
-6: 'https://login.afreecatv.com/membership/changeMember.php',
-8: "Hello! AfreecaTV here.\nThe username you have entered belongs to \n an account that requires a legal guardian's consent. \nIf you wish to use our services without restriction, \nplease make sure to go through the necessary verification process.",
-9: 'https://member.afreecatv.com/app/pop_login_block.php',
-11: 'https://login.afreecatv.com/afreeca/second_login.php',
-12: 'https://member.afreecatv.com/app/user_security.php',
0: 'The username does not exist or you have entered the wrong password.',
-1: 'The username does not exist or you have entered the wrong password.',
-3: 'You have entered your username/password incorrectly.',
-7: 'You cannot use your Global AfreecaTV account to access Korean AfreecaTV.',
-10: 'Sorry for the inconvenience. \nYour account has been blocked due to an unauthorized access. \nPlease contact our Help Center for assistance.',
-32008: 'You have failed to log in. Please contact our Help Center.',
}
result = int_or_none(response.get('RESULT'))
if result != 1:
error = _ERRORS.get(result, 'You have failed to log in.')
raise ExtractorError(
'Unable to login: %s said: %s' % (self.IE_NAME, error),
expected=True)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) video_xml = self._download_xml(
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
if re.search(r'alert\(["\']This video has been deleted', webpage): video_id, query={
raise ExtractorError(
'Video %s has been deleted' % video_id, expected=True)
station_id = self._search_regex(
r'nStationNo\s*=\s*(\d+)', webpage, 'station')
bbs_id = self._search_regex(
r'nBbsNo\s*=\s*(\d+)', webpage, 'bbs')
video_id = self._search_regex(
r'nTitleNo\s*=\s*(\d+)', webpage, 'title', default=video_id)
partial_view = False
for _ in range(2):
query = {
'nTitleNo': video_id, 'nTitleNo': video_id,
'nStationNo': station_id, 'partialView': 'SKIP_ADULT',
'nBbsNo': bbs_id, })
}
if partial_view:
query['partialView'] = 'SKIP_ADULT'
video_xml = self._download_xml(
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
video_id, 'Downloading video info XML%s'
% (' (skipping adult)' if partial_view else ''),
video_id, headers={
'Referer': url,
}, query=query)
flag = xpath_text(video_xml, './track/flag', 'flag', default=None) flag = xpath_text(video_xml, './track/flag', 'flag', default=None)
if flag and flag == 'SUCCEED': if flag and flag != 'SUCCEED':
break
if flag == 'PARTIAL_ADULT':
self._downloader.report_warning(
'In accordance with local laws and regulations, underage users are restricted from watching adult content. '
'Only content suitable for all ages will be downloaded. '
'Provide account credentials if you wish to download restricted content.')
partial_view = True
continue
elif flag == 'ADULT':
error = 'Only users older than 19 are able to watch this video. Provide account credentials to download this content.'
else:
error = flag
raise ExtractorError( raise ExtractorError(
'%s said: %s' % (self.IE_NAME, error), expected=True) '%s said: %s' % (self.IE_NAME, flag), expected=True)
else:
raise ExtractorError('Unable to download video info')
video_element = video_xml.findall(compat_xpath('./track/video'))[-1] video_element = video_xml.findall(compat_xpath('./track/video'))[1]
if video_element is None or video_element.text is None: if video_element is None or video_element.text is None:
raise ExtractorError( raise ExtractorError('Specified AfreecaTV video does not exist',
'Video %s does not exist' % video_id, expected=True) expected=True)
video_url = video_element.text.strip() video_url = video_element.text.strip()
@ -305,7 +220,7 @@ class AfreecaTVIE(InfoExtractor):
file_elements = video_element.findall(compat_xpath('./file')) file_elements = video_element.findall(compat_xpath('./file'))
one = len(file_elements) == 1 one = len(file_elements) == 1
for file_num, file_element in enumerate(file_elements, start=1): for file_num, file_element in enumerate(file_elements, start=1):
file_url = url_or_none(file_element.text) file_url = file_element.text
if not file_url: if not file_url:
continue continue
key = file_element.get('key', '') key = file_element.get('key', '')

View file

@ -11,7 +11,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE): class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1', 'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '', 'md5': '',
@ -51,9 +51,6 @@ class AMCNetworksIE(ThePlatformIE):
}, { }, {
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3', 'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.sundancetv.com/shows/riviera/full-episodes/season-1/episode-01-episode-1',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

37
youtube_dl/extractor/americastestkitchen.py Normal file → Executable file
View file

@ -5,7 +5,6 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
clean_html, clean_html,
int_or_none, int_or_none,
js_to_json,
try_get, try_get,
unified_strdate, unified_strdate,
) )
@ -14,21 +13,22 @@ from ..utils import (
class AmericasTestKitchenIE(InfoExtractor): class AmericasTestKitchenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com/(?:episode|videos)/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com/(?:episode|videos)/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.americastestkitchen.com/episode/582-weeknight-japanese-suppers', 'url': 'https://www.americastestkitchen.com/episode/548-summer-dinner-party',
'md5': 'b861c3e365ac38ad319cfd509c30577f', 'md5': 'b861c3e365ac38ad319cfd509c30577f',
'info_dict': { 'info_dict': {
'id': '5b400b9ee338f922cb06450c', 'id': '1_5g5zua6e',
'title': 'Weeknight Japanese Suppers', 'title': 'Summer Dinner Party',
'ext': 'mp4', 'ext': 'mp4',
'description': 'md5:3d0c1a44bb3b27607ce82652db25b4a8', 'description': 'md5:858d986e73a4826979b6a5d9f8f6a1ec',
'thumbnail': r're:^https?://', 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1523664000, 'timestamp': 1497285541,
'upload_date': '20180414', 'upload_date': '20170612',
'release_date': '20180414', 'uploader_id': 'roger.metcalf@americastestkitchen.com',
'release_date': '20170617',
'series': "America's Test Kitchen", 'series': "America's Test Kitchen",
'season_number': 18, 'season_number': 17,
'episode': 'Weeknight Japanese Suppers', 'episode': 'Summer Dinner Party',
'episode_number': 15, 'episode_number': 24,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@ -43,19 +43,22 @@ class AmericasTestKitchenIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
partner_id = self._search_regex(
r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
webpage, 'kaltura partner id')
video_data = self._parse_json( video_data = self._parse_json(
self._search_regex( self._search_regex(
r'window\.__INITIAL_STATE__\s*=\s*({.+?})\s*;\s*</script>', r'window\.__INITIAL_STATE__\s*=\s*({.+?})\s*;\s*</script>',
webpage, 'initial context'), webpage, 'initial context'),
video_id, js_to_json) video_id)
ep_data = try_get( ep_data = try_get(
video_data, video_data,
(lambda x: x['episodeDetail']['content']['data'], (lambda x: x['episodeDetail']['content']['data'],
lambda x: x['videoDetail']['content']['data']), dict) lambda x: x['videoDetail']['content']['data']), dict)
ep_meta = ep_data.get('full_video', {}) ep_meta = ep_data.get('full_video', {})
external_id = ep_data.get('external_id') or ep_meta['external_id']
zype_id = ep_data.get('zype_id') or ep_meta['zype_id']
title = ep_data.get('title') or ep_meta.get('title') title = ep_data.get('title') or ep_meta.get('title')
description = clean_html(ep_meta.get('episode_description') or ep_data.get( description = clean_html(ep_meta.get('episode_description') or ep_data.get(
@ -69,8 +72,8 @@ class AmericasTestKitchenIE(InfoExtractor):
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % zype_id, 'url': 'kaltura:%s:%s' % (partner_id, external_id),
'ie_key': 'Zype', 'ie_key': 'Kaltura',
'title': title, 'title': title,
'description': description, 'description': description,
'thumbnail': thumbnail, 'thumbnail': thumbnail,

View file

@ -3,12 +3,11 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none,
parse_iso8601,
mimetype2ext,
determine_ext, determine_ext,
ExtractorError, ExtractorError,
int_or_none,
mimetype2ext,
parse_iso8601,
url_or_none,
) )
@ -36,7 +35,7 @@ class AMPIE(InfoExtractor):
media_thumbnail = [media_thumbnail] media_thumbnail = [media_thumbnail]
for thumbnail_data in media_thumbnail: for thumbnail_data in media_thumbnail:
thumbnail = thumbnail_data.get('@attributes', {}) thumbnail = thumbnail_data.get('@attributes', {})
thumbnail_url = url_or_none(thumbnail.get('url')) thumbnail_url = thumbnail.get('url')
if not thumbnail_url: if not thumbnail_url:
continue continue
thumbnails.append({ thumbnails.append({
@ -52,7 +51,7 @@ class AMPIE(InfoExtractor):
media_subtitle = [media_subtitle] media_subtitle = [media_subtitle]
for subtitle_data in media_subtitle: for subtitle_data in media_subtitle:
subtitle = subtitle_data.get('@attributes', {}) subtitle = subtitle_data.get('@attributes', {})
subtitle_href = url_or_none(subtitle.get('href')) subtitle_href = subtitle.get('href')
if not subtitle_href: if not subtitle_href:
continue continue
subtitles.setdefault(subtitle.get('lang') or 'en', []).append({ subtitles.setdefault(subtitle.get('lang') or 'en', []).append({
@ -66,7 +65,7 @@ class AMPIE(InfoExtractor):
media_content = [media_content] media_content = [media_content]
for media_data in media_content: for media_data in media_content:
media = media_data.get('@attributes', {}) media = media_data.get('@attributes', {})
media_url = url_or_none(media.get('url')) media_url = media.get('url')
if not media_url: if not media_url:
continue continue
ext = mimetype2ext(media.get('type')) or determine_ext(media_url) ext = mimetype2ext(media.get('type')) or determine_ext(media_url)
@ -80,7 +79,7 @@ class AMPIE(InfoExtractor):
else: else:
formats.append({ formats.append({
'format_id': media_data.get('media-category', {}).get('@attributes', {}).get('label'), 'format_id': media_data.get('media-category', {}).get('@attributes', {}).get('label'),
'url': media_url, 'url': media['url'],
'tbr': int_or_none(media.get('bitrate')), 'tbr': int_or_none(media.get('bitrate')),
'filesize': int_or_none(media.get('fileSize')), 'filesize': int_or_none(media.get('fileSize')),
'ext': ext, 'ext': ext,

View file

@ -8,7 +8,6 @@ from ..utils import (
determine_ext, determine_ext,
extract_attributes, extract_attributes,
ExtractorError, ExtractorError,
url_or_none,
urlencode_postdata, urlencode_postdata,
urljoin, urljoin,
) )
@ -53,7 +52,7 @@ class AnimeOnDemandIE(InfoExtractor):
}] }]
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return
@ -166,7 +165,7 @@ class AnimeOnDemandIE(InfoExtractor):
}, fatal=False) }, fatal=False)
if not playlist: if not playlist:
continue continue
stream_url = url_or_none(playlist.get('streamurl')) stream_url = playlist.get('streamurl')
if stream_url: if stream_url:
rtmp = re.search( rtmp = re.search(
r'^(?P<url>rtmpe?://(?P<host>[^/]+)/(?P<app>.+/))(?P<playpath>mp[34]:.+)', r'^(?P<url>rtmpe?://(?P<host>[^/]+)/(?P<app>.+/))(?P<playpath>mp[34]:.+)',

View file

@ -0,0 +1,30 @@
from __future__ import unicode_literals
from .nuevo import NuevoBaseIE
class AnitubeIE(NuevoBaseIE):
IE_NAME = 'anitube.se'
_VALID_URL = r'https?://(?:www\.)?anitube\.se/video/(?P<id>\d+)'
_TEST = {
'url': 'http://www.anitube.se/video/36621',
'md5': '59d0eeae28ea0bc8c05e7af429998d43',
'info_dict': {
'id': '36621',
'ext': 'mp4',
'title': 'Recorder to Randoseru 01',
'duration': 180.19,
},
'skip': 'Blocked in the US',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
key = self._search_regex(
r'src=["\']https?://[^/]+/embed/([A-Za-z0-9_-]+)', webpage, 'key')
return self._extract_nuevo(
'http://www.anitube.se/nuevo/econfig.php?key=%s' % key, video_id)

View file

@ -134,33 +134,9 @@ class AnvatoIE(InfoExtractor):
'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582' 'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582'
} }
_API_KEY = '3hwbSuqqT690uxjNYBktSQpa5ZrpYYR0Iofx7NcJHyA'
_ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1' _ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1'
_AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce' _AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce'
_TESTS = [{
# from https://www.boston25news.com/news/watch-humpback-whale-breaches-right-next-to-fishing-boat-near-nh/817484874
'url': 'anvato:8v9BEynrwx8EFLYpgfOWcG1qJqyXKlRM:4465496',
'info_dict': {
'id': '4465496',
'ext': 'mp4',
'title': 'VIDEO: Humpback whale breaches right next to NH boat',
'description': 'VIDEO: Humpback whale breaches right next to NH boat. Footage courtesy: Zach Fahey.',
'duration': 22,
'timestamp': 1534855680,
'upload_date': '20180821',
'uploader': 'ANV',
},
'params': {
'skip_download': True,
},
}, {
# from https://sanfrancisco.cbslocal.com/2016/06/17/source-oakland-cop-on-leave-for-having-girlfriend-help-with-police-reports/
'url': 'anvato:DVzl9QRzox3ZZsP9bNu5Li3X7obQOnqP:3417601',
'only_matching': True,
}]
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
super(AnvatoIE, self).__init__(*args, **kwargs) super(AnvatoIE, self).__init__(*args, **kwargs)
self.__server_time = None self.__server_time = None
@ -193,8 +169,7 @@ class AnvatoIE(InfoExtractor):
'api': { 'api': {
'anvrid': anvrid, 'anvrid': anvrid,
'anvstk': md5_text('%s|%s|%d|%s' % ( 'anvstk': md5_text('%s|%s|%d|%s' % (
access_key, anvrid, server_time, access_key, anvrid, server_time, self._ANVACK_TABLE[access_key])),
self._ANVACK_TABLE.get(access_key, self._API_KEY))),
'anvts': server_time, 'anvts': server_time,
}, },
} }
@ -302,13 +277,10 @@ class AnvatoIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
self._initialize_geo_bypass({ self._initialize_geo_bypass(smuggled_data.get('geo_countries'))
'countries': smuggled_data.get('geo_countries'),
})
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
access_key, video_id = mobj.group('access_key_or_mcp', 'id') access_key, video_id = mobj.group('access_key_or_mcp', 'id')
if access_key not in self._ANVACK_TABLE: if access_key not in self._ANVACK_TABLE:
access_key = self._MCP_TO_ACCESS_KEY_TABLE.get( access_key = self._MCP_TO_ACCESS_KEY_TABLE[access_key]
access_key) or access_key
return self._get_anvato_videos(access_key, video_id) return self._get_anvato_videos(access_key, video_id)

View file

@ -0,0 +1,61 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
parse_duration,
int_or_none,
)
class AnySexIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?anysex\.com/(?P<id>\d+)'
_TEST = {
'url': 'http://anysex.com/156592/',
'md5': '023e9fbb7f7987f5529a394c34ad3d3d',
'info_dict': {
'id': '156592',
'ext': 'mp4',
'title': 'Busty and sexy blondie in her bikini strips for you',
'description': 'md5:de9e418178e2931c10b62966474e1383',
'categories': ['Erotic'],
'duration': 270,
'age_limit': 18,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_regex(r"video_url\s*:\s*'([^']+)'", webpage, 'video URL')
title = self._html_search_regex(r'<title>(.*?)</title>', webpage, 'title')
description = self._html_search_regex(
r'<div class="description"[^>]*>([^<]+)</div>', webpage, 'description', fatal=False)
thumbnail = self._html_search_regex(
r'preview_url\s*:\s*\'(.*?)\'', webpage, 'thumbnail', fatal=False)
categories = re.findall(
r'<a href="http://anysex\.com/categories/[^"]+" title="[^"]*">([^<]+)</a>', webpage)
duration = parse_duration(self._search_regex(
r'<b>Duration:</b> (?:<q itemprop="duration">)?(\d+:\d+)', webpage, 'duration', fatal=False))
view_count = int_or_none(self._html_search_regex(
r'<b>Views:</b> (\d+)', webpage, 'view count', fatal=False))
return {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': title,
'description': description,
'thumbnail': thumbnail,
'categories': categories,
'duration': duration,
'view_count': view_count,
'age_limit': 18,
}

View file

@ -4,24 +4,19 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
url_or_none,
) )
class AolIE(InfoExtractor): class AolIE(InfoExtractor):
IE_NAME = 'aol.com' IE_NAME = 'on.aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>[0-9a-f]+)' _VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
# video with 5min ID # video with 5min ID
'url': 'https://www.aol.com/video/view/u-s--official-warns-of-largest-ever-irs-phone-scam/518167793/', 'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img',
'md5': '18ef68f48740e86ae94b98da815eec42', 'md5': '18ef68f48740e86ae94b98da815eec42',
'info_dict': { 'info_dict': {
'id': '518167793', 'id': '518167793',
@ -38,7 +33,7 @@ class AolIE(InfoExtractor):
} }
}, { }, {
# video with vidible ID # video with vidible ID
'url': 'https://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/', 'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'info_dict': { 'info_dict': {
'id': '5707d6b8e4b090497b04f706', 'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4', 'ext': 'mp4',
@ -53,29 +48,17 @@ class AolIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
} }
}, { }, {
'url': 'https://www.aol.com/video/view/park-bench-season-2-trailer/559a1b9be4b0c3bfad3357a7/', 'url': 'http://on.aol.com/partners/abc-551438d309eab105804dbfe8/sneak-peek-was-haley-really-framed-570eaebee4b0448640a5c944',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://www.aol.com/video/view/donald-trump-spokeswoman-tones-down-megyn-kelly-attacks/519442220/', 'url': 'http://on.aol.com/shows/park-bench-shw518173474-559a1b9be4b0c3bfad3357a7?context=SH:SHW518173474:PL4327:1460619712763',
'only_matching': True,
}, {
'url': 'http://on.aol.com/video/519442220',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'aol-video:5707d6b8e4b090497b04f706', 'url': 'aol-video:5707d6b8e4b090497b04f706',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.aol.com/video/playlist/PL8245/5ca79d19d21f1a04035db606/',
'only_matching': True,
}, {
'url': 'https://www.aol.ca/video/view/u-s-woman-s-family-arrested-for-murder-first-pinned-on-panhandler-police/5c7ccf45bc03931fa04b2fe1/',
'only_matching': True,
}, {
'url': 'https://www.aol.co.uk/video/view/-one-dead-and-22-hurt-in-bus-crash-/5cb3a6f3d21f1a072b457347/',
'only_matching': True,
}, {
'url': 'https://www.aol.de/video/view/eva-braun-privataufnahmen-von-hitlers-geliebter-werden-digitalisiert/5cb2d49de98ab54c113d3d5d/',
'only_matching': True,
}, {
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -89,12 +72,12 @@ class AolIE(InfoExtractor):
video_data = response['data'] video_data = response['data']
formats = [] formats = []
m3u8_url = url_or_none(video_data.get('videoMasterPlaylist')) m3u8_url = video_data.get('videoMasterPlaylist')
if m3u8_url: if m3u8_url:
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False)) m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
for rendition in video_data.get('renditions', []): for rendition in video_data.get('renditions', []):
video_url = url_or_none(rendition.get('url')) video_url = rendition.get('url')
if not video_url: if not video_url:
continue continue
ext = rendition.get('format') ext = rendition.get('format')
@ -112,12 +95,6 @@ class AolIE(InfoExtractor):
'width': int(mobj.group(1)), 'width': int(mobj.group(1)),
'height': int(mobj.group(2)), 'height': int(mobj.group(2)),
}) })
else:
qs = compat_parse_qs(compat_urllib_parse_urlparse(video_url).query)
f.update({
'width': int_or_none(qs.get('w', [None])[0]),
'height': int_or_none(qs.get('h', [None])[0]),
})
formats.append(f) formats.append(f)
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id')) self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))

View file

@ -1,94 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
js_to_json,
url_or_none,
)
class APAIE(InfoExtractor):
_VALID_URL = r'https?://[^/]+\.apa\.at/embed/(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
_TESTS = [{
'url': 'http://uvp.apa.at/embed/293f6d17-692a-44e3-9fd5-7b178f3a1029',
'md5': '2b12292faeb0a7d930c778c7a5b4759b',
'info_dict': {
'id': 'jjv85FdZ',
'ext': 'mp4',
'title': '"Blau ist mysteriös": Die Blue Man Group im Interview',
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 254,
'timestamp': 1519211149,
'upload_date': '20180221',
},
}, {
'url': 'https://uvp-apapublisher.sf.apa.at/embed/2f94e9e6-d945-4db2-9548-f9a41ebf7b78',
'only_matching': True,
}, {
'url': 'http://uvp-rma.sf.apa.at/embed/70404cca-2f47-4855-bbb8-20b1fae58f76',
'only_matching': True,
}, {
'url': 'http://uvp-kleinezeitung.sf.apa.at/embed/f1c44979-dba2-4ebf-b021-e4cf2cac3c81',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return [
mobj.group('url')
for mobj in re.finditer(
r'<iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//[^/]+\.apa\.at/embed/[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}.*?)\1',
webpage)]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
jwplatform_id = self._search_regex(
r'media[iI]d\s*:\s*["\'](?P<id>[a-zA-Z0-9]{8})', webpage,
'jwplatform id', default=None)
if jwplatform_id:
return self.url_result(
'jwplatform:' + jwplatform_id, ie='JWPlatform',
video_id=video_id)
sources = self._parse_json(
self._search_regex(
r'sources\s*=\s*(\[.+?\])\s*;', webpage, 'sources'),
video_id, transform_source=js_to_json)
formats = []
for source in sources:
if not isinstance(source, dict):
continue
source_url = url_or_none(source.get('file'))
if not source_url:
continue
ext = determine_ext(source_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
else:
formats.append({
'url': source_url,
})
self._sort_formats(formats)
thumbnail = self._search_regex(
r'image\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
'thumbnail', fatal=False, group='url')
return {
'id': video_id,
'title': video_id,
'thumbnail': thumbnail,
'formats': formats,
}

View file

@ -4,92 +4,66 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
merge_dicts,
mimetype2ext, mimetype2ext,
url_or_none,
) )
class AparatIE(InfoExtractor): class AparatIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aparat\.com/(?:v/|video/video/embed/videohash/)(?P<id>[a-zA-Z0-9]+)' _VALID_URL = r'https?://(?:www\.)?aparat\.com/(?:v/|video/video/embed/videohash/)(?P<id>[a-zA-Z0-9]+)'
_TESTS = [{ _TEST = {
'url': 'http://www.aparat.com/v/wP8On', 'url': 'http://www.aparat.com/v/wP8On',
'md5': '131aca2e14fe7c4dcb3c4877ba300c89', 'md5': '131aca2e14fe7c4dcb3c4877ba300c89',
'info_dict': { 'info_dict': {
'id': 'wP8On', 'id': 'wP8On',
'ext': 'mp4', 'ext': 'mp4',
'title': 'تیم گلکسی 11 - زومیت', 'title': 'تیم گلکسی 11 - زومیت',
'description': 'md5:096bdabcdcc4569f2b8a5e903a3b3028', 'age_limit': 0,
'duration': 231,
'timestamp': 1387394859,
'upload_date': '20131218',
'view_count': int,
}, },
}, { # 'skip': 'Extremely unreliable',
# multiple formats }
'url': 'https://www.aparat.com/v/8dflw/',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
# Provides more metadata # Note: There is an easier-to-parse configuration at
webpage = self._download_webpage(url, video_id, fatal=False) # http://www.aparat.com/video/video/config/videohash/%video_id
# but the URL in there does not work
if not webpage: webpage = self._download_webpage(
# Note: There is an easier-to-parse configuration at 'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id,
# http://www.aparat.com/video/video/config/videohash/%video_id
# but the URL in there does not work
webpage = self._download_webpage(
'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id,
video_id)
options = self._parse_json(
self._search_regex(
r'options\s*=\s*JSON\.parse\(\s*(["\'])(?P<value>(?:(?!\1).)+)\1\s*\)',
webpage, 'options', group='value'),
video_id) video_id)
player = options['plugins']['sabaPlayerPlugin'] title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
file_list = self._parse_json(
self._search_regex(
r'fileList\s*=\s*JSON\.parse\(\'([^\']+)\'\)', webpage,
'file list'),
video_id)
formats = [] formats = []
for sources in player['multiSRC']: for item in file_list[0]:
for item in sources: file_url = item.get('file')
if not isinstance(item, dict): if not file_url:
continue continue
file_url = url_or_none(item.get('src')) ext = mimetype2ext(item.get('type'))
if not file_url: label = item.get('label')
continue formats.append({
item_type = item.get('type') 'url': file_url,
if item_type == 'application/vnd.apple.mpegurl': 'ext': ext,
formats.extend(self._extract_m3u8_formats( 'format_id': label or ext,
file_url, video_id, 'mp4', 'height': int_or_none(self._search_regex(
entry_protocol='m3u8_native', m3u8_id='hls', r'(\d+)[pP]', label or '', 'height', default=None)),
fatal=False)) })
else: self._sort_formats(formats)
ext = mimetype2ext(item.get('type'))
label = item.get('label')
formats.append({
'url': file_url,
'ext': ext,
'format_id': 'http-%s' % (label or ext),
'height': int_or_none(self._search_regex(
r'(\d+)[pP]', label or '', 'height',
default=None)),
})
self._sort_formats(
formats, field_preference=('height', 'width', 'tbr', 'format_id'))
info = self._search_json_ld(webpage, video_id, default={}) thumbnail = self._search_regex(
r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
if not info.get('title'): return {
info['title'] = player['title']
return merge_dicts(info, {
'id': video_id, 'id': video_id,
'thumbnail': url_or_none(options.get('poster')), 'title': title,
'duration': int_or_none(player.get('duration')), 'thumbnail': thumbnail,
'age_limit': self._family_friendly_search(webpage),
'formats': formats, 'formats': formats,
}) }

View file

@ -41,7 +41,7 @@ class ArchiveOrgIE(InfoExtractor):
webpage = self._download_webpage( webpage = self._download_webpage(
'http://archive.org/embed/' + video_id, video_id) 'http://archive.org/embed/' + video_id, video_id)
jwplayer_playlist = self._parse_json(self._search_regex( jwplayer_playlist = self._parse_json(self._search_regex(
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\)", r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\);",
webpage, 'jwplayer playlist'), video_id) webpage, 'jwplayer playlist'), video_id)
info = self._parse_jwplayer_data( info = self._parse_jwplayer_data(
{'playlist': jwplayer_playlist}, video_id, base_url=url) {'playlist': jwplayer_playlist}, video_id, base_url=url)

View file

@ -1,50 +1,101 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from .generic import GenericIE from .generic import GenericIE
from ..compat import compat_str
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
ExtractorError, ExtractorError,
qualities,
int_or_none, int_or_none,
parse_duration, parse_duration,
qualities,
str_or_none,
try_get,
unified_strdate, unified_strdate,
unified_timestamp,
update_url_query,
url_or_none,
xpath_text, xpath_text,
update_url_query,
) )
from ..compat import compat_etree_fromstring from ..compat import compat_etree_fromstring
class ARDMediathekBaseIE(InfoExtractor): class ARDMediathekIE(InfoExtractor):
_GEO_COUNTRIES = ['DE'] IE_NAME = 'ARD:mediathek'
_VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
_TESTS = [{
'url': 'http://www.ardmediathek.de/tv/Dokumentation-und-Reportage/Ich-liebe-das-Leben-trotzdem/rbb-Fernsehen/Video?documentId=29582122&bcastId=3822114',
'info_dict': {
'id': '29582122',
'ext': 'mp4',
'title': 'Ich liebe das Leben trotzdem',
'description': 'md5:45e4c225c72b27993314b31a84a5261c',
'duration': 4557,
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'HTTP Error 404: Not Found',
}, {
'url': 'http://www.ardmediathek.de/tv/Tatort/Tatort-Scheinwelten-H%C3%B6rfassung-Video/Das-Erste/Video?documentId=29522730&bcastId=602916',
'md5': 'f4d98b10759ac06c0072bbcd1f0b9e3e',
'info_dict': {
'id': '29522730',
'ext': 'mp4',
'title': 'Tatort: Scheinwelten - Hörfassung (Video tgl. ab 20 Uhr)',
'description': 'md5:196392e79876d0ac94c94e8cdb2875f1',
'duration': 5252,
},
'skip': 'HTTP Error 404: Not Found',
}, {
# audio
'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
'md5': '219d94d8980b4f538c7fcb0865eb7f2c',
'info_dict': {
'id': '28488308',
'ext': 'mp3',
'title': 'Tod eines Fußballers',
'description': 'md5:f6e39f3461f0e1f54bfa48c8875c86ef',
'duration': 3240,
},
'skip': 'HTTP Error 404: Not Found',
}, {
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
'only_matching': True,
}, {
# audio
'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
'md5': '4e8f00631aac0395fee17368ac0e9867',
'info_dict': {
'id': '30796318',
'ext': 'mp3',
'title': 'Vor dem Fest',
'description': 'md5:c0c1c8048514deaed2a73b3a60eecacb',
'duration': 3287,
},
'skip': 'Video is no longer available',
}]
def _extract_media_info(self, media_info_url, webpage, video_id): def _extract_media_info(self, media_info_url, webpage, video_id):
media_info = self._download_json( media_info = self._download_json(
media_info_url, video_id, 'Downloading media JSON') media_info_url, video_id, 'Downloading media JSON')
return self._parse_media_info(media_info, video_id, '"fsk"' in webpage)
def _parse_media_info(self, media_info, video_id, fsk):
formats = self._extract_formats(media_info, video_id) formats = self._extract_formats(media_info, video_id)
if not formats: if not formats:
if fsk: if '"fsk"' in webpage:
raise ExtractorError( raise ExtractorError(
'This video is only available after 20:00', expected=True) 'This video is only available after 20:00', expected=True)
elif media_info.get('_geoblocked'): elif media_info.get('_geoblocked'):
self.raise_geo_restricted( raise ExtractorError('This video is not available due to geo restriction', expected=True)
'This video is not available due to geoblocking',
countries=self._GEO_COUNTRIES)
self._sort_formats(formats) self._sort_formats(formats)
duration = int_or_none(media_info.get('_duration'))
thumbnail = media_info.get('_previewImage')
is_live = media_info.get('_isLive') is True
subtitles = {} subtitles = {}
subtitle_url = media_info.get('_subtitleUrl') subtitle_url = media_info.get('_subtitleUrl')
if subtitle_url: if subtitle_url:
@ -55,9 +106,9 @@ class ARDMediathekBaseIE(InfoExtractor):
return { return {
'id': video_id, 'id': video_id,
'duration': int_or_none(media_info.get('_duration')), 'duration': duration,
'thumbnail': media_info.get('_previewImage'), 'thumbnail': thumbnail,
'is_live': media_info.get('_isLive') is True, 'is_live': is_live,
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
} }
@ -76,7 +127,7 @@ class ARDMediathekBaseIE(InfoExtractor):
quality = stream.get('_quality') quality = stream.get('_quality')
server = stream.get('_server') server = stream.get('_server')
for stream_url in stream_urls: for stream_url in stream_urls:
if not url_or_none(stream_url): if not isinstance(stream_url, compat_str) or '//' not in stream_url:
continue continue
ext = determine_ext(stream_url) ext = determine_ext(stream_url)
if quality != 'auto' and ext in ('f4m', 'm3u8'): if quality != 'auto' and ext in ('f4m', 'm3u8'):
@ -86,11 +137,11 @@ class ARDMediathekBaseIE(InfoExtractor):
update_url_query(stream_url, { update_url_query(stream_url, {
'hdcore': '3.1.1', 'hdcore': '3.1.1',
'plugin': 'aasp-3.1.1.69.124' 'plugin': 'aasp-3.1.1.69.124'
}), video_id, f4m_id='hds', fatal=False)) }),
video_id, f4m_id='hds', fatal=False))
elif ext == 'm3u8': elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', 'm3u8_native', stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
m3u8_id='hls', fatal=False))
else: else:
if server and server.startswith('rtmp'): if server and server.startswith('rtmp'):
f = { f = {
@ -103,9 +154,7 @@ class ARDMediathekBaseIE(InfoExtractor):
'url': stream_url, 'url': stream_url,
'format_id': 'a%s-%s-%s' % (num, ext, quality) 'format_id': 'a%s-%s-%s' % (num, ext, quality)
} }
m = re.search( m = re.search(r'_(?P<width>\d+)x(?P<height>\d+)\.mp4$', stream_url)
r'_(?P<width>\d+)x(?P<height>\d+)\.mp4$',
stream_url)
if m: if m:
f.update({ f.update({
'width': int(m.group('width')), 'width': int(m.group('width')),
@ -116,48 +165,6 @@ class ARDMediathekBaseIE(InfoExtractor):
formats.append(f) formats.append(f)
return formats return formats
class ARDMediathekIE(ARDMediathekBaseIE):
IE_NAME = 'ARD:mediathek'
_VALID_URL = r'^https?://(?:(?:(?:www|classic)\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
_TESTS = [{
# available till 26.07.2022
'url': 'http://www.ardmediathek.de/tv/S%C3%9CDLICHT/Was-ist-die-Kunst-der-Zukunft-liebe-Ann/BR-Fernsehen/Video?bcastId=34633636&documentId=44726822',
'info_dict': {
'id': '44726822',
'ext': 'mp4',
'title': 'Was ist die Kunst der Zukunft, liebe Anna McCarthy?',
'description': 'md5:4ada28b3e3b5df01647310e41f3a62f5',
'duration': 1740,
},
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'https://one.ard.de/tv/Mord-mit-Aussicht/Mord-mit-Aussicht-6-39-T%C3%B6dliche-Nach/ONE/Video?bcastId=46384294&documentId=55586872',
'only_matching': True,
}, {
# audio
'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
'only_matching': True,
}, {
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
'only_matching': True,
}, {
# audio
'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
'only_matching': True,
}, {
'url': 'https://classic.ardmediathek.de/tv/Panda-Gorilla-Co/Panda-Gorilla-Co-Folge-274/Das-Erste/Video?bcastId=16355486&documentId=58234698',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if ARDBetaMediathekIE.suitable(url) else super(ARDMediathekIE, cls).suitable(url)
def _real_extract(self, url): def _real_extract(self, url):
# determine video id from url # determine video id from url
m = re.match(self._VALID_URL, url) m = re.match(self._VALID_URL, url)
@ -190,18 +197,13 @@ class ARDMediathekIE(ARDMediathekBaseIE):
title = self._html_search_regex( title = self._html_search_regex(
[r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>', [r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
r'<meta name="dcterms\.title" content="(.*?)"/>', r'<meta name="dcterms\.title" content="(.*?)"/>',
r'<h4 class="headline">(.*?)</h4>', r'<h4 class="headline">(.*?)</h4>'],
r'<title[^>]*>(.*?)</title>'],
webpage, 'title') webpage, 'title')
description = self._html_search_meta( description = self._html_search_meta(
'dcterms.abstract', webpage, 'description', default=None) 'dcterms.abstract', webpage, 'description', default=None)
if description is None: if description is None:
description = self._html_search_meta( description = self._html_search_meta(
'description', webpage, 'meta description', default=None) 'description', webpage, 'meta description')
if description is None:
description = self._html_search_regex(
r'<p\s+class="teasertext">(.+?)</p>',
webpage, 'teaser text', default=None)
# Thumbnail is sometimes not present. # Thumbnail is sometimes not present.
# It is in the mobile version, but that seems to use a different URL # It is in the mobile version, but that seems to use a different URL
@ -249,27 +251,21 @@ class ARDMediathekIE(ARDMediathekBaseIE):
class ARDIE(InfoExtractor): class ARDIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html' _VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
_TESTS = [{ _TEST = {
# available till 14.02.2019 'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html', 'md5': 'd216c3a86493f9322545e045ddc3eb35',
'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49',
'info_dict': { 'info_dict': {
'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video', 'display_id': 'die-story-im-ersten-mission-unter-falscher-flagge',
'id': '102', 'id': '100',
'ext': 'mp4', 'ext': 'mp4',
'duration': 4435.0, 'duration': 2600,
'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?', 'title': 'Die Story im Ersten: Mission unter falscher Flagge',
'upload_date': '20180214', 'upload_date': '20140804',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
}, },
}, { 'skip': 'HTTP Error 404: Not Found',
'url': 'https://www.daserste.de/information/reportage-dokumentation/erlebnis-erde/videosextern/woelfe-und-herdenschutzhunde-ungleiche-brueder-102.html', }
'only_matching': True,
}, {
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
@ -310,113 +306,3 @@ class ARDIE(InfoExtractor):
'upload_date': upload_date, 'upload_date': upload_date,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
} }
class ARDBetaMediathekIE(ARDMediathekBaseIE):
_VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?:player|live|video)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)'
_TESTS = [{
'url': 'https://ardmediathek.de/ard/video/die-robuste-roswita/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
'md5': 'dfdc87d2e7e09d073d5a80770a9ce88f',
'info_dict': {
'display_id': 'die-robuste-roswita',
'id': '70153354',
'title': 'Die robuste Roswita',
'description': r're:^Der Mord.*trüber ist als die Ilm.',
'duration': 5316,
'thumbnail': 'https://img.ardmediathek.de/standard/00/70/15/33/90/-1852531467/16x9/960?mandant=ard',
'timestamp': 1577047500,
'upload_date': '20191222',
'ext': 'mp4',
},
}, {
'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
'only_matching': True,
}, {
'url': 'https://ardmediathek.de/ard/video/saartalk/saartalk-gesellschaftsgift-haltung-gegen-hass/sr-fernsehen/Y3JpZDovL3NyLW9ubGluZS5kZS9TVF84MTY4MA/',
'only_matching': True,
}, {
'url': 'https://www.ardmediathek.de/ard/video/trailer/private-eyes-s01-e01/one/Y3JpZDovL3dkci5kZS9CZWl0cmFnLTE1MTgwYzczLWNiMTEtNGNkMS1iMjUyLTg5MGYzOWQxZmQ1YQ/',
'only_matching': True,
}, {
'url': 'https://www.ardmediathek.de/ard/player/Y3JpZDovL3N3ci5kZS9hZXgvbzEwNzE5MTU/',
'only_matching': True,
}, {
'url': 'https://www.ardmediathek.de/swr/live/Y3JpZDovL3N3ci5kZS8xMzQ4MTA0Mg',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('video_id')
display_id = mobj.group('display_id')
if display_id:
display_id = display_id.rstrip('/')
if not display_id:
display_id = video_id
player_page = self._download_json(
'https://api.ardmediathek.de/public-gateway',
display_id, data=json.dumps({
'query': '''{
playerPage(client:"%s", clipId: "%s") {
blockedByFsk
broadcastedOn
maturityContentRating
mediaCollection {
_duration
_geoblocked
_isLive
_mediaArray {
_mediaStreamArray {
_quality
_server
_stream
}
}
_previewImage
_subtitleUrl
_type
}
show {
title
}
synopsis
title
tracking {
atiCustomVars {
contentId
}
}
}
}''' % (mobj.group('client'), video_id),
}).encode(), headers={
'Content-Type': 'application/json'
})['data']['playerPage']
title = player_page['title']
content_id = str_or_none(try_get(
player_page, lambda x: x['tracking']['atiCustomVars']['contentId']))
media_collection = player_page.get('mediaCollection') or {}
if not media_collection and content_id:
media_collection = self._download_json(
'https://www.ardmediathek.de/play/media/' + content_id,
content_id, fatal=False) or {}
info = self._parse_media_info(
media_collection, content_id or video_id,
player_page.get('blockedByFsk'))
age_limit = None
description = player_page.get('synopsis')
maturity_content_rating = player_page.get('maturityContentRating')
if maturity_content_rating:
age_limit = int_or_none(maturity_content_rating.lstrip('FSK'))
if not age_limit and description:
age_limit = int_or_none(self._search_regex(
r'\(FSK\s*(\d+)\)\s*$', description, 'age limit', default=None))
info.update({
'age_limit': age_limit,
'display_id': display_id,
'title': title,
'description': description,
'timestamp': unified_timestamp(player_page.get('broadcastedOn')),
'series': try_get(player_page, lambda x: x['show']['title']),
})
return info

View file

@ -103,7 +103,7 @@ class ArkenaIE(InfoExtractor):
f_url, video_id, mpd_id=kind, fatal=False)) f_url, video_id, mpd_id=kind, fatal=False))
elif kind == 'silverlight': elif kind == 'silverlight':
# TODO: process when ism is supported (see # TODO: process when ism is supported (see
# https://github.com/ytdl-org/youtube-dl/issues/8118) # https://github.com/rg3/youtube-dl/issues/8118)
continue continue
else: else:
tbr = float_or_none(f.get('Bitrate'), 1000) tbr = float_or_none(f.get('Bitrate'), 1000)

View file

@ -4,10 +4,17 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import (
compat_parse_qs,
compat_str,
compat_urllib_parse_urlparse,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
find_xpath_attr,
get_element_by_attribute,
int_or_none, int_or_none,
NO_DEFAULT,
qualities, qualities,
try_get, try_get,
unified_strdate, unified_strdate,
@ -18,7 +25,59 @@ from ..utils import (
# add tests. # add tests.
class ArteTvIE(InfoExtractor):
_VALID_URL = r'https?://videos\.arte\.tv/(?P<lang>fr|de|en|es)/.*-(?P<id>.*?)\.html'
IE_NAME = 'arte.tv'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
lang = mobj.group('lang')
video_id = mobj.group('id')
ref_xml_url = url.replace('/videos/', '/do_delegate/videos/')
ref_xml_url = ref_xml_url.replace('.html', ',view,asPlayerXml.xml')
ref_xml_doc = self._download_xml(
ref_xml_url, video_id, note='Downloading metadata')
config_node = find_xpath_attr(ref_xml_doc, './/video', 'lang', lang)
config_xml_url = config_node.attrib['ref']
config = self._download_xml(
config_xml_url, video_id, note='Downloading configuration')
formats = [{
'format_id': q.attrib['quality'],
# The playpath starts at 'mp4:', if we don't manually
# split the url, rtmpdump will incorrectly parse them
'url': q.text.split('mp4:', 1)[0],
'play_path': 'mp4:' + q.text.split('mp4:', 1)[1],
'ext': 'flv',
'quality': 2 if q.attrib['quality'] == 'hd' else 1,
} for q in config.findall('./urls/url')]
self._sort_formats(formats)
title = config.find('.//name').text
thumbnail = config.find('.//firstThumbnailUrl').text
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}
class ArteTVBaseIE(InfoExtractor): class ArteTVBaseIE(InfoExtractor):
@classmethod
def _extract_url_info(cls, url):
mobj = re.match(cls._VALID_URL, url)
lang = mobj.group('lang')
query = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
if 'vid' in query:
video_id = query['vid'][0]
else:
# This is not a real id, it can be for example AJT for the news
# http://www.arte.tv/guide/fr/emissions/AJT/arte-journal
video_id = mobj.group('id')
return video_id, lang
def _extract_from_json_url(self, json_url, video_id, lang, title=None): def _extract_from_json_url(self, json_url, video_id, lang, title=None):
info = self._download_json(json_url, video_id) info = self._download_json(json_url, video_id)
player_info = info['videoJsonPlayer'] player_info = info['videoJsonPlayer']
@ -49,15 +108,13 @@ class ArteTVBaseIE(InfoExtractor):
'upload_date': unified_strdate(upload_date_str), 'upload_date': unified_strdate(upload_date_str),
'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'), 'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'),
} }
qfunc = qualities(['MQ', 'HQ', 'EQ', 'SQ']) qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ'])
LANGS = { LANGS = {
'fr': 'F', 'fr': 'F',
'de': 'A', 'de': 'A',
'en': 'E[ANG]', 'en': 'E[ANG]',
'es': 'E[ESP]', 'es': 'E[ESP]',
'it': 'E[ITA]',
'pl': 'E[POL]',
} }
langcode = LANGS.get(lang, lang) langcode = LANGS.get(lang, lang)
@ -69,8 +126,8 @@ class ArteTVBaseIE(InfoExtractor):
l = re.escape(langcode) l = re.escape(langcode)
# Language preference from most to least priority # Language preference from most to least priority
# Reference: section 6.8 of # Reference: section 5.6.3 of
# https://www.arte.tv/sites/en/corporate/files/complete-technical-guidelines-arte-geie-v1-07-1.pdf # http://www.arte.tv/sites/en/corporate/files/complete-technical-guidelines-arte-geie-v1-05.pdf
PREFERENCES = ( PREFERENCES = (
# original version in requested language, without subtitles # original version in requested language, without subtitles
r'VO{0}$'.format(l), r'VO{0}$'.format(l),
@ -136,59 +193,274 @@ class ArteTVBaseIE(InfoExtractor):
class ArteTVPlus7IE(ArteTVBaseIE): class ArteTVPlus7IE(ArteTVBaseIE):
IE_NAME = 'arte.tv:+7' IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos/(?P<id>\d{6}-\d{3}-[AF])' _VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/(?:[^/]+/)?(?P<lang>fr|de|en|es)/(?:videos/)?(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.arte.tv/en/videos/088501-000-A/mexico-stealing-petrol-to-survive/', 'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D',
'only_matching': True,
}, {
'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22',
'only_matching': True,
}, {
'url': 'http://www.arte.tv/de/videos/048696-000-A/der-kluge-bauch-unser-zweites-gehirn',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url)
def _real_extract(self, url):
video_id, lang = self._extract_url_info(url)
webpage = self._download_webpage(url, video_id)
return self._extract_from_webpage(webpage, video_id, lang)
def _extract_from_webpage(self, webpage, video_id, lang):
patterns_templates = (r'arte_vp_url=["\'](.*?%s.*?)["\']', r'data-url=["\']([^"]+%s[^"]+)["\']')
ids = (video_id, '')
# some pages contain multiple videos (like
# http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D),
# so we first try to look for json URLs that contain the video id from
# the 'vid' parameter.
patterns = [t % re.escape(_id) for _id in ids for t in patterns_templates]
json_url = self._html_search_regex(
patterns, webpage, 'json vp url', default=None)
if not json_url:
def find_iframe_url(webpage, default=NO_DEFAULT):
return self._html_search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+\bjson_url=.+?)\1',
webpage, 'iframe url', group='url', default=default)
iframe_url = find_iframe_url(webpage, None)
if not iframe_url:
embed_url = self._html_search_regex(
r'arte_vp_url_oembed=\'([^\']+?)\'', webpage, 'embed url', default=None)
if embed_url:
player = self._download_json(
embed_url, video_id, 'Downloading player page')
iframe_url = find_iframe_url(player['html'])
# en and es URLs produce react-based pages with different layout (e.g.
# http://www.arte.tv/guide/en/053330-002-A/carnival-italy?zone=world)
if not iframe_url:
program = self._search_regex(
r'program\s*:\s*({.+?["\']embed_html["\'].+?}),?\s*\n',
webpage, 'program', default=None)
if program:
embed_html = self._parse_json(program, video_id)
if embed_html:
iframe_url = find_iframe_url(embed_html['embed_html'])
if iframe_url:
json_url = compat_parse_qs(
compat_urllib_parse_urlparse(iframe_url).query)['json_url'][0]
if json_url:
title = self._search_regex(
r'<h3[^>]+title=(["\'])(?P<title>.+?)\1',
webpage, 'title', default=None, group='title')
return self._extract_from_json_url(json_url, video_id, lang, title=title)
# Different kind of embed URL (e.g.
# http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
entries = [
self.url_result(url)
for _, url in re.findall(r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1', webpage)]
return self.playlist_result(entries)
# It also uses the arte_vp_url url from the webpage to extract the information
class ArteTVCreativeIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:creative'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://creative.arte.tv/fr/episode/osmosis-episode-1',
'info_dict': { 'info_dict': {
'id': '088501-000-A', 'id': '057405-001-A',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Mexico: Stealing Petrol to Survive', 'title': 'OSMOSIS - N\'AYEZ PLUS PEUR D\'AIMER (1)',
'upload_date': '20190628', 'upload_date': '20150716',
},
}, {
'url': 'http://creative.arte.tv/fr/Monty-Python-Reunion',
'playlist_count': 11,
'add_ie': ['Youtube'],
}, {
'url': 'http://creative.arte.tv/de/episode/agentur-amateur-4-der-erste-kunde',
'only_matching': True,
}]
class ArteTVInfoIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:info'
_VALID_URL = r'https?://info\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://info.arte.tv/fr/service-civique-un-cache-misere',
'info_dict': {
'id': '067528-000-A',
'ext': 'mp4',
'title': 'Service civique, un cache misère ?',
'upload_date': '20160403',
}, },
}] }]
class ArteTVFutureIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:future'
_VALID_URL = r'https?://future\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://future.arte.tv/fr/info-sciences/les-ecrevisses-aussi-sont-anxieuses',
'info_dict': {
'id': '050940-028-A',
'ext': 'mp4',
'title': 'Les écrevisses aussi peuvent être anxieuses',
'upload_date': '20140902',
},
}, {
'url': 'http://future.arte.tv/fr/la-science-est-elle-responsable',
'only_matching': True,
}]
class ArteTVDDCIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:ddc'
_VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>[^/?#&]+)'
_TESTS = []
def _real_extract(self, url): def _real_extract(self, url):
lang, video_id = re.match(self._VALID_URL, url).groups() video_id, lang = self._extract_url_info(url)
return self._extract_from_json_url( if lang == 'folge':
'https://api.arte.tv/api/player/v1/config/%s/%s' % (lang, video_id), lang = 'de'
video_id, lang) elif lang == 'emission':
lang = 'fr'
webpage = self._download_webpage(url, video_id)
scriptElement = get_element_by_attribute('class', 'visu_video_block', webpage)
script_url = self._html_search_regex(r'src="(.*?)"', scriptElement, 'script url')
javascriptPlayerGenerator = self._download_webpage(script_url, video_id, 'Download javascript player generator')
json_url = self._search_regex(r"json_url=(.*)&rendering_place.*", javascriptPlayerGenerator, 'json url')
return self._extract_from_json_url(json_url, video_id, lang)
class ArteTVConcertIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:concert'
_VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://concert.arte.tv/de/notwist-im-pariser-konzertclub-divan-du-monde',
'md5': '9ea035b7bd69696b67aa2ccaaa218161',
'info_dict': {
'id': '186',
'ext': 'mp4',
'title': 'The Notwist im Pariser Konzertclub "Divan du Monde"',
'upload_date': '20140128',
'description': 'md5:486eb08f991552ade77439fe6d82c305',
},
}]
class ArteTVCinemaIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:cinema'
_VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
_TESTS = [{
'url': 'http://cinema.arte.tv/fr/article/les-ailes-du-desir-de-julia-reck',
'md5': 'a5b9dd5575a11d93daf0e3f404f45438',
'info_dict': {
'id': '062494-000-A',
'ext': 'mp4',
'title': 'Film lauréat du concours web - "Les ailes du désir" de Julia Reck',
'upload_date': '20150807',
},
}]
class ArteTVMagazineIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:magazine'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/magazine/[^/]+/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
# Embedded via <iframe src="http://www.arte.tv/arte_vp/index.php?json_url=..."
'url': 'http://www.arte.tv/magazine/trepalium/fr/entretien-avec-le-realisateur-vincent-lannoo-trepalium',
'md5': '2a9369bcccf847d1c741e51416299f25',
'info_dict': {
'id': '065965-000-A',
'ext': 'mp4',
'title': 'Trepalium - Extrait Ep.01',
'upload_date': '20160121',
},
}, {
# Embedded via <iframe src="http://www.arte.tv/guide/fr/embed/054813-004-A/medium"
'url': 'http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium',
'md5': 'fedc64fc7a946110fe311634e79782ca',
'info_dict': {
'id': '054813-004_PLUS7-F',
'ext': 'mp4',
'title': 'Trepalium (4/6)',
'description': 'md5:10057003c34d54e95350be4f9b05cb40',
'upload_date': '20160218',
},
}, {
'url': 'http://www.arte.tv/magazine/metropolis/de/frank-woeste-german-paris-metropolis',
'only_matching': True,
}]
class ArteTVEmbedIE(ArteTVPlus7IE): class ArteTVEmbedIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:embed' IE_NAME = 'arte.tv:embed'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https://www\.arte\.tv http://www\.arte\.tv
/player/v3/index\.php\?json_url= /(?:playerv2/embed|arte_vp/index)\.php\?json_url=
(?P<json_url> (?P<json_url>
https?://api\.arte\.tv/api/player/v1/config/ http://arte\.tv/papi/tvguide/videos/stream/player/
(?P<lang>[^/]+)/(?P<id>\d{6}-\d{3}-[AF]) (?P<lang>[^/]+)/(?P<id>[^/]+)[^&]*
) )
''' '''
_TESTS = [] _TESTS = []
def _real_extract(self, url): def _real_extract(self, url):
json_url, lang, video_id = re.match(self._VALID_URL, url).groups() mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
lang = mobj.group('lang')
json_url = mobj.group('json_url')
return self._extract_from_json_url(json_url, video_id, lang) return self._extract_from_json_url(json_url, video_id, lang)
class TheOperaPlatformIE(ArteTVPlus7IE):
IE_NAME = 'theoperaplatform'
_VALID_URL = r'https?://(?:www\.)?theoperaplatform\.eu/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.theoperaplatform.eu/de/opera/verdi-otello',
'md5': '970655901fa2e82e04c00b955e9afe7b',
'info_dict': {
'id': '060338-009-A',
'ext': 'mp4',
'title': 'Verdi - OTELLO',
'upload_date': '20160927',
},
}]
class ArteTVPlaylistIE(ArteTVBaseIE): class ArteTVPlaylistIE(ArteTVBaseIE):
IE_NAME = 'arte.tv:playlist' IE_NAME = 'arte.tv:playlist'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos/(?P<id>RC-\d{6})' _VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/[^#]*#collection/(?P<id>PL-\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.arte.tv/en/videos/RC-016954/earn-a-living/', 'url': 'http://www.arte.tv/guide/de/plus7/?country=DE#collection/PL-013263/ARTETV',
'info_dict': { 'info_dict': {
'id': 'RC-016954', 'id': 'PL-013263',
'title': 'Earn a Living', 'title': 'Areva & Uramin',
'description': 'md5:d322c55011514b3a7241f7fb80d494c2', 'description': 'md5:a1dc0312ce357c262259139cfd48c9bf',
}, },
'playlist_mincount': 6, 'playlist_mincount': 6,
}, {
'url': 'http://www.arte.tv/guide/de/playlists?country=DE#collection/PL-013190/ARTETV',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
lang, playlist_id = re.match(self._VALID_URL, url).groups() playlist_id, lang = self._extract_url_info(url)
collection = self._download_json( collection = self._download_json(
'https://api.arte.tv/api/player/v1/collectionData/%s/%s?source=videos' 'https://api.arte.tv/api/player/v1/collectionData/%s/%s?source=videos'
% (lang, playlist_id), playlist_id) % (lang, playlist_id), playlist_id)

View file

@ -5,12 +5,15 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from .kaltura import KalturaIE from .kaltura import KalturaIE
from ..utils import extract_attributes from ..utils import (
extract_attributes,
remove_end,
urlencode_postdata,
)
class AsianCrushIE(InfoExtractor): class AsianCrushIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?(?P<host>(?:(?:asiancrush|yuyutv|midnightpulp)\.com|cocoro\.tv))' _VALID_URL = r'https?://(?:www\.)?asiancrush\.com/video/(?:[^/]+/)?0+(?P<id>\d+)v\b'
_VALID_URL = r'%s/video/(?:[^/]+/)?0+(?P<id>\d+)v\b' % _VALID_URL_BASE
_TESTS = [{ _TESTS = [{
'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/', 'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/',
'md5': 'c3b740e48d0ba002a42c0b72857beae6', 'md5': 'c3b740e48d0ba002a42c0b72857beae6',
@ -18,7 +21,7 @@ class AsianCrushIE(InfoExtractor):
'id': '1_y4tmjm5r', 'id': '1_y4tmjm5r',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Women Who Flirt', 'title': 'Women Who Flirt',
'description': 'md5:7e986615808bcfb11756eb503a751487', 'description': 'md5:3db14e9186197857e7063522cb89a805',
'timestamp': 1496936429, 'timestamp': 1496936429,
'upload_date': '20170608', 'upload_date': '20170608',
'uploader_id': 'craig@crifkin.com', 'uploader_id': 'craig@crifkin.com',
@ -26,75 +29,29 @@ class AsianCrushIE(InfoExtractor):
}, { }, {
'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/', 'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.yuyutv.com/video/013886v/the-act-of-killing/',
'only_matching': True,
}, {
'url': 'https://www.yuyutv.com/video/peep-show/013922v-warring-factions/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/video/010400v/drifters/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/video/mononoke/016378v-zashikiwarashi-part-1/',
'only_matching': True,
}, {
'url': 'https://www.cocoro.tv/video/the-wonderful-wizard-of-oz/008878v-the-wonderful-wizard-of-oz-ep01/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) video_id = self._match_id(url)
host = mobj.group('host')
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id) data = self._download_json(
'https://www.asiancrush.com/wp-admin/admin-ajax.php', video_id,
data=urlencode_postdata({
'postid': video_id,
'action': 'get_channel_kaltura_vars',
}))
entry_id, partner_id, title = [None] * 3 entry_id = data['entry_id']
vars = self._parse_json( return self.url_result(
self._search_regex( 'kaltura:%s:%s' % (data['partner_id'], entry_id),
r'iEmbedVars\s*=\s*({.+?})', webpage, 'embed vars', ie=KalturaIE.ie_key(), video_id=entry_id,
default='{}'), video_id, fatal=False) video_title=data.get('vid_label'))
if vars:
entry_id = vars.get('entry_id')
partner_id = vars.get('partner_id')
title = vars.get('vid_label')
if not entry_id:
entry_id = self._search_regex(
r'\bentry_id["\']\s*:\s*["\'](\d+)', webpage, 'entry id')
player = self._download_webpage(
'https://api.%s/embeddedVideoPlayer' % host, video_id,
query={'id': entry_id})
kaltura_id = self._search_regex(
r'entry_id["\']\s*:\s*(["\'])(?P<id>(?:(?!\1).)+)\1', player,
'kaltura id', group='id')
if not partner_id:
partner_id = self._search_regex(
r'/p(?:artner_id)?/(\d+)', player, 'partner id',
default='513551')
description = self._html_search_regex(
r'(?s)<div[^>]+\bclass=["\']description["\'][^>]*>(.+?)</div>',
webpage, 'description', fatal=False)
return {
'_type': 'url_transparent',
'url': 'kaltura:%s:%s' % (partner_id, kaltura_id),
'ie_key': KalturaIE.ie_key(),
'id': video_id,
'title': title,
'description': description,
}
class AsianCrushPlaylistIE(InfoExtractor): class AsianCrushPlaylistIE(InfoExtractor):
_VALID_URL = r'%s/series/0+(?P<id>\d+)s\b' % AsianCrushIE._VALID_URL_BASE _VALID_URL = r'https?://(?:www\.)?asiancrush\.com/series/0+(?P<id>\d+)s\b'
_TESTS = [{ _TEST = {
'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/', 'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/',
'info_dict': { 'info_dict': {
'id': '12481', 'id': '12481',
@ -102,16 +59,7 @@ class AsianCrushPlaylistIE(InfoExtractor):
'description': 'md5:7addd7c5132a09fd4741152d96cce886', 'description': 'md5:7addd7c5132a09fd4741152d96cce886',
}, },
'playlist_count': 20, 'playlist_count': 20,
}, { }
'url': 'https://www.yuyutv.com/series/013920s/peep-show/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/series/016375s/mononoke/',
'only_matching': True,
}, {
'url': 'https://www.cocoro.tv/series/008549s/the-wonderful-wizard-of-oz/',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = self._match_id(url) playlist_id = self._match_id(url)
@ -128,15 +76,15 @@ class AsianCrushPlaylistIE(InfoExtractor):
entries.append(self.url_result( entries.append(self.url_result(
mobj.group('url'), ie=AsianCrushIE.ie_key())) mobj.group('url'), ie=AsianCrushIE.ie_key()))
title = self._html_search_regex( title = remove_end(
r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage, self._html_search_regex(
'title', default=None) or self._og_search_title( r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
webpage, default=None) or self._html_search_meta( 'title', default=None) or self._og_search_title(
'twitter:title', webpage, 'title', webpage, default=None) or self._html_search_meta(
default=None) or self._search_regex( 'twitter:title', webpage, 'title',
r'<title>([^<]+)</title>', webpage, 'title', fatal=False) default=None) or self._search_regex(
if title: r'<title>([^<]+)</title>', webpage, 'title', fatal=False),
title = re.sub(r'\s*\|\s*.+?$', '', title) ' | AsianCrush')
description = self._og_search_description( description = self._og_search_description(
webpage, default=None) or self._html_search_meta( webpage, default=None) or self._html_search_meta(

View file

@ -1,118 +1,202 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import time
import hmac
import hashlib
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_HTTPError from ..compat import compat_str
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
float_or_none,
int_or_none, int_or_none,
sanitized_Request,
urlencode_postdata, urlencode_postdata,
xpath_text,
) )
class AtresPlayerIE(InfoExtractor): class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})' _VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
_NETRC_MACHINE = 'atresplayer' _NETRC_MACHINE = 'atresplayer'
_TESTS = [ _TESTS = [
{ {
'url': 'https://www.atresplayer.com/antena3/series/pequenas-coincidencias/temporada-1/capitulo-7-asuntos-pendientes_5d4aa2c57ed1a88fc715a615/', 'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html',
'md5': 'efd56753cda1bb64df52a3074f62e38a',
'info_dict': { 'info_dict': {
'id': '5d4aa2c57ed1a88fc715a615', 'id': 'capitulo-10-especial-solidario-nochebuena',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Capítulo 7: Asuntos pendientes', 'title': 'Especial Solidario de Nochebuena',
'description': 'md5:7634cdcb4d50d5381bedf93efb537fbc', 'description': 'md5:e2d52ff12214fa937107d21064075bf1',
'duration': 3413, 'duration': 5527.6,
}, 'thumbnail': r're:^https?://.*\.jpg$',
'params': {
'format': 'bestvideo',
}, },
'skip': 'This video is only available for registered users' 'skip': 'This video is only available for registered users'
}, },
{ {
'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/', 'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html',
'only_matching': True, 'md5': '6e52cbb513c405e403dbacb7aacf8747',
'info_dict': {
'id': 'capitulo-112-david-bustamante',
'ext': 'flv',
'title': 'David Bustamante',
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
'duration': 1439.0,
'thumbnail': r're:^https?://.*\.jpg$',
},
}, },
{ {
'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/', 'url': 'http://www.atresplayer.com/television/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_2014122400174.html',
'only_matching': True, 'only_matching': True,
}, },
] ]
_API_BASE = 'https://api.atresplayer.com/'
_USER_AGENT = 'Dalvik/1.6.0 (Linux; U; Android 4.3; GT-I9300 Build/JSS15J'
_MAGIC = 'QWtMLXs414Yo+c#_+Q#K@NN)'
_TIMESTAMP_SHIFT = 30000
_TIME_API_URL = 'http://servicios.atresplayer.com/api/admin/time.json'
_URL_VIDEO_TEMPLATE = 'https://servicios.atresplayer.com/api/urlVideo/{1}/{0}/{1}|{2}|{3}.json'
_PLAYER_URL_TEMPLATE = 'https://servicios.atresplayer.com/episode/getplayer.json?episodePk=%s'
_EPISODE_URL_TEMPLATE = 'http://www.atresplayer.com/episodexml/%s'
_LOGIN_URL = 'https://servicios.atresplayer.com/j_spring_security_check'
_ERRORS = {
'UNPUBLISHED': 'We\'re sorry, but this video is not yet available.',
'DELETED': 'This video has expired and is no longer available for online streaming.',
'GEOUNPUBLISHED': 'We\'re sorry, but this video is not available in your region due to right restrictions.',
# 'PREMIUM': 'PREMIUM',
}
def _real_initialize(self): def _real_initialize(self):
self._login() self._login()
def _handle_error(self, e, code):
if isinstance(e.cause, compat_HTTPError) and e.cause.code == code:
error = self._parse_json(e.cause.read(), None)
if error.get('error') == 'required_registered':
self.raise_login_required()
raise ExtractorError(error['error_description'], expected=True)
raise
def _login(self): def _login(self):
username, password = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return
self._request_webpage( login_form = {
self._API_BASE + 'login', None, 'Downloading login page') 'j_username': username,
'j_password': password,
}
try: request = sanitized_Request(
target_url = self._download_json( self._LOGIN_URL, urlencode_postdata(login_form))
'https://account.atresmedia.com/api/login', None, request.add_header('Content-Type', 'application/x-www-form-urlencoded')
'Logging in', headers={ response = self._download_webpage(
'Content-Type': 'application/x-www-form-urlencoded' request, None, 'Logging in')
}, data=urlencode_postdata({
'username': username,
'password': password,
}))['targetUrl']
except ExtractorError as e:
self._handle_error(e, 400)
self._request_webpage(target_url, None, 'Following Target URL') error = self._html_search_regex(
r'(?s)<ul[^>]+class="[^"]*\blist_error\b[^"]*">(.+?)</ul>',
response, 'error', default=None)
if error:
raise ExtractorError(
'Unable to login: %s' % error, expected=True)
def _real_extract(self, url): def _real_extract(self, url):
display_id, video_id = re.match(self._VALID_URL, url).groups() video_id = self._match_id(url)
try: webpage = self._download_webpage(url, video_id)
episode = self._download_json(
self._API_BASE + 'client/v1/player/episode/' + video_id, video_id)
except ExtractorError as e:
self._handle_error(e, 403)
title = episode['titulo'] episode_id = self._search_regex(
r'episode="([^"]+)"', webpage, 'episode id')
request = sanitized_Request(
self._PLAYER_URL_TEMPLATE % episode_id,
headers={'User-Agent': self._USER_AGENT})
player = self._download_json(request, episode_id, 'Downloading player JSON')
episode_type = player.get('typeOfEpisode')
error_message = self._ERRORS.get(episode_type)
if error_message:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
formats = [] formats = []
for source in episode.get('sources', []): video_url = player.get('urlVideo')
src = source.get('src') if video_url:
if not src: format_info = {
'url': video_url,
'format_id': 'http',
}
mobj = re.search(r'(?P<bitrate>\d+)K_(?P<width>\d+)x(?P<height>\d+)', video_url)
if mobj:
format_info.update({
'width': int_or_none(mobj.group('width')),
'height': int_or_none(mobj.group('height')),
'tbr': int_or_none(mobj.group('bitrate')),
})
formats.append(format_info)
timestamp = int_or_none(self._download_webpage(
self._TIME_API_URL,
video_id, 'Downloading timestamp', fatal=False), 1000, time.time())
timestamp_shifted = compat_str(timestamp + self._TIMESTAMP_SHIFT)
token = hmac.new(
self._MAGIC.encode('ascii'),
(episode_id + timestamp_shifted).encode('utf-8'), hashlib.md5
).hexdigest()
request = sanitized_Request(
self._URL_VIDEO_TEMPLATE.format('windows', episode_id, timestamp_shifted, token),
headers={'User-Agent': self._USER_AGENT})
fmt_json = self._download_json(
request, video_id, 'Downloading windows video JSON')
result = fmt_json.get('resultDes')
if result.lower() != 'ok':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
for format_id, video_url in fmt_json['resultObject'].items():
if format_id == 'token' or not video_url.startswith('http'):
continue continue
src_type = source.get('type') if 'geodeswowsmpra3player' in video_url:
if src_type == 'application/vnd.apple.mpegurl': # f4m_path = video_url.split('smil:', 1)[-1].split('free_', 1)[0]
formats.extend(self._extract_m3u8_formats( # f4m_url = 'http://drg.antena3.com/{0}hds/es/sd.f4m'.format(f4m_path)
src, video_id, 'mp4', 'm3u8_native', # this videos are protected by DRM, the f4m downloader doesn't support them
m3u8_id='hls', fatal=False)) continue
elif src_type == 'application/dash+xml': video_url_hd = video_url.replace('free_es', 'es')
formats.extend(self._extract_mpd_formats( formats.extend(self._extract_f4m_formats(
src, video_id, mpd_id='dash', fatal=False)) video_url_hd[:-9] + '/manifest.f4m', video_id, f4m_id='hds',
fatal=False))
formats.extend(self._extract_mpd_formats(
video_url_hd[:-9] + '/manifest.mpd', video_id, mpd_id='dash',
fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
heartbeat = episode.get('heartbeat') or {} path_data = player.get('pathData')
omniture = episode.get('omniture') or {}
get_meta = lambda x: heartbeat.get(x) or omniture.get(x) episode = self._download_xml(
self._EPISODE_URL_TEMPLATE % path_data, video_id,
'Downloading episode XML')
duration = float_or_none(xpath_text(
episode, './media/asset/info/technical/contentDuration', 'duration'))
art = episode.find('./media/asset/info/art')
title = xpath_text(art, './name', 'title')
description = xpath_text(art, './description', 'description')
thumbnail = xpath_text(episode, './media/asset/files/background', 'thumbnail')
subtitles = {}
subtitle_url = xpath_text(episode, './media/asset/files/subtitle', 'subtitle')
if subtitle_url:
subtitles['es'] = [{
'ext': 'srt',
'url': subtitle_url,
}]
return { return {
'display_id': display_id,
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'description': episode.get('descripcion'), 'description': description,
'thumbnail': episode.get('imgPoster'), 'thumbnail': thumbnail,
'duration': int_or_none(episode.get('duration')), 'duration': duration,
'formats': formats, 'formats': formats,
'channel': get_meta('channel'), 'subtitles': subtitles,
'season': get_meta('season'),
'episode_number': int_or_none(get_meta('episodeNumber')),
} }

View file

@ -28,10 +28,8 @@ class ATVAtIE(InfoExtractor):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
video_data = self._parse_json(unescapeHTML(self._search_regex( video_data = self._parse_json(unescapeHTML(self._search_regex(
[r'flashPlayerOptions\s*=\s*(["\'])(?P<json>(?:(?!\1).)+)\1', r'class="[^"]*jsb_video/FlashPlayer[^"]*"[^>]+data-jsb="([^"]+)"',
r'class="[^"]*jsb_video/FlashPlayer[^"]*"[^>]+data-jsb="(?P<json>[^"]+)"'], webpage, 'player data')), display_id)['config']['initial_video']
webpage, 'player data', group='json')),
display_id)['config']['initial_video']
video_id = video_data['id'] video_id = video_data['id']
video_title = video_data['title'] video_title = video_data['title']

View file

@ -5,12 +5,13 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
sanitized_Request,
) )
class AudiMediaIE(InfoExtractor): class AudiMediaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audi-mediacenter\.com/(?:en|de)/audimediatv/(?:video/)?(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?audi-mediacenter\.com/(?:en|de)/audimediatv/(?P<id>[^/?#]+)'
_TESTS = [{ _TEST = {
'url': 'https://www.audi-mediacenter.com/en/audimediatv/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-1467', 'url': 'https://www.audi-mediacenter.com/en/audimediatv/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-1467',
'md5': '79a8b71c46d49042609795ab59779b66', 'md5': '79a8b71c46d49042609795ab59779b66',
'info_dict': { 'info_dict': {
@ -23,46 +24,41 @@ class AudiMediaIE(InfoExtractor):
'duration': 74022, 'duration': 74022,
'view_count': int, 'view_count': int,
} }
}, { }
'url': 'https://www.audi-mediacenter.com/en/audimediatv/video/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-2991', # extracted from https://audimedia.tv/assets/embed/embedded-player.js (dataSourceAuthToken)
'only_matching': True, _AUTH_TOKEN = 'e25b42847dba18c6c8816d5d8ce94c326e06823ebf0859ed164b3ba169be97f2'
}]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
raw_payload = self._search_regex([ raw_payload = self._search_regex([
r'class="amtv-embed"[^>]+id="([0-9a-z-]+)"', r'class="amtv-embed"[^>]+id="([^"]+)"',
r'id="([0-9a-z-]+)"[^>]+class="amtv-embed"', r'class=\\"amtv-embed\\"[^>]+id=\\"([^"]+)\\"',
r'class=\\"amtv-embed\\"[^>]+id=\\"([0-9a-z-]+)\\"',
r'id=\\"([0-9a-z-]+)\\"[^>]+class=\\"amtv-embed\\"',
r'id=(?:\\)?"(amtve-[a-z]-\d+-[a-z]{2})',
], webpage, 'raw payload') ], webpage, 'raw payload')
_, stage_mode, video_id, _ = raw_payload.split('-') _, stage_mode, video_id, lang = raw_payload.split('-')
# TODO: handle s and e stage_mode (live streams and ended live streams) # TODO: handle s and e stage_mode (live streams and ended live streams)
if stage_mode not in ('s', 'e'): if stage_mode not in ('s', 'e'):
video_data = self._download_json( request = sanitized_Request(
'https://www.audimedia.tv/api/video/v1/videos/' + video_id, 'https://audimedia.tv/api/video/v1/videos/%s?embed[]=video_versions&embed[]=thumbnail_image&where[content_language_iso]=%s' % (video_id, lang),
video_id, query={ headers={'X-Auth-Token': self._AUTH_TOKEN})
'embed[]': ['video_versions', 'thumbnail_image'], json_data = self._download_json(request, video_id)['results']
})['results']
formats = [] formats = []
stream_url_hls = video_data.get('stream_url_hls') stream_url_hls = json_data.get('stream_url_hls')
if stream_url_hls: if stream_url_hls:
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
stream_url_hls, video_id, 'mp4', stream_url_hls, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)) entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
stream_url_hds = video_data.get('stream_url_hds') stream_url_hds = json_data.get('stream_url_hds')
if stream_url_hds: if stream_url_hds:
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
stream_url_hds + '?hdcore=3.4.0', stream_url_hds + '?hdcore=3.4.0',
video_id, f4m_id='hds', fatal=False)) video_id, f4m_id='hds', fatal=False))
for video_version in video_data.get('video_versions', []): for video_version in json_data.get('video_versions'):
video_version_url = video_version.get('download_url') or video_version.get('stream_url') video_version_url = video_version.get('download_url') or video_version.get('stream_url')
if not video_version_url: if not video_version_url:
continue continue
@ -83,11 +79,11 @@ class AudiMediaIE(InfoExtractor):
return { return {
'id': video_id, 'id': video_id,
'title': video_data['title'], 'title': json_data['title'],
'description': video_data.get('subtitle'), 'description': json_data.get('subtitle'),
'thumbnail': video_data.get('thumbnail_image', {}).get('file'), 'thumbnail': json_data.get('thumbnail_image', {}).get('file'),
'timestamp': parse_iso8601(video_data.get('publication_date')), 'timestamp': parse_iso8601(json_data.get('publication_date')),
'duration': int_or_none(video_data.get('duration')), 'duration': int_or_none(json_data.get('duration')),
'view_count': int_or_none(video_data.get('view_count')), 'view_count': int_or_none(json_data.get('view_count')),
'formats': formats, 'formats': formats,
} }

View file

@ -2,25 +2,22 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import float_or_none
clean_html,
float_or_none,
)
class AudioBoomIE(InfoExtractor): class AudioBoomIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://audioboom.com/posts/7398103-asim-chaudhry', 'url': 'https://audioboom.com/boos/4279833-3-09-2016-czaban-hour-3?t=0',
'md5': '7b00192e593ff227e6a315486979a42d', 'md5': '63a8d73a055c6ed0f1e51921a10a5a76',
'info_dict': { 'info_dict': {
'id': '7398103', 'id': '4279833',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Asim Chaudhry', 'title': '3/09/2016 Czaban Hour 3',
'description': 'md5:2f3fef17dacc2595b5362e1d7d3602fc', 'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 4000.99, 'duration': 2245.72,
'uploader': 'Sue Perkins: An hour or so with...', 'uploader': 'SB Nation A.M.',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/perkins', 'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
} }
}, { }, {
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0', 'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
@ -35,8 +32,8 @@ class AudioBoomIE(InfoExtractor):
clip = None clip = None
clip_store = self._parse_json( clip_store = self._parse_json(
self._html_search_regex( self._search_regex(
r'data-new-clip-store=(["\'])(?P<json>{.+?})\1', r'data-new-clip-store=(["\'])(?P<json>{.*?"clipId"\s*:\s*%s.*?})\1' % video_id,
webpage, 'clip store', default='{}', group='json'), webpage, 'clip store', default='{}', group='json'),
video_id, fatal=False) video_id, fatal=False)
if clip_store: if clip_store:
@ -50,15 +47,14 @@ class AudioBoomIE(InfoExtractor):
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property( audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
'audio', webpage, 'audio url') 'audio', webpage, 'audio url')
title = from_clip('title') or self._html_search_meta( title = from_clip('title') or self._og_search_title(webpage)
['og:title', 'og:audio:title', 'audio_title'], webpage) description = from_clip('description') or self._og_search_description(webpage)
description = from_clip('description') or clean_html(from_clip('formattedDescription')) or self._og_search_description(webpage)
duration = float_or_none(from_clip('duration') or self._html_search_meta( duration = float_or_none(from_clip('duration') or self._html_search_meta(
'weibo:audio:duration', webpage)) 'weibo:audio:duration', webpage))
uploader = from_clip('author') or self._html_search_meta( uploader = from_clip('author') or self._og_search_property(
['og:audio:artist', 'twitter:audio:artist_name', 'audio_artist'], webpage, 'uploader') 'audio:artist', webpage, 'uploader', fatal=False)
uploader_url = from_clip('author_url') or self._html_search_meta( uploader_url = from_clip('author_url') or self._html_search_meta(
'audioboo:channel', webpage, 'uploader url') 'audioboo:channel', webpage, 'uploader url')

Some files were not shown because too many files have changed in this diff Show more