Yen Chi Hsuan
79298173c5
[utils] Fix getheader in urlhandle_detect_ext
...
Fixes #7049 , related to #9440
2016-05-15 15:34:50 +08:00
Sergey M․
cda6d47aad
[utils] Simplify integer conversion in js_to_json
2016-05-14 23:41:57 +06:00
Sergey M․
89ac4a19e6
[utils] Process non-base 10 integers in js_to_json
2016-05-14 20:39:58 +06:00
felix
bd1e484448
[utils] js_to_json: various improvements
...
now JS object literals like { /* " */ 0: ",]\xaa<\/p>", } will be correctly converted to JSON.
2016-05-14 20:12:39 +06:00
Yen Chi Hsuan
7581bfc958
[utils] Unquote crendentials passed to SOCKS proxies
...
Fixes #9450
2016-05-13 00:27:25 +08:00
Yen Chi Hsuan
778a1ccca7
[utils] Add Œ and œ found in French to ACCENT_CHARS
...
Fixes #9463
2016-05-12 19:48:48 +08:00
Yen Chi Hsuan
702ccf2dc0
[compat] Rename shlex_quote and remove unused subprocess_check_output
2016-05-10 16:00:21 +08:00
Yen Chi Hsuan
edaa23f822
[compat] Rename struct_(un)pack to compat_struct_(un)pack
2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
d5ae6bb501
[utils] Add rationale for register_socks_protocols
2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
51fb4995a5
[utils] Register SOCKS protocols in urllib and support SOCKS4A
2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
71aff18809
[socks] Support SOCKS proxies
2016-05-10 14:51:38 +08:00
Yen Chi Hsuan
dab0daeeb0
[utils,compat] Move struct_pack and struct_unpack to compat.py
2016-05-10 14:51:38 +08:00
Sergey M․
abc97b5eda
[utils] Allow empty attribute values in get_element_by_attribute ( Closes #9415 )
2016-05-06 22:07:30 +06:00
Adam Thalhammer
c587cbb793
improved performance by extracting accented chars to top level
2016-05-03 10:40:30 +10:00
Adam Thalhammer
79a2e94e79
Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347
2016-05-02 13:21:39 +10:00
Sergey M․
eb9ee19422
[utils] Allow None mimetypes in mimetype2ext
2016-04-25 00:03:12 +06:00
Sergey M
b6c0d4f431
Merge pull request #9110 from remitamine/parse_duration
...
[utils] imporove parse_duration to handle more formats
2016-04-21 22:53:16 +07:00
remitamine
acaff49575
[utils] imporove parse_duration to handle more formats
2016-04-21 16:34:54 +01:00
Yen Chi Hsuan
cacd996662
[utils] Don't touch URLs if not necessary
...
Fix test_Generic_15 (Google redirect)
2016-04-09 19:27:54 +08:00
Jaime Marquínez Ferrándiz
5bf28d7864
[utils] dfxp2srt: add additional namespace
...
Used by the ZDF subtitles (#9081 ).
2016-04-04 20:46:35 +02:00
Sergey M․
15d260ebaa
[utils] Use update_Request in http_request
2016-03-31 22:55:49 +06:00
Sergey M․
ed0291d153
[utils] Add update_Request
2016-03-31 22:55:01 +06:00
Sergey M․
17bcc626bf
[utils] Extract sanitize_url routine
2016-03-26 19:33:57 +06:00
Sergey M․
15707c7e02
[compat] Add compat_urllib_parse_urlencode and eliminate encode_dict
...
encode_dict functionality has been improved and moved directly into compat_urllib_parse_urlencode
All occurrences of compat_urllib_parse.urlencode throughout the codebase have been replaced by compat_urllib_parse_urlencode
Closes #8974
2016-03-26 01:46:57 +06:00
Yen Chi Hsuan
622d19160b
[utils] Clarify Python versions affected by buggy struct module
2016-03-24 18:06:15 +08:00
Yen Chi Hsuan
efbed08dc2
[utils] Encode hostnames before passing to urllib
...
With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError
Fixes #8890
2016-03-23 22:24:52 +08:00
Jaime Marquínez Ferrándiz
782b1b5bd1
[utils] lookup_unit_table: Match word boundary instead of end of string
2016-03-19 11:44:49 +01:00
Jaime Marquínez Ferrándiz
09fc33198a
utils: lookup_unit_table: Use a stricter regex
...
In parse_count multiple units start with the same letter, so it would match different units depending on the order they were sorted when iterating over them.
2016-03-18 19:23:06 +01:00
Sergey M․
810c10baa1
[utils] Use compat_xpath
2016-03-18 02:52:23 +06:00
Sergey M․
c5229f3926
[utils] PEP 8
2016-03-16 21:50:04 +06:00
remitamine
83548824c2
Merge pull request #8092 from bpfoley/twitter-thumbnail
...
[utils] Add extract_attributes for extracting html tag attributes
2016-03-16 13:16:27 +01:00
Sergey M․
2f7ae819ac
[utils] PEP 8
2016-03-13 17:23:08 +06:00
Sergey M․
fb47597b09
[bbc] Generalize unit table lookup and add parse_count
2016-03-13 16:27:20 +06:00
Yen Chi Hsuan
25cb05bda9
[utils] Remove codec2ext
...
This function is orignally used for determining file extensions of DASH
formats. Now in DASH, ext is determined by mime_type. See #8766 for more
information.
2016-03-11 23:51:42 +08:00
Yen Chi Hsuan
6d210f2090
[utils] Add more codecs to codec2ext
...
BBC uses avc3. Here's an example (thanks to @remitamine for this example)
http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd
See also https://trac.ffmpeg.org/ticket/5217
2016-03-06 17:57:48 +08:00
Yen Chi Hsuan
19a17d4623
[utils] Add codec2ext
2016-03-05 18:18:28 +08:00
Jaime Marquínez Ferrándiz
3233a68fbb
[utils] update_url_query: Encode the strings in the query dict
...
The test case with {'test': '第二行тест'} was failing on python 2 (the non-ascii characters were replaced with '?').
2016-03-04 22:18:40 +01:00
remitamine
1255733945
Merge pull request #8739 from remitamine/update_url_params
...
[utils] add update_url_query function to create or update query string params
2016-03-03 19:24:04 +01:00
remitamine
38f9ef31dc
[utils] add update_url_query function
2016-03-03 18:34:52 +01:00
Yen Chi Hsuan
0cae023b24
Merge branch 'jython-support'
...
Closes #8302
2016-03-03 18:49:32 +08:00
Yen Chi Hsuan
8ee239e921
[utils] Jython support - handle filenames correctly
...
Now test:youtube downloads
2016-03-03 18:47:54 +08:00
Brian Foley
8bb56eeeea
[utils] Add extract_attributes for extracting html tag attributes
...
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
2016-03-03 10:11:37 +00:00
remitamine
e07237f640
[utils] remove check for val from find_xpath_attr
2016-03-02 21:40:21 +01:00
Yen Chi Hsuan
5eb6bdced4
[utils] Multiple changes to base_n()
...
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
2016-02-27 03:22:52 +08:00
Yen Chi Hsuan
680079be39
[utils] Relaxing regex in decode_packed_codes for vidzi
2016-02-26 15:13:03 +08:00
Yen Chi Hsuan
f52354a889
[utils] Move codes for handling eval() from iqiyi.py
2016-02-26 14:58:29 +08:00
Yen Chi Hsuan
59f898b7a7
[utils] Merge base_n functions
2016-02-26 14:37:20 +08:00
Yen Chi Hsuan
481888294d
[utils] Add base36 for use in Vidzi
2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
81bdc8fdf6
[utils] Move base62 to utils
2016-02-26 14:26:26 +08:00
Sergey M․
f160785c5c
[utils] Remove AM/PM from unified_strdate patterns
2016-02-25 00:52:49 +06:00
Yen Chi Hsuan
b95dc034ca
[utils] Implement cache for OnDemandPagedList
2016-02-23 13:11:20 +08:00
remitamine
cafcf657a4
add more subtitles mime types to mimetype2ext and fix the platform subtitle extraction
2016-02-20 22:02:03 +01:00
Yen Chi Hsuan
c1c05c67ea
[utils] Jython support - disable setproctitle() until ctypes is complete
2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
399a76e67b
[utils] Jython support: tolerate missing fcntl module
2016-02-21 03:32:03 +08:00
Jaime Marquínez Ferrándiz
765ac263db
[utils] mimetype2ext: return 'm4a' for 'audio/mp4' ( fixes #8620 )
...
The youtube extractor was using 'mp4' for them, therefore filters like 'bestaudio[ext=m4a]' stopped working (94278f7202
broke it).
2016-02-20 19:55:10 +01:00
Yen Chi Hsuan
5bc880b988
[utils] Add OHDave's RSA encryption function
2016-02-20 19:54:58 +08:00
Sergey M․
611c1dd96e
[refactor] Single quotes consistency
2016-02-14 15:37:17 +06:00
Sergey M․
d800609c62
[refactor] Do not specify redundant None as second argument in dict.get()
2016-02-14 14:25:04 +06:00
Sergey M․
9c7b38981c
[utils] Bump Firefox version in User-Agent
...
Old version number causes Youtube not to serve some formats in ytplayer.config
2016-02-11 23:12:30 +06:00
Sergey M․
8411229bd5
[utils] Allow dot in strip_jsonp
2016-02-07 19:47:09 +06:00
Sergey M․
86296ad2cd
[utils] Add ability to control skipping false values in dict_get
2016-02-07 08:13:04 +06:00
Sergey M․
cbecc9b903
[utils] Add dict_get convenience method
2016-02-07 06:12:53 +06:00
Jaime Marquínez Ferrándiz
87de7069b9
[utils] dfxp2srt: make TTMLPElementParser inherit from object
...
For consistency between python 2 and 3.
2016-02-02 22:30:13 +01:00
remitamine
2b14cb566f
[utils] fix dfxp2srt text extraction( fixes #8055 )
2016-01-28 12:38:34 +01:00
Yen Chi Hsuan
a0d8d704df
[utils] Reorder items in mimetype2ext alphabetically
2016-01-25 01:01:15 +08:00
Yen Chi Hsuan
f6861ec96f
[utils] Add more items to mimetype2ext ( #8293 )
...
These are used in Youtube formats
2016-01-25 00:58:53 +08:00
remitamine
6ec6cb4e95
Revert "fix typos"
...
This reverts commit 36a0e46c39
.
2016-01-10 19:27:22 +01:00
remitamine
36a0e46c39
fix typos
2016-01-10 17:55:41 +01:00
Jakub Wilk
dfb1b1468c
Fix typos
...
Closes #8200 .
2016-01-10 17:24:28 +01:00
Sergey M․
a7aaa39863
[utils] Extract known extensions for reuse
2016-01-04 01:08:34 +06:00
Yen Chi Hsuan
c047270c02
[utils] Remove Content-encoding from headers after decompression
...
With cn_verification_proxy, our http_response() is called twice, one from
PerRequestProxyHandler.proxy_open() and another from normal
YoutubeDL.urlopen(). As a result, for proxies honoring Accept-Encoding, the
following bug occurs:
$ youtube-dl -vs --cn-verification-proxy https://secure.uku.im:993 "test:letv"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-vs', '--cn-verification-proxy', 'https://secure.uku.im:993 ', 'test:letv']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.12.23
[debug] Git HEAD: 97f18fa
[debug] Python version 3.5.1 - Linux-4.3.3-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: ffmpeg 2.8.4, ffprobe 2.8.4, rtmpdump 2.4
[debug] Proxy map: {}
[TestURL] Test URL: http://www.letv.com/ptv/vplay/22005890.html
[Letv] 22005890: Downloading webpage
[Letv] 22005890: Downloading playJson data
ERROR: Unable to download JSON metadata: Not a gzipped file (b'{"') (caused by OSError('Not a gzipped file (b\'{"\')',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/common.py", line 330, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1886, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/lib/python3.5/urllib/request.py", line 471, in open
response = meth(req, response)
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 773, in http_response
raise original_ioerror
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 761, in http_response
uncompressed = io.BytesIO(gz.read())
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
2015-12-28 01:09:18 +08:00
Sergey M․
9b9c5355e4
Rename error_to_str to error_to_compat_str
2015-12-20 07:00:39 +06:00
Sergey M․
8e60dc7526
[utils] Add encode_compat_str
2015-12-20 06:26:26 +06:00
Sergey M․
fdae235858
[utils] Add error_to_str
2015-12-20 05:26:47 +06:00
Yen Chi Hsuan
db2fe38b55
[utils] Support alternative timestamp format in TTML
...
Fixes #7608
2015-12-19 19:29:51 +08:00
Yen Chi Hsuan
d631d5f9f2
[utils] Fix TTML conversion
...
Tolerate invalid timestamps (closes #7909 )
2015-12-19 18:21:42 +08:00
Sergey M․
31b2051e21
[utils] Add remove_quotes
2015-12-14 21:30:58 +06:00
Yen Chi Hsuan
992fc9d6e1
[utils] Refactor handle_youtubedl_headers for future extension
2015-11-29 12:58:29 +08:00
Yen Chi Hsuan
0424ec307b
[utils] Correct docstring of YoutubeDLHandler
2015-11-29 12:46:04 +08:00
Yen Chi Hsuan
87f0e62d94
[utils] Separate codes for handling Youtubedl-* headers
2015-11-29 12:42:50 +08:00
Sergey M․
67dda51722
Rename compat_urllib_request_Request to sanitized_Request and move to utils
2015-11-23 21:55:15 +06:00
Sergey M․
9cb9a5df77
[utils] Check ext with trailing slash against the list of known extensions
2015-11-22 17:27:13 +06:00
Sergey M․
3e12bc583a
[utils] Improve determine_ext ( Closes #7593 )
2015-11-22 06:29:39 +06:00
Sergey M․
7e1f5447e7
[utils] Improve encode_dict
2015-11-21 20:46:33 +06:00
Sergey M․
7a3f0c00ad
[utils] Style
2015-11-16 20:24:09 +06:00
Sergey M․
7aefc49c40
[utils] Skip invalid/non HTML entities ( Closes #7518 )
2015-11-16 20:20:16 +06:00
Jaime Marquínez Ferrándiz
6a75040278
[utils] unified_strdate: Return None if the date format can't be recognized ( fixes #7340 )
...
This issue was introduced with ae12bc3ebb
, it returned 'None'.
2015-11-02 14:08:38 +01:00
Sergey M․
c90d16cf36
[utils:sanitize_path] Disallow trailing whitespace in path segment ( Closes #7332 )
2015-11-02 04:26:20 +06:00
Sergey M
30eecc6a04
Merge pull request #7296 from jaimeMF/xml_attrib_unicode
...
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
2015-10-31 18:15:21 +00:00
Sergey M․
ae12bc3ebb
[utils] Make unified_strdate always return unicode string
2015-10-31 23:07:37 +06:00
Sergey M․
578c074575
[utils] Support list of xpath in xpath_element
2015-10-31 22:39:44 +06:00
Sergey M․
52c3a6e49d
[utils] Improve parse_iso8601
2015-10-28 21:40:22 +06:00
Jaime Marquínez Ferrándiz
f78546272c
[compat] compat_etree_fromstring: also decode the text attribute
...
Deletes parse_xml from utils, because it also does it.
2015-10-26 16:41:24 +01:00
Jaime Marquínez Ferrándiz
36e6f62cd0
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ( #7178 )
...
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
2015-10-25 20:13:16 +01:00
Sergey M․
d01949dc89
[utils:js_to_json] Fix bad escape in double quoted strings
2015-10-20 23:09:51 +06:00
Yen Chi Hsuan
1e399778ee
[letv] Fix extraction
...
Using data URIs for passing the decrypted M3U8 manifest, which is
supported by ffmpeg only.
2015-10-18 13:42:57 +08:00
Sergey M․
af98f8ff37
[utils] Return default on fail in int_or_none
2015-10-14 22:37:03 +06:00
Sergey M․
caf80631f0
[utils] Do not fail in float_or_none on non-numeric data
2015-10-14 22:36:37 +06:00
Sergey M․
1812afb7b3
[utils] Do not fail in int_or_none on non-numeric data ( Closes #7175 )
2015-10-14 22:35:01 +06:00
Sergey M․
5a1a2e9454
[utils] Fix kwargs on old python 2 ( Closes #6905 )
2015-09-20 21:08:29 +06:00