Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32
* Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
* Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
* test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
Solaris or Windows, but does it really exist? I found it the in the issue.
........
r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines
Fix my last commit (r81471) about codecs
Rememder: don't touch the code just before a commit
........
................
Issue #5753: A new C API function, :cfunc:`PySys_SetArgvEx`, allows
embedders of the interpreter to set sys.argv without also modifying
sys.path. This helps fix `CVE-2008-5983
<http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5983>`_.
........
................
Bug 7755: audiotest.au is arguably copyrighted material, but definitely makes
Debian unhappy. The actual contents of the audio clip are unimportant, so
replace it with something that we know is okay. Guido likes woodpeckers.
........
Issue #8766: Initialize _warnings module before importing the first module.
Fix a crash if an empty directory called "encodings" exists in sys.path.
........
Issue #8559: improve unicode support of (gdb) libpython.py
* Escape non printable characters (use locale.getpreferredencoding())
* Fix support of surrogate pairs
* test_gdb.py: use ascii() instead of repr() in gdb program arguments to avoid
encoding issues
* Fix test_strings() of test_gdb.py for encoding different than UTF-8
(eg. ACSII)
........
Merged revisions 80030,80067,80069,80080-80081,80084,80432-80433,80465-80470,81059,81065-81067 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
Add an x-ref to where the O_ constants are documented and move the SEEK_ constants after lseek().
........
r80080 | georg.brandl | 2010-04-14 21:16:38 +0200 (Mi, 14 Apr 2010) | 1 line
#8399: add note about Windows and O_BINARY.
........
r80081 | georg.brandl | 2010-04-14 23:34:44 +0200 (Mi, 14 Apr 2010) | 1 line
#5250: document __instancecheck__ and __subclasscheck__. I hope the part about the class/metaclass distinction is understandable.
........
r80084 | georg.brandl | 2010-04-14 23:46:45 +0200 (Mi, 14 Apr 2010) | 1 line
#7507: quote "!" in pipes.quote(); it is a special character for some shells.
........
r80465 | georg.brandl | 2010-04-25 12:29:17 +0200 (So, 25 Apr 2010) | 1 line
Remove LaTeXy index entry syntax.
........
r80466 | georg.brandl | 2010-04-25 12:54:42 +0200 (So, 25 Apr 2010) | 1 line
Patch from Tim Hatch: Better cross-referencing in socket and winreg docs.
........
r80467 | georg.brandl | 2010-04-25 12:55:16 +0200 (So, 25 Apr 2010) | 1 line
Patch from Tim Hatch: Remove reference to winreg being the fabled high-level registry interface.
........
r80468 | georg.brandl | 2010-04-25 12:55:58 +0200 (So, 25 Apr 2010) | 1 line
Patch from Tim Hatch: Minor spelling changes to _winreg docs.
........
r80469 | georg.brandl | 2010-04-25 12:56:41 +0200 (So, 25 Apr 2010) | 1 line
Fix code example to have valid syntax so that it can be highlighted.
........
r80470 | georg.brandl | 2010-04-25 12:57:15 +0200 (So, 25 Apr 2010) | 1 line
Patch from Tim Hatch: Make socket setblocking <-> settimeout examples symmetric.
........
r81059 | georg.brandl | 2010-05-10 23:02:51 +0200 (Mo, 10 Mai 2010) | 1 line
#8642: fix wrong function name.
........
r81065 | georg.brandl | 2010-05-10 23:46:50 +0200 (Mo, 10 Mai 2010) | 1 line
Fix reference direction.
........
r81066 | georg.brandl | 2010-05-10 23:50:57 +0200 (Mo, 10 Mai 2010) | 1 line
Consolidate deprecation messages.
........
r81067 | georg.brandl | 2010-05-10 23:51:33 +0200 (Mo, 10 Mai 2010) | 1 line
Patch from Tim Hatch: Better cross-referencing in socket and winreg docs.
........
r80467 | georg.brandl | 2010-04-25 12:55:16 +0200 (So, 25 Apr 2010) | 1 line
Patch from Tim Hatch: Remove reference to winreg being the fabled high-level registry interface.
........
r80468 | georg.brandl | 2010-04-25 12:55:58 +0200 (So, 25 Apr 2010) | 1 line
Patch from Tim Hatch: Minor spelling changes to _winreg docs.
........
r80469 | georg.brandl | 2010-04-25 12:56:41 +0200 (So, 25 Apr 2010) | 1 line
Fix code example to have valid syntax so that it can be highlighted.
........
................
Issue #8663: distutils.log emulates backslashreplace error handler. Fix
compilation in a non-ASCII directory if stdout encoding is ASCII (eg. if stdout
is not a TTY).
........
r81360 | victor.stinner | 2010-05-19 19:11:19 +0200 (mer., 19 mai 2010) | 5 lines
regrtest.py: call replace_stdout() before the first call to print()
print("== ", os.getcwd()) fails if the current working directory is not ASCII
whereas sys.stdout encoding is ASCII.
........
r81361 | victor.stinner | 2010-05-19 19:15:50 +0200 (mer., 19 mai 2010) | 2 lines
Oops, add the new test_log.py for distutils test suite (missing part of r81359)
........
Issue #8589: Decode PYTHONWARNINGS environment variable with the file system
encoding and surrogateespace error handler instead of the locale encoding to be
consistent with os.environ. Add PySys_AddWarnOptionUnicode() function.
........
Issue #8513: os.get_exec_path() supports b'PATH' key and bytes value.
subprocess.Popen() and os._execvpe() support bytes program name. Add
os.supports_bytes_environ flag: True if the native OS type of the environment
is bytes (eg. False on Windows).
........
r81292 | victor.stinner | 2010-05-18 19:24:09 +0200 (mar., 18 mai 2010) | 2 lines
Add versionadded (3.2) tag to os.supports_bytes_environ documentation
........
Fix issue #8573 (asyncore._strerror bug): fixed os.strerror typo; included NameError in the tuple of expected exception; added test case for asyncore._strerror.
........
................
Issue #8633: Support for POSIX.1-2008 binary pax headers.
tarfile is now able to read and write pax headers with a
"hdrcharset=BINARY" record. This record was introduced in
POSIX.1-2008 as a method to store unencoded binary strings that
cannot be translated to UTF-8. In practice, this is just a workaround
that allows a tar implementation to store filenames that do not
comply with the current filesystem encoding and thus cannot be
decoded correctly.
Additionally, tarfile works around a bug in current versions of GNU
tar: undecodable filenames are stored as-is in a pax header without a
"hdrcharset" record being added. Technically, these headers are
invalid, but tarfile manages to read them correctly anyway.
........
Issue #6697: Fix a crash if code of "python -c code" contains surrogates
........
r81251 | victor.stinner | 2010-05-17 03:26:01 +0200 (lun., 17 mai 2010) | 3 lines
PyObject_Dump() encodes unicode objects to utf8 with backslashreplace (instead
of strict) error handler to escape surrogates
........
r81252 | victor.stinner | 2010-05-17 10:58:51 +0200 (lun., 17 mai 2010) | 6 lines
handle_system_exit() flushs files to warranty the output order
PyObject_Print() writes into the C object stderr, whereas PySys_WriteStderr()
writes into the Python object sys.stderr. Each object has its own buffer, so
call sys.stderr.flush() and fflush(stderr).
........
r81253 | victor.stinner | 2010-05-17 11:33:42 +0200 (lun., 17 mai 2010) | 6 lines
Fix refleak in internal_print() introduced by myself in r81251
_PyUnicode_AsDefaultEncodedString() uses a magical PyUnicode attribute to
automatically destroy PyUnicode_EncodeUTF8() result when the unicode string is
destroyed.
........
test_os: cleanup test_internal_execvpe() and os._execvpe() mockup
* Replace os.defpath instead of os.get_exec_path() to test also
os.get_exec_path()
* Use contextlib.contextmanager, move the mockup outside the class, and
the mockup returns directly the call list object
* Use two different contexts for the two tests
* Use more revelant values and names
........
Clear the OpenSSL error queue each time an error is signalled.
When the error queue is not emptied, strange things can happen on the next SSL call, depending on the OpenSSL version.
........
................
Issue #8692: Improve performance of math.factorial:
(1) use a different algorithm that roughly halves the total number of
multiplications required and results in more balanced multiplications
(2) use a lookup table for small arguments
(3) fast accumulation of products in C integer arithmetic rather than
PyLong arithmetic when possible.
Typical speedup, from unscientific testing on a 64-bit laptop, is 4.5x
to 6.5x for arguments in the range 100 - 10000.
Patch by Daniel Stutzbach; extensive reviews by Alexander Belopolsky.
........
Issue #8715: Create PyUnicode_EncodeFSDefault() function: Encode a Unicode
object to Py_FileSystemDefaultEncoding with the "surrogateescape" error
handler, return a bytes object. If Py_FileSystemDefaultEncoding is not set,
fall back to UTF-8.
........
Issue #8610: Load file system codec at startup, and display a fatal error on
failure. Set the file system encoding to utf-8 (instead of None) if getting
the locale encoding failed, or if nl_langinfo(CODESET) function is missing.
........
Victor Stinner [Fri, 14 May 2010 20:08:55 +0000 (20:08 +0000)]
test/support.py: remove TESTFN if it is a directory
Because of my previous commit (r81171), test_os failed without removing TESTFN
directory (shutil.rmtree() was broken). Some buildbots still have a @test
directory and some tests fail because of that.
PyUnicode_DecodeFSDefault*() doesn't use surrogateescape error handler, and so
PyUnicode_FromEncodedObject(v, Py_FileSystemDefaultEncoding, "surrogateescape")
cannot be replaced by PyUnicode_DecodeFSDefault().
It's a bad idea to try to fix surrogates things in Python 3.1...
Use directly PyUnicode_DecodeFSDefaultAndSize() instead of
PyBytes_FromStringAndSize() + PyUnicode_FromEncodedObject() if the argument is
unicode.
........
* Add paragraph titles to c-api/unicode.rst.
* Fix PyUnicode_DecodeFSDefault*() comment: it now uses the "surrogateescape"
error handler (and not "replace")
* Remove "The function is intended to be used for paths and file names only
during bootstrapping process where the codecs are not set up." from
PyUnicode_FSConverter() comment: it is used after the bootstrapping and for
other purposes than file names
........
Issue #8681: Make the zlib module's error messages more informative when
the zlib itself doesn't give any detailed explanation.
........
................
Issue #8672: Add a zlib test ensuring that an incomplete stream can be
handled by a decompressor object without errors (it returns incomplete
uncompressed data).
........
................
#8575 - Update and reorganize some _winreg contents.
I've removed the hopeful note about a future higher-level module since
it's been in there for quite a long time and nothing of the sort has
come up. There are a few places where markup was added to cross-reference
other sections, and many of the external links have been removed and now
point to newly created sections containing previously undocumented
information.
The Value Types section was created and it's contents were taken from
a function-specific area, since it applies to more than just that
function. It fits in better with the other newly documented constants.
........
Merged revisions 79911,79916-79917,80018,80418,80572-80573,80635-80639,80668,80922 via svnmerge from
svn+ssh://pythondev@svn.python.org/sandbox/trunk/2to3/lib2to3
use absolute import
........
r79916 | benjamin.peterson | 2010-04-09 16:05:21 -0500 (Fri, 09 Apr 2010) | 1 line
generalize detection of __future__ imports and attach them to the tree
........
r79917 | benjamin.peterson | 2010-04-09 16:11:44 -0500 (Fri, 09 Apr 2010) | 1 line
don't try to 'fix' relative imports when absolute_import is enabled #8858
........
r80018 | benjamin.peterson | 2010-04-12 16:12:12 -0500 (Mon, 12 Apr 2010) | 4 lines
prevent diffs from being mangled is multiprocess mode #6409
Patch by George Boutsioukis.
........
r80418 | benjamin.peterson | 2010-04-23 16:00:03 -0500 (Fri, 23 Apr 2010) | 1 line
Don't transform imports that are already relative. 2to3 turned
from . import refactor
into
from .. import refactor
which broke the transformation of 2to3 itself.
........
r80635 | benjamin.peterson | 2010-04-29 16:02:23 -0500 (Thu, 29 Apr 2010) | 1 line
pass string to Number
........
r80668 | jeffrey.yasskin | 2010-04-30 18:02:47 -0500 (Fri, 30 Apr 2010) | 4 lines
Make 2to3 run under Python 2.5 so that the benchmark suite at
http://hg.python.org/benchmarks/ can use it and still run on implementations
that haven't gotten to 2.6 yet. Fixes issue 8566.
........
r80922 | benjamin.peterson | 2010-05-07 11:06:25 -0500 (Fri, 07 May 2010) | 1 line
prevent xrange transformation from wrapping range calls it produces in list
........
................
................
PyErr_SetFromErrnoWithFilename() decodes the filename using
PyUnicode_DecodeFSDefault() instead of PyUnicode_FromString()
........
r80950 | victor.stinner | 2010-05-08 02:36:42 +0200 (sam., 08 mai 2010) | 5 lines
posix_error_with_allocated_filename() decodes the filename with
PyUnicode_DecodeFSDefaultAndSize() and call
PyErr_SetFromErrnoWithFilenameObject() instead of
PyErr_SetFromErrnoWithFilename()
........
r80971 | victor.stinner | 2010-05-08 13:10:09 +0200 (sam., 08 mai 2010) | 3 lines
Issue #8514: Add os.fsencode() function (Unix only): encode a string to bytes
for use in the file system, environment variables or the command line.
........
Issue #8571: Fix an internal error when compressing or decompressing a
chunk larger than 1GB with the zlib module's compressor and decompressor
objects.
........
................
Issue #8603: Create a bytes version of os.environ for Unix
Create os.environb mapping and os.getenvb() function, os.unsetenv() encodes str
argument to the file system encoding with the surrogateescape error handler
(instead of utf8/strict) and accepts bytes, and posix.environ keys and values
are bytes.
........
Fix asyncore issues 8573 and 8483: _strerror might throw ValueError; asyncore.__getattr__ cheap inheritance caused confusing error messages when accessing undefined class attributes; added an alias for __str__ which now is used as a fallback for __repr__
........
................
Issue #7472: ISO-2022 charsets now consistently use 7bit CTE.
Fixed a typo in the email.encoders module so that messages output using
an ISO-2022 character set will use a content-transfer-encoding of
7bit consistently. Previously if the input data had any eight bit
characters the output data would get marked as 8bit even though it
was actually 7bit.
........
................
r80855 | r.david.murray | 2010-05-05 21:41:14 -0400 (Wed, 05 May 2010) | 24 lines
Merged revisions 80800 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
It turns out that email5 (py3k), because it is using unicode for the
payload, doesn't do the encoding to the output character set until later
in the process. Specifically, charset.body_encode no longer does the
input-to-output charset conversion. So the if test in the exception
clause in encoders.encode_7or8bit really is needed in email5.
So, this merge only merges the test, not the removal of the 'if'.
Issue #7472: remove unused code from email.encoders.encode_7or8bit.
Yukihiro Nakadaira noticed a typo in encode_7or8bit that was trying
to special case iso-2022 codecs. It turns out that the code in
question is never used, because whereas it was designed to trigger
if the payload encoding was eight bit but its output encoding was
7 bit, in practice the payload is always converted to the 7bit
encoding before encode_7or8bit is called. Patch by Shawat Anand.
........
................