Pablo Galindo [Sun, 13 Oct 2019 15:48:59 +0000 (16:48 +0100)]
bpo-38379: Don't block collection of unreachable objects when some objects resurrect (GH-16687)
Currently if any finalizer invoked during garbage collection resurrects any object, the gc gives up and aborts the collection. Although finalizers are assured to only run once per object, this behaviour of the gc can lead to an ever-increasing memory situation if new resurrecting objects are allocated in every new gc collection.
To avoid this, recompute what objects among the unreachable set need to be resurrected and what objects can be safely collected. In this way, resurrecting objects will not block the collection of other objects in the unreachable set.
Gregory P. Smith [Sat, 12 Oct 2019 23:35:53 +0000 (16:35 -0700)]
bpo-38456: Use /bin/true in test_subprocess (GH-16736)
* bpo-38456: Use /bin/true in test_subprocess.
Instead of sys.executable, "-c", "pass" or "import sys; sys.exit(0)"
use /bin/true when it is available. On a reasonable machine this
shaves up to two seconds wall time off the otherwise ~40sec execution
on a --with-pydebug build. It should be more notable on many
buildbots or overloaded slower I/O systems (CI, etc).
Victor Stinner [Thu, 10 Oct 2019 19:30:20 +0000 (21:30 +0200)]
bpo-38282: Rewrite getsockaddrarg() helper function (GH-16698)
Rewrite getsockaddrarg() helper function of socketmodule.c (_socket
module) to prevent a false alarm when compiling codde using GCC with
_FORTIFY_SOURCE=2. Pass a pointer of the sock_addr_t union, rather
than passing a pointer to a sockaddr structure.
Add "struct sockaddr_tipc tipc;" to the sock_addr_t union.
M. Eric Irrgang [Thu, 10 Oct 2019 11:11:33 +0000 (14:11 +0300)]
bpo-32996: Documentation fix-up. (GH-16646)
PR #4906 changed the typing.Generic class hierarchy, leaving an
outdated comment in the library reference. User-defined Generic ABCs now
must get a abc.ABCMeta metaclass from something other than typing.Generic
inheritance.
Ronan Lamy [Thu, 10 Oct 2019 07:34:46 +0000 (09:34 +0200)]
bpo-38109: Add missing constants to Lib/stat.py (GH-16665)
Add missing stat.S_IFDOOR, stat.S_IFPORT, stat.S_IFWHT,
stat.S_ISDOOR, stat.S_ISPORT, and stat.S_ISWHT values to
the Python implementation of the stat module.
Victor Stinner [Tue, 8 Oct 2019 16:45:43 +0000 (18:45 +0200)]
bpo-37531: regrtest ignores output on timeout (GH-16659)
bpo-37531, bpo-38207: On timeout, regrtest no longer attempts to call
`popen.communicate() again: it can hang until all child processes
using stdout and stderr pipes completes. Kill the worker process and
ignores its output.
Pablo Galindo [Tue, 8 Oct 2019 15:30:50 +0000 (16:30 +0100)]
bpo-38395: Fix ownership in weakref.proxy methods (GH-16632)
The implementation of weakref.proxy's methods call back into the Python
API using a borrowed references of the weakly referenced object
(acquired via PyWeakref_GET_OBJECT). This API call may delete the last
reference to the object (either directly or via GC), leaving a dangling
pointer, which can be subsequently dereferenced.
To fix this, claim a temporary ownership of the referenced object when
calling the appropriate method. Some functions because at the moment they
do not need to access the borrowed referent, but to protect against
future changes to these functions, ownership need to be fixed in
all potentially affected methods.
subtract_refs() now pass the parent object to visit_decref() which
pass it to _PyObject_ASSERT(). So if the "is freed" assertion fails,
the parent is used in debug trace, rather than the freed object. The
parent object is more likely to contain useful information. Freed
objects cannot be inspected are are displayed as "<object at xxx is
freed>" with no other detail.
Pablo Galindo [Mon, 7 Oct 2019 23:43:14 +0000 (00:43 +0100)]
bpo-38400 Don't check for NULL linked list pointers in _PyObject_IsFreed (GH-16630)
Some objects like Py_None are not initialized with conventional means
that prepare the circular linked list pointers, leaving them unlinked
from the rest of the objects. For those objects, NULL pointers does
not mean that they are freed, so we need to skip the check in those
cases.
Victor Stinner [Mon, 7 Oct 2019 22:09:31 +0000 (00:09 +0200)]
bpo-38392: PyObject_GC_Track() validates object in debug mode (GH-16615)
In debug mode, PyObject_GC_Track() now calls tp_traverse() of the
object type to ensure that the object is valid: test that objects
visited by tp_traverse() are valid.
Fix pyexpat.c: only track the parser in the GC once the parser is
fully initialized.
Ricardo Bánffy [Mon, 7 Oct 2019 20:54:35 +0000 (21:54 +0100)]
bpo-38294: Add list of no-longer-escaped chars to re.escape documentation. (GH-16442)
Prior to 3.7, re.escape escaped many characters that don't have
special meaning in Python, but that use to require escaping in other
tools and languages. This commit aims to make it clear which characters
were, but are no longer escaped.
Victor Stinner [Mon, 7 Oct 2019 16:42:01 +0000 (18:42 +0200)]
bpo-36389: _PyObject_CheckConsistency() available in release mode (GH-16612)
bpo-36389, bpo-38376: The _PyObject_CheckConsistency() function is
now also available in release mode. For example, it can be used to
debug a crash in the visit_decref() function of the GC.
Modify the following functions to also work in release mode:
* _PyMem_IsPtrFreed(ptr) now also returns 1 if ptr is NULL
(equals to 0).
* _PyBytesWriter_CheckConsistency() now returns 1 and is only used
with assert().
* Reorder _PyObject_Dump() to write safe fields first, and only
attempt to render repr() at the end.
Victor Stinner [Fri, 4 Oct 2019 17:53:43 +0000 (19:53 +0200)]
bpo-38353: getpath.c: allocates strings on the heap (GH-16585)
* _Py_FindEnvConfigValue() now returns a string allocated
by PyMem_RawMalloc().
* calculate_init() now decodes VPATH macro.
* Add calculate_open_pyenv() function.
* Add substring() and joinpath2() functions.
Victor Stinner [Thu, 3 Oct 2019 14:15:16 +0000 (16:15 +0200)]
bpo-36670: Enhance regrtest (GH-16556)
* Add log() method: add timestamp and load average prefixes
to main messages.
* WindowsLoadTracker:
* LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL
* Initialize the load to the arithmetic mean of the first 5 values
of the Processor Queue Length value (so over 5 seconds), rather
than 0.0.
* Handle BrokenPipeError and when typeperf exit.
* format_duration(1.5) now returns '1.5 sec', rather than
'1 sec 500 ms'
Victor Stinner [Wed, 2 Oct 2019 15:52:35 +0000 (17:52 +0200)]
bpo-38338, test.pythoninfo: add more ssl infos (GH-16539)
test.pythoninfo now logs environment variables used by OpenSSL and
Python ssl modules, and logs attributes of 3 SSL contexts
(SSLContext, default HTTPS context, stdlib context).
Victor Stinner [Wed, 2 Oct 2019 11:35:11 +0000 (13:35 +0200)]
bpo-36670: regrtest bug fixes (GH-16537)
* Fix TestWorkerProcess.__repr__(): start_time is only valid
if _popen is not None.
* Fix _kill(): don't set _killed to True if _popen is None.
* _run_process(): only set _killed to False after calling
run_test_in_subprocess().
Victor Stinner [Tue, 1 Oct 2019 11:12:29 +0000 (13:12 +0200)]
bpo-37474: Don't call fedisableexcept() on FreeBSD (GH-16515)
On FreeBSD, Python no longer calls fedisableexcept() at startup to
control the floating point control mode. The call became useless
since FreeBSD 6: it became the default mode.
Victor Stinner [Tue, 1 Oct 2019 10:29:36 +0000 (12:29 +0200)]
bpo-36670: Multiple regrtest bugfixes (GH-16511)
* Windows: Fix counter name in WindowsLoadTracker. Counter names are
localized: use the registry to get the counter name. Original
change written by Lorenz Mende.
* Regrtest.main() now ensures that the Windows load tracker is also
killed if an exception is raised
* TestWorkerProcess now ensures that worker processes are no longer
running before exiting: kill also worker processes when an
exception is raised.
* Enhance regrtest messages and warnings: include test name,
duration, add a worker identifier, etc.
* Rename MultiprocessRunner to TestWorkerProcess
* Use print_warning() to display warnings.
Co-Authored-By: Lorenz Mende <Lorenz.mende@gmail.com>
bpo-32689: Updates shutil.move to allow for Path objects to be used as source arg (GH-15326)
Important work originally done by @emilyemorehouse two years ago and nearly ready to go in.
This bug has affected many people and in some cases has been a dealbreaker to the adoption of the otherwise wonderful pathlib and PEP519. https://stackoverflow.com/questions/33625931/copy-file-with-pathlib-in-python.
This adds the outstanding test request from that PR @vstinner (https://github.com/python/cpython/pull/5393).
Test fails without the change, passes with it, along with every other test in test_shutil.
Some variants were experimented with to make the one line change and the most performant one was picked.
# Added Test for PathLike directory destination, the current fail case
def test_move_file_pathlike(self):
# Move a file to another location on the same filesystem.
src = pathlib.Path(self.src_file)
> self._check_move_file(src, self.dst_dir, self.dst_file)
def _basename(path):
# A basename() variant which first strips the trailing slash, if present.
# Thus we always get the last component of the path, even for directories.
sep = os.path.sep + (os.path.altsep or '')
> return os.path.basename(path.rstrip(sep))
E AttributeError: 'PosixPath' object has no attribute 'rstrip'
================================================ 96 passed, 7 skipped in 1.25 seconds =================================================
```
# Performance Considerations
Is it considered poor form to get rid of _basename altogether and make use of pathlib in the move function? I'm not sure if the idea is for all these modules to strictly avoid circular dependencies. They are already using os.path which is just as much a citizen in 3.8 as pathlib right?
I've looked around and familiarized myself, and I now think importing pathlib here is fine. My only remaining concern is that of performance.
Here's the performance difference for this step.
```
In [46]: %timeit real_dst = os.path.join("a/b/c", _basename('b/'))
2.71 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [47]: %timeit real_dst = Path("a/b/c") / Path('b/').name
12.4 µs ± 65.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
```
Is 10us significant or insignificant compared to the least expensive operation this function will do? I don't know. Let's find out.
```
In [55]: %timeit os.rename('/tmp/a/a.txt', '/tmp/a/b.txt'); os.rename('/tmp/a/b.txt', '/tmp/a/a.txt')
124 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
62us to rename. 10us seems significant enough that we wouldn't want to favor the Path sugar suggestion. 16% speed decrease from adding the 10us.
What do people think? I was hoping to get to use pathlib.Path here, but I suspect for this low level move, it should be as fast as possible, and 16% is not worth one line of sugary code to me.
Neil Schemenauer [Mon, 30 Sep 2019 17:06:45 +0000 (10:06 -0700)]
Clear weakrefs in garbage found by the GC (#16495)
Fix a bug due to the interaction of weakrefs and the cyclic garbage
collector. We must clear any weakrefs in garbage in order to prevent
their callbacks from executing and causing a crash.
Victor Stinner [Mon, 30 Sep 2019 12:49:34 +0000 (14:49 +0200)]
bpo-38322: Fix gotlandmark() of PC/getpathp.c (GH-16489)
Write the filename into a temporary buffer instead of reusing prefix.
The problem is that join() modifies prefix inplace. If prefix is not
normalized, join() can make prefix shorter and so gotlandmark()
does modify prefix instead of returning it unmodified.