bpo-37936: Systematically distinguish rooted vs. unrooted in .gitignore (GH-15823)
A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
`pyconfig.h` generated by `./configure`) but actually applies to all
similarly-named files in the tree (e.g., `PC/pyconfig.h`.)
Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree. This means that if we
write the rules in the most obvious-looking way, then
* for specific files we want to ignore that happen to be in
subdirectories (like `Modules/config.c`), the rule will work
as intended, staying "rooted" to the top of the tree; but
* when a specific file we want to ignore happens to be at the root of
the repo (like `platform`), then the obvious rule (`platform`) will
apply much more broadly than intended: if someone tries to add a
file or directory named `platform` somewhere else in the tree, it
will unexpectedly get ignored.
That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.
To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.
Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.
For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed. In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash. If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.
Gregory P. Smith [Wed, 11 Sep 2019 09:23:05 +0000 (04:23 -0500)]
bpo-37424: Avoid a hang in subprocess.run timeout output capture (GH-14490)
Fixes a possible hang when using a timeout on subprocess.run() while
capturing output. If the child process spawned its own children or otherwise
connected its stdout or stderr handles with another process, we could hang
after the timeout was reached and our child was killed when attempting to read
final output from the pipes.
This is a restructuring of the datetime documentation to hopefully make
them more user-friendly and approachable to new users without losing any
of the detail.
Changes include:
- Creating dedicated subsections for some concepts such as:
- "Constants"
- "Naive vs Aware"
- "Determining if an Object is Aware"
- Give 'naive vs aware' its own subsection
- Give 'constants' their own subsection
- Overhauling the strftime-strptime section by:
- Breaking it into logical, linkable, and digestable parts
- Adding a high-level comparison table
- Moving the technical detail to bottom: readers come to this
section primarily to remind themselves to things:
- How do I write the format code for X?
- strptime/strftime: which one is which again?
- Touching up fromisoformat + isoformat sections by:
- Revising fromisoformat + isoformat for date, time, and
datetime
- Adding basic examples
- Enforcing consistency about putting formats (i.e. ``HH:MM``)
in double backticks. This was previously done in some places
but not all
- Putting long 'supported formats', on their own line to improve
readability
- Moving the 'seealso' section to the top and add a link to dateutil
Rationale: This doesn't really belong nested under the
'constants' section. Let readers know right away that
datetime is one of several related tools.
- Moving common features of several types into one place:
Previously, each type went out of its way to note separately
that it was hashable and picklable. These can be brought
into one single place that is more prominent.
- Reducing some verbose explanations to improve readability
- Breaking up long paragraphs into digestable chunks
- Displaying longer "equivalent to" examples, as short code blocks
- Using the dot notation for datetime/time classes:
Use :class:`.time` and :class:`.datetime` rather than :class:`time` and
:class:`datetime`; otherwise, the generated links will route to the
respective modules, not classes.
- Rewording the tzinfo class description
The top paragraph should get straight to the point of telling the reader
what subclasses of tzinfo _do_. Previously, that was hidden in a later
paragraph.
- Adding a note on .today() versus .now()
- Rearranging and expanding example blocks, including:
- Moved long, multiline inline examples to standalone examples
- Simplified the example block for timedelta arithmetic:
- Broke the example into two logical sections:
1. normalization/parameter 'merging'
2. timedelta arithmetic
- Reduced the complexity of the some of the examples. Show
reasonable, real-world uses cases that are easy to follow
along with and progres in difficult slightly.
- Broke up the example sections for date and datetime sections by putting
the easy examples first, progressing to more esoteric situations and
breaking it up into logical sections based on what the methods are
doing at a high level.
- Simplified the KabulTz example:
- Put the class definition itself into a non-REPL block since there is
no interactive output involved there
- Briefly explained what's happening before launching into the code
- Broke the example section into visually separate chunks
- Various whitespace, formatting, style and grammar fixes including:
- Consistently using backctics for 'date_string' formats
- Consistently using one space after periods.
- Consistently using bold for vocab terms
- Consistently using italics when referring to params:
See https://devguide.python.org/documenting/#id4
- Using '::' to lead into code blocks
Per https://devguide.python.org/documenting/#source-code, this will
let the reader use the 'expand/collapse' top-right button for REPL
blocks to hide or show the prompt.
- Using consistent captialization schemes
- Removing use of the default role
- Put 'example' blocks in Markdown subsections
Eddie Elizondo [Wed, 11 Sep 2019 09:17:13 +0000 (05:17 -0400)]
bpo-37879: Suppress subtype_dealloc decref when base type is a C heap type (GH-15323)
The instance destructor for a type is responsible for preparing
an instance for deallocation by decrementing the reference counts
of its referents.
If an instance belongs to a heap type, the type object of an instance
has its reference count decremented while for static types, which
are permanently allocated, the type object is unaffected by the
instance destructor.
Previously, the default instance destructor searched the class
hierarchy for an inherited instance destructor and, if present,
would invoke it.
Then, if the instance type is a heap type, it would decrement the
reference count of that heap type. However, this could result in the
premature destruction of a type because the inherited instance
destructor should have already decremented the reference count
of the type object.
This change avoids the premature destruction of the type object
by suppressing the decrement of its reference count when an
inherited, non-default instance destructor has been invoked.
Finally, an assertion on the Py_SIZE of a type was deleted. Heap
types have a non zero size, making this into an incorrect assertion.
Petr Viktorin [Tue, 10 Sep 2019 11:21:09 +0000 (12:21 +0100)]
bpo-37499: Test various C calling conventions (GH-15776)
Add functions with various calling conventions to `_testcapi`, expose them as module-level functions, bound methods, class methods, and static methods, and test calling them and introspecting them through GDB.
bpo-38076: Make struct module PEP-384 compatible (#15805)
* PEP-384 _struct
* More PEP-384 fixes for _struct
Summary: Add a couple of more fixes for `_struct` that were previously missed such as removing `tp_*` accessors and using `PyBytesWriter` instead of calling `PyBytes_FromStringAndSize` with `NULL`. Also added a test to confirm that `iter_unpack` type is still uninstantiable.
Neil Schemenauer [Tue, 10 Sep 2019 09:44:20 +0000 (02:44 -0700)]
bpo-37725: have "make clean" remove PGO task data (#15033)
Change "clean" makefile target to also clean the program guided
optimization (PGO) data. Previously you would have to use "make
clean" and "make profile-removal", or "make clobber".
bpo-38043: Move unicodedata.normalize tests into test_unicodedata. (GH-15712)
Having these in a separate file from the one that's named after the
module in the usual way makes it very easy to miss them when looking
for tests for these two functions.
(In fact when working recently on is_normalized, I'd been surprised to
see no tests for it here and concluded the function had evaded being
tested at all. I'd gone as far as to write up some tests myself
before I spotted this other file.)
Mostly this just means moving all the one file's code into the other,
and moving code from the module toplevel to inside the test class to
keep it tidily separate from the rest of the file's code.
There's one substantive change, which reduces by a bit the amount of
code to be moved: we drop the `x > sys.maxunicode` conditional and all
the `RangeError` logic behind it. Now if that condition ever occurs
it will cause an error at `chr(x)`, and a test failure. That's the
right result because, since PEP 393 in Python 3.3, there is no longer
such a thing as an "unsupported character".
Cut tricky `goto` that isn't needed, in _PyBytes_DecodeEscape. (GH-15825)
This is the sort of `goto` that requires the reader to stare hard at
the code to unpick what it's doing.
On doing so, the answer is... not very much!
* It jumps from the bottom of the loop to almost the top; the effect
is to bypass the loop condition `s < end` and also the
`if`-condition `*s != '\\'`, acting as if both are true.
* We've just decremented `s`, after incrementing it in the `switch`
condition. So it has the same value as when `s == end` failed.
Before that was another increment... and before that we had
`s < end`. So `s < end` true, then increment, then `s == end`
false... that means `s < end` is still true.
* Also this means `s` points to the same character as it did for the
`switch` condition. And there was a `case '\\'`, which we didn't
hit -- so `*s != '\\'` is also true.
* That means this has no effect on the behavior! The most it might do
is an optimization -- we get to skip those two checks, because (as
just proven above) we know they're true.
* But gosh, this is the *invalid escape sequence* path. This does not
seem like the kind of code path that calls for extreme optimization
tricks.
So, take the `goto` and the label out.
Perhaps the compiler will notice the exact same facts we showed above,
and generate identical code. Or perhaps it won't! That'll be OK.
But then, crucially, if some future edit to this loop causes the
reasoning above to *stop* holding true... the compiler will adjust
this jump accordingly. One of us fallible humans might not.
bpo-36502: Update link to UAX #44, the Unicode doc on the UCD. (GH-15301)
The link we have points to the version from Unicode 6.0.0, dated 2010.
There have been numerous updates to it since then:
https://www.unicode.org/reports/tr44/#Modifications
Change the link to one that points to the current version. Also, use HTTPS.
bpo-35941: Fix performance regression in new code (GH-12610)
Accumulate certificates in a set instead of doing a costly list contain
operation. A Windows cert store can easily contain over hundred
certificates. The old code would result in way over 5,000 comparison
operations
Signed-off-by: Christian Heimes <christian@python.org>
In debug mode, visit_decref() now calls _PyObject_IsFreed() to ensure
that the object is not freed. If it's freed, the program fails with
an assertion error and Python dumps informations about the freed
object.
bpo-37758: Cut always-constant conditionals on sys.maxunicode. (GH-15302)
Since PEP 393 in Python 3.3, this value is always 0x10ffff, the
maximum codepoint in Unicode; there's no longer such a thing as a
UCS-2 build of Python, which couldn't properly represent some
characters.
There are a couple of spots left where we still condition on the value
of this constant. Take them out.
Victor Stinner [Mon, 9 Sep 2019 14:55:58 +0000 (16:55 +0200)]
bpo-38006: Avoid closure in weakref.WeakValueDictionary (GH-15641)
weakref.WeakValueDictionary defines a local remove() function used as
callback for weak references. This function was created with a
closure. Modify the implementation to avoid the closure.
Mark files as executable that are meant as scripts. (GH-15354)
This is the converse of GH-15353 -- in addition to plenty of
scripts in the tree that are marked with the executable bit
(and so can be directly executed), there are a few that have
a leading `#!` which could let them be executed, but it doesn't
do anything because they don't have the executable bit set.
Here's a command which finds such files and marks them. The
first line finds files in the tree with a `#!` line *anywhere*;
the next-to-last step checks that the *first* line is actually of
that form. In between we filter out files that already have the
bit set, and some files that are meant as fragments to be
consumed by one or another kind of preprocessor.
bpo-26185: Fix repr() on empty ZipInfo object (#13441)
* bpo-26185: Fix repr() on empty ZipInfo object
It was failing on AttributeError due to inexistant
but required attributes file_size and compress_size.
They are now initialized to 0 in ZipInfo.__init__().
* Remove useless hasattr() in ZipInfo._open_to_write()
* Completely remove file_size setting in _open_to_write().
bpo-37702: Fix SSL's certificate-store leak on Windows (GH-15632)
ssl_collect_certificates function in _ssl.c has a memory leak.
Calling CertOpenStore() and CertAddStoreToCollection(), a store's refcnt gets incremented by 2.
But CertCloseStore() is called only once and the refcnt leaves 1.