Tim Peters [Tue, 9 Jul 2002 18:35:34 +0000 (18:35 +0000)]
New file to try to document the "special build" preprocessor symbols.
Incomplete. Add to it! Once it settles down, it would make a nice
appendix in the real docs.
Tim Peters [Tue, 9 Jul 2002 02:57:01 +0000 (02:57 +0000)]
The Py_REF_DEBUG/COUNT_ALLOCS/Py_TRACE_REFS macro minefield: added
more trivial lexical helper macros so that uses of these guys expand
to nothing at all when they're not enabled. This should help sub-
standard compilers that can't do a good job of optimizing away the
previous "(void)0" expressions.
Py_DECREF: There's only one definition of this now. Yay! That
was that last one in the family defined multiple times in an #ifdef
maze.
Py_FatalError(): Changed the char* signature to const char*.
_Py_NegativeRefcount(): New helper function for the Py_REF_DEBUG
expansion of Py_DECREF. Calling an external function cuts down on
the volume of generated code. The previous inline expansion of abort()
didn't work as intended on Windows (the program often kept going, and
the error msg scrolled off the screen unseen). _Py_NegativeRefcount
calls Py_FatalError instead, which captures our best knowledge of
how to abort effectively across platforms.
Barry Warsaw [Tue, 9 Jul 2002 02:50:02 +0000 (02:50 +0000)]
Anthony Baxter's patch for non-strict parsing. This adds a `strict'
argument to the constructor -- defaulting to true -- which is
different than Anthony's approach of using global state.
parse(), parsestr(): Grow a `headersonly' argument which stops parsing
once the header block has been seen, i.e. it does /not/ parse or even
read the body of the message. This is used for parsing message/rfc822
type messages.
We need test cases for the non-strict parsing. Anthony will supply
these.
_parsebody(): We can get rid of the isdigest end-of-line kludges,
although we still need to know if we're parsing a multipart/digest so
we can set the default type accordingly.
Barry Warsaw [Tue, 9 Jul 2002 02:46:12 +0000 (02:46 +0000)]
Add the concept of a "default type". Normally the default type is
text/plain but the RFCs state that inside a multipart/digest, the
default type is message/rfc822. To preserve idempotency, we need a
separate place to define the default type than the Content-Type:
header.
get_default_type(), set_default_type(): Accessor and mutator methods
for the default type.
Barry Warsaw [Tue, 9 Jul 2002 02:43:47 +0000 (02:43 +0000)]
clone(): A new method for creating a clone of this generator (for
recursive generation).
_dispatch(): If the message object doesn't have a Content-Type:
header, check its default type instead of assuming it's text/plain.
This makes for correct generation of message/rfc822 containers.
_handle_multipart(): We can get rid of the isdigest kludge. Just
print the message as normal and everything will work out correctly.
_handle_mulitpart_digest(): We don't need this anymore either.
Tim Peters [Mon, 8 Jul 2002 22:11:52 +0000 (22:11 +0000)]
SF bug 578752: COUNT_ALLOCS vs heap types
Repair segfaults and infinite loops in COUNT_ALLOCS builds in the
presence of new-style (heap-allocated) classes/types.
Bugfix candidate. I'll backport this to 2.2. It's irrelevant in 2.1.
Jack Jansen [Mon, 8 Jul 2002 21:39:36 +0000 (21:39 +0000)]
The readme file said that OSX Carbon modules were only built for
-enable-framework builds, but setup.py built them anyway. Fixed.
Also normalized whitespace.
Fred Drake [Mon, 8 Jul 2002 14:08:48 +0000 (14:08 +0000)]
Added font-setting line (and associated comments) to the A4 version of
this file; the lack of this was causing the A4 version of tutorial to
use really bad Type 3 fonts instead of Type 1 fonts, which also
bloated the file size substantially.
I thought there was a SourceForge bug for this, but couldn't find it.
Tim Peters [Mon, 8 Jul 2002 06:32:09 +0000 (06:32 +0000)]
PyNode_AddChild(): Do aggressive over-allocation when the number of
children gets large, to avoid severe platform realloc() degeneration
in extreme cases (like test_longexp).
Bugfix candidate.
This was doing extremely timid over-allocation, just rounding up to the
nearest multiple of 3. Now so long as the number of children is <= 128,
it rounds up to a multiple of 4 but via a much faster method. When the
number of children exceeds 128, though, and more space is needed, it
doubles the capacity. This is aggressive over-allocation.
SF patch <http://www.python.org/sf/578297> has Andrew MacIntyre using
PyMalloc in the parser to overcome platform malloc problems in
test_longexp on OS/2 EMX. Jack Jansen notes there that it didn't help
him on the Mac, because the Mac has problems with frequent ever-growing
reallocs, not just with gazillions of teensy mallocs. Win98 has no
visible problems with test_longexp, but I tried boosting the test-case
size and soon got "senseless" MemoryErrors out of it, and soon after
crashed the OS: as I've seen in many other contexts before, while the
Win98 realloc remains zippy in bad cases, it leads to extreme
fragmentation of user address space, to the point that the OS barfs.
I don't yet know whether this fixes Jack's Mac problems, but it does cure
Win98's problems when boosting the test case size. It also speeds
test_longexp in its unaltered state.
Tim Peters [Sun, 7 Jul 2002 19:59:50 +0000 (19:59 +0000)]
Rearranged and added comments to object.h, to clarify many things
that have taken me "too long" to reverse-engineer over the years.
Vastly reduced the nesting level and redundancy of #ifdef-ery.
Took a light stab at repairing comments that are no longer true.
sys_gettotalrefcount(): Changed to enable under Py_REF_DEBUG.
It was enabled under Py_TRACE_REFS, which was much heavier than
necessary. sys.gettotalrefcount() is now available in a
Py_REF_DEBUG-only build.
Jeremy Hylton [Sun, 7 Jul 2002 16:51:37 +0000 (16:51 +0000)]
Fix for SF bug #432621: httplib: multiple Set-Cookie headers
If multiple header fields with the same name occur, they are combined
according to the rules in RFC 2616 sec 4.2:
Appending each subsequent field-value to the first, each separated by
a comma. The order in which header fields with the same field-name are
received is significant to the interpretation of the combined field
value.
Tim Peters [Sun, 7 Jul 2002 05:13:56 +0000 (05:13 +0000)]
Trashcan cleanup: Now that cyclic gc is always there, the trashcan
mechanism is no longer evil: it no longer plays dangerous games with
the type pointer or refcounts, and objects in extension modules can play
along too without needing to edit the core first.
Rewrote all the comments to explain this, and (I hope) give clear
guidance to extension authors who do want to play along. Documented
all the functions. Added more asserts (it may no longer be evil, but
it's still dangerous <0.9 wink>). Rearranged the generated code to
make it clearer, and to tolerate either the presence or absence of a
semicolon after the macros. Rewrote _PyTrash_destroy_chain() to call
tp_dealloc directly; it was doing a Py_DECREF again, and that has all
sorts of obscure distorting effects in non-release builds (Py_DECREF
was already called on the object!). Removed Christian's little "embedded
change log" comments -- that's what checkin messages are for, and since
it was impossible to correlate the comments with the code that changed,
I found them merely distracting.
Jeremy Hylton [Sat, 6 Jul 2002 18:48:07 +0000 (18:48 +0000)]
Handle HTTP/0.9 responses.
Section 19.6 of RFC 2616 (HTTP/1.1):
It is beyond the scope of a protocol specification to mandate
compliance with previous versions. HTTP/1.1 was deliberately
designed, however, to make supporting previous versions easy....
And we would expect HTTP/1.1 clients to:
- recognize the format of the Status-Line for HTTP/1.0 and 1.1
responses;
- understand any valid response in the format of HTTP/0.9, 1.0, or
1.1.
The changes to the code do handle response in the format of HTTP/0.9.
Some users may consider this a bug because all responses with a
sufficiently corrupted status line will look like an HTTP/0.9
response. These users can pass strict=1 to the HTTP constructors to
get a BadStatusLine exception instead.
While this is a new feature of sorts, it enhances the robustness of
the code (be tolerant in what you accept). Thus, I consider it a bug
fix candidate.
Combine OldStackViewer.py with Debugger.py, removing dead code.
M Debugger.py : Incorporate StackViewer, NamespaceViewer classes
M StackViewer.py : remove import OldStackViewer
U OldStackViewer.py : remove file
Docstring improvements. In particular, added docstrings for the
standalone wrap() and fill() functions. This should address the
misunderstanding that led to SF bug 577106.
Tim Peters [Wed, 3 Jul 2002 03:31:20 +0000 (03:31 +0000)]
Stop trying to cater to platforms with a broken HUGE_VAL definition. It
breaks other platforms (in this case, the hack for broken Cray systems in
turn caused failure on a Mac system broken in a different way).
Fred Drake [Tue, 2 Jul 2002 22:34:44 +0000 (22:34 +0000)]
Update the documentation of the errors and failures attributes of the
TestResult object. Add an example of how to get even more information for
apps that can use it.
Closes SF bug #558278.
Tim Peters [Tue, 2 Jul 2002 22:24:50 +0000 (22:24 +0000)]
Another stab at SF 576327: zipfile when sizeof(long) == 8
binascii_crc32(): The previous patch forced this to return the same
result across platforms. This patch deals with that, on a 64-bit box,
the *entry* value may have "unexpected" bits in the high four bytes.
Don't list all the keyword args to the TextWrapper constructor in the
classdesc -- just use "..." with prose explaining the correspondence
between keyword args and instance attributes.
Document 'width' along with the other instance attributes.
Fred Drake [Tue, 2 Jul 2002 20:32:50 +0000 (20:32 +0000)]
Abstract the creation of signature lines for callable things; the new
\py@sigline macro will wrap the argument list so it will not extend into
the right margin.
Substantially based on a contribution from Dave Cole.
This addresses one of the comments in SF bug #574742.
Tim Peters [Tue, 2 Jul 2002 20:20:08 +0000 (20:20 +0000)]
Fix for SF bug #576327: zipfile when sizeof(long) == 8
binascii_crc32(): Make this return a signed 4-byte result across
platforms. The other way to make this platform-independent would be to
make it return an unsigned unbounded int, but the evidence suggests
other code out there treats it like a signed 4-byte int (e.g., existing
code writing the result with struct.pack "l" format).
Tim Peters [Tue, 2 Jul 2002 18:12:35 +0000 (18:12 +0000)]
Finished transitioning to using gc_refs to track gc objects' states.
This was mostly a matter of adding comments and light code rearrangement.
Upon untracking, gc_next is still set to NULL. It's a cheap way to
provoke memory faults if calling code is insane. It's also used in some
way by the trashcan mechanism.
Tim Peters [Tue, 2 Jul 2002 00:52:30 +0000 (00:52 +0000)]
Reserved another gc_refs value for untracked objects. Every live gc
object should now have a well-defined gc_refs value, with clear transitions
among gc_refs states. As a result, none of the visit_XYZ traversal
callbacks need to check IS_TRACKED() anymore, and those tests were removed.
(They were already looking for objects with specific gc_refs states, and
the gc_refs state of an untracked object can no longer match any other
gc_refs state by accident.)
Added more asserts.
I expect that the gc_next == NULL indicator for an untracked object is
now redundant and can also be removed, but I ran out of time for this.
Tim Peters [Mon, 1 Jul 2002 03:52:19 +0000 (03:52 +0000)]
OK, I couldn't stand it <0.5 wink>: removed all uncertainty about what's
in gc_refs, even at the cost of putting back a test+branch in
visit_decref.
The good news: since gc_refs became utterly tame then, it became
clear that another special value could be useful. The move_roots() and
move_root_reachable() passes have now been replaced by a single
move_unreachable() pass. Besides saving a pass over the generation, this
has a better effect: most of the time everything turns out to be
reachable, so we were breaking the generation list apart and moving it
into into the reachable list, one element at a time. Now the reachable
stuff stays in the generation list, and the unreachable stuff is moved
instead. This isn't quite as good as it sounds, since sometimes we
guess wrongly that a thing is unreachable, and have to move it back again.
Still, overall, it yields a significant (but not dramatic) boost in
collection speed.
Tim Peters [Sun, 30 Jun 2002 21:31:03 +0000 (21:31 +0000)]
visit_decref(): Two optimizations.
1. You're not supposed to call this with a NULL argument, although the
docs could be clearer about that. The other visit_XYZ() functions
don't bother to check. This doesn't either now, although it does
assert non-NULL-ness now.
2. It doesn't matter whether the object is currently tracked, so don't
bother checking that either (if it isn't currently tracked, it may
have some nonsense value in gc_refs, but it doesn't hurt to
decrement gibberish, and it's cheaper to do so than to make everyone
test for trackedness).
It would be nice to get rid of the other tests on IS_TRACKED. Perhaps
trackedness should not be a matter of not being in any gc list, but
should be a matter of being in a new "untracked" gc list. This list
simply wouldn't be involved in the collection mechanism. A newly
created object would be put in the untracked list. Tracking would
simply unlink it and move it into the gen0 list. Untracking would do
the reverse. No test+branch needed then. visit_move() may be vulnerable
then, though, and I don't know how this would work with the trashcan.
Tim Peters [Sun, 30 Jun 2002 17:56:40 +0000 (17:56 +0000)]
SF bug #574132: Major GC related performance regression
"The regression" is actually due to that 2.2.1 had a bug that prevented
the regression (which isn't a regression at all) from showing up. "The
regression" is actually a glitch in cyclic gc that's been there forever.
As the generation being collected is analyzed, objects that can't be
collected (because, e.g., we find they're externally referenced, or
are in an unreachable cycle but have a __del__ method) are moved out
of the list of candidates. A tricksy scheme uses negative values of
gc_refs to mark such objects as being moved. However, the exact
negative value set at the start may become "more negative" over time
for objects not in the generation being collected, and the scheme was
checking for an exact match on the negative value originally assigned.
As a result, objects in generations older than the one being collected
could get scanned too, and yanked back into a younger generation. Doing
so doesn't lead to an error, but doesn't do any good, and can burn an
unbounded amount of time doing useless work.
A test case is simple (thanks to Kevin Jacobs for finding it!):
x = []
for i in xrange(200000):
x.append((1,))
Without the patch, this ends up scanning all of x on every gen0 collection,
scans all of x twice on every gen1 collection, and x gets yanked back into
gen1 on every gen0 collection. With the patch, once x gets to gen2, it's
never scanned again until another gen2 collection, and stays in gen2.
Bugfix candidate, although the code has changed enough that I think I'll
need to port it by hand. 2.2.1 also has a different bug that causes
bound method objects not to get tracked at all (so the test case doesn't
burn absurd amounts of time in 2.2.1, but *should* <wink>).
Martin v. Löwis [Sun, 30 Jun 2002 07:38:50 +0000 (07:38 +0000)]
Merge from PyXML:
[1.3] Added documentation of the namespace URI for elements with no namespace.
[1.4] New property http://www.python.org/sax/properties/encoding.
[1.5] Support optional string interning in pyexpat.
Martin v. Löwis [Sun, 30 Jun 2002 07:21:24 +0000 (07:21 +0000)]
Merge changes from PyXML:
[1.15]
Added understanding of the feature_validation, feature_external_pes,
and feature_string_interning features.
Added support for the feature_external_ges feature.
Added support for the property_xml_string property.
[1.16]
Made it recognize the namespace prefixes feature.
[1.17]
removed erroneous first line
[1.19]
Support optional string interning in pyexpat.
[1.21]
Restore compatibility with versions of Python that did not support weak
references. These do not get the cyclic reference fix, but they will
continue to work as they did before.
[1.22]
Activate entity processing unless standalone.