Fred Drake [Tue, 2 Jul 2002 22:34:44 +0000 (22:34 +0000)]
Update the documentation of the errors and failures attributes of the
TestResult object. Add an example of how to get even more information for
apps that can use it.
Closes SF bug #558278.
Tim Peters [Tue, 2 Jul 2002 22:24:50 +0000 (22:24 +0000)]
Another stab at SF 576327: zipfile when sizeof(long) == 8
binascii_crc32(): The previous patch forced this to return the same
result across platforms. This patch deals with that, on a 64-bit box,
the *entry* value may have "unexpected" bits in the high four bytes.
Don't list all the keyword args to the TextWrapper constructor in the
classdesc -- just use "..." with prose explaining the correspondence
between keyword args and instance attributes.
Document 'width' along with the other instance attributes.
Fred Drake [Tue, 2 Jul 2002 20:32:50 +0000 (20:32 +0000)]
Abstract the creation of signature lines for callable things; the new
\py@sigline macro will wrap the argument list so it will not extend into
the right margin.
Substantially based on a contribution from Dave Cole.
This addresses one of the comments in SF bug #574742.
Tim Peters [Tue, 2 Jul 2002 20:20:08 +0000 (20:20 +0000)]
Fix for SF bug #576327: zipfile when sizeof(long) == 8
binascii_crc32(): Make this return a signed 4-byte result across
platforms. The other way to make this platform-independent would be to
make it return an unsigned unbounded int, but the evidence suggests
other code out there treats it like a signed 4-byte int (e.g., existing
code writing the result with struct.pack "l" format).
Tim Peters [Tue, 2 Jul 2002 18:12:35 +0000 (18:12 +0000)]
Finished transitioning to using gc_refs to track gc objects' states.
This was mostly a matter of adding comments and light code rearrangement.
Upon untracking, gc_next is still set to NULL. It's a cheap way to
provoke memory faults if calling code is insane. It's also used in some
way by the trashcan mechanism.
Tim Peters [Tue, 2 Jul 2002 00:52:30 +0000 (00:52 +0000)]
Reserved another gc_refs value for untracked objects. Every live gc
object should now have a well-defined gc_refs value, with clear transitions
among gc_refs states. As a result, none of the visit_XYZ traversal
callbacks need to check IS_TRACKED() anymore, and those tests were removed.
(They were already looking for objects with specific gc_refs states, and
the gc_refs state of an untracked object can no longer match any other
gc_refs state by accident.)
Added more asserts.
I expect that the gc_next == NULL indicator for an untracked object is
now redundant and can also be removed, but I ran out of time for this.
Tim Peters [Mon, 1 Jul 2002 03:52:19 +0000 (03:52 +0000)]
OK, I couldn't stand it <0.5 wink>: removed all uncertainty about what's
in gc_refs, even at the cost of putting back a test+branch in
visit_decref.
The good news: since gc_refs became utterly tame then, it became
clear that another special value could be useful. The move_roots() and
move_root_reachable() passes have now been replaced by a single
move_unreachable() pass. Besides saving a pass over the generation, this
has a better effect: most of the time everything turns out to be
reachable, so we were breaking the generation list apart and moving it
into into the reachable list, one element at a time. Now the reachable
stuff stays in the generation list, and the unreachable stuff is moved
instead. This isn't quite as good as it sounds, since sometimes we
guess wrongly that a thing is unreachable, and have to move it back again.
Still, overall, it yields a significant (but not dramatic) boost in
collection speed.
Tim Peters [Sun, 30 Jun 2002 21:31:03 +0000 (21:31 +0000)]
visit_decref(): Two optimizations.
1. You're not supposed to call this with a NULL argument, although the
docs could be clearer about that. The other visit_XYZ() functions
don't bother to check. This doesn't either now, although it does
assert non-NULL-ness now.
2. It doesn't matter whether the object is currently tracked, so don't
bother checking that either (if it isn't currently tracked, it may
have some nonsense value in gc_refs, but it doesn't hurt to
decrement gibberish, and it's cheaper to do so than to make everyone
test for trackedness).
It would be nice to get rid of the other tests on IS_TRACKED. Perhaps
trackedness should not be a matter of not being in any gc list, but
should be a matter of being in a new "untracked" gc list. This list
simply wouldn't be involved in the collection mechanism. A newly
created object would be put in the untracked list. Tracking would
simply unlink it and move it into the gen0 list. Untracking would do
the reverse. No test+branch needed then. visit_move() may be vulnerable
then, though, and I don't know how this would work with the trashcan.
Tim Peters [Sun, 30 Jun 2002 17:56:40 +0000 (17:56 +0000)]
SF bug #574132: Major GC related performance regression
"The regression" is actually due to that 2.2.1 had a bug that prevented
the regression (which isn't a regression at all) from showing up. "The
regression" is actually a glitch in cyclic gc that's been there forever.
As the generation being collected is analyzed, objects that can't be
collected (because, e.g., we find they're externally referenced, or
are in an unreachable cycle but have a __del__ method) are moved out
of the list of candidates. A tricksy scheme uses negative values of
gc_refs to mark such objects as being moved. However, the exact
negative value set at the start may become "more negative" over time
for objects not in the generation being collected, and the scheme was
checking for an exact match on the negative value originally assigned.
As a result, objects in generations older than the one being collected
could get scanned too, and yanked back into a younger generation. Doing
so doesn't lead to an error, but doesn't do any good, and can burn an
unbounded amount of time doing useless work.
A test case is simple (thanks to Kevin Jacobs for finding it!):
x = []
for i in xrange(200000):
x.append((1,))
Without the patch, this ends up scanning all of x on every gen0 collection,
scans all of x twice on every gen1 collection, and x gets yanked back into
gen1 on every gen0 collection. With the patch, once x gets to gen2, it's
never scanned again until another gen2 collection, and stays in gen2.
Bugfix candidate, although the code has changed enough that I think I'll
need to port it by hand. 2.2.1 also has a different bug that causes
bound method objects not to get tracked at all (so the test case doesn't
burn absurd amounts of time in 2.2.1, but *should* <wink>).
Martin v. Löwis [Sun, 30 Jun 2002 07:38:50 +0000 (07:38 +0000)]
Merge from PyXML:
[1.3] Added documentation of the namespace URI for elements with no namespace.
[1.4] New property http://www.python.org/sax/properties/encoding.
[1.5] Support optional string interning in pyexpat.
Martin v. Löwis [Sun, 30 Jun 2002 07:21:24 +0000 (07:21 +0000)]
Merge changes from PyXML:
[1.15]
Added understanding of the feature_validation, feature_external_pes,
and feature_string_interning features.
Added support for the feature_external_ges feature.
Added support for the property_xml_string property.
[1.16]
Made it recognize the namespace prefixes feature.
[1.17]
removed erroneous first line
[1.19]
Support optional string interning in pyexpat.
[1.21]
Restore compatibility with versions of Python that did not support weak
references. These do not get the cyclic reference fix, but they will
continue to work as they did before.
[1.22]
Activate entity processing unless standalone.
Barry Warsaw [Fri, 28 Jun 2002 23:49:33 +0000 (23:49 +0000)]
Lots of new and updated tests to check for proper ascii header
folding. Note that some of the Japanese tests have changed, but I
don't really know if they are correct or not. :(
Someone with Japanese and RFC 2047 expertise, please take a look!
Barry Warsaw [Fri, 28 Jun 2002 23:48:23 +0000 (23:48 +0000)]
_max_append(): When adding the string `s' to its own line, it should
be lstrip'd so that old continuation whitespace is replaced by that
specified in Header's continuation_ws parameter.
Barry Warsaw [Fri, 28 Jun 2002 23:46:53 +0000 (23:46 +0000)]
Teach this class about "highest-level syntactic breaks" but only for
headers with no charset or 'us-ascii' charsets. Actually this is only
partially true: we know about semicolons (but not true parameters) and
we know about whitespace (but not technically folding whitespace).
Still it should be good enough for all practical purposes.
Other changes include:
__init__(): Add a continuation_ws argument, which defaults to a single
space. Set this to change the whitespace used for continuation lines
when a header must be split. Also, changed the way header line
lengths are calculated, so that they take into account continuation_ws
(when tabs-expanded) and any provided header_name parameter. This
should do much better on returning split headers for which the first
and subsequent lines must fit into a specified width.
guess_maxlinelen(): Removed. I don't think we need this method as
part of the public API.
encode_chunks() -> _encode_chunks(): I don't think we need this one as
part of the public API either.
Barry Warsaw [Fri, 28 Jun 2002 23:41:42 +0000 (23:41 +0000)]
_split_header(): The code here was terminally broken because it didn't
know anything about RFC 2047 encoded headers. Fortunately we have a
perfectly good header splitter in Header.encode(). So we just call
that to give us a properly formatted and split header.
Header.encode() didn't know about "highest-level syntactic breaks" but
that's been fixed now too.
Jeremy Hylton [Fri, 28 Jun 2002 23:32:51 +0000 (23:32 +0000)]
Close SF patch 523944: importing modules with foreign newlines.
Didn't use the patch, because universal newlines support made it easy.
It might be worth fixing the actual problem in the 2.2 maintenance
branch, in which case the patch is still needed.
Fred Drake [Fri, 28 Jun 2002 22:56:48 +0000 (22:56 +0000)]
Added character data buffering to pyexpat parser objects.
Setting the buffer_text attribute to true causes the parser to collect
character data, waiting as long as possible to report it to the Python
callback. This can save an enormous number of callbacks from C to
Python, which can be a substantial performance improvement.
Jeremy Hylton [Fri, 28 Jun 2002 22:38:01 +0000 (22:38 +0000)]
Fixes for two separate HTTP/1.1 bugs: 100 responses and HTTPS connections.
The HTTPResponse class now handles 100 continue responses, instead of
choking on them. It detects them internally in the _begin() method
and ignores them. Based on a patch by Bob Kline.
This closes SF bugs 498149 and 551273.
The FakeSocket class (for SSL) is now usable with HTTP/1.1
connections. The old version of the code could not work with
persistent connections, because the makefile() implementation read
until EOF before returning. If the connection is persistent, the
server sends a response and leaves the connection open. A client that
reads until EOF will block until the server gives up on the connection
-- more than a minute in my test case.
The problem was fixed by implementing a reasonable makefile(). It
reads data only when it is needed by the layers above it. It's
implementation uses an internal buffer with a default size of 8192.
Also, rename begin() method of HTTPResponse to _begin() because it
should only be called by the HTTPConnection.
Fred Drake [Fri, 28 Jun 2002 22:29:01 +0000 (22:29 +0000)]
pyexpat code cleanup and minor refactorings:
The handlers array on each parser now has the invariant that None will
never be set as a handler; it will always be NULL or a Python-level
value passed in for the specific handler.
have_handler(): Return true if there is a Python handler for a
particular event.
get_handler_name(): Return a string object giving the name of a
particular handler. This caches the string object so it doesn't
need to be created more than once.
get_parse_result(): Helper to allow the Parse() and ParseFile()
methods to share the same logic for determining the return value
or exception state.
PyUnknownEncodingHandler(), PyModule_AddIntConstant():
Made these helpers static. (The later is only defined for older
versions of Python.)
pyxml_UpdatePairedHandlers(), pyxml_SetStartElementHandler(),
pyxml_SetEndElementHandler(), pyxml_SetStartNamespaceDeclHandler(),
pyxml_SetEndNamespaceDeclHandler(), pyxml_SetStartCdataSection(),
pyxml_SetEndCdataSection(), pyxml_SetStartDoctypeDeclHandler(),
pyxml_SetEndDoctypeDeclHandler():
Removed. These are no longer needed with Expat 1.95.x.
handler_info:
Use the setter functions provided by Expat 1.95.x instead of the
pyxml_Set*Handler() functions which have been removed.
Minor code formatting changes for consistency.
Trailing whitespace removed.
Jack Jansen [Wed, 26 Jun 2002 20:37:40 +0000 (20:37 +0000)]
Changed some prototypes to match the exact definition in some faraway Apple
header files. If we're building with precompiled headers these are in scope.
Jack Jansen [Wed, 26 Jun 2002 15:44:30 +0000 (15:44 +0000)]
Fixed a few showstoppers in the process of making MacPython use setup.py to build it's exension modules (in stead of relying on a private mechanism). It definitely doesn't work yet, but it looks promising.
Jack Jansen [Wed, 26 Jun 2002 15:00:29 +0000 (15:00 +0000)]
This module broke on the Mac (where it can't work, but distutils seems to import it anyway) because it imported pwd and grp. Moved the import to inside the routine where they're used.
Kurt B. Kaiser [Wed, 26 Jun 2002 02:32:09 +0000 (02:32 +0000)]
Shutdown subprocess debugger and associated Proxies/Adapters when closing
the Idle debugger.
M PyShell.py : Call RemoteDebugger.close_remote_debugger()
M RemoteDebugger.py: Add close_remote_debugger(); further polish code used
to start the debugger sections.
M rpc.py : Add comments on Idlefork methods register(), unregister()
comment out unused methods
M run.py : Add stop_the_debugger(); polish code
Fred Drake [Tue, 25 Jun 2002 17:10:50 +0000 (17:10 +0000)]
Talk about interfaces rather than implementation classes where appropriate.
Add hyperlinks to make the documentation on the Attributes and AttributesNS
interfaces more discoverable.
Closes SF bug #484603.