Tim Peters [Sun, 21 Jul 2002 17:37:03 +0000 (17:37 +0000)]
New test "+sort", tacking 10 random floats on to the end of a sorted
array. Our samplesort special-cases the snot out of this, running about
12x faster than *sort. The experimental mergesort runs it about 8x
faster than *sort without special-casing, but should really do better
than that (when merging runs of different lengths, right now it only
does something clever about finding where the second run begins in
the first and where the first run ends in the second, and that's more
of a temp-memory optimization).
it's looking for, and reports the skip as a crash failure instead of
as a skipped test.
I suppose this will make it harder to run this test outside of
regrtest, but under the assumption only Barry does that, better to
make it skip cleanly for everyone else.
Barry Warsaw [Fri, 19 Jul 2002 22:31:10 +0000 (22:31 +0000)]
The email package's tests live much better in a subpackage
(i.e. email.test), so move the guts of them here from Lib/test. The
latter directory will retain stubs to run the email.test tests using
Python's standard regression test.
test_email_torture.py is a torture tester which will not run under
Python's test suite because I don't want to commit megs of data to
that project (it will fail cleanly there). When run under the mimelib
project it'll stress test the package with megs of message samples
collected from various locations in the wild.
Barry Warsaw [Fri, 19 Jul 2002 22:29:49 +0000 (22:29 +0000)]
The email package's tests live much better in a subpackage
(i.e. email.test), so move the guts of them here from Lib/test. The
latter directory will retain stubs to run the email.test tests using
Python's standard regression test.
test_email_torture.py is a torture tester which will not run under
Python's test suite because I don't want to commit megs of data to
that project (it will fail cleanly there). When run under the mimelib
project it'll stress test the package with megs of message samples
collected from various locations in the wild.
email/test/data is a copy of Lib/test/data. The fate of the latter is
still undecided.
Barry Warsaw [Fri, 19 Jul 2002 22:24:55 +0000 (22:24 +0000)]
To better support default content types, fix an API wart, and preserve
backwards compatibility, we're silently deprecating get_type(),
get_subtype() and get_main_type(). We may eventually noisily
deprecate these. For now, we'll just fix a bug in the splitting of
the main and subtypes.
get_content_type(), get_content_maintype(), get_content_subtype(): New
methods which replace the above. These /always/ return a content type
string and do not take a failobj, because an email message always at
least has a default content type.
set_default_type(): Someday there may be additional default content
types, so don't hard code an assertion about the value of the ctype
argument.
Alas, roll back the definition of _XOPEN_SOURCE. It breaks the tests
for the time module, because somehow configure won't define the
symbols HAVE_STRUCT_TM_TM_ZONE, HAVE_TM_ZONE, and HAVE_TZNAME in this
case.
I've got no time to research this further, so I leave it in Jeremy and
Martin's capable hands to find a different solution for True64 (or to
devise a way to get the time tests to succeed while defining
_XOPEN_SOURCE).
Remove a few lines that aren't used and cause problems on platforms
where recvfrom() on a TCP stream returns None for the address.
This should address the remaining problems on FreeBSD.
Patch to call the Pure python strptime implementation if there's no
C implementation. See SF patch 474274, by Brett Cannon.
(As an experiment, I'm adding a line that #undefs HAVE_STRPTIME,
so that you'll always get the Python version. This is so that it
gets some good exercise. We should eventually delete that line.)
Tim Peters [Fri, 19 Jul 2002 07:05:44 +0000 (07:05 +0000)]
More sort cleanup: Moved the special cases from samplesortslice into
listsort. If the former calls itself recursively, they're a waste of
time, since it's called on a random permutation of a random subset of
elements. OTOH, for exactly the same reason, they're an immeasurably
small waste of time (the odds of finding exploitable order in a random
permutation are ~= 0, so the special-case loops looking for order give
up quickly). The point is more for conceptual clarity.
Also changed some "assert comments" into real asserts; when this code
was first written, Python.h didn't supply assert.h.
Tim Peters [Fri, 19 Jul 2002 03:30:57 +0000 (03:30 +0000)]
Cleanup yielding a small speed boost: before rich comparisons were
introduced, list.sort() was rewritten to use only the "< or not <?"
distinction. After rich comparisons were introduced, docompare() was
fiddled to translate a Py_LT Boolean result into the old "-1 for <,
0 for ==, 1 for >" flavor of outcome, and the sorting code was left
alone. This left things more obscure than they should be, and turns
out it also cost measurable cycles.
So: The old CMPERROR novelty is gone. docompare() is renamed to islt(),
and now has the same return conditinos as PyObject_RichCompareBool. The
SETK macro is renamed to ISLT, and is even weirder than before (don't
complain unless you want to maintain the sort code <wink>).
Overall, this yields a 1-2% speedup in the usual (no explicit function
passed to list.sort()) case when sorting arrays of floats (as sortperf.py
does). The boost is higher for arrays of ints.
Jeremy Hylton [Thu, 18 Jul 2002 22:39:34 +0000 (22:39 +0000)]
Define _XOPEN_SOURCE in configure and Python.h.
This gets compilation of posixmodule.c to succeed on Tru64 and does no
harm on Linux. We may need to undefine it on some platforms, but
let's wait and see.
Martin says:
> I think it is generally the right thing to define _XOPEN_SOURCE on
> Unix, providing a negative list of systems that cannot support this
> setting (or preferably solving whatever problems remain).
>
> I'd put an (unconditional) AC_DEFINE into configure.in early on; it
> *should* go into confdefs.h as configure proceeds, and thus be active
> when other tests are performed.
Fred Drake [Thu, 18 Jul 2002 19:20:23 +0000 (19:20 +0000)]
Simplify; the low-level log reader is now always a modern iterator,
and should never return None. (It only did this for an old version of
HotShot that was trying to still work with a patched Python 2.1.)
Fred Drake [Thu, 18 Jul 2002 19:11:44 +0000 (19:11 +0000)]
- When the log reader detects end-of-file, close the file.
- The log reader now provides a "closed" attribute similar to the
profiler.
- Both the profiler and log reader now provide a fileno() method.
- Use METH_NOARGS where possible, allowing simpler code in the method
implementations.
Add default timeout functionality. This adds setdefaulttimeout() and
getdefaulttimeout() functions to the socket and _socket modules, and
appropriate tests.
Tim Peters [Thu, 18 Jul 2002 15:53:32 +0000 (15:53 +0000)]
Gave this a facelift: "/" vs "//", whrandom vs random, etc. Boosted
the default range to end at 2**20 (machines are much faster now).
Fixed what was quite a arguably a bug, explaining an old mystery: the
"!sort" case here contructs what *was* a quadratic-time disaster for
the old quicksort implementation. But under the current samplesort, it
always ran much faster than *sort (the random case). This never made
sense. Turns out it was because !sort was sorting an integer array,
while all the other cases sort floats; and comparing ints goes much
quicker than comparing floats in Python. After changing !sort to chew
on floats instead, it's now slower than the random sort case, which
makes more sense (but is just a few percent slower; samplesort is
massively less sensitive to "bad patterns" than quicksort).
Tim Peters [Thu, 18 Jul 2002 14:54:28 +0000 (14:54 +0000)]
Gave hotshot.LogReader a close() method, to allow users to close the
file object that LogReader opens. Used it then in test_hotshot; the
test passes again on Windows. Thank Guido for the analysis.
Fred Drake [Wed, 17 Jul 2002 18:54:20 +0000 (18:54 +0000)]
Added a docstring for the closed attribute.
write_header(): When we encounter a non-string object in sys.path, record
a fairly mindless placeholder rather than dying. Possibly could record
the repr of the object found, but not clear whether that matters.
Jeremy Hylton [Wed, 17 Jul 2002 16:30:39 +0000 (16:30 +0000)]
staticforward bites the dust.
The staticforward define was needed to support certain broken C
compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
static keyword when it was used with a forward declaration of a static
initialized structure. Standard C allows the forward declaration with
static, and we've decided to stop catering to broken C compilers. (In
fact, we expect that the compilers are all fixed eight years later.)
I'm leaving staticforward and statichere defined in object.h as
static. This is only for backwards compatibility with C extensions
that might still use it.
Some modernization. Get rid of the redundant next() method. Always
assume tp_iter and later fields exist. Use PyObject_GenericGetAttr
instead of providing our own tp_getattr hook.
Jeremy Hylton [Tue, 16 Jul 2002 21:21:11 +0000 (21:21 +0000)]
Send HTTP requests with a single send() call instead of many.
The implementation now stores all the lines of the request in a buffer
and makes a single send() call when the request is finished,
specifically when endheaders() is called.
This appears to improve performance. The old code called send() for
each line. The sends are all short, so they caused bad interactions
with the Nagle algorithm and delayed acknowledgements. In simple
tests, the second packet was delayed by 100s of ms. The second send was
delayed by the Nagle algorithm, waiting for the ack. The delayed ack
strategy delays the ack in hopes of piggybacking it on a data packet,
but the server won't send any data until it receives the complete
request.
This change minimizes the problem that Nagle + delayed ack will cause
a problem, although a request large enough to be broken into two
packets will still suffer some delay. Luckily the MSS is large enough
to accomodate most single packets.
Remove the next() method -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set (fortunately,
because using the tp_iternext implementation for the the next()
implementation is buggy). Also changed the allocation order in
enum_next() so that the underlying iterator is only moved ahead when
we have successfully allocated the result tuple and index.
Remove the next() method -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. Also removed the
redundant (and expensive!) call to raise StopIteration from
rangeiter_next().
Make StopIteration a sink state. This is done by clearing out the
di_dict field when the end of the list is reached. Also make the
error ("dictionary changed size during iteration") a sticky state.
Also remove the next() method -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. That's a good
thing, because the implementation given here was buggy (it never
raised StopIteration).
Make StopIteration a sink state. This is done by clearing out the
object references (it_seq for seqiterobject, it_callable and
it_sentinel for calliterobject) when the end of the list is reached.
Also remove the next() methods -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. That's a good
thing, because the implementation given here was buggy (it never
raised StopIteration).
Make StopIteration a sink state. This is done by clearing out the
it_seq field when the end of the list is reached.
Also remove the next() method -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. That's a good
thing, because the implementation given here was buggy (it never
raised StopIteration).