Tim Peters [Sun, 6 May 2001 01:05:02 +0000 (01:05 +0000)]
Generalize zip() to work with iterators.
NEEDS DOC CHANGES.
More AttributeErrors transmuted into TypeErrors, in test_b2.py, and,
again, this strikes me as a good thing.
This checkin completes the iterator generalization work that obviously
needed to be done. Can anyone think of others that should be changed?
Tim Peters [Sat, 5 May 2001 21:05:01 +0000 (21:05 +0000)]
Reimplement PySequence_Contains() and instance_contains(), so they work
safely together and don't duplicate logic (the common logic was factored
out into new private API function _PySequence_IterContains()).
Visible change:
some_complex_number in some_instance
no longer blows up if some_instance has __getitem__ but neither
__contains__ nor __iter__. test_iter changed to ensure that remains true.
Skeletal version; I'm checking this in now so I can keep a list of changes,
but don't plan on actually writing any text until, ooh, say, July or
thereabouts.
Tim Peters [Sat, 5 May 2001 10:06:17 +0000 (10:06 +0000)]
Make 'x in y' and 'x not in y' (PySequence_Contains) play nice w/ iterators.
NEEDS DOC CHANGES
A few more AttributeErrors turned into TypeErrors, but in test_contains
this time.
The full story for instance objects is pretty much unexplainable, because
instance_contains() tries its own flavor of iteration-based containment
testing first, and PySequence_Contains doesn't get a chance at it unless
instance_contains() blows up. A consequence is that
some_complex_number in some_instance
dies with a TypeError unless some_instance.__class__ defines __iter__ but
does not define __getitem__.
Tim Peters [Sat, 5 May 2001 05:36:48 +0000 (05:36 +0000)]
Make unicode.join() work nice with iterators. This also required a change
to string.join(), so that when the latter figures out in midstream that
it really needs unicode.join() instead, unicode.join() can actually get
all the sequence elements (i.e., there's no guarantee that the sequence
passed to string.join() can be iterated over *again* by unicode.join(),
so string.join() must not pass on the original sequence object anymore).
Tim Peters [Sat, 5 May 2001 04:24:43 +0000 (04:24 +0000)]
Mark string.join() as done. Turns out string_join() works "for free" now,
because PySequence_Fast() started working for free as soon as
PySequence_Tuple() learned how to work with iterators. For some reason
unicode.join() still doesn't work, though.
Tim Peters [Sat, 5 May 2001 03:56:37 +0000 (03:56 +0000)]
Generalize tuple() to work nicely with iterators.
NEEDS DOC CHANGES.
This one surprised me! While I expected tuple() to be a no-brainer, turns
out it's actually dripping with consequences:
1. It will *allow* the popular PySequence_Fast() to work with any iterable
object (code for that not yet checked in, but should be trivial).
2. It caused two std tests to fail. This because some places used
PyTuple_Sequence() (the C spelling of tuple()) as an indirect way to test
whether something *is* a sequence. But tuple() code only looked for the
existence of sq->item to determine that, and e.g. an instance passed
that test whether or not it supported the other operations tuple()
needed (e.g., __len__). So some things the tests *expected* to fail
with an AttributeError now fail with a TypeError instead. This looks
like an improvement to me; e.g., test_coercion used to produce 559
TypeErrors and 2 AttributeErrors, and now they're all TypeErrors. The
error details are more informative too, because the places calling this
were *looking* for TypeErrors in order to replace the generic tuple()
"not a sequence" msg with their own more specific text, and
AttributeErrors snuck by that.
Tim Peters [Thu, 3 May 2001 23:54:49 +0000 (23:54 +0000)]
Generalize map() to work with iterators.
NEEDS DOC CHANGES.
Possibly contentious: The first time s.next() yields StopIteration (for
a given map argument s) is the last time map() *tries* s.next(). That
is, if other sequence args are longer, s will never again contribute
anything but None values to the result, even if trying s.next() again
could yield another result. This is the same behavior map() used to have
wrt IndexError, so it's the only way to be wholly backward-compatible.
I'm not a fan of letting StopIteration mean "try again later" anyway.
Fred Drake [Thu, 3 May 2001 19:44:50 +0000 (19:44 +0000)]
Remove unnecessary intialization for the case of weakly-referencable objects;
the code necessary to accomplish this is simpler and faster if confined to
the object implementations, so we only do this there.
This causes no behaviorial changes beyond a (very slight) speedup.
Fred Drake [Thu, 3 May 2001 16:04:13 +0000 (16:04 +0000)]
Since Py_TPFLAGS_HAVE_WEAKREFS is set in Py_TPFLAGS_DEFAULT, it does not
need to be specified in the type structures independently. The flag
exists only for binary compatibility.
This is a "source cleanliness" issue and introduces no behavioral changes.
Fred Drake [Thu, 3 May 2001 04:58:49 +0000 (04:58 +0000)]
InteractiveInterpreter.showsyntaxerror():
When replacing the exception object, be sure we stuff the new value
in sys.last_value (which we already did for the original value).
Fred Drake [Wed, 2 May 2001 20:20:53 +0000 (20:20 +0000)]
Make the Mailbox objects support iteration -- they already had the
appropriate next() method, and this is what people really want to do with
these objects in practice.
Mchael Hudson pointed out that the code for detecting changes in
dictionary size was comparing ma_size, the hash table size, which is
always a power of two, rather than ma_used, wich changes on each
insertion or deletion. Fixed this.
Tim Peters [Tue, 1 May 2001 20:45:31 +0000 (20:45 +0000)]
Generalize list(seq) to work with iterators. This also generalizes list()
to no longer insist that len(seq) be defined.
NEEDS DOC CHANGES.
This is meant to be a model for how other functions of this ilk (max,
filter, etc) can be generalized similarly. Feel encouraged to grab your
favorite and convert it!
Note some cute consequences:
list(file) == file.readlines() == list(file.xreadlines())
list(dict) == dict.keys()
list(dict.iteritems()) = dict.items()
list(xrange(i, j, k)) == range(i, j, k)
Printing objects to a real file still wasn't done right: if the
object's type didn't define tp_print, there were still cases where the
full "print uses str() which falls back to repr()" semantics weren't
honored. This resulted in
>>> print None
<None object at 0x80bd674>
>>> print type(u'')
<type object at 0x80c0a80>
Fixed this by always using the appropriate PyObject_Repr() or
PyObject_Str() call, rather than trying to emulate what they would do.
Also simplified PyObject_Str() to always fall back on PyObject_Repr()
when tp_str is not defined (rather than making an extra check for
instances with a __str__ method). And got rid of the special case for
strings.
Tim Peters [Sun, 29 Apr 2001 22:21:25 +0000 (22:21 +0000)]
SF bug #417093: Case sensitive import: dir and .py file w/ same name
Directory containing
Spam.py
spam/__init__.py
Then "import Spam" caused a SystemError, because code checking for
the existence of "Spam/__init__.py" finds it on a case-insensitive
filesystem, but then bails because the directory it finds it in
doesn't match case, and then old code assumed that was still an error
even though it isn't anymore. Changed the code to just continue
looking in this case (instead of calling it an error). So
import Spam
and
import spam
both work now.
Tim Peters [Sat, 28 Apr 2001 08:20:22 +0000 (08:20 +0000)]
Fix buglet reported on c.l.py: map(fnc, file.xreadlines()) blows up.
Also a 2.1 bugfix candidate (am I supposed to do something with those?).
Took away map()'s insistence that sequences support __len__, and cleaned
up the convoluted code that made it *look* like it really cared about
__len__ (in fact the old ->len field was only *used* as a flag bit, as
the main loop only looked at its sign bit, setting the field to -1 when
IndexError got raised; renamed the field to ->saw_IndexError instead).
Tim Peters [Sat, 28 Apr 2001 05:38:26 +0000 (05:38 +0000)]
A different approach to the problem reported in
Patch #419651: Metrowerks on Mac adds 0x itself
C std says %#x and %#X conversion of 0 do not add the 0x/0X base marker.
Metrowerks apparently does. Mark Favas reported the same bug under a
Compaq compiler on Tru64 Unix, but no other libc broken in this respect
is known (known to be OK under MSVC and gcc).
So just try the damn thing at runtime and see what the platform does.
Note that we've always had bugs here, but never knew it before because
a relevant test case didn't exist before 2.1.
Fix a very old flaw in PyObject_Print(). Amazing! When an object
type defines tp_str but not tp_repr, 'print x' to a real file
object would not call the tp_str slot but rather print a default style
representation: <foo object at 0x....>. This even though 'print x' to
a file-like-object would correctly call the tp_str slot.
Jack Jansen [Fri, 27 Apr 2001 20:43:27 +0000 (20:43 +0000)]
Got rid of the whole event filtering mess again, I can't get it to work. Simply disabling the Tk event handling hook in _tkinter is not as nice, but at least it works.
Jeremy Hylton [Fri, 27 Apr 2001 02:29:40 +0000 (02:29 +0000)]
Fix 2.1 nested scopes crash reported by Evan Simpson
The new test case demonstrates the bug. Be more careful in
symtable_resolve_free() to add a var to cells or frees only if it
won't be added under some other rule.
Jack Jansen [Wed, 25 Apr 2001 22:07:27 +0000 (22:07 +0000)]
- Raise console window on input. Fixes Carbon hang.
- Better handling of menu bar save/restore.
- Override abort() so it honours the "keep console window" flag.
Fred Drake [Wed, 25 Apr 2001 16:01:30 +0000 (16:01 +0000)]
ParserCreate(): Allow an empty string for the namespace_separator argument;
while not generally a good idea, this is used by RDF users, and works
to implement RDF-style namespace+localname concatenation as defined
in the RDF specifications. (This also corrects a backwards-compatibility
bug.)
Be more conservative while clearing out handlers; set the slot in the
self->handlers array to NULL before DECREFing the callback.
Still more adjustments to make the code style internally consistent.
Tim Peters [Wed, 25 Apr 2001 03:43:14 +0000 (03:43 +0000)]
SF bug 418615: regular expression bug in pipes.py.
Obviously bad regexps, spotted by Jeffery Collins.
HELP! I can't run this on Windows, and the module test() function
probably doesn't work on anyone's box. Could a Unixoid please write
an at least minimal working test and add it to the std test suite?
This patch originated from an idea by Martin v. Loewis who submitted a
patch for sharing single character Unicode objects.
Martin's patch had to be reworked in a number of ways to take Unicode
resizing into consideration as well. Here's what the updated patch
implements:
* Single character Unicode strings in the Latin-1 range are shared
(not only ASCII chars as in Martin's original patch).
* The ASCII and Latin-1 codecs make use of this optimization,
providing a noticable speedup for single character strings. Most
Unicode methods can use the optimization as well (by virtue
of using PyUnicode_FromUnicode()).
* Some code cleanup was done (replacing memcpy with Py_UNICODE_COPY)
* The PyUnicode_Resize() can now also handle the case of resizing
unicode_empty which previously resulted in an error.
* Modified the internal API _PyUnicode_Resize() and
the public PyUnicode_Resize() API to handle references to
shared objects correctly. The _PyUnicode_Resize() signature
changed due to this.
* Callers of PyUnicode_FromUnicode() may now only modify the Unicode
object contents of the returned object in case they called the API
with NULL as content template.
Note that even though this patch passes the regression tests, there
may still be subtle bugs in the sharing code.
Mondo changes to the iterator stuff, without changing how Python code
sees it (test_iter.py is unchanged).
- Added a tp_iternext slot, which calls the iterator's next() method;
this is much faster for built-in iterators over built-in types
such as lists and dicts, speeding up pybench's ForLoop with about
25% compared to Python 2.1. (Now there's a good argument for
iterators. ;-)
- Renamed the built-in sequence iterator SeqIter, affecting the C API
functions for it. (This frees up the PyIter prefix for generic
iterator operations.)
- Added PyIter_Check(obj), which checks that obj's type has a
tp_iternext slot and that the proper feature flag is set.
- Added PyIter_Next(obj) which calls the tp_iternext slot. It has a
somewhat complex return condition due to the need for speed: when it
returns NULL, it may not have set an exception condition, meaning
the iterator is exhausted; when the exception StopIteration is set
(or a derived exception class), it means the same thing; any other
exception means some other error occurred.
Fred Drake [Sun, 22 Apr 2001 06:20:31 +0000 (06:20 +0000)]
Update publish-to-SourceForge scripts to automatically determine if the
branch is the head (development) branch or a maintenance brach, and use
the appropriate target directory for each.
Fred Drake [Sat, 21 Apr 2001 06:00:51 +0000 (06:00 +0000)]
Add support for <memberline/> (needs markup improvement!).
Update <versionadded/> to recent addition of optional explanatory text;
make the explanation text take the same attribute name for both
<versionadded/> and <versionchanged/>.
new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER
new C API PyObject_GetIter(), calls tp_iter
new builtin iter(), with two forms: iter(obj), and iter(function, sentinel)
new internal object types iterobject and calliterobject
new exception StopIteration
new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py)
new magic number for .pyc files
new special method for instances: __iter__() returns an iterator
iteration over dictionaries: "for x in dict" iterates over the keys
iteration over files: "for x in file" iterates over lines
TODO:
documentation
test suite
decide whether to use a different way to spell iter(function, sentinal)
decide whether "for key in dict" is a good idea
use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?)
speed tuning (make next() a slot tp_next???)
Jeremy Hylton [Fri, 20 Apr 2001 19:04:55 +0000 (19:04 +0000)]
dispatcher.__repr__() was unprepared to handle the address for a Unix
domain socket. Fix that and make the error message for failures a
little more helpful by including the class name.
Implement, test and document "key in dict" and "key not in dict".
I know some people don't like this -- if it's really controversial,
I'll take it out again. (If it's only Alex Martelli who doesn't like
it, that doesn't count as "real controversial" though. :-)
That's why this is a separate checkin from the iterators stuff I'm
about to check in next.
Fred Drake [Thu, 19 Apr 2001 16:26:06 +0000 (16:26 +0000)]
Weak*Dictionary: Added docstrings to the classes.
Weak*Dictionary.update(): No longer create a temporary list to hold the
things that will be stuffed into the underlying dictionary. This had
been done so that if any of the objects used as the weakly-held value
was not weakly-referencable, no updates would take place (TypeError
would be raised). With this change, TypeError will still be raised
but a partial update could occur. This is more like other .update()
implementations.
Thoughout, use of the name "ref" as a local variable has been removed. The
original use of the name occurred when the function to create a weak
reference was called "new"; the overloaded use of the name could be
confusing for someone reading the code. "ref" used as a variable name
has been replaced with "wr" (for 'weak reference').
Fred Drake [Thu, 19 Apr 2001 04:55:23 +0000 (04:55 +0000)]
Add versioning notes: many of the signatures changed to allow the time
used to be omitted (meaning use the current time) as of Python 2.1.
Users who need cross-version portability need to know things like this.