Guido van Rossum [Fri, 23 Aug 2002 14:11:35 +0000 (14:11 +0000)]
The error messages in err_args() -- which is only called when the
required number of args is 0 or 1 -- were reversed. Also change "1"
into "exactly one", the same words as used elsewhere for this
condition.
Guido van Rossum [Fri, 23 Aug 2002 01:36:01 +0000 (01:36 +0000)]
Rewritten using the tokenize module, which gives us a real tokenizer
rather than a number of approximating regular expressions.
Alas, it is 3-4 times slower. Let that be a challenge for the
tokenize module.
Greg Ward [Thu, 22 Aug 2002 21:04:21 +0000 (21:04 +0000)]
Fix SF bug #596434: tweak wordsep_re so "--foo-bar" now splits
into /--foo-/bar/ rather than /--/foo-/bar/. Needed for Optik and
Docutils to handle Unix-style command-line options properly.
Guido van Rossum [Thu, 22 Aug 2002 20:02:03 +0000 (20:02 +0000)]
Standardize behavior: no docstrings in test functions. Also get rid
of dummy_test_TemporaryFile class; when NamedTemporaryFile and
TemporaryFile are the same, simply don't add a test suite for
TemporaryFile.
Greg Ward [Thu, 22 Aug 2002 19:47:27 +0000 (19:47 +0000)]
Add test_em_dash() to WrapTestCase to make sure that TextWrapper handles
em-dashes -- like this -- properly. (Also--like this. Although this
usage may be incompatible with fixing bug #596434; we shall see.)
Greg Ward [Thu, 22 Aug 2002 19:02:37 +0000 (19:02 +0000)]
Factor LongWordTestCase out of WrapTestCase, and rename its methods
(tests) from test_funky_punc() to test_break_long() and
test_long_words() to test_nobreak_long().
Greg Ward [Thu, 22 Aug 2002 18:55:38 +0000 (18:55 +0000)]
Ditch the whole loop-over-subcases way of working. Add check_wrap() to
base class (WrapperTestCase) instead, and call it repeatedly in the
methods that used to have a loop-over-subcases. Much simpler.
Greg Ward [Thu, 22 Aug 2002 18:45:02 +0000 (18:45 +0000)]
Simplify and reformat the use of 'subcases' lists (and following
for-loops) in test_simple(), test_wrap_short() test_hyphenated(), and
test_funky_punc().
Greg Ward [Thu, 22 Aug 2002 18:35:49 +0000 (18:35 +0000)]
Conform to standards documented in README:
* lowercase test*() methods
* define test_main() and use it instead of unittest.main()
Kill #! line.
Improve some test names and docstrings.
Greg Ward [Thu, 22 Aug 2002 18:11:10 +0000 (18:11 +0000)]
Test script for the textwrap module. Kindly provided by Peter Hansen
<peter@engcorp.com> based on a test script that's been kicking around my
home directory for a couple of months now and only saw the light of day
because I included it when I sent textwrap.py to python-dev for review.
Guido van Rossum [Thu, 22 Aug 2002 17:23:33 +0000 (17:23 +0000)]
Change the binary operators |, &, ^, - to return NotImplemented rather
than raising TypeError when the other argument is not a BaseSet. This
made it necessary to separate the implementation of e.g. __or__ from
the union method; the latter should not return NotImplemented but
raise TypeError. This is accomplished by making union(self, other)
return self|other, etc.; Python's binary operator machinery will raise
TypeError.
The idea behind this change is to allow other set implementations with
an incompatible internal structure; these can provide union (etc.) with
standard sets by implementing __ror__ etc.
I wish I could do this for comparisons too, but the default comparison
implementation allows comparing anything to anything else (returning
false); we don't want that (at least the test suite makes sure
e.g. Set()==42 raises TypeError). That's probably fine; otherwise
other set implementations would be constrained to implementing a hash
that's compatible with ours.
Fred Drake [Wed, 21 Aug 2002 20:23:22 +0000 (20:23 +0000)]
Refactor: Remove some code that was obsoleted when this module was
changed to use universal newlines.
Remove all imports from the compile() function; these are
now done at the top of the module ("Python normal form"),
and define a helper based on the platform instead of
testing the platform in the compile() function.
Fred Drake [Wed, 21 Aug 2002 19:24:21 +0000 (19:24 +0000)]
Clarify that even though some of the relevant specifications define the
order in which form variables should be encoded in a request, a CGI script
should not rely on that since a client may not conform to those specs, or
they may not be relevant to the request.
Closes SF bug #596866.
Now that __init__ transforms set elements, we know that all of the
elements are hashable, so we can use dict.update() or dict.copy()
for a C speed Set.copy().
Guido van Rossum [Wed, 21 Aug 2002 03:20:44 +0000 (03:20 +0000)]
Ouch. The test suite *really* needs work!!!!! There were several
superficial errors and one deep one that aren't currently caught. I'm
headed for bed after this checkin.
- Fixed several typos introduced by Raymond Hettinger (through
cut-n-paste from my template): it's _as_temporarily_immutable, not
_as_temporary_immutable, and moreover when the element is added, we
should use _as_immutable.
- Made the seq argument to ImmutableSet.__init__ optional, so we can
write ImmutableSet() to create an immutable empty set.
- Rename the seq argument to Set and ImmutableSet to iterable.
- Add a Set.__hash__ method that raises a TypeError. We inherit a
default __hash__ implementation from object, and we don't want that.
We can then catch this in update(), so that
e.g. s.update([Set([1])]) will transform the Set([1]) to
ImmutableSet([1]).
- Added the dance to catch TypeError and try _as_immutable in the
constructors too (by calling _update()). This is needed so that
Set([Set([1])]) is correctly interpreted as
Set([ImmutableSet([1])]). (I was puzzled by a side effect of this
and the inherited __hash__ when comparing two sets of sets while
testing different powerset implementations: the Set element passed
to a Set constructor wasn't transformed to an ImmutableSet, and then
the dictionary didn't believe the Set found in one dict it was the
same as ImmutableSet in the other, because the hashes were
different.)
- Refactored Set.update() and both __init__() methods; moved the body
of update() into BaseSet as _update(), and call this from __init__()
and update().
- Changed the NotImplementedError in BaseSet.__init__ to TypeError,
both for consistency with basestring() and because we have to use
TypeError when denying Set.__hash__. Together those provide
sufficient evidence that an unimplemented method needs to raise
TypeError.
Guido van Rossum [Tue, 20 Aug 2002 21:38:37 +0000 (21:38 +0000)]
Move __init__ from BaseSet into Set and ImmutableSet. This causes a
tiny amount of code duplication, but makes it possible to give BaseSet
an __init__ that raises an exception.
Tim Peters [Tue, 20 Aug 2002 19:00:22 +0000 (19:00 +0000)]
long_format(), long_lshift(): Someone on c.l.py is trying to boost
SHIFT and MASK, and widen digit. One problem is that code of the form
digit << small_integer
implicitly assumes that the result fits in an int or unsigned int
(platform-dependent, but "int sized" in any case), since digit is
promoted "just" to int or unsigned via the usual integer promotions.
But if digit is typedef'ed as unsigned int, this loses information.
The cure for this is just to cast digit to twodigits first.
Guido van Rossum [Tue, 20 Aug 2002 17:29:29 +0000 (17:29 +0000)]
Fix some endcase bugs in unicode rfind()/rindex() and endswith().
These were reported and fixed by Inyeol Lee in SF bug 595350. The
endswith() bug was already fixed in 2.3, but this adds some more test
cases.
Barry Warsaw [Tue, 20 Aug 2002 14:50:09 +0000 (14:50 +0000)]
get_content_type(), get_content_maintype(), get_content_subtype(): RFC
2045, section 5.2 states that if the Content-Type: header is
syntactically invalid, the default type should be text/plain.
Implement minimal sanity checking of the header -- it must have
exactly one slash in it. This closes SF patch #597593 by Skip, but in
a different way.
Note that these methods used to raise ValueError for invalid ctypes,
but now they won't.
Barry Warsaw [Tue, 20 Aug 2002 14:47:30 +0000 (14:47 +0000)]
_dispatch(): Use get_content_maintype() and get_content_subtype() to
get the MIME main and sub types, instead of getting the whole ctype
and splitting it here. The two more specific methods now correctly
implement RFC 2045, section 5.2.
Create two subsections of the "Core Language Changes" section, because
the list is getting awfully long
Mention Karatsuba multiplication and some other items
Guido van Rossum [Mon, 19 Aug 2002 21:43:18 +0000 (21:43 +0000)]
SF patch 576101, by Oren Tirosh: alternative implementation of
interning. I modified Oren's patch significantly, but the basic idea
and most of the implementation is unchanged. Interned strings created
with PyString_InternInPlace() are now mortal, and you must keep a
reference to the resulting string around; use the new function
PyString_InternImmortal() to create immortal interned strings.
Guido van Rossum [Mon, 19 Aug 2002 20:24:07 +0000 (20:24 +0000)]
Another ugly inlining hack, expanding the two PyDict_GetItem() calls
in LOAD_GLOBAL. Besides saving a C function call, it saves checks
whether f_globals and f_builtins are dicts, and extracting and testing
the string object's hash code is done only once. We bail out of the
inlining if the name is not exactly a string, or when its hash is -1;
because of interning, neither should ever happen. I believe interning
guarantees that the hash code is set, and I believe that the 'names'
tuple of a code object always contains interned strings, but I'm not
assuming that -- I'm simply testing hash != -1.
On my home machine, this makes a pystone variant with new-style
classes and slots run at the same speed as classic pystone! (With
new-style classes but without slots, it is still a lot slower.)
Guido van Rossum [Mon, 19 Aug 2002 19:26:42 +0000 (19:26 +0000)]
Call me anal, but there was a particular phrase that was speading to
comments everywhere that bugged me: /* Foo is inlined */ instead of
/* Inline Foo */. Somehow the "is inlined" phrase always confused me
for half a second (thinking, "No it isn't" until I added the missing
"here"). The new phrase is hopefully unambiguous.
Guido van Rossum [Mon, 19 Aug 2002 16:02:33 +0000 (16:02 +0000)]
Simple but important optimization for descr_check(): instead of the
expensive and overly general PyObject_IsInstance(), call
PyObject_TypeCheck() which is a macro that often avoids a call, and if
it does make a call, calls the much more efficient PyType_IsSubtype().
This saved 6% on a benchmark for slot lookups.
Tim Peters [Mon, 19 Aug 2002 00:42:29 +0000 (00:42 +0000)]
SF bug 595919: popenN return only text mode pipes
popen2() and popen3() created text-mode pipes even when binary mode
was asked for. This was specific to Windows.
Jack Jansen [Sun, 18 Aug 2002 21:57:09 +0000 (21:57 +0000)]
Refuse to run if the last bit of the destination path contains a # character.
This is a silly workaround for a rather serious bug in MacOSX: if you take
a long filename and convert it to an FSSpec the fsspec gets a magic
cooky (containing a #, indeed). If you then massage the extension of this
fsspec and convert back to a pathname you may end up referring to the
same file. This could destroy your sourcefile. The problem only occcurs
in MacPython-OS9, not MacPython-OSX (I think).
Modify splituser() method to allow an @ in the userinfo field.
Jeremy reported that this is not allowed by RFC 2396; however,
other tools support unescaped @'s so we should also.