I was in this module anyway, so I did some janitorial things.
METH_NOARGS functions are still called with two arguments, one NULL,
so put that back into the function definitions (I didn't know this
until recently).
Guido van Rossum [Thu, 30 Jan 2003 06:37:41 +0000 (06:37 +0000)]
There was a subtle big in save_newobj(): it used self.save_global(t)
on the type instead of self.save(t). This defeated the purpose of
NEWOBJ, because it didn't generate a BINGET opcode when t was already
memoized; but moreover, it would generate multiple BINPUT opcodes for
the same type! pickletools.dis() doesn't like this.
How I found this? I was playing with picklesize.py in the datetime
sandbox, and noticed that protocol 2 pickles for multiple objects were
in fact larger than protocol 1 pickles! That was suspicious, so I
decided to disassemble one of the pickles.
This really needs a unit test, but I'm exhausted. I'll be late for
work as it is. :-(
Guido van Rossum [Thu, 30 Jan 2003 05:39:04 +0000 (05:39 +0000)]
In save_newobj(), if an object's __getnewargs__ and __getstate__ are
the same function, don't save the state or write a BUILD opcode. This
is so that a type (e.g. datetime :-) can support protocol 2 using
__getnewargs__ while also supporting protocol 0 and 1 using
__getstate__. (Without this, the state would be pickled twice with
protocol 2, unless __getstate__ is defined to return None, which
breaks protocol 0 and 1.)
Tim Peters [Wed, 29 Jan 2003 20:12:21 +0000 (20:12 +0000)]
dis(): This had a problem with proto 0 pickles, in that POP sometimes
popped a MARK, but without stack emulation the disassembler couldn't
know that, and subsequent indentation got hosed.
Now the disassembler does do enough stack emulation to catch this. While
I was at it, also added lots of sanity checks for other stack operations,
and correct use of the memo. This goes (I think) a long way toward being
a "pickle verifier" now too.
Guido van Rossum [Wed, 29 Jan 2003 17:58:45 +0000 (17:58 +0000)]
Implement appropriate __getnewargs__ for all immutable subclassable builtin
types. The special handling for these can now be removed from save_newobj().
Add some testing for this.
Also add support for setting the 'fast' flag on the Python Pickler class,
which suppresses use of the memo.
Tim Peters [Wed, 29 Jan 2003 00:35:32 +0000 (00:35 +0000)]
Expect test_macostools and test_macfs to get skipped whenever
sys.platform != mac. Likewise expect test_win{reg,sound} to get skipped
on non-win32 platforms.
Tim Peters [Tue, 28 Jan 2003 22:34:11 +0000 (22:34 +0000)]
Temporary hacks to arrange that the pickle tests relying on protocol 2
only get run by test_pickle.py now (& not by test_cpickle.py). This
should be undone when protocol 2 is implemented in cPickle too.
test_cpickle should pass again.
Guido van Rossum [Tue, 28 Jan 2003 22:01:16 +0000 (22:01 +0000)]
Instead of bad hacks trying to worm around the inherited
object.__reduce__, do a getattr() on the class so we can explicitly
test for it. The reduce()-calling code becomes a bit more regular as
a result.
Also add support slots: if an object has slots, the default state is
(dict, slots) where dict is the __dict__ or None, and slots is a dict
mapping slot names to slot values. We do a best-effort approach to
find slot names, assuming the __slots__ fields of classes aren't
modified after class definition time to misrepresent the actual list
of slots defined by a class.
Jack Jansen [Tue, 28 Jan 2003 21:45:44 +0000 (21:45 +0000)]
Install "python$(VERSION)" into /usr/local as the symlink to the framework,
and also create a symlink "python" pointing to "python$(VERSION)".
Fixes #675745.
Tim Peters [Tue, 28 Jan 2003 20:37:45 +0000 (20:37 +0000)]
Added new private API function _PyLong_NumBits. This will be used at the
start for the C implemention of new pickle LONG1 and LONG4 opcodes (the
linear-time way to pickle a long is to call _PyLong_AsByteArray, but
the caller has no idea how big an array to allocate, and correct
calculation is a bit subtle).
Guido van Rossum [Tue, 28 Jan 2003 19:48:18 +0000 (19:48 +0000)]
The default __reduce__ on the base object type obscured any
possibility of calling save_reduce(). Add a special hack for this.
The tests for this are much simpler now (no __getstate__ or
__getnewargs__ needed).
Tim Peters [Tue, 28 Jan 2003 15:27:57 +0000 (15:27 +0000)]
dis(): Not all opcodes are printable anymore, so print the repr
of the opcode character instead (but stripping the quotes).
Added a proto 2 test section for the canonical recursive-tuple case.
Note that since pickle's save_tuple() takes different paths depending on
tuple length now, beefier tests are really needed (but not in pickletools);
the "short tuple" case tried here was actually broken yesterday, and it's
subtle stuff so needs to be tested.
Tim Peters [Tue, 28 Jan 2003 05:34:53 +0000 (05:34 +0000)]
save_tuple(): I believe the new code for TUPLE{1,2,3} in proto 2 was
incorrect for recursive tuples. Tried to repair; seems to work OK, but
there are no checked-in tests for this yet.
Tim Peters [Tue, 28 Jan 2003 03:51:36 +0000 (03:51 +0000)]
save_inst(): Rewrote to have only one branch on self.bin. Also got rid
of my recent XXX comment, taking a (what appears to be vanishingly small)
chance and calling self.memoize() instead.
Guido van Rossum [Tue, 28 Jan 2003 03:17:21 +0000 (03:17 +0000)]
Add a comment explaining that struct.pack() beats marshal.dumps(), but
marshal.loads() beats struct.unpack()! Possibly because the latter
creates a one-tuple. :-(
Guido van Rossum [Tue, 28 Jan 2003 03:03:08 +0000 (03:03 +0000)]
Got rid of mdumps; I timed it, and struct.pack("<i", x) is more than
40% faster than marshal.dumps(x)[1:]! (That's not counting the
module attribute lookups, which can be avoided in either case.)
Tim Peters [Tue, 28 Jan 2003 01:03:10 +0000 (01:03 +0000)]
save_pers(): Switched the order of cases, to get rid of a "not", and to
make the bin-vs-not-bin order consistent with what other routines try to
do (they almost all handle the bin case first).
Tim Peters [Tue, 28 Jan 2003 01:00:38 +0000 (01:00 +0000)]
Several routines appeared to inline the guts of memoize(), possibly for
some notion of low-level efficiency. Undid that, but left one routine
alone: save_inst() claims it has a reason for not using memoize().
I don't understand that comment, so added an XXX comment there.
Tim Peters [Tue, 28 Jan 2003 00:48:09 +0000 (00:48 +0000)]
save(): Fiddled the control flow to put the normal case where it
belongs. This is a much smaller change than it may appear: the bulk
of the function merely got unindented by one level.
Tim Peters [Tue, 28 Jan 2003 00:13:19 +0000 (00:13 +0000)]
Removed the new LONG2 opcode: it's extravagant. If LONG1 isn't enough,
then the embedded argument consumes at least 256 bytes. The difference
between a 3-byte prefix (LONG2 + 2 bytes) and a 5-byte prefix (LONG4 +
4 bytes) is at worst less than 1%. Note that binary strings and binary
Unicode strings also have only "size is 1 byte, or size is 4 bytes?"
flavors, and I expect for the same reason. The only place a 2-byte
thingie was used was in BININT2, where the 2 bytes make up the *entire*
embedded argument (and now EXT2 also does this); that's a large savings
over 4 bytes, because the total opcode+argument size is so small in
the BININT2/EXT2 case.
Removed the TAKEN_FROM_ARGUMENT "number of bytes" code, and bifurcated it
into TAKEN_FROM_ARGUMENT1 and TAKEN_FROM_ARGUMENT4. Now there's enough
info in ArgumentDescriptor objects to deduce the # of bytes consumed by
each opcode.
Rearranged the order in which proto2 opcodes are listed in pickle.py.