Adjust TAS assembly as per recent discussions: use "+m"(*lock) everywhere
to reference the spinlock variable, and specify "memory" as a clobber
operand to be sure gcc does not try to keep shared-memory values in
registers across a spinlock acquisition. Also tighten the S/390 asm
sequence, which was apparently written with only minimal study of the
gcc asm documentation. I have personally tested i386, ia64, ppc, hppa,
and s390 variants --- there is some small chance that I broke the others,
but I doubt it.