Fix #69 by making the NASM preproc and parser use the yasm built-in
alignment bytecode rather than just times'ing a NOP. This generates better
NOP code.
The new align only triggers when the NASM align directive is used unadorned
or with nop as the parameter (e.g. "align 16" or "align 16, nop"). Other
uses, including all uses of balign, maintain their old NASM behavior. This
is somewhat useful if you still want a string of NOPs rather than more
optimized instruction patterns: just use "balign X, nop" rather than
"align X". The new align also follows the GAS behavior of increasing the
section's alignment to be the specified alignment (if not already larger).
While I was in here, I found and fixed a bug in 16-bit alignment generation
(typo). I also changed the x86 32-bit code alignment fill pattern per
suggestions in the AMD x86 code optimization manual.
* nasm-bison.y: Implement a new [align] directive that can take a single
parameter (the alignment) and generate a nop-generating align bytecode.
* standard.mac: Change align macro to generate [align] if the second
macro parameter is nonexistent or "nop".
* x86arch.c (x86_get_fill): Update 32-bit fill pattern and fix bug in 16-bit
fill pattern.