Peter Johnson [Fri, 2 Jan 2009 07:27:39 +0000 (07:27 -0000)]
Gas parser: Move instruction/prefix lookup from tokenizer to parser.
Use the single token of lookahead to detect the label case.
This is significantly cleaner as it removes the special-casing of labels
in the tokenizer (so there is just a single identifier rule) and removes
the INSTDIR parser state (as this was only used to prevent instruction
lookup within other locations).
Also, ID and LABEL now provide the string length to the parser. We needed
to do this for ID due to parse_check_insnprefix() needing the length, so
both were folded in for consistency.
Peter Johnson [Sun, 21 Dec 2008 10:57:36 +0000 (10:57 -0000)]
Legalize effective addresses such as [eax*2+ebx*2-ebx].
These were incorrectly identified as invalid. The code would see the
ebx*2 term and identify ebx as the index register. After the ebx was
subtracted, the ebx remained in the index register slot, so eax*2 had
nowhere to go. The code now recognizes this case and frees the slot
when the -ebx is processed, leaving the index register selection up
to the main part of the code.
Peter Johnson [Fri, 5 Dec 2008 07:13:33 +0000 (07:13 -0000)]
gen_x86_insn.py: Handle invalid rcstag.
This can happen if somehow this file is retrieved without expanded keywords
(e.g. directly from the webpage, or via something like git-svn).
Reported by: Patrick Walton <pcwalton@cs.ucla.edu>
Peter Johnson [Tue, 7 Oct 2008 05:38:11 +0000 (05:38 -0000)]
Add core TASM syntax support.
Contributed by: Samuel Thibault <samuel.thibault@ens-lyon.org>
It is built on top of the NASM parser and preproc, with the following
notable extensions for TASM syntax:
- case insensitive symbols and filenames,
- support for segment and size of labels, which permits to avoid giving
them on each memory dereference,
- support for data reservation (i.e. e.g. "var dd ?"),
- support for multiples (i.e. e.g. "var dd 1 dup 10"),
- little endian string integer constants,
- additional expression operators: shl, shr, and, or, low, high,
- additional offset keyword,
- additional fword and df,
- support for doubled quotes within quotes,
- support for array-like and structure-like notations: t[eax] and
[var].field,
- support for tasm directives: macro, rept, irp, locals, proc, struc,
segment, assume.
Notes:
- Almost all extensions are only effective when tasm_compatible_mode is
set, so we should have very reduced possible breakage.
- Because the "and" keyword can be an expression operator and an
instruction name, the data pseudo-instructions explicitly switch the
lexer state to INSTRUCTION state to fix the ambiguity.
- In gen_x86_insn.py, several instructions (namely lds and lea) now take
relaxed memory sizes. The reason is that in the case of tasm, the size
of the actual pointed data is passed up to there, and thus any type of
data should be accepted.
With all of this, loadlin can be compiled by yasm with quite reduced
modifications.
Peter Johnson [Sun, 5 Oct 2008 08:31:04 +0000 (08:31 -0000)]
Optimize non-strict push with 66 override to byte size if possible in NASM
syntax.
Previously, the forms of push that did this optimization were disabled in
NASM syntax due to conflicting with the size=BITS case. Fix this via
reordering to allow these forms to be active in NASM syntax.
Peter Johnson [Tue, 30 Sep 2008 03:56:37 +0000 (03:56 -0000)]
Fix expr simplification bug.
If an expression of the form INT+(a+b)+INT was simplified, constant folding
would combine the ints, but then the level stage (to make INT+a+b) would
start reading from the second (deleted due to folding) INT rather than the
new end of the expression.
Peter Johnson [Tue, 30 Sep 2008 02:52:56 +0000 (02:52 -0000)]
Nasm lexer: Don't read past end of passed string.
This was because the re2c-generated code always reads the next character
prior to user code being executed. Instead, check for the \0 marker prior
to entering the re2c code. Retain the re2c check just for sanity.
Peter Johnson [Tue, 15 Jul 2008 05:49:29 +0000 (05:49 -0000)]
Actually implement yasm__abspath() according to its documentation.
Formerly, it would not correctly handle absolute paths. Now it uses
yasm__combpath() to do absolute-path aware path combination.
As a benefit, it no longer needs to be OS-specific.
Peter Johnson [Sun, 6 Jul 2008 22:26:49 +0000 (22:26 -0000)]
Generated files listed in SOURCES (rather than included by other files)
should not be listed in BUILT_SOURCES but rather have nodist_ prepended.
They still need to be separately listed as CLEANFILES as they're built
at make time.
Reported by: David Harvey <dmharvey@math.harvard.edu>
Peter Johnson [Thu, 3 Jul 2008 04:19:12 +0000 (04:19 -0000)]
Bin map file: Fix incorrect address printing for symbols.
We were printing the previous bytecode's start rather than the label's
bytecode. Use yasm_bc_next_offset() to get the correct offset.
Peter Johnson [Sun, 8 Jun 2008 09:06:05 +0000 (09:06 -0000)]
Fix #132: Add --prefix and --suffix (aka --postfix) options.
These allow arbitrary prefixes and/or suffixes to be added to
externally-visible (GLOBAL, EXTERN, or COMMON) symbol names.
Peter Johnson [Thu, 5 Jun 2008 08:48:21 +0000 (08:48 -0000)]
Fix #141: Add macho64 PIC support.
The way PIC relocations are generated for macho64 requires a bit of a hack
to detect MOV opcodes and generate GOT_LOAD relocs.
GAS contains a similar hack.
Peter Johnson [Fri, 23 May 2008 06:46:51 +0000 (06:46 -0000)]
Enable DLL/plugin builds with cmake on Windows.
Add proper declspec dllimport/dllexport to all libyasm functions.
Use macros to make these do nothing on non-cmake and Unix builds.
Peter Johnson [Thu, 22 May 2008 09:08:03 +0000 (09:08 -0000)]
Add cmake build infrastructure.
Not default nor even distributed in the .tar.gz, the cmake build allows for
loadable yasm plugins by building libyasm as a shared library.
Example plugins are in the plugins/ directory, and may be loaded into a
cmake-built yasm using the -N command line option (non-cmake builds will
not have this option).
Tested only on Linux so far, but should be relatively painless to port to
Windows thanks to the use of cmake rather than libtool to create shared
libraries.
The only modification to the main source tree is some conditional-compiled
additions to yasm.c.
Peter Johnson [Fri, 9 May 2008 06:46:02 +0000 (06:46 -0000)]
Split NASM preprocessor standard macro set between various modules.
Standard macro sets are looked up based on parser and preprocessor keyword
from individual modules.
The "standard" NASM parser macros now reside in the NASM parser, so when
the GAS parser is used with the NASM preprocessor, the NASM-specific macros
are no longer defined.
Object-format specific macros are now individually defined by each object
formatm module. This allows for the object formats to be independent of the
NASM preprocessor module and yields a small optimization benefit as unused
object format macros don't need to be skipped over.
Also add GAS macro equivalents for the Win64 SEH more complex directives [1].
[1] Requested by Brian Gladman <brg@gladman.plus.com>
Peter Johnson [Sat, 12 Apr 2008 08:30:22 +0000 (08:30 -0000)]
Allow underscores in the middle of binary, octal, and hex constants.
This makes things like 00_11_22_33h okay.
Allow 0X as well as 0x in directives (already allowed for normal case).
Peter Johnson [Fri, 11 Apr 2008 01:37:46 +0000 (01:37 -0000)]
Move BITS==64 condition out of the CPU field (where it really doesn't belong)
and into a new misc_flags field. This doesn't take any more space,
simplifies the code, and allows adding additional conditions like this in
the future.
Found and fixed a bug in CPU field generation (not copying a set variable
in gen_x86_insn.py).
Peter Johnson [Fri, 4 Apr 2008 02:10:24 +0000 (02:10 -0000)]
PCOMUcc should have been PCOMccU.
Fix up condition code naming and add aliases to match SSE5 added to GNU
binutils as follows:
COMcc:
- swapped uneq, unlt, and unle with neq, nlt, and nle
- added unge, ungt, ne, uge, ugt, nge, ngt, une, ge, and gt aliases
PCOMcc:
- added ne alias (for neq)
Peter Johnson [Thu, 27 Mar 2008 05:35:36 +0000 (05:35 -0000)]
Fix #136: Unbreak ..@ non-local-label mechanism.
Add testcase for this.
Also fix $-prefixed labels to match non-$-prefixed label behavior
(this has been broken for a very long time).
Peter Johnson [Fri, 15 Feb 2008 06:06:57 +0000 (06:06 -0000)]
Pass size directly to yasm_value_finalize_expr() rather than setting
afterwards. This fixes the value full-mask case for immediate operands:
e.g. "mov ax, word (label & 0xffff)".
Peter Johnson [Sat, 9 Feb 2008 03:35:07 +0000 (03:35 -0000)]
Enable use of sym@FOO constructs in GAS parser.
To do this, restructure how special symbols are handled between the parser
and object format. Instead of creating special symbols with the right
names, instead have the parser call the object format to see if a match
is found into the special symbols, which are no longer stored in the
symbol table.