Peter Johnson [Tue, 7 Oct 2008 05:38:11 +0000 (05:38 -0000)]
Add core TASM syntax support.
Contributed by: Samuel Thibault <samuel.thibault@ens-lyon.org>
It is built on top of the NASM parser and preproc, with the following
notable extensions for TASM syntax:
- case insensitive symbols and filenames,
- support for segment and size of labels, which permits to avoid giving
them on each memory dereference,
- support for data reservation (i.e. e.g. "var dd ?"),
- support for multiples (i.e. e.g. "var dd 1 dup 10"),
- little endian string integer constants,
- additional expression operators: shl, shr, and, or, low, high,
- additional offset keyword,
- additional fword and df,
- support for doubled quotes within quotes,
- support for array-like and structure-like notations: t[eax] and
[var].field,
- support for tasm directives: macro, rept, irp, locals, proc, struc,
segment, assume.
Notes:
- Almost all extensions are only effective when tasm_compatible_mode is
set, so we should have very reduced possible breakage.
- Because the "and" keyword can be an expression operator and an
instruction name, the data pseudo-instructions explicitly switch the
lexer state to INSTRUCTION state to fix the ambiguity.
- In gen_x86_insn.py, several instructions (namely lds and lea) now take
relaxed memory sizes. The reason is that in the case of tasm, the size
of the actual pointed data is passed up to there, and thus any type of
data should be accepted.
With all of this, loadlin can be compiled by yasm with quite reduced
modifications.
Peter Johnson [Sun, 5 Oct 2008 08:31:04 +0000 (08:31 -0000)]
Optimize non-strict push with 66 override to byte size if possible in NASM
syntax.
Previously, the forms of push that did this optimization were disabled in
NASM syntax due to conflicting with the size=BITS case. Fix this via
reordering to allow these forms to be active in NASM syntax.
Peter Johnson [Tue, 30 Sep 2008 03:56:37 +0000 (03:56 -0000)]
Fix expr simplification bug.
If an expression of the form INT+(a+b)+INT was simplified, constant folding
would combine the ints, but then the level stage (to make INT+a+b) would
start reading from the second (deleted due to folding) INT rather than the
new end of the expression.
Peter Johnson [Tue, 30 Sep 2008 02:52:56 +0000 (02:52 -0000)]
Nasm lexer: Don't read past end of passed string.
This was because the re2c-generated code always reads the next character
prior to user code being executed. Instead, check for the \0 marker prior
to entering the re2c code. Retain the re2c check just for sanity.
Peter Johnson [Tue, 15 Jul 2008 05:49:29 +0000 (05:49 -0000)]
Actually implement yasm__abspath() according to its documentation.
Formerly, it would not correctly handle absolute paths. Now it uses
yasm__combpath() to do absolute-path aware path combination.
As a benefit, it no longer needs to be OS-specific.
Peter Johnson [Sun, 6 Jul 2008 22:26:49 +0000 (22:26 -0000)]
Generated files listed in SOURCES (rather than included by other files)
should not be listed in BUILT_SOURCES but rather have nodist_ prepended.
They still need to be separately listed as CLEANFILES as they're built
at make time.
Reported by: David Harvey <dmharvey@math.harvard.edu>
Peter Johnson [Thu, 3 Jul 2008 04:19:12 +0000 (04:19 -0000)]
Bin map file: Fix incorrect address printing for symbols.
We were printing the previous bytecode's start rather than the label's
bytecode. Use yasm_bc_next_offset() to get the correct offset.
Peter Johnson [Sun, 8 Jun 2008 09:06:05 +0000 (09:06 -0000)]
Fix #132: Add --prefix and --suffix (aka --postfix) options.
These allow arbitrary prefixes and/or suffixes to be added to
externally-visible (GLOBAL, EXTERN, or COMMON) symbol names.
Peter Johnson [Thu, 5 Jun 2008 08:48:21 +0000 (08:48 -0000)]
Fix #141: Add macho64 PIC support.
The way PIC relocations are generated for macho64 requires a bit of a hack
to detect MOV opcodes and generate GOT_LOAD relocs.
GAS contains a similar hack.
Peter Johnson [Fri, 23 May 2008 06:46:51 +0000 (06:46 -0000)]
Enable DLL/plugin builds with cmake on Windows.
Add proper declspec dllimport/dllexport to all libyasm functions.
Use macros to make these do nothing on non-cmake and Unix builds.
Peter Johnson [Thu, 22 May 2008 09:08:03 +0000 (09:08 -0000)]
Add cmake build infrastructure.
Not default nor even distributed in the .tar.gz, the cmake build allows for
loadable yasm plugins by building libyasm as a shared library.
Example plugins are in the plugins/ directory, and may be loaded into a
cmake-built yasm using the -N command line option (non-cmake builds will
not have this option).
Tested only on Linux so far, but should be relatively painless to port to
Windows thanks to the use of cmake rather than libtool to create shared
libraries.
The only modification to the main source tree is some conditional-compiled
additions to yasm.c.
Peter Johnson [Fri, 9 May 2008 06:46:02 +0000 (06:46 -0000)]
Split NASM preprocessor standard macro set between various modules.
Standard macro sets are looked up based on parser and preprocessor keyword
from individual modules.
The "standard" NASM parser macros now reside in the NASM parser, so when
the GAS parser is used with the NASM preprocessor, the NASM-specific macros
are no longer defined.
Object-format specific macros are now individually defined by each object
formatm module. This allows for the object formats to be independent of the
NASM preprocessor module and yields a small optimization benefit as unused
object format macros don't need to be skipped over.
Also add GAS macro equivalents for the Win64 SEH more complex directives [1].
[1] Requested by Brian Gladman <brg@gladman.plus.com>
Peter Johnson [Sat, 12 Apr 2008 08:30:22 +0000 (08:30 -0000)]
Allow underscores in the middle of binary, octal, and hex constants.
This makes things like 00_11_22_33h okay.
Allow 0X as well as 0x in directives (already allowed for normal case).
Peter Johnson [Fri, 11 Apr 2008 01:37:46 +0000 (01:37 -0000)]
Move BITS==64 condition out of the CPU field (where it really doesn't belong)
and into a new misc_flags field. This doesn't take any more space,
simplifies the code, and allows adding additional conditions like this in
the future.
Found and fixed a bug in CPU field generation (not copying a set variable
in gen_x86_insn.py).
Peter Johnson [Fri, 4 Apr 2008 02:10:24 +0000 (02:10 -0000)]
PCOMUcc should have been PCOMccU.
Fix up condition code naming and add aliases to match SSE5 added to GNU
binutils as follows:
COMcc:
- swapped uneq, unlt, and unle with neq, nlt, and nle
- added unge, ungt, ne, uge, ugt, nge, ngt, une, ge, and gt aliases
PCOMcc:
- added ne alias (for neq)
Peter Johnson [Thu, 27 Mar 2008 05:35:36 +0000 (05:35 -0000)]
Fix #136: Unbreak ..@ non-local-label mechanism.
Add testcase for this.
Also fix $-prefixed labels to match non-$-prefixed label behavior
(this has been broken for a very long time).
Peter Johnson [Fri, 15 Feb 2008 06:06:57 +0000 (06:06 -0000)]
Pass size directly to yasm_value_finalize_expr() rather than setting
afterwards. This fixes the value full-mask case for immediate operands:
e.g. "mov ax, word (label & 0xffff)".
Peter Johnson [Sat, 9 Feb 2008 03:35:07 +0000 (03:35 -0000)]
Enable use of sym@FOO constructs in GAS parser.
To do this, restructure how special symbols are handled between the parser
and object format. Instead of creating special symbols with the right
names, instead have the parser call the object format to see if a match
is found into the special symbols, which are no longer stored in the
symbol table.
Peter Johnson [Fri, 8 Feb 2008 18:59:46 +0000 (18:59 -0000)]
Support masking of relocatable values with an AND of the full value width to
avoid warnings. This is primarily useful in bin object format output.
db label ; if label is >255, warns.
db label & 0xff ; okay, no warning.
Masks other than full-sized 1s are still not supported:
db label & 0x7f ; too complex error
Peter Johnson [Fri, 8 Feb 2008 18:26:40 +0000 (18:26 -0000)]
Fix #130: Add SAFESEH directive for indicating SEH handlers in win32 output.
Unlike in MASM, no command-line switch is required.
Usage:
extern handler (or handler: to define locally)
safeseh handler
Peter Johnson [Sat, 2 Feb 2008 19:23:17 +0000 (19:23 -0000)]
Revert r2029. According to both AMD64 and Intel 64 instruction set
references, REX + 90h opcode is not NOP, but a valid XCHG:
"The x86 architecture commonly uses the XCHG EAX, EAX instruction (opcode
90h) as a one-byte NOP. In 64-bit mode, the processor treats opcode 90h as
a true NOP only if it would exchange rAX with itself. Without this special
handling, the instruction would zero-extend the upper 32 bits of RAX, and
thus it would not be a true nooperation. Opcode 90h can still be used to
exchange rAX and r8 if the appropriate REX prefix is used."
Peter Johnson [Sat, 19 Jan 2008 08:59:19 +0000 (08:59 -0000)]
Make jmp with seg:off equ behave the same as NASM.
Formerly:
foo equ 1:2
jmp foo
would result in a far jump. Now, an explicit "far" is required:
jmp far foo
to generate a far jump.
In addition, the direct use of seg:off in immediates and effective
addresses will result in an error; the use of EQU'ed seg:off values
is still legal (and will still result in just the offset). This
behavior is more sane and also matches NASM behavior.
Thus:
foo equ 1:2
mov ax, foo ; okay, just 2
mov ax, [foo] ; okay, just 2
mov ax, 1:2 ; illegal
mov ax, [1:2] ; illegal
Peter Johnson [Tue, 4 Dec 2007 06:55:11 +0000 (06:55 -0000)]
Fix #125: Improve reporting of operand and expression syntax errors.
Now instead of the generic "expression syntax error", more informative
error messages such as the following are reported:
- unexpected `:' after instruction
- expected expression after `%'
- expected operand, got `%'
Peter Johnson [Wed, 28 Nov 2007 07:21:08 +0000 (07:21 -0000)]
Fix #119. Quite a few SSE/SSE2 instructions assumed 128-bit memory sizes
instead of the correct 64-bit or 32-bit sizes (e.g. xmm/m64 or similar).
It worked fine when no memory size was specified, but it should also work
with the correct size modifier.