Peter Johnson [Sun, 16 Apr 2006 08:41:41 +0000 (08:41 -0000)]
* symrec.pxi: Revamp to more correctly generate Symbol objects, support
most dictionary functions (but not yet iterator protocol).
* yasm.pyx: Add Py_INCREF, Py_DECREF, malloc, free externals.
* coretype.pxi: Rename c_print to print_, add some missing yasm_expr_op
enum values.
Peter Johnson [Sat, 15 Apr 2006 08:03:42 +0000 (08:03 -0000)]
* python-yasm/Makefile.inc: Since we aren't using buildtools to call Pyrex,
we shouldn't need to remove the build directory. Not removing it speeds up
the build considerably (we don't have to rebuild all of the Python bindings
just because the Makefile changed, for example).
Peter Johnson [Sat, 15 Apr 2006 04:06:44 +0000 (04:06 -0000)]
Change the NASM preprocessor to use yasm_intnum and yasm_expr. This
decreases bloat and more importantly makes it possible to use >32-bit values
in the preprocessor.
This has NOT been heavily tested, so there may easily be bugs. I've not yet
decided whether to merge this to 0.5.0 final for this reason.
Inspired by: Jason Chen <jchen@centtech.com> patches to use "long long"
rather than "long" in the preprocessor to enable >32-bit values.
* coretype.h (yasm_op): Add XNOR, LXOR, LXNOR, LNOR.
* intnum.c (yasm_intnum_calc): Calculate the above.
* expr.c (expr_is_constant, expr_can_destroy_int_left)
(expr_can_destroy_int_right, ...): Actually handle logical operations,
including the new ones.
* intnum.c (yasm_intnum_get_str): New; gets a signed decimal string
representation of an intnum.
* intnum.h (yasm_intnum_get_str): Prototype.
* nasm-preproc.c (nasm_preproc_input): Close a memory leak.
* nasm.h: Clean out a lot of cruft we don't use.
(tokenval): Change t_integer and t_inttwo to yasm_intnums.
(evalfunc): Remove fwref and hints parameters (cleanup), change return value
to yasm_expr.
* nasm-eval.c: Massively rewrite to just call appropriate yasm_expr creation
functions rather than calculating the value here. Overall recursive descent
parsing structure is unchanged.
* nasmlib.c: Remove a lot of now-unused functions.
(nasm_readnum): Use yasm_intnum functions and return that.
(nasm_readstrnum): Likewise.
* nasmlib.h: Update prototypes.
* nasm-pp.c: Change to use intnum/expr as necessary.
* nasm-preproc.c (nasm_preproc_destroy): Don't call nasm_eval_cleanup,
it's been deleted.
* nasmpp-bigint.asm: New test for >32-bit preproc values.
* ifcritical-err.errwarn: The new code doesn't generate a duplicate warning.
Peter Johnson [Wed, 12 Apr 2006 04:06:44 +0000 (04:06 -0000)]
Correctly handle input characters >127 by using unsigned char in the re2c
tokenizers. Signed chars >127 are negative, and thus aren't caught by the
[\000-\377] range.
* gas-parser.h (YYCTYPE): Change to unsigned char.
* gas-bison.y, gas-token.re: Cast as necessary to char.
* nasm-parser.h, nasm-bison.y, nasm-token.re: Likewise.
* lc3bid.re: Likewise.
Peter Johnson [Sun, 9 Apr 2006 06:05:37 +0000 (06:05 -0000)]
* ax_create_stdint_h.m4: Don't bother using head -1 to get the first line of
gcc --version; this is just extra information. POSIX needs -n 1 instead of
-1; instead of dealing with this, just disable this function.
Peter Johnson [Thu, 6 Apr 2006 05:55:41 +0000 (05:55 -0000)]
Take #2 on reversioning: decouple version and build from autoconf version
(which is used for the tar.gz name). Also clean up Mkfiles/config.h
* configure.ac: Change autoconf version back to HEAD, add new PACKAGE_INTVER
and PACKAGE_BUILD config.h defines.
* cv-symline.c, yasm.c: Use PACKAGE_INTVER and PACKAGE_BUILD instead of
PACKAGE_STRING.
* genversion.c: Likewise.
* Mkfiles: Clean up and add PACKAGE_INTVER and PACKAGE_BUILD.
Peter Johnson [Wed, 5 Apr 2006 07:14:41 +0000 (07:14 -0000)]
Implement better versioning. From now on, trunk's version will be the
current major and minor version but with subminor version = 99. All releases
and snapshots will have a build version of the current svn version. Version
information is put into predefined macros in the NASM preproc:
__YASM_MAJOR__, __YASM_MINOR__, __YASM_SUBMINOR__, __YASM_BUILD__,
__YASM_VERSION_ID__, __YASM_VER__.
__YASM_VER__ does not have the build version as part of the string, and
__YASM_VERSION_ID__ does not incorporate the build version.
If the build version is "HEAD" (or other non-numeric), __YASM_BUILD__ is
set to 0.
* configure.ac: Set version to 0.4.99.HEAD for trunk.
* genversion.c: Generate version.mac for the NASM preproc version macros.
* Makefile.inc: Hook into build.
Peter Johnson [Wed, 5 Apr 2006 05:39:23 +0000 (05:39 -0000)]
* COPYING: Update verbiage, add list of contributors and copyright notice.
* genstring.c: Generate string array from source file.
* Makefile.am: Build genstring.
* Makefile.inc: Use genstring to create license.c from COPYING.
* yasm.c: Include license.c and use it for new --license option. Shorten
--version option to just displaying version, compile date, and very short
copyright message.
Peter Johnson [Wed, 5 Apr 2006 02:26:19 +0000 (02:26 -0000)]
Quiet warnings in Pyrex code by adding "-w" if GCC detected.
* configure.ac: Pass GCC value to Makefile
* Makefile.inc: Pass GCC value to python-setup.txt
* setup.py: Check GCC value and append -w if necessary.
Peter Johnson [Tue, 4 Apr 2006 07:51:42 +0000 (07:51 -0000)]
Be much smarter at checking for and running Pyrex. Pyrex is a Python module
and the "pyrexc" command is just a wrapper, so instead of looking for
"pyrexc" look for the module instead.
* pyrex.m4: New macro to check Pyrex version.
* m4/Makefile.inc: Add to dist.
* configure.ac: Use macro to check for Pyrex >= 0.9.3.
* python-yasm/Makefile.inc: Run Python directly instead of pyrexc wrapper.
Peter Johnson [Tue, 4 Apr 2006 06:07:32 +0000 (06:07 -0000)]
Hook Python module into the build (even though it's pretty incomplete),
enabled with --enable-python (defaults to auto-detect).
Thanks to some magic in python-yasm/Makefile.inc and setup.py, this actually
plays nicely with automake distcheck.
* m4/pythonhead.m4: Script to find Python.h.
* m4/Makefile.inc: Include it in distfiles.
* configure.ac: Add --enable-python option and Python and Pyrex detection.
* tools/Makefile.inc: Pull tools/python-yasm/Makefile.inc into build.
* python-yasm/Makefile.inc: Add automake build magic for extension.
* setup.py: Likewise.
Peter Johnson [Mon, 3 Apr 2006 04:54:27 +0000 (04:54 -0000)]
* value.pxi, expr.pxi: Copy instead of NULL'ing origin; this will look much
more pythonic on the python side, although it will cause much additional
object churn, and will not be fast.
Peter Johnson [Mon, 3 Apr 2006 04:32:25 +0000 (04:32 -0000)]
python-yasm: Modularize and clean up. Note the modularization is a little
bit broken: you need to remove yasm.c before running python setup.py build
or the pyrex step may not actually run.
Michael Urman [Sun, 2 Apr 2006 08:13:31 +0000 (08:13 -0000)]
Checkin of initial work on a pyrex python binding for yasm. Very little
works so far. Build it with the command {{{python setup.py build}}}, and
optionally symlink to the built yasm.so to enable importing it from a
python started in the same directory.
Peter Johnson [Sat, 1 Apr 2006 09:09:53 +0000 (09:09 -0000)]
* coff-objfmt.c (coff_objfmt_section_switch): Fix up handling of GAS flags
a bit; add support for .rodata and .debug sections.
* dwarf2-line.c (yasm_dwarf2__generate_line): Set alignment for .debug_line
explicitly to 1 rather than leaving it implicit.
Peter Johnson [Wed, 29 Mar 2006 07:24:51 +0000 (07:24 -0000)]
Fix a couple bugs in CV8 that generate bad debug info.
* cv-symline.c (cv_sym_bc_tobytes): Output sym relocation at correct offset relative to bc start.
(cv8_add_sym_compile): Compile flag "symbol" has list of key/values terminated with 0/0; we
weren't doing this. Like MASM, just output an empty list.
(cv_generate_sym): Default data type to UBYTE rather than nothing.
Peter Johnson [Tue, 28 Mar 2006 09:05:53 +0000 (09:05 -0000)]
Finally fix brokenness that was NASM imported preprocessor include path
handling. The new way of doing things follows C compiler standard path
searching rules in terms of relative and absolute paths in relation to
source location and the current working directory. There's probably some
latent bugs in this as I've not tested it on Windows yet.
This should make CodeView actually usable with included files.
* file.c (yasm__abspath_win, yasm__abspath_unix): Convert a relative path
into an absolute path. Code moved from:
* cv-symline.c (cv_make_pathname): Here (deleted).
(cv_dbgfmt_add_file): Use yasm__abspath() instead.
* file.h (yasm__abspath_win, yasm__abspath_unix): Prototype.
(yasm__abspath): Macro pointing to right version.
* file.c (yasm__combpath_win, yasm__combpath_unix): Combine two possibly
relative paths (also handles absolute paths).
* file.h (yasm__combpath_win, yasm__combpath_unix): Prototype.
(yasm__combpath): Macro pointing to right version.
* file.c (yasm__fopen_include): Where the new include search magic happens,
using yasm__combpath heavily.
* file.h (yasm__fopen_include): Prototype.
* nasm-preproc.c: Update to use yasm__fopen_include().
* nasmlib.c (nasm_src_get_fname): Helper.
* nasmlib.h (nasm_src_get_fname): Helper prototype.
* nasm-preproc.c (nasm_preproc_add_dep): Don't free pointer, we need it.
* splitpath_test.c: Update.
* combpath_test.c: New test for yasm__combpath_*.
* Mkfiles: Update config.h for yasm__abspath and yasm__combpath.
Peter Johnson [Tue, 28 Mar 2006 02:37:21 +0000 (02:37 -0000)]
* yasm.c (main): Fix longstanding bug of putting the default object file
output into the source directory rather than the current directory (like
C compilers do).
* coretype.h (YASM_PATHSEP): New define for path separator.
* vc8/config.h, vc/config.h, dj/config.h: Override.
* dbgfmts/codeview: New, adds "cv8" dbgfmt.
* dbgfmts/Makefile.inc: Hook into build.
* coff-objfmt.c: Enable cv8 dbgfmt for win32 and win64. Change a few things
to more closely match MASM output.
(coff_section_data): Add isdebug flag so that section flags are set properly.
(coff_objfmt_secton_switch): Initialize to 0 here.
(coff_objfmt_init_remaining_section): Initialize to 1 and set various section
flags (DATA, DISCARD, READ) if section name starts with ".debug".
(coff_objfmt_output): If non-NULL dbgfmt, set object flags to say line numbers
are included.
* dwarfwin64_testhd.hex: Minor update due to coff_objfmt_output() change.
Peter Johnson [Mon, 20 Mar 2006 00:43:18 +0000 (00:43 -0000)]
Add a feature and fix a long-standing bug in Win64 output. The new feature
is support for cross-section relative symbol references using "sym-$".
This generates a 32-bit relative relocation similar to those used for
cross-section jumps and calls.
The bugfix is that in Win64 output, RIP-relative relocations do something
special when there is an immediate value (or anything else) between the
value being relocated and the next instruction. E.g.
"shl dword [sym wrt rip], 5" needs to generate a REL32_1 relocation thanks
to the immediate byte following the RIP-relative value.
* symrec.c (sym_type): add SYM_CURPOS to track labels representing the
current assembly position (e.g. $ in NASM, . in GAS).
(yasm_symtab_define_curpos): New function to create symbols of this type.
(yasm_symrec_is_curpos): Check to see if symbol is SYM_CURPOS type.
(yasm_symrec_get_label, yasm_symrec_print): Update to handle SYM_CURPOS.
* symrec.h (yasm_symtab_define_curpos): Prototype.
(yasm_symrec_is_curpos): Prototype.
* gas-bison.y: Use yasm_symtab_define_curpos when defining '.'.
* nasm-bison.y: Use yasm_symtab_define_curpos when defining '$'.
* value.c (yasm_value_finalize_expr): Look for cross-section
"sym-SYM_CURPOS" combinations and generate curpos-relative value if found.
* coretype.h (yasm_value): Add ip_rel member to designate that curpos_rel
is set due to the value being IP-relative (rather than sym-curpos).
* value.h (yasm_value_initialize, yasm_value_init_sym): Initialize ip_rel.
* x86expr.c (yasm_x86__check_ea): Set ip_rel in addition to curpos_rel if
detected WRT rip.
* x86bc.c (x86_bc_insn_tobytes): Use ip_rel instead of curpos_rel when
adjusting for RIP-relative.
* coff-objfmt.c (coff_objfmt_output_value): Properly adjust output and
generate correct relocations for both curpos_rel and ip_rel. This
includes new generation of REL32_1, REL32_2, etc relocations.
* symrec.c, symrec.h (yasm_symtab_define_label2): Delete.
* coff-objfmt.c (stabs_debgfmt_generate_sections): Change to use
yasm_symtab_define_label() instead.
* win32-curpos.asm, win64-curpos.asm, curpos.asm, curpos-err.asm,
elf_gas64_curpos.asm: New tests for above.
Peter Johnson [Sun, 19 Mar 2006 18:31:00 +0000 (18:31 -0000)]
* value.c (value_finalize_expr): Check for purely -1*symrec cases; these
are also invalid.
* fwdequ64.asm: This test actually had a "4-label" expression; the correct
way to write this is "4-(label-$$)" (same output generated).
Peter Johnson [Sun, 19 Mar 2006 04:18:10 +0000 (04:18 -0000)]
Massive cleanup of relocation and WRT handling. Closes #49 and lays the
groundwork for further features and possible cleanups.
Note: this commit changes the way in which relocations in the
COFF/Win32/Win64 target can be forced to reference a different symbol than
is being pointed to; instead of the ambiguous "trap+(trap.end-trap)" to get
the reloc to point at trap.end but reference the trap symbol, after this
commit "trap.end wrt trap" is the way to say this. This also reads a lot
more clearly and is not ambiguous. This should really only affect people
who write .pdata sections for Win64. See the
objfmts/win64/tests/win64-dataref.asm testcase for an example of usage.
This cleanup adds a new data structure, yasm_value, which is used for all
expressions that can be potentially relocatable. This data structure
splits the absolute portion of the expression away from the relative
portion and any modifications to the relative portion (SEG, WRT,
PC-relative, etc). A large amount of code in the new value module breaks
a general expression into its absolute and relative parts
(yasm_value_finalize_expr) and provides a common set of code for writing
out non-relocated values (yasm_value_output_basic).
All bytecode handling in both libyasm and the architecture modules was
rewritten to use yasm_values when appropriate (e.g. data values,
immediates, and effective addresses). The yasm_output_expr_func is now
yasm_output_value_func and all users and implementors (mainly in object
formats) have been updated to handle yasm_values.
Simultaneously with this change, yasm_effaddr and yasm_immval full
structure definitions have been moved from bc-int.h to bytecode.h.
The data hiding provided by bc-int.h was relatively minimal and probably
overkill. Also, great simplifications have been made to x86 effective
address expression handling.
Peter Johnson [Sat, 18 Mar 2006 22:36:21 +0000 (22:36 -0000)]
Eliminate some signed/unsigned character mismatches in GAP build.
* phash.c, phash.h (phash_checksum): Change to take signed char.
* perfect.h (key): Change name_k to char.
* perfect.c (initinl): Cast name_k.
Peter Johnson [Tue, 14 Mar 2006 08:52:41 +0000 (08:52 -0000)]
More gracefully handle absolute section refernce expansion, and allow for
correct detection of absolute section reference loops (fixing a crash case).
This is also needed for an ongoing rewrite of reloc/value handling.
* expr.c (expr_xform_bc_dist): Remove transformation of absolute section
references; move in changed form into...
(yasm_expr__level_tree): Here. The new code doesn't immediately calculate
the distance from the start of the absolute section to the referenced symbol;
rather it generates an expression for this quantity. As this actually adds
new absolute section refs to the tree, we can't expand with YASM_EXPR_SYMs,
otherwise we would expand multiple times. Thus we need a new
YASM_EXPR_SYMEXP type that thus does not get expanded. Unfortunately this
ripples changes a bit because everywhere *else* we look for YASM_EXPR_SYM,
we now need to look for YASM_EXPR_SYMEXP as well...
(expr_xform_bc_dist): Here.
(yasm_expr__copy_except): Here.
(yasm_expr_extract_symrec): Here.
(yasm_expr_get_symrec): Here.
(yasm_expr_print): Here.
* bin-objfmt.c (bin_objfmt_expr_xform): And here.
* expr-int.h (yasm_expr__type): Define new YASM_EXPR_SYMEXP.
* section.h (yasm_section_abs_get_sym): To implement above, we need to get
a symbol referencing the first bytecode in the absolute section. To avoid
creating redundant symrecs, one is generated for us now. This function
lets us get it in yasm_expr__level_tree().
* section.c (yasm_section_abs_get_sym): Implement.
(yasm_section): Add necessary SECTION_ABSOLUTE data.
(yasm_section_create_absolute): Create the symrec here.
* absloop-err.asm: New test for absolute section reference loops.
Peter Johnson [Sun, 5 Mar 2006 21:15:59 +0000 (21:15 -0000)]
* hamt.c: Add stopgap fix for GAP in cross-build situations by typedefing
uintptr_t to unsigned long for the build platform (instead of trying to
pull in the host platform's _stdint.h).
Peter Johnson [Sun, 5 Mar 2006 01:07:38 +0000 (01:07 -0000)]
* phash.c (phash_lookup), perfect.c (initnorm): Mask upper bits in hash
calculations so that these functions behave identically on 32-bit and >32-bit
architectures.
* gap.c: Fix warnings.
* perfect.c: Change K&R to prototypes to avoid warnings.
Peter Johnson [Sat, 4 Mar 2006 22:09:26 +0000 (22:09 -0000)]
Rewrite x86 identifier recognition to use a minimal perfect hash table
instead of re2c-generated code. This gives identifier recognition a
significant speedup and also drastically shortens compilation time of yasm
itself. This rewrite encouraged combining instruction and prefix
recognition into one function and register and target modifier
recognition into a second function (rather than having 5 or so separate
functions).
Also created a state in the NASM parser (as was done in the GAS parser),
so instructions/prefixes are only looked for until an instruction is
recognized. This avoids search time in the instructions hash for operands.
The tool used to generate the new identifier recognition is called GAP.
Someday we might extend this to generate more code than just the perfect
hash lookup.
* tools/gap: New tool to Generate Architecture Parser (aka perfect hashes).
* phash.c, phash.h: Helper functions used by GAP-generated code.
* x86id.re: Delete. Split into..
* x86parse.gap: Contains all identifier recognition portions.
* x86id.c: Contains instruction operand tables and code and higher-level
entry points into x86parse.gap perfect hash recognizers. Chose to flow
history of x86id.re into this file.
* arch.h: Combine instruction/prefix entry points and register/target
modifier entry points.
* lc3barch.c, lc3bid.re, lc3barch.h: Update to match.
* x86arch.c, x86arch.h: Update to match.
* Makefile.am, various Makefile.inc: Update.
* POTFILES.in: Update due to numerous file changes (not just this commit).
* Mkfiles: Update. VC build files untested at the moment.
Peter Johnson [Thu, 2 Mar 2006 02:53:11 +0000 (02:53 -0000)]
* x86id.re (retnf_insn, yasm_x86__parse_check_insn): Fix handling of retf
(NASM syntax) in 64-bit mode. While I'm here, make all ret forms in GAS mode
match GAS output.
Peter Johnson [Tue, 28 Feb 2006 07:59:29 +0000 (07:59 -0000)]
* coff-objfmt.c: Fix crash when sections are generated inside of the
assembler that don't call section_switch(), e.g. for dwarf2 internally
generated debug sections.
Peter Johnson [Sat, 25 Feb 2006 19:39:57 +0000 (19:39 -0000)]
Fix #70 by allowing overrides on the default (usually ".text") section.
The fix for this rippled into a lot of places, and I'm starting to see some
opportunities for cleaning up some of the object and objfmt structures.
* objfmt.h (yasm_objfmt_add_default_section): Move from standalone function
into objfmt-specific function.
(yasm_objfmt_module): Remove default_section_name string, and add objfmt
specific add_default_section function.
* yasm.c (main): Use slightly updated parameters when calling.
* xdf-objfmt.c, bin-objfmt.c, dbg-objfmt.c, coff-objfmt.c: Implement.
Usually this required refactoring the objfmt-specific section data creation
into a separate function that could be used by both section_switch() and
the new add_default_section() functions, and changing section_switch() to
update changes to the section data if section was new or previously just a
default section, instead of the previous behavior of warning if the section
was not new.
* objfmt.c: Delete (no longer needed).
* Makefile.inc, Makefile.flat, libyasm.vcproj
Makefile.dj: Update to reflect removal.
Peter Johnson [Fri, 24 Feb 2006 04:29:39 +0000 (04:29 -0000)]
Fix #69 by making the NASM preproc and parser use the yasm built-in
alignment bytecode rather than just times'ing a NOP. This generates better
NOP code.
The new align only triggers when the NASM align directive is used unadorned
or with nop as the parameter (e.g. "align 16" or "align 16, nop"). Other
uses, including all uses of balign, maintain their old NASM behavior. This
is somewhat useful if you still want a string of NOPs rather than more
optimized instruction patterns: just use "balign X, nop" rather than
"align X". The new align also follows the GAS behavior of increasing the
section's alignment to be the specified alignment (if not already larger).
While I was in here, I found and fixed a bug in 16-bit alignment generation
(typo). I also changed the x86 32-bit code alignment fill pattern per
suggestions in the AMD x86 code optimization manual.
* nasm-bison.y: Implement a new [align] directive that can take a single
parameter (the alignment) and generate a nop-generating align bytecode.
* standard.mac: Change align macro to generate [align] if the second
macro parameter is nonexistent or "nop".
* x86arch.c (x86_get_fill): Update 32-bit fill pattern and fix bug in 16-bit
fill pattern.
Peter Johnson [Sun, 12 Feb 2006 23:12:10 +0000 (23:12 -0000)]
* intnum.c (yasm_intnum_create_leb128): Create an intnum from a LEB128
encoded value.
* intnum.h (yasm_intnum_create_leb128): Prototype.
* leb128_test.c: Test above.
Peter Johnson [Sat, 11 Feb 2006 21:58:35 +0000 (21:58 -0000)]
Generate DWARF2 information from asm source, closing #43. Asm-level source
debug information is generated if no file/loc directives are used. Also will
generate basic DWARF2 info/abbrev/aranges if not specified in source. What's
not handled is multiple code sections; I need to figure out how DWARF2 expects
these to be generated.
The implementation of this refactors all the line generation into
dwarf2-line.c and makes dwarf2-dbgfmt.c contain only the core DWARF2
functions.
One challenge yet to be taken care of is how to test the automatic generation,
as the current working directory is saved into the output object file.
* dwarf64_2loc: Update so built-in info generation doesn't happen.
Peter Johnson [Sat, 11 Feb 2006 21:52:20 +0000 (21:52 -0000)]
* elf-objfmt.c (elf_objfmt_output): Create all missing section headers using
initial traversal of object sections. This ensures any symbols generated by
elf_objfmt_create_dbg_secthead() are properly accounted for.
(elf_objfmt_create_dbg_secthead): Update to be properly called from traversal.