Peter Johnson [Thu, 3 Nov 2005 05:29:42 +0000 (05:29 -0000)]
Handle instruction and prefix identifiers properly when used in other
places in GAS input. Do this by adding a tokenizer state that turns off
insn and prefix generation when inside an instruction or directive, AND
adding a special case for labels.
* gas-parser.h (state): Add INSTDIR.
* gas-token.re: Switch state back to INITIAL on ';' or newline; set state
when entering instruction or directive, add special case for labels.
* gas-bison.y: Remove non-working attempt at translating INSN and PREFIX
into string token; add special case for LABEL identifiers (generated from
special case for labels in tokenizer).
Peter Johnson [Thu, 3 Nov 2005 04:38:21 +0000 (04:38 -0000)]
* symrec.c (symrec_define): Don't error if a symbol is declared common and
then defined, and warn instead of error if a symbol is declared global and
then defined.
Peter Johnson [Thu, 3 Nov 2005 03:49:10 +0000 (03:49 -0000)]
Add warning class (YASM_WARN_UNINIT_CONTENTS) to turn off the
"uninitialized data in code/data section: zeroing" warning. This can now
be turned off using -Wno-uninit-contents on the command line.
* errwarn.h (yasm_warn_class): Add warning class.
* errwarn.c (yasm_errwarn_initialize): Default it to enabled.
* yasm.c (opt_warning_handler): Add as option.
* xdf-objfmt.c, elf-objfmt.c, bin-objfmt.c, coff-objfmt.c: Change warning
class for this warning.
Peter Johnson [Wed, 2 Nov 2005 08:24:19 +0000 (08:24 -0000)]
Add support for single-level GAS .rept directive. Nested .rept's are not
allowed at the moment. The implementation works mostly like a preproc; it
copies source lines and replays them to the lexer. A new .line directive
was added to fix up line numbers for errors and warnings.
* gas-parser.h (yasm_parser_gas): Add rept structure storage.
(gas_rept): New data structure for .rept state.
(gas_rept_line): Data structure to store source lines within .rept block.
* gas-parser.c: Initialize rept to NULL and check for unclosed rept.
* gas-bison.y: Add support for .line, .rept, and .endr directives. The
DIR_REPT handler just creates the rept structure, and the DIR_ENDR handler
just errors (.endr without .rept). All the real work is done in the lexer.
* gas-token.re (rept_input): Replays captured .rept block source lines back
to fill().
(fill): Call rept_input() instead of yasm_preproc_input() if expanding a
rept block.
(gas_parser_lex): Capture source lines and store into rept data structures.
Peter Johnson [Tue, 1 Nov 2005 08:26:19 +0000 (08:26 -0000)]
Fix the use of ELF type/size directives with local variables.
* elf.h (elf_symtab_entry): Add in_table flag.
(elf_sym_in_table): New.
* elf.c (elf_symtab_entry_create): Initialize in_table to 0.
(elf_symtab_append_entry, elf_symtab_insert_local_sym): Set flag to 1.
* elf.c (elf_symtab_insert_local_sym): Don't create the entry here, instead
take it as a parameter.
* elf-objfmt.c (elf_objfmt_symtab_append): Only add if not in table by
checking new in_table flag.
(elf_objfmt_append_local_sym): Likewise, and pull some of the logic from
the old elf_symtab_insert_local_sym function to do it.
(elf_objfmt_directive): Don't append to ELF symbol table here, as we don't
know yet if the variable is global or local.
Peter Johnson [Tue, 1 Nov 2005 04:48:15 +0000 (04:48 -0000)]
* elf.c (elf_symtab_create): Default type to STT_NOTYPE.
* elf-objfmt.c (elf_objfmt_append_local_sym, elf_objfmt_extern_declare)
(elf_objfmt_global_declare, elf_objfmt_common_declare)
(elf_objfmt_directive): Only override if actually required.
Peter Johnson [Tue, 1 Nov 2005 04:23:54 +0000 (04:23 -0000)]
* elf-objfmt.c (elf_objfmt_symtab_append): Don't append symbol a second
time if it already has associated data. This keeps global followed by
extern from generating duplicate symbol table entries.
Peter Johnson [Tue, 1 Nov 2005 03:57:42 +0000 (03:57 -0000)]
Fix implementation of r1298 and fixup testcases.
* gas-bison.y (gas_get_section, gas_switch_section): Add parameter builtin
to indicate gas flags should not be generated.
Peter Johnson [Tue, 1 Nov 2005 03:37:44 +0000 (03:37 -0000)]
Fix linker errors with GAS parser directives .data/.text/etc.
* gas-bison.y (gas_get_section): Don't create empty gas flags unless type
is also specified. This fixes .data/.text/etc section flags.
* elf-objfmt.c (elf_objfmt_section_switch): Add default flags for .comment
section; this is needed so the above change doesn't break .ident.
Peter Johnson [Wed, 26 Oct 2005 03:22:44 +0000 (03:22 -0000)]
Enhance builtin bytecode_data to support embedded NULs in character strings.
While NASM doesn't allow this, GAS does.
While we're here, greatly clean up GAS data bytecode creation by no longer
building intermediate valparam list.
* bytecode.h (yasm_dv_create_string): Add length parameter.
(yasm_bc_create_data): Add append_zero parameter for new ability to append
a single ero byte after each data value. This is used by the GAS .asciz
directive.
* bytecode.c (bytecode_data, ...): Implement the above.
* gas-bison.y (gas_define_strings, gas_define_data)
(gas_define_leb128): Remove; replace in usage with direct calls to bytecode
functions. Add str, dataval, and datavalhead to parser union. Add new
dirvals, which has valparams type, and change strvals and datavals to
datavals type.
* gas-token.re: Use new str type where STRING token is generated.
* nasm-bison.y: Add str type to union, and use for STRING token.
* nasm-token.re: Use new str type where STRING token is generated.
* coff-objfmt.c (win32_objfmt_directive): Adjust for updates to
bytecode_data.
Peter Johnson [Mon, 24 Oct 2005 04:48:37 +0000 (04:48 -0000)]
Support standalone, segment, and REX prefixes in GAS mode.
* bytecode.c (yasm_bc_create_empty_insn): New function to create empty
instruction that can have prefixes applied to it, for standalone prefixes.
* bytecode.h (yasm_bc_create_empty_insn): Prototype.
* x86arch.h (x86_parse_insn_prefix): Add prefix types for segment registers
(X86_SEGREG) and REX bytes (X86_REX).
(yasm_x86__bc_apply_prefixes): Adjust prototype to include REX pointer (as
this isn't in the x86_common structure).
* x86bc.c (yasm_x86__bc_apply_prefixes): Support the new prefix types.
* x86id.re (x86_finalize_*): Use const x86_insn_info; all insn_infos are
const so these pointers should be as well.
(yasm_x86__finalize_insn): Handle empty instruction case by pointing to new
empty_insn info.
(empty_insn): New.
(yasm_x86__parse_check_prefix): Support GAS prefix naming, and REX and jump
hint prefixes (only in GAS mode at the moment).
* gas-bison.y: Add rules to handle segreg prefixes as well as standalone
prefixes (both segreg and others).
* gas-prefix.asm: New testcase that also hits the warning cases in
yasm_x86__bc_apply_prefixes X86_REX case.
Peter Johnson [Wed, 19 Oct 2005 07:44:59 +0000 (07:44 -0000)]
* gas-bison.y: Add support for .value alias for .2byte (GAS-x86/amd64).
This is generated by GCC in debug sections.
* gas-token.re: Likewise.
* gas-bison.y: Add support for 4th parameter on .section directive, for use
with M (SHF_MERGE) ELF section flag.
* elf-objfmt.c: Add support for M, S (SHF_STRINGS), G (SHF_GROUP), and T
(SHF_TLS) section flags.
* elf.h: Declare additional SHF_* flags.
With these changes, debug information generated by GCC in GAS format is
passed through successfully. Should just need line number generation to
have full debugging for ELF-DWARF2 coming from GCC.
Only remaining thing to handle that I see at the moment for full GCC output
support is multiple instructions on one line (separated by semicolons).
Peter Johnson [Wed, 19 Oct 2005 07:18:20 +0000 (07:18 -0000)]
* elf-x86-amd64.c (elf_x86_amd64_write_reloc): Fix a crash with ELF: when an
invalid relocation is generated, this still gets called but with a NULL
addend.
* expr.c (expr_xform_bc_dist): Check return value of yasm_symrec_get_label()
to avoid crash.
* intnum.c (yasm_intnum_get_leb128, yasm_intnum_size_leb128): New.
* intnum.h (yasm_intnum_get_leb128, yasm_intnum_size_leb128): Prototype.
* leb128_test.c: New test for intnum-level LEB128 functions.
* bytecode.c (bytecode_leb128): New bytecode and supporting functions.
(yasm_bc_create_leb128): New creation function.
* bytecode.h (yasm_bc_create_leb128): Prototype.
* gas-token.re: Recognize .uleb128 and .sleb128.
* gas-bison.y: Ditto.
(gas_define_leb128): New.
* leb128.asm: New test for GAS .uleb128 and .sleb128 directives.
Peter Johnson [Mon, 10 Oct 2005 03:47:58 +0000 (03:47 -0000)]
Update all re2c input files to use case-insensitive strings. The code
generated for this is identical to the old A=[aA] way of doing this, but
this way is easier to read.
Peter Johnson [Sun, 9 Oct 2005 07:11:45 +0000 (07:11 -0000)]
Continue re2c updates. This one gets rid of the unused label warnings.
Going ahead and removing the cleanup script; a later commit will get rid
of the remaining unused variable warnings that the cleanup script took care
of as well.
Peter Johnson [Sun, 9 Oct 2005 06:08:02 +0000 (06:08 -0000)]
Update re2c to May 12, 2004 version. This adds an output file option, so
also adjust cleanup program to take input/output file name, and update
Makefiles to use it in this fashion.
Peter Johnson [Fri, 7 Oct 2005 05:15:52 +0000 (05:15 -0000)]
* x86arch.h (x86_insn): Combine shift_op, signext_imm8_op, shortmov_op, and
address16_op flags into a single postop enum.
* x86id.re (yasm_x86__finalie_insn): Set new enum rather than flags.
* x86bc.c: Use new combined enum.
Peter Johnson [Wed, 5 Oct 2005 06:57:37 +0000 (06:57 -0000)]
* x86id.re: Implement string, protection, SSE2 instructions for GAS mode.
While we're here, add 64-bit register versions of SSE2 instructions movmskps,
pextrw, pinsrw, and pmovmskb that are documented by Intel but not AMD.
Peter Johnson [Mon, 3 Oct 2005 06:49:15 +0000 (06:49 -0000)]
* bytecode.c (bc_align_tobytes): Handle cases where some code fills don't
exist (this happens in LC3b).
* lc3barch.c (lc3b_get_fill): NOP pattern is actually all 0's.
* gas-parser.h (yasm_parser_gas): Add code_section flag to indicate when to
use code fill vs. data fill.
* gas-parser.c: Initialize flag.
* gas-bison.y: Update flag in various places. Generate org bytecode.
Call gas_parser_align to generate align bytecode.
(gas_parser_align): Generate align bytecode.
Peter Johnson [Mon, 3 Oct 2005 00:12:08 +0000 (00:12 -0000)]
* x86id.re (push_insn): Turn on signed 8-bit optimization for GAS mode.
Don't do this yet for NASM mode; this could be done e.g. through use of
the strict modifier.
Peter Johnson [Sat, 1 Oct 2005 05:47:54 +0000 (05:47 -0000)]
Revert [1251]. From further investigation, ML64's generation of REL32 in
these cases seems to be a bug. If you get a linker error about ADDR32, it
means you aren't using RIP-relative instructions. Note this means to access
an array you need to do:
lea rax, [var wrt rip] ; generates RIP-relative insn and REL32 reloc
mov rcx, [rax+rbx] ; rbx is index
and not:
mov rcx, [var+rbx] ; generates ADDR32 reloc
At least when trying to build a DLL (the ADDR32 reloc fails the DLL link).
When building statically, ADDR32 should work okay and thus the latter form
can be used.
Peter Johnson [Fri, 30 Sep 2005 04:03:59 +0000 (04:03 -0000)]
* coff-objfmt.c (coff_objfmt_output_expr): Try to match the new ML64's
output better by generating relocs directly to the symbol being relocated
rather than to the section. Use a new coff_objfmt->win64 flag to
conditionalize this rather than just COFF_MACHINE_AMD64.
(coff_objfmt): New win64 flag.
(coff_objfmt_create, win32_objfmt_create, win64_objfmt_create): Initialize
flag.
(coff_objfmt_output): Turn on outputting all symbols in win64 mode so they
can be referenced by relocs. This isn't quite correct: we should only turn
on the symbols that are actually used by relocs, but having them there
doesn't hurt linking; it only exposes all of the internal symbol names.
With these changes, yasm output matches the new ML64 output except for a
very few cases:
- ML64 generates REL32 relocs when referencing objects in the same .text
section. I cannot see how this is necessary because call instructions
don't generate REL32 relocs! I currently do not plan on fixing this unless
it causes a problem.
- ML64 generates ADDR32 relocs instead of REL32 relocs when loading a
32-bit register with the address of an object. I will probably try to fix
this.
Peter Johnson [Thu, 29 Sep 2005 05:13:26 +0000 (05:13 -0000)]
* x86id.re: Unbreak movq for NASM parser. I accidentally overrode it when
defining the mov forms for GAS. While I'm here, fix movq so it also
supports the 64-bit move registers (per Intel's spec, AMD has it under movd)
and copy the MMX/SSE2 versions of movq into mov so they're visible to the
GAS parser (and only the GAS parser).
Add a whole bunch of testcases to test movd and movq in both 32 bit and 64
bit modes for both GAS and NASM parsers.
Peter Johnson [Wed, 28 Sep 2005 05:50:51 +0000 (05:50 -0000)]
- Add win64 as an alias for -f win32 -m amd64.
- Add elf32 as an alias for -f elf.
- Add elf64 as an alias for -f elf -m amd64.
Note the old command lines still work.
Add a testcase for win64 (includes masm -> yasm mapping, look at
win64-dataref.masm and win64-dataref.asm files respectively).
Peter Johnson [Wed, 28 Sep 2005 03:23:24 +0000 (03:23 -0000)]
* coff-objfmt.c (coff_objfmt_output_expr): Change relocations output for
instructions in Win64 to always be REL32 regardless of whether they're
RIP relative or not. I don't understand this behavior, but it matches how
ML64 generates relocs and unbreaks linking. The handling of this case must
be handled at a higher level somehow (either at the linker or the compiler).
Note that this REL32 generation behavior of ML64 happens only with the
latest version (VC8); the 2003 SP1 SDK ML64 doesn't do this.
* bytecode.c (yasm_bc_is_data): New supporting function.
* bytecode.h (yasm_bc_is_data): Prototype.
Testcase pending.
Reported by and much debugging support contributed by:
Brian Gladman <brg@gladman.plus.com>. Thanks!
Peter Johnson [Tue, 27 Sep 2005 07:07:07 +0000 (07:07 -0000)]
Split arch module parse_check_id into parse_check_reg, parse_check_reggroup,
parse_check_segreg, parse_check_insn, parse_check_prefix, and
parse_check_targetmod. This will allow for future improvements to
identifier handling in the various parsers.
Peter Johnson [Tue, 27 Sep 2005 03:46:34 +0000 (03:46 -0000)]
* gas-bison.y: Allow .data, .text, and .bss to be used in expressions (they
come through as unique directive tokens, not as DIR_ID).
* dataref-imm.*: Test for this.
Peter Johnson [Mon, 26 Sep 2005 07:52:25 +0000 (07:52 -0000)]
* x86id.re (DEF_INSN_DATA): OR in data[3] rather than setting it directly.
(yasm_x86__finalize_insn): Ignore special suffix value 0x80 when matching
in info, but use strict matching. This unbreaks jmp/call broken in previous
commit.
(yasm_x86__parse_check_insn): Initialize data[3] and for FLDT and FSTPT, set
special suffix value 0x80.
Peter Johnson [Mon, 26 Sep 2005 07:06:27 +0000 (07:06 -0000)]
* x86id.re: Implement GAS handling for floating point, some extensions,
interrupts, conditional movs/sets, and a few other minor things. What's
left: string insns, loop insns, other jmp/call forms, protection control,
SSE/SSE2, and odds and ends (like prefixes-as-instructions).
Peter Johnson [Mon, 26 Sep 2005 04:17:09 +0000 (04:17 -0000)]
* x86expr.c (yasm_x86__expr_checkea): Add address16_op parameter to avoid
errors when using enter in 64-bit mode.
* x86arch.h (yasm_x86__expr_checkea): Update prototype.
* x86bc.c (x86_bc_insn_resolve, x86_bc_insn_tobytes): Pass flag to above.
* x86id.re: Implement mul, imul, div, idiv, enter, leave for GAS mode.
Add tests for above.
Peter Johnson [Sun, 25 Sep 2005 04:25:26 +0000 (04:25 -0000)]
Preliminary GAS parser. Only a few instructions are supported at present.
This work is being done under contract with a company that has requested
to remain unnamed at the present time.
* bc-int.h (yasm_effaddr): Add strong flag to indicate if the effective
address is definitely an effective address; GAS does not use [] to designate
effective addresses so it's otherwise impossible to tell the difference
between "expr(,1)" and just "expr" (important for the relative jump
instructions).
* bytecode.h (yasm_ea_set_strong): New function to set the strong flag.
* bytecode.c (yasm_ea_set_strong): Implementation.
* x86bc.c (yasm_x86__ea_create_reg): Initialize strong flag.
* arch.h (yasm_insn_operand): Add deref flag to indicate use of "*foo" in
GAS syntax.
* arch.c (yasm_operand_create_reg, yasm_operand_create_segreg)
(yasm_operand_create_mem, yasm_operand_create_imm): Set deref flag to 0.
* gas: GAS syntax lexer and parser. Not all directives are implemented yet
(some will require additional core bytecodes).
* elf-objfmt.c (elf_objfmt_section_switch): Add support for GAS-style
section flags.
* x86arch.h (yasm_arch_x86): Add parser setting.
* x86arch.c (x86_create): Check for gas parser and initialize setting.
* x86bc.c (yasm_x86__ea_create_expr): Transform val+RIP to val wrt RIP when
using the GAS parser (this is how GAS interprets "expr(%rip)").
* x86id.re: Too many changes to enumerate in detail. Add new modifiers for
GAS suffixes. Start using them in a couple instructions. Split check_id
into subfunctions (still one entry point at present).
(yasm_x86__finalize_insn): Support new modifiers, reverse operands, derefs.
* yasm.c (main): Change all undef to extern when using GAS parser (this is
default GAS behavior).
Peter Johnson [Sun, 25 Sep 2005 03:20:54 +0000 (03:20 -0000)]
* arch.h (yasm_arch_check_id_retval): Add YASM_ARCH_CHECK_ID_REGGROUP to
represent a register group (e.g. indexed registers).
(yasm_arch_reggroup_get_reg): New function to get a specific register from
a register group and index.
(yasm_arch_module): Add module version of yasm_arch_reggroup_get_reg().
* lc3barch.c (lc3b_reggroup_get_reg): Implement.
(yasm_lc3b_LTX_arch): Point to implementation.
* x86arch.c (x86_reggroup_get_reg, yasm_x86_LTX_arch): Likewise.
Peter Johnson [Sun, 25 Sep 2005 01:01:02 +0000 (01:01 -0000)]
* symrec.h (yasm_symtab_parser_finalize): Add function to declare all
undefined symbols extern if unused rather than causing undef errors.
* symrec.c (yasm_symtab_parser_finalize): Implement.
(symtab_finalize_info): New (more data to pass to
(symtab_parser_finalize_checksym): Update finalize helper.
* yasm.c (main): Update call to yasm_symtab_parser_finalize().
Peter Johnson [Sun, 25 Sep 2005 00:41:04 +0000 (00:41 -0000)]
* arch.h (yasm_arch_syntax_flavor): Remove.
(yasm_arch_create): Add parser and error parameters; now the arch is given
the keyword of the parser in use. The error parameter allows the caller to
find out whether it was the machine name or the parser name that was in
error.
(yasm_arch_module): Change create definition to match yasm_arch_create().
(yasm_arch_create_error): New error typedef for yasm_arch_create() errors.
* lc3barch.c (lc3b_create): Update to match new yasm_arch_create().
* x86arch.c (x86_create): Likewise.
* yasm.c (main): Use new yasm_arch_create() and handle the two kinds of
errors it can now generate. Move parser creation up in the sequence so it
happens before the arch is created.
Peter Johnson [Sun, 25 Sep 2005 00:09:30 +0000 (00:09 -0000)]
* yasm.c (main): Add workaround for when -m amd64 is specified to override
the object format default BITS setting. This makes it so [bits 64] is not
necessary to explicitly specify in the source file.
Peter Johnson [Sat, 24 Sep 2005 23:50:09 +0000 (23:50 -0000)]
* x86bc.c (x86_bc_insn_resolve): Actually support the flag to allow
shortening to signed 8-bit immediate from a larger immediate size. This
yields much smaller code for many arithmetic instructions.
Peter Johnson [Thu, 8 Sep 2005 05:01:32 +0000 (05:01 -0000)]
* elf-objfmt.c (elf_objfmt_output_section): Don't try to skip empty
sections. This breaks section numbering between the file section headers
and the section numbering used by symbols to reference sections.
While we're here, don't even try to number sections during parse... this
numbering is getting overwritten anyway.
Peter Johnson [Wed, 7 Sep 2005 03:53:38 +0000 (03:53 -0000)]
* expr.c (expr_level_op): Fix corruption with certain types of complex
expressions by adjusting level_numterms if expr_simplify_identity changes
fold_numterms.
Peter Johnson [Mon, 5 Sep 2005 20:53:07 +0000 (20:53 -0000)]
* hamt.c: Use uintptr_t to correctly cast to integer from pointer. On some
platforms (notably Win64), unsigned long is not big enough to hold a
pointer.
* Makefile.am, configure.ac: Use ax_create_stdint_h to get us uintptr_t.
* ax_create_stdint_h.m4: Implementation of ax_create_stdint_h autoconf
macro from http://ac-archive.sourceforge.net/guidod/ax_create_stdint_h.html
Peter Johnson [Sat, 27 Aug 2005 02:05:36 +0000 (02:05 -0000)]
* bytecode.c (bc_incbin_tobytes): Fix fread call so that return value check
works (was broken for >1 byte files).
(yasm_bc_tobytes): Fix handling of bytecodes that are larger than provided
buffer.
Patch by: Stephen Polkowski <stephen@centtech.com>
Peter Johnson [Thu, 4 Aug 2005 07:53:10 +0000 (07:53 -0000)]
coff_objfmt.c: Add support for ADDR32NB relocations, and enable by default for
the .pdata section. This is needed for structured exception handling on AMD64.
Yasm in the long run should generate this info itself via the use of objfmt
specific directives.
Noticed By: Andrew Dunstan <a_dunstan@hotmail.com>
Peter Johnson [Wed, 29 Jun 2005 06:16:10 +0000 (06:16 -0000)]
Add -M option for Makefile dependency generation.
Initial patch by: Thomas Weidenmueller <thomas@reactsoft.com>
The NASM preprocessor implementation of this is ugly; the preprocessor
really needs a rewrite to clean it up, but there's other higher-priority
items on the TODO list.