From 265595baff1d73f00aedba3ecbc33c0559d388af Mon Sep 17 00:00:00 2001 From: Peter Johnson Date: Sat, 21 Oct 2006 08:07:09 +0000 Subject: [PATCH] Revamp and update man pages. Still need to add in some missing ones. svn path=/trunk/yasm/; revision=1660 --- frontends/yasm/yasm.xml | 1044 +++++++++++++++++++----------------- modules/arch/yasm_arch.xml | 582 ++++++++++---------- yasm.1 | 447 ++++++++------- yasm_arch.7 | 439 +++++++++++---- 4 files changed, 1447 insertions(+), 1065 deletions(-) diff --git a/frontends/yasm/yasm.xml b/frontends/yasm/yasm.xml index 56fc9153..a0d86001 100644 --- a/frontends/yasm/yasm.xml +++ b/frontends/yasm/yasm.xml @@ -6,503 +6,551 @@ - - YASM Modular Assembler - September 2004 - YASM - - Peter - Johnson - -
peter@tortall.net
-
-
- - - 2004 - Peter Johnson - -
- - - yasm - 1 - - - - yasm - The YASM Modular Assembler - - - - - yasm - - - - - - - - - - infile - - - - yasm - - - - - - -Description - - The YASM Modular Assembler is a portable, retargetable assembler - written under the new (2 or 3 clause) BSD license. It - is designed from the ground up to allow for multiple assembler - syntaxes (parsers) to be supported in addition to multiple output - object formats and multiple instruction sets. Another primary module - of the overall design is an optimizer module. - - YASM consists of the yasm command, libyasm, the - core backend library, and a large number of loadable modules. On some - platforms, libyasm and the loadable modules are statically built into - the yasm executable rather than being dynamically - loaded. + + The Yasm Modular Assembler + October 2006 + Yasm + + Peter + Johnson + +
peter@tortall.net
+
+
+ + + 2004 + 2005 + 2006 + Peter Johnson + +
+ + + yasm + 1 + + + + yasm + The Yasm Modular Assembler + + + + + yasm + + + + + + + + + + infile + + + + yasm + + + + + + + + Description + + The Yasm Modular Assembler is a portable, retargetable + assembler written under the new (2 or 3 clause) BSD + license. Yasm currently supports the x86 and AMD64 instruction + sets, accepts NASM and GAS assembler syntaxes, outputs binary, + ELF32, ELF64, COFF, Win32, and Win64 object formats, and generates + source debugging information in STABS, DWARF 2, and CodeView 8 + formats. + + YASM consists of the yasm command, libyasm, + the core backend library, and a large number of modules. + Currently, libyasm and the loadable modules are statically built + into the yasm executable. - The yasm command assembles the file infile and - directs output to the file outfile if - specified. If outfile is not specified, - yasm will derive a default output file name from - the name of its input file, usually by appending - .o or .obj, or by removing - all extensions for a raw binary file. Failing that, the output file - name will be yasm.out. - - If called without an infile, - yasm assembles the standard input and directs - output to the file outfile, or - yasm.out if no outfile - is specified. - - - -Options - - Many options may be given in one of two forms: either a dash - followed by a single letter, or two dashes followed by a long option - name. - - The following general options are available: - - - - - - - - - Prints yasm version information and license summary to - standard output. All other options are ignored, and no - output file is generated. - - - - - - or - - - - Prints a summary of invocation options. All other - options are ignored, and no output file is - generated. - - - - - - or - - - - - Selects the target architecture. The default - architecture is x86, which supports both - the IA-32 and derivatives and AMD64 instruction sets. To - print a list of available architectures to standard - output, use help as - arch. See - yasm_arch - 7 - for more details. - - - - - - or - - - - - Selects the parser (the assembler syntax). The default - parser is nasm, which emulates the syntax - of NASM, the Netwide Assembler. To print a list of - available parsers to standard output, use - help as - parser. - - - - - - or - - - - - Selects the preprocessor to use on the input file before - passing it to the parser. Preprocessors often provide - macro functionality that is not included in the main - parser. The default preprocessor is nasm, - which is an imported version of the actual NASM - preprocessor. A raw preprocessor is also - available, which simply skips the preprocessing step, - passing the input file directly to the parser. To print a - list of available preprocessors to standard output, use - help as - preproc. - - - - - - or - - - - - Selects the output object format. The default - object format is bin, which is a flat - format binary with no relocation. To print a list of - available object formats to standard output, use - help as - format. - - - - - - or - - - - - Selects the debugging format for debug information. - Debugging information can be used by a debugger to - associate executable code back to the source file or get - data structure and type information. Available debug - formats vary between different object formats; - yasm will error when an invalid - combination is selected. The default object format is - selected by the object format. To print a list of - available debugging formats to standard output, use - help as - debug. - - - - - - or - - - - - Selects the format/style of the output list file. List - files typically intermix the original source with the - machine code generated by the assembler. The default list - format is nasm, which mimics the NASM list - file format. To print a list of available list file - formats to standard output, use help as - list. - - - - - - or - - - - - Specifies the name of the output file, overriding any - default name selected by yasm. - - - - - - or - - - - - Specifies the name of the output list file. If this - option is not used, no list file is generated. - - - - - - or - - - - - Selects the target machine architecture. Essentially a - subtype of the selected architecture, the machine type - selects between major subsets of an architecture. For - example, for the x86 architecture, the two - available machines are x86, which is used - for the IA-32 and derivative 32-bit instruction set, and - amd64, which is used for the 64-bit - instruction set. This differentiation is required to - generate the proper object file for relocatable object - formats such as COFF and ELF. To print a list of - available machines for a given architecture to standard - output, use help as - machine and the given - architecture using . See - - yasm_arch - 7 - for more details. - - - - - - -Warning Options - - options have two contrary forms: - and - . Only the - non-default forms are shown here. - - - - - - - - - Inhibits all warning messages. - - - - - - - - - - Treats warnings as errors. - - - - - - - - - - Causes yasm to not warn on - unrecognized characters found in the input. - - - - - - - - - - Causes yasm to warn about labels found - alone on a line without a trailing colon. While these are - legal labels in the nasm parser, they may be - unintentional, due to typos or macro definition - ordering. - - - - - - - - - - Selects a specific output style of error and warning - messages. The default is gnu style, which - mimics the output of gcc. The - vc style is also available, which mimics the - output of Microsoft's Visual C++ compiler. - - - - - - -Preprocessor Options - - - - or - - - - Stops assembly after the preprocessing stage; - preprocessed output is sent to the specified output name - or, if no output name is specified, the standard output. - No object file is produced. - - - - - - - - - - Adds directory path to the - search path for include files. - - - - - - - - - - Pre-includes file filename, - making it look as though - filename was prepended to the - input. - - - - - - - - - - Pre-defines a single-line macro. - - - - - - - - - - Undefines a single-line macro. - - - - - - -Examples - - To assemble NASM syntax, 32-bit x86 source - source.asm into ELF file - source.o, warning on orphan labels: - - yasm -f elf -Worphan-labels source.asm - - To assemble NASM syntax AMD64 source x.asm into - AMD64 Win32 file object.obj: - - yasm -m amd64 -f win32 -o object.obj x.asm - - To assemble already preprocessed NASM syntax 32-bit x86 source - y.asm into flat binary file - y.com: - - yasm -f bin -r raw -o y.com y.asm - - - -Diagnostics - - The yasm command exits 0 on success, and nonzero - if an error occurs. - - - -Compatibility - - YASM's NASM parser and preprocessor, while they strive to be as - compatible as possible with NASM, have a few incompatibilities due to - YASM's different internal structure. - - - -Restrictions - - As object files are often architecture and machine dependent, not - all combinations of object formats, architectures, and machines are - legal; trying to use an invalid combination will result in an - error. - - There is no support for list files or symbol maps. - - Relocatable object formats are limited to static linking - applications, as YASM cannot generate relocations for dynamic - linking. - - - -See Also - - - as - 1 - , - - ld - 1 - , - - nasm - 1 - , - - yasm_arch - 7 - - - - -Bugs - - When using the x86 architecture, it is overly easy to - generate AMD64 code (using the BITS 64 - directive) and generate a 32-bit object file (by failing to specify - on the command line). Similarly, specifying - does not default the BITS setting to - 64. - - - + The yasm command assembles the file infile + and directs output to the file outfile + if specified. If outfile is not + specified, yasm will derive a default output + file name from the name of its input file, usually by appending + .o or .obj, or by + removing all extensions for a raw binary file. Failing that, the + output file name will be yasm.out. + + If called with an infile of + -, yasm assembles the standard + input and directs output to the file + outfile, or + yasm.out if no + outfile is specified. +
+ + + Options + + Many options may be given in one of two forms: either a dash + followed by a single letter, or two dashes followed by a long + option name. Options are listed in alphabetical order. + + + General Options + + + + or + : Select + target architecture + + + Selects the target architecture. The default architecture + is x86, which supports both the IA-32 and + derivatives and AMD64 instruction sets. To print a list of + available architectures to standard output, use + help as arch. See + + + yasm_arch + 7 + + + for a list of supported architectures. + + + + + or + : + Select object format + + + Selects the output object format. The default object + format is bin, which is a flat format binary + with no relocation. To print a list of available object + formats to standard output, use help as + format. See + + + yasm_objfmt + 7 + + + for a list of supported object formats. + + + + + or + : + Select debugging format + + + Selects the debugging format for debug information. + Debugging information can be used by a debugger to associate + executable code back to the source file or get data structure + and type information. Available debug formats vary between + different object formats; yasm will error + when an invalid combination is selected. The default object + format is selected by the object format. To print a list of + available debugging formats to standard output, use + help as debug. See + + + yasm_dbgfmt + 7 + + + for a list of supported debugging formats. + + + + + or : Print a + summary of options + + + Prints a summary of invocation options. All other options + are ignored, and no output file is generated. + + + + + or + : + Select list file format + + + Selects the format/style of the output list file. List + files typically intermix the original source with the machine + code generated by the assembler. The default list format is + nasm, which mimics the NASM list file format. + To print a list of available list file formats to standard + output, use help as + list. + + + + + or + : + Specify list filename + + + Specifies the name of the output list file. If this + option is not used, no list file is generated. + + + + + or + : + Select target machine architecture + + + Selects the target machine architecture. Essentially a + subtype of the selected architecture, the machine type selects + between major subsets of an architecture. For example, for the + x86 architecture, the two available machines are + x86, which is used for the IA-32 and derivative + 32-bit instruction set, and amd64, which is used + for the 64-bit instruction set. This differentiation is + required to generate the proper object file for relocatable + object formats such as COFF and ELF. To print a list of + available machines for a given architecture to standard output, + use help as machine + and the given architecture using . See + + + yasm_arch + 7 + + + for more details. + + + + + or + : + Specify object filename + + + Specifies the name of the output file, overriding any + default name generated by Yasm. + + + + + or + : + Select parser + + + Selects the parser (the assembler syntax). The default + parser is nasm, which emulates the syntax of + NASM, the Netwide Assembler. Another available parser is + gas, which emulates the syntax of GNU AS. To + print a list of available parsers to standard output, use + help as parser. See + + + yasm_parsers + 7 + + + for a list of supported parsers. + + + + + or + : + Select preprocessor + + + Selects the preprocessor to use on the input file before + passing it to the parser. Preprocessors often provide macro + functionality that is not included in the main parser. The + default preprocessor is nasm, which is an + imported version of the actual NASM preprocessor. A + raw preprocessor is also available, which simply + skips the preprocessing step, passing the input file directly + to the parser. To print a list of available preprocessors to + standard output, use help as + preproc. + + + + + : Get the Yasm version + + + This option causes Yasm to prints the version number of + Yasm as well as a license summary to standard output. All + other options are ignored, and no output file is + generated. + + + + + + + Warning Options + + options have two contrary forms: + and + . Only the + non-default forms are shown here. + + The warning options are handled in the order given on the + command line, so if is followed by + , all warnings are turned off + except for orphan-labels. + + + + : Inhibit all warning messages + + + This option causes Yasm to inhibit all warning messages. + As discussed above, this option may be followed by other + options to re-enable specified warnings. + + + + + : Treat warnings as errors + + + This option causes Yasm to treat all warnings as errors. + Normally warnings do not prevent an object file from being + generated and do not result in a failure exit status from + yasm, whereas errors do. This option makes + warnings equivalent to errors in terms of this behavior. + + + + + : Do not warn on + unrecognized input characters + + + Causes Yasm to not warn on unrecognized characters found + in the input. Normally Yasm will generate a warning for any + non-ASCII character found in the input file. + + + + + : Warn on labels lacking a + trailing option + + + When using the NASM-compatible parser, causes Yasm to warn + about labels found alone on a line without a trailing colon. + While these are legal labels in NASM syntax, they may be + unintentional, due to typos or macro definition + ordering. + + + + + : + Change error/warning reporting style + + + Selects a specific output style for error and warning + messages. The default is gnu style, which + mimics the output of gcc. The + vc style is also available, which mimics the + output of Microsoft's Visual C++ compiler. + + This option is available so that Yasm integrates more + naturally into IDE environments such as Visual Studio or Emacs, allowing the IDE to + correctly recognize the error/warning message as such and link + back to the offending line of source code. + + + + + + + Preprocessor Options + + While these preprocessor options theoretically will affect + any preprocessor, the only preprocessor currently in Yasm is the + nasm preprocessor. + + + + : Pre-define a + macro + + + Pre-defines a single-line macro. The value is optional + (if no value is given, the macro is still defined, but to an + empty value). + + + + + or : + Only preprocess + + + Stops assembly after the preprocessing stage; preprocessed + output is sent to the specified output name or, if no output + name is specified, the standard output. No object file is + produced. + + + + + : Add + include file path + + + Adds directory path to the + search path for include files. The search path defaults to + only including the directory in which the source file + resides. + + + + + : + Pre-include a file + + + Pre-includes file filename, + making it look as though filename + was prepended to the input. Can be useful for prepending + multi-line macros that the can't + support. + + + + + : + Undefine a macro + + + Undefines a single-line macro (may be either a built-in + macro or one defined earlier in the command line with + . + + + + + + + + + Examples + + To assemble NASM syntax, 32-bit x86 source + source.asm into ELF file + source.o, warning on orphan labels: + + yasm -f elf32 -Worphan-labels source.asm + + To assemble NASM syntax AMD64 source + x.asm into Win64 file + object.obj: + + yasm -f win64 -o object.obj x.asm + + To assemble already preprocessed NASM syntax x86 source + y.asm into flat binary file + y.com: + + yasm -f bin -r raw -o y.com y.asm + + + + Diagnostics + + The yasm command exits 0 on success, and + nonzero if an error occurs. + + + + Compatibility + + Yasm's NASM parser and preprocessor, while they strive to be + as compatible as possible with NASM, have a few incompatibilities + due to YASM's different internal structure. + + Yasm's GAS parser and preprocessor are missing a number of + features present in GNU AS. + + + + Restrictions + + As object files are often architecture and machine dependent, + not all combinations of object formats, architectures, and machines + are legal; trying to use an invalid combination will result in an + error. + + There is no support for symbol maps. + + + + See Also + + + yasm_arch + 7 + , + + yasm_dbgfmt + 7 + , + + yasm_objfmt + 7 + , + + yasm_parsers + 7 + + + Related tools: + + as + 1 + , + + ld + 1 + , + + nasm + 1 + + + + + Bugs + + When using the x86 architecture, it is overly + easy to generate AMD64 code (using the BITS + 64 directive) and generate a 32-bit object file (by + failing to specify or selecting a 64-bit + object format such as ELF64 on the command line). Similarly, + specifying does not default the BITS + setting to 64. An easy way to avoid this is by directly specifying + a 64-bit object format such as . +
diff --git a/modules/arch/yasm_arch.xml b/modules/arch/yasm_arch.xml index 4b0cc0fd..91c57d77 100644 --- a/modules/arch/yasm_arch.xml +++ b/modules/arch/yasm_arch.xml @@ -6,278 +6,314 @@ - - YASM Architectures - September 2004 - YASM - - Peter - Johnson - -
peter@tortall.net
-
-
- - - 2004 - Peter Johnson - -
- - - yasm_arch - 7 - - - - yasm_arch - YASM Architectures - - - - - yasm - - - - - - - - - - - - -Description - - The standard YASM distribution includes a number of loadable modules - for different target architectures. Additional target architectures - may be installed as third-party modules. Each target architecture can - support one or more machine architectures. - - The architecture and machine are selected on the - yasm 1 - command line by use of the and command line options, - respectively. - - - -x86 Architecture - - The x86 architecture supports the IA-32 instruction - set and derivatives and the AMD64 instruction set. It consists of two - machines: x86 (for the IA-32 and derivatives) and - amd64 (for the AMD64 and derivatives). The default - machine for the x86 architecture is the - x86 machine. - - BITS Setting - - The x86 architecture BITS setting specifies to YASM the - processor mode in which the generated code is intended to execute. - x86 processors can run in three different major execution modes: - 16-bit, 32-bit, and on AMD64-supporting processors, 64-bit. As - the x86 instruction set contains portions whose function is - execution-mode dependent (such as operand-size and address-size - override prefixes), YASM cannot assemble x86 instructions - correctly unless it is told by the user in what processor mode the - code will execute. - - The BITS setting can be changed in a variety of ways. When - using the NASM-compatible parser, the BITS setting can be changed - directly via the use of the BITS xx - assembler directive. The default BITS setting is determined by - the object format in use. - - - - BITS 64 Extensions - - When an AMD64-supporting processor is executing in 64-bit mode, - a number of additional extensions are available, including extra - general purpose registers, extra SSE2 registers, and RIP-relative - addressing. - - Register Changes - - The additional 64-bit general purpose registers are named - r8-r15. There are also 8-bit (rXb), 16-bit (rXw), and 32-bit - (rXd) subregisters that map to the least significant 8, 16, or - 32 bits of the 64-bit register. The original 8 general - purpose registers have also been extended to 64-bits: eax, - edx, ecx, ebx, esi, edi, esp, and ebp have new 64-bit versions - called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp - respectively. The old 32-bit registers map to the least - significant bits of the new 64-bit registers. - - New 8-bit registers are also available that map to the 8 - least significant bits of rsi, rdi, rsp, and rbp. These are - called sil, dil, spl, and bpl respectively. Unfortunately, - due to the way instructions are encoded, these new 8-bit - registers are encoded the same as the old 8-bit registers ah, - dh, ch, and bh. The processor tells which is being used by - the presence of the new REX prefix that is used to specify the - other extended registers. This means it is illegal to mix the - use of ah, dh, ch, and bh with an instruction that requires - the REX prefix for other reasons. For instance: - - add ah, [r10] + + Yasm Supported Target Architectures + October 2006 + YASM + + Peter + Johnson + +
peter@tortall.net
+
+
+ + + 2004 + 2005 + 2006 + Peter Johnson + +
+ + + yasm_arch + 7 + + + + yasm_arch + Yasm Supported Target Architectures + + + + + yasm + + + + + + + + + + + + + + Description + + The standard Yasm distribution includes a number of modules + for different target architectures. Each target architecture can + support one or more machine architectures. + + The architecture and machine are selected on the + + + yasm + 1 + + + command line by use of the and command line options, + respectively. + + The machine architecture may also automatically be selected by + certain object formats. For example, the elf32 + object format selects the x86 machine architecture + by default, while the elf64 object format selects + the amd64 machine architecture by default. + + + + x86 Architecture + + The x86 architecture supports the IA-32 + instruction set and derivatives and the AMD64 instruction set. It + consists of two machines: x86 (for the IA-32 and + derivatives) and amd64 (for the AMD64 and + derivatives). The default machine for the x86 + architecture is the x86 machine. + + + BITS Setting + + The x86 architecture BITS setting specifies to Yasm the + processor mode in which the generated code is intended to execute. + x86 processors can run in three different major execution modes: + 16-bit, 32-bit, and on AMD64-supporting processors, 64-bit. As + the x86 instruction set contains portions whose function is + execution-mode dependent (such as operand-size and address-size + override prefixes), Yasm cannot assemble x86 instructions + correctly unless it is told by the user in what processor mode the + code will execute. + + The BITS setting can be changed in a variety of ways. When + using the NASM-compatible parser, the BITS setting can be changed + directly via the use of the BITS xx + assembler directive. The default BITS setting is determined by + the object format in use. + + + + BITS 64 Extensions + + The AMD64 architecture is a new 64-bit architecture developed + by AMD, based on the 32-bit x86 architecture. It extends the + original x86 architecture by doubling the number of general + purpose and SIMD registers, extending the arithmetic operations + and address space to 64 bits, as well as other features. + + Recently, Intel has introduced an essentially identical + version of AMD64 called EM64T. + + When an AMD64-supporting processor is executing in 64-bit + mode, a number of additional extensions are available, including + extra general purpose registers, extra SSE2 registers, and + RIP-relative addressing. + + Yasm extends the base NASM syntax to support AMD64 as + follows. To enable assembly of instructions for the 64-bit mode + of AMD64 processors, use the directive BITS + 64. As with NASM's BITS directive, this does not + change the format of the output object file to 64 bits; it only + changes the assembler mode to assume that the instructions being + assembled will be run in 64-bit mode. To specify an AMD64 object + file, use on the Yasm command line, or + explicitly target a 64-bit object format such as or . + + + Register Changes + + The additional 64-bit general purpose registers are named + r8-r15. There are also 8-bit (rXb), 16-bit (rXw), and 32-bit + (rXd) subregisters that map to the least significant 8, 16, or 32 + bits of the 64-bit register. The original 8 general purpose + registers have also been extended to 64-bits: eax, edx, ecx, ebx, + esi, edi, esp, and ebp have new 64-bit versions called rax, rdx, + rcx, rbx, rsi, rdi, rsp, and rbp respectively. The old 32-bit + registers map to the least significant bits of the new 64-bit + registers. + + New 8-bit registers are also available that map to the 8 + least significant bits of rsi, rdi, rsp, and rbp. These are + called sil, dil, spl, and bpl respectively. Unfortunately, due + to the way instructions are encoded, these new 8-bit registers + are encoded the same as the old 8-bit registers ah, dh, ch, and + bh. The processor tells which is being used by the presence of + the new REX prefix that is used to specify the other extended + registers. This means it is illegal to mix the use of ah, dh, + ch, and bh with an instruction that requires the REX prefix for + other reasons. For instance: + + add ah, [r10] - (NASM syntax) is not a legal instruction because the use of - r10 requires a REX prefix, making it impossible to use - ah. - - In 64-bit mode, an additional 8 SSE2 registers are also - available. These are named xmm8-xmm15. - - - - 64 Bit Instructions - - By default, most operations in 64-bit mode remain 32-bit; - operations that are 64-bit usually require a REX prefix (one - bit in the REX prefix determines whether an operation is - 64-bit or 32-bit). Thus, essentially all 32-bit instructions - have a 64-bit version, and the 64-bit versions of instructions - can use extended registers for free (as the REX - prefix is already present). Examples in NASM syntax: - - mov eax, 1 ; 32-bit instruction - mov rcx, 1 ; 64-bit instruction - - Instructions that modify the stack (push, pop, call, ret, - enter, and leave) are implicitly 64-bit. Their 32-bit - counterparts are not available, but their 16-bit counterparts - are. Examples in NASM syntax: - - push eax ; illegal instruction - push rbx ; 1-byte instruction - push r11 ; 2-byte instruction with REX prefix - - - - Implicit Zero Extension - - Results of 32-bit operations are implicitly zero-extended to - the upper 32 bits of the corresponding 64-bit register. 16 - and 8 bit operations, on the other hand, do not affect upper - bits of the register (just as in 32-bit and 16-bit modes). - This can be used to generate smaller code in some instances. - Examples in NASM syntax: - - mov ecx, 1 ; 1 byte shorter than mov rcx, 1 - and edx, 3 ; equivalent to and rdx, 3 - - - - Immediates - - For most instructions in 64-bit mode, immediate values - remain 32 bits; their value is sign-extended into the upper 32 - bits of the target register prior to being used. The - exception is the mov instruction, which can take a 64-bit - immediate when the destination is a 64-bit register. Examples - in NASM syntax: - - add rax, 1 ; legal - add rax, 0xffffffff ; sign-extended - add rax, -1 ; same as above - add rax, 0xffffffffffffffff ; warning (>32 bit) - mov eax, 1 ; 5 byte instruction - mov rax, 1 ; 10 byte instruction - mov rbx, 0x1234567890abcdef ; 10 byte instruction - mov rcx, 0xffffffff ; 10 byte instruction - mov ecx, -1 ; 5 byte instruction equivalent to above - - - - Displacements - - Just like immediates, displacements, for the most part, - remain 32 bits and are sign extended prior to use. Again, the - exception is one restricted form of the mov instruction: - between the al/ax/eax/rax register and a 64-bit absolute - address (no registers allowed in the effective address). In - NASM syntax, use of the 64-bit absolute form requires - [qword]. Examples in NASM - syntax: - - mov eax, [1] ; 32 bit, with sign extension - mov al, [rax-1] ; 32 bit, with sign extension - mov al, [qword 0x1122334455667788] ; 64-bit absolute - mov al, [0x1122334455667788] ; truncated to 32-bit (warning) - - - - RIP Relative Addressing - - In 64-bit mode, a new form of effective addressing is - available to make it easier to write position-independent - code. Any memory reference may be made RIP relative (RIP is - the instruction pointer register, which contains the address - of the location immediately following the current - instruction). - - In NASM syntax, there are two ways to specify RIP-relative - addressing: - - mov dword [rip+10], 1 - - stores the value 1 ten bytes after the end of the - instruction. 10 can also be a symbolic - constant, and will be treated the same way. On the other - hand, - - mov dword [symb wrt rip], 1 - - stores the value 1 into the address of symbol - symb. This is distinctly different - than the behavior of: - - mov dword [symb+rip], 1 - - which takes the address of the end of the instruction, adds - the address of symb to it, then stores - the value 1 there. If symb is a - variable, this will NOT store the value 1 into the - symb variable! - - - - - -lc3b Architecture - - The lc3b architecture supports the LC-3b ISA as used - in the ECE 312 (now ECE 411) course at the University of Illinois, - Urbana-Champaign, as well as other university courses. See for more details and - example code. The lc3b architecture consists of only - one machine: lc3b. - - - -See Also - - - yasm - 1 - - - - -Bugs - - When using the x86 architecture, it is overly easy to - generate AMD64 code (using the BITS 64 - directive) and generate a 32-bit object file (by failing to specify - on the command line). Similarly, specifying - does not default the BITS setting to - 64. - - - + (NASM syntax) is not a legal instruction because the use of + r10 requires a REX prefix, making it impossible to use ah. + + In 64-bit mode, an additional 8 SSE2 registers are also + available. These are named xmm8-xmm15. +
+ + + 64 Bit Instructions + + By default, most operations in 64-bit mode remain 32-bit; + operations that are 64-bit usually require a REX prefix (one bit + in the REX prefix determines whether an operation is 64-bit or + 32-bit). Thus, essentially all 32-bit instructions have a 64-bit + version, and the 64-bit versions of instructions can use extended + registers for free (as the REX prefix is already + present). Examples in NASM syntax: + + mov eax, 1 ; 32-bit instruction + mov rcx, 1 ; 64-bit instruction + + Instructions that modify the stack (push, pop, call, ret, + enter, and leave) are implicitly 64-bit. Their 32-bit + counterparts are not available, but their 16-bit counterparts + are. Examples in NASM syntax: + + push eax ; illegal instruction + push rbx ; 1-byte instruction + push r11 ; 2-byte instruction with REX prefix + + + + Implicit Zero Extension + + Results of 32-bit operations are implicitly zero-extended to + the upper 32 bits of the corresponding 64-bit register. 16 and 8 + bit operations, on the other hand, do not affect upper bits of + the register (just as in 32-bit and 16-bit modes). This can be + used to generate smaller code in some instances. Examples in + NASM syntax: + + mov ecx, 1 ; 1 byte shorter than mov rcx, 1 + and edx, 3 ; equivalent to and rdx, 3 + + + + Immediates + + For most instructions in 64-bit mode, immediate values + remain 32 bits; their value is sign-extended into the upper 32 + bits of the target register prior to being used. The exception + is the mov instruction, which can take a 64-bit immediate when + the destination is a 64-bit register. Examples in NASM + syntax: + + add rax, 1 ; optimized down to signed 8-bit + add rax, dword 1 ; force size to 32-bit + add rax, 0xffffffff ; sign-extended 32-bit + add rax, -1 ; same as above + add rax, 0xffffffffffffffff ; truncated to 32-bit (warning) + mov eax, 1 ; 5 byte + mov rax, 1 ; 5 byte (optimized to signed 32-bit) + mov rax, qword 1 ; 10 byte (forced 64-bit) + mov rbx, 0x1234567890abcdef ; 10 byte + mov rcx, 0xffffffff ; 10 byte (does not fit in signed 32-bit) + mov ecx, -1 ; 5 byte, equivalent to above + mov rcx, sym ; 5 byte, 32-bit size default for symbols + mov rcx, qword sym ; 10 byte, override default size + + + + Displacements + + Just like immediates, displacements, for the most part, + remain 32 bits and are sign extended prior to use. Again, the + exception is one restricted form of the mov instruction: between + the al/ax/eax/rax register and a 64-bit absolute address (no + registers allowed in the effective address). In NASM syntax, use + of the 64-bit absolute form requires + [qword]. Examples in NASM syntax: + + mov eax, [1] ; 32 bit, with sign extension + mov al, [rax-1] ; 32 bit, with sign extension + mov al, [qword 0x1122334455667788] ; 64-bit absolute + mov al, [0x1122334455667788] ; truncated to 32-bit (warning) + + + + RIP Relative Addressing + + In 64-bit mode, a new form of effective addressing is + available to make it easier to write position-independent code. + Any memory reference may be made RIP relative (RIP is the + instruction pointer register, which contains the address of the + location immediately following the current instruction). + + In NASM syntax, there are two ways to specify RIP-relative + addressing: + + mov dword [rip+10], 1 + + stores the value 1 ten bytes after the end of the + instruction. 10 can also be a symbolic + constant, and will be treated the same way. On the other + hand, + + mov dword [symb wrt rip], 1 + + stores the value 1 into the address of symbol + symb. This is distinctly different than + the behavior of: + + mov dword [symb+rip], 1 + + which takes the address of the end of the instruction, adds + the address of symb to it, then stores the + value 1 there. If symb is a variable, + this will not store the value 1 into the + symb variable! + +
+
+ + + lc3b Architecture + + The lc3b architecture supports the LC-3b ISA as + used in the ECE 312 (now ECE 411) course at the University of + Illinois, Urbana-Champaign, as well as other university courses. + See for more + details and example code. The lc3b architecture + consists of only one machine: lc3b. + + + + See Also + + + yasm + 1 + + + + + Bugs + + When using the x86 architecture, it is overly + easy to generate AMD64 code (using the BITS + 64 directive) and generate a 32-bit object file (by + failing to specify on the command line or + selecting a 64-bit object format). Similarly, specifying + does not default the BITS setting to + 64. An easy way to avoid this is by directly specifying + a 64-bit object format such as . +
diff --git a/yasm.1 b/yasm.1 index d9f21498..9f91f5f4 100644 --- a/yasm.1 +++ b/yasm.1 @@ -1,202 +1,291 @@ -.\"Generated by db2man.xsl. Don't modify this, modify the source. -.de Sh \" Subsection -.br -.if t .Sp -.ne 5 -.PP -\fB\\$1\fR -.PP -.. -.de Sp \" Vertical space (when we can't use .PP) -.if t .sp .5v -.if n .sp -.. -.de Ip \" List item -.br -.ie \\n(.$>=3 .ne \\$3 -.el .ne 3 -.IP "\\$1" \\$2 -.. -.TH "YASM" 1 "September 2004" "YASM" "YASM Modular Assembler" -.SH NAME -yasm \- The YASM Modular Assembler -.SH "SYNOPSIS" +.\" Title: yasm +.\" Author: Peter Johnson +.\" Generator: DocBook XSL Stylesheets v1.70.1 +.\" Date: October 2006 +.\" Manual: The Yasm Modular Assembler +.\" Source: Yasm +.\" +.TH "YASM" "1" "October 2006" "Yasm" "The Yasm Modular Assembler" +.\" disable hyphenation +.nh +.\" disable justification (adjust text to left margin only) .ad l -.hy 0 +.SH "NAME" +yasm \- The Yasm Modular Assembler +.SH "SYNOPSIS" .HP 5 -\fByasm\fR [\fB\-f\ \fIformat\fR\fR] [\fB\-o\ \fIoutfile\fR\fR] [\fB\fIoptions\fR\fR...] [\fIinfile\fR] -.ad -.hy -.ad l -.hy 0 +\fByasm\fR [\fB\-f\ \fR\fB\fIformat\fR\fR] [\fB\-o\ \fR\fB\fIoutfile\fR\fR] [\fB\fIother\ options\fR\fR...] {\fIinfile\fR} .HP 5 \fByasm\fR \fB\-h\fR -.ad -.hy - .SH "DESCRIPTION" - .PP -The YASM Modular Assembler is a portable, retargetable assembler written under the ``new'' (2 or 3 clause) BSD license\&. It is designed from the ground up to allow for multiple assembler syntaxes (parsers) to be supported in addition to multiple output object formats and multiple instruction sets\&. Another primary module of the overall design is an optimizer module\&. - +The Yasm Modular Assembler is a portable, retargetable assembler written under the +\(lqnew\(rq +(2 or 3 clause) BSD license. Yasm currently supports the x86 and AMD64 instruction sets, accepts NASM and GAS assembler syntaxes, outputs binary, ELF32, ELF64, COFF, Win32, and Win64 object formats, and generates source debugging information in STABS, DWARF 2, and CodeView 8 formats. .PP -YASM consists of the \fByasm\fR command, libyasm, the core backend library, and a large number of loadable modules\&. On some platforms, libyasm and the loadable modules are statically built into the \fByasm\fR executable rather than being dynamically loaded\&. - +YASM consists of the +\fByasm\fR +command, libyasm, the core backend library, and a large number of modules. Currently, libyasm and the loadable modules are statically built into the +\fByasm\fR +executable. .PP -The \fByasm\fR command assembles the file infile and directs output to the file \fIoutfile\fR if specified\&. If \fIoutfile\fR is not specified, \fByasm\fR will derive a default output file name from the name of its input file, usually by appending \fI\&.o\fR or \fI\&.obj\fR, or by removing all extensions for a raw binary file\&. Failing that, the output file name will be \fIyasm\&.out\fR\&. - +The +\fByasm\fR +command assembles the file infile and directs output to the file +\fIoutfile\fR +if specified. If +\fIoutfile\fR +is not specified, +\fByasm\fR +will derive a default output file name from the name of its input file, usually by appending +\fI.o\fR +or +\fI.obj\fR, or by removing all extensions for a raw binary file. Failing that, the output file name will be +\fIyasm.out\fR. .PP -If called without an \fIinfile\fR, \fByasm\fR assembles the standard input and directs output to the file \fIoutfile\fR, or \fIyasm\&.out\fR if no \fIoutfile\fR is specified\&. - +If called with an +\fIinfile\fR +of +\(lq\-\(rq, +\fByasm\fR +assembles the standard input and directs output to the file +\fIoutfile\fR, or +\fIyasm.out\fR +if no +\fIoutfile\fR +is specified. .SH "OPTIONS" - -.PP -Many options may be given in one of two forms: either a dash followed by a single letter, or two dashes followed by a long option name\&. - -.PP -The following general options are available: - -.TP -\fB\-\-version\fR -Prints yasm version information and license summary to standard output\&. All other options are ignored, and no output file is generated\&. - -.TP -\fB\-h\fR or \fB\-\-help\fR -Prints a summary of invocation options\&. All other options are ignored, and no output file is generated\&. - -.TP -\fB\-a \fIarch\fR\fR or \fB\-\-arch=\fIarch\fR\fR -Selects the target architecture\&. The default architecture is ``x86'', which supports both the IA\-32 and derivatives and AMD64 instruction sets\&. To print a list of available architectures to standard output, use ``help'' as \fIarch\fR\&. See \fByasm_arch\fR(7) for more details\&. - -.TP -\fB\-p \fIparser\fR\fR or \fB\-\-parser=\fIparser\fR\fR -Selects the parser (the assembler syntax)\&. The default parser is ``nasm'', which emulates the syntax of NASM, the Netwide Assembler\&. To print a list of available parsers to standard output, use ``help'' as \fIparser\fR\&. - -.TP -\fB\-r \fIpreproc\fR\fR or \fB\-\-preproc=\fIpreproc\fR\fR -Selects the preprocessor to use on the input file before passing it to the parser\&. Preprocessors often provide macro functionality that is not included in the main parser\&. The default preprocessor is ``nasm'', which is an imported version of the actual NASM preprocessor\&. A ``raw'' preprocessor is also available, which simply skips the preprocessing step, passing the input file directly to the parser\&. To print a list of available preprocessors to standard output, use ``help'' as \fIpreproc\fR\&. - -.TP -\fB\-f \fIformat\fR\fR or \fB\-\-oformat=\fIformat\fR\fR -Selects the output object format\&. The default object format is ``bin'', which is a flat format binary with no relocation\&. To print a list of available object formats to standard output, use ``help'' as \fIformat\fR\&. - -.TP -\fB\-g \fIdebug\fR\fR or \fB\-\-dformat=\fIdebug\fR\fR -Selects the debugging format for debug information\&. Debugging information can be used by a debugger to associate executable code back to the source file or get data structure and type information\&. Available debug formats vary between different object formats; \fByasm\fR will error when an invalid combination is selected\&. The default object format is selected by the object format\&. To print a list of available debugging formats to standard output, use ``help'' as \fIdebug\fR\&. - -.TP -\fB\-L \fIlist\fR\fR or \fB\-\-lformat=\fIlist\fR\fR -Selects the format/style of the output list file\&. List files typically intermix the original source with the machine code generated by the assembler\&. The default list format is ``nasm'', which mimics the NASM list file format\&. To print a list of available list file formats to standard output, use ``help'' as \fIlist\fR\&. - -.TP -\fB\-o \fIfilename\fR\fR or \fB\-\-objfile=\fIfilename\fR\fR -Specifies the name of the output file, overriding any default name selected by \fByasm\fR\&. - -.TP -\fB\-l \fIlistfile\fR\fR or \fB\-\-list=\fIlistfile\fR\fR -Specifies the name of the output list file\&. If this option is not used, no list file is generated\&. - -.TP -\fB\-m \fImachine\fR\fR or \fB\-\-machine=\fImachine\fR\fR -Selects the target machine architecture\&. Essentially a subtype of the selected architecture, the machine type selects between major subsets of an architecture\&. For example, for the ``x86'' architecture, the two available machines are ``x86'', which is used for the IA\-32 and derivative 32\-bit instruction set, and ``amd64'', which is used for the 64\-bit instruction set\&. This differentiation is required to generate the proper object file for relocatable object formats such as COFF and ELF\&. To print a list of available machines for a given architecture to standard output, use ``help'' as \fImachine\fR and the given architecture using \fB\-a \fIarch\fR\fR\&. See \fByasm_arch\fR(7) for more details\&. - -.SH "WARNING OPTIONS" - -.PP -\fB\-W\fR options have two contrary forms: \fB\-W\fIname\fR\fR and \fB\-Wno\-\fIname\fR\fR\&. Only the non\-default forms are shown here\&. - -.TP +.PP +Many options may be given in one of two forms: either a dash followed by a single letter, or two dashes followed by a long option name. Options are listed in alphabetical order. +.SS "General Options" +.TP 3n +\fB\-a \fR\fB\fIarch\fR\fR or \fB\-\-arch=\fR\fB\fIarch\fR\fR: Select target architecture +Selects the target architecture. The default architecture is +\(lqx86\(rq, which supports both the IA\-32 and derivatives and AMD64 instruction sets. To print a list of available architectures to standard output, use +\(lqhelp\(rq +as +\fIarch\fR. See +\fByasm_arch\fR(7) +for a list of supported architectures. +.TP 3n +\fB\-f \fR\fB\fIformat\fR\fR or \fB\-\-oformat=\fR\fB\fIformat\fR\fR: Select object format +Selects the output object format. The default object format is +\(lqbin\(rq, which is a flat format binary with no relocation. To print a list of available object formats to standard output, use +\(lqhelp\(rq +as +\fIformat\fR. See +\fByasm_objfmt\fR(7) +for a list of supported object formats. +.TP 3n +\fB\-g \fR\fB\fIdebug\fR\fR or \fB\-\-dformat=\fR\fB\fIdebug\fR\fR: Select debugging format +Selects the debugging format for debug information. Debugging information can be used by a debugger to associate executable code back to the source file or get data structure and type information. Available debug formats vary between different object formats; +\fByasm\fR +will error when an invalid combination is selected. The default object format is selected by the object format. To print a list of available debugging formats to standard output, use +\(lqhelp\(rq +as +\fIdebug\fR. See +\fByasm_dbgfmt\fR(7) +for a list of supported debugging formats. +.TP 3n +\fB\-h\fR or \fB\-\-help\fR: Print a summary of options +Prints a summary of invocation options. All other options are ignored, and no output file is generated. +.TP 3n +\fB\-L \fR\fB\fIlist\fR\fR or \fB\-\-lformat=\fR\fB\fIlist\fR\fR: Select list file format +Selects the format/style of the output list file. List files typically intermix the original source with the machine code generated by the assembler. The default list format is +\(lqnasm\(rq, which mimics the NASM list file format. To print a list of available list file formats to standard output, use +\(lqhelp\(rq +as +\fIlist\fR. +.TP 3n +\fB\-l \fR\fB\fIlistfile\fR\fR or \fB\-\-list=\fR\fB\fIlistfile\fR\fR: Specify list filename +Specifies the name of the output list file. If this option is not used, no list file is generated. +.TP 3n +\fB\-m \fR\fB\fImachine\fR\fR or \fB\-\-machine=\fR\fB\fImachine\fR\fR: Select target machine architecture +Selects the target machine architecture. Essentially a subtype of the selected architecture, the machine type selects between major subsets of an architecture. For example, for the +\(lqx86\(rq +architecture, the two available machines are +\(lqx86\(rq, which is used for the IA\-32 and derivative 32\-bit instruction set, and +\(lqamd64\(rq, which is used for the 64\-bit instruction set. This differentiation is required to generate the proper object file for relocatable object formats such as COFF and ELF. To print a list of available machines for a given architecture to standard output, use +\(lqhelp\(rq +as +\fImachine\fR +and the given architecture using +\fB\-a \fR\fB\fIarch\fR\fR. See +\fByasm_arch\fR(7) +for more details. +.TP 3n +\fB\-o \fR\fB\fIfilename\fR\fR or \fB\-\-objfile=\fR\fB\fIfilename\fR\fR: Specify object filename +Specifies the name of the output file, overriding any default name generated by Yasm. +.TP 3n +\fB\-p \fR\fB\fIparser\fR\fR or \fB\-\-parser=\fR\fB\fIparser\fR\fR: Select parser +Selects the parser (the assembler syntax). The default parser is +\(lqnasm\(rq, which emulates the syntax of NASM, the Netwide Assembler. Another available parser is +\(lqgas\(rq, which emulates the syntax of GNU AS. To print a list of available parsers to standard output, use +\(lqhelp\(rq +as +\fIparser\fR. See +\fByasm_parsers\fR(7) +for a list of supported parsers. +.TP 3n +\fB\-r \fR\fB\fIpreproc\fR\fR or \fB\-\-preproc=\fR\fB\fIpreproc\fR\fR: Select preprocessor +Selects the preprocessor to use on the input file before passing it to the parser. Preprocessors often provide macro functionality that is not included in the main parser. The default preprocessor is +\(lqnasm\(rq, which is an imported version of the actual NASM preprocessor. A +\(lqraw\(rq +preprocessor is also available, which simply skips the preprocessing step, passing the input file directly to the parser. To print a list of available preprocessors to standard output, use +\(lqhelp\(rq +as +\fIpreproc\fR. +.TP 3n +\fB\-\-version\fR: Get the Yasm version +This option causes Yasm to prints the version number of Yasm as well as a license summary to standard output. All other options are ignored, and no output file is generated. +.\" end of SS subsection "General Options" +.SS "Warning Options" +.PP +\fB\-W\fR +options have two contrary forms: +\fB\-W\fR\fB\fIname\fR\fR +and +\fB\-Wno\-\fR\fB\fIname\fR\fR. Only the non\-default forms are shown here. +.PP +The warning options are handled in the order given on the command line, so if \fB\-w\fR -Inhibits all warning messages\&. - -.TP -\fB\-Werror\fR -Treats warnings as errors\&. - -.TP -\fB\-Wno\-unrecognized\-char\fR -Causes \fByasm\fR to not warn on unrecognized characters found in the input\&. - -.TP -\fB\-Worphan\-labels\fR -Causes \fByasm\fR to warn about labels found alone on a line without a trailing colon\&. While these are legal labels in the ``nasm'' parser, they may be unintentional, due to typos or macro definition ordering\&. - -.TP -\fB\-X \fIstyle\fR\fR -Selects a specific output style of error and warning messages\&. The default is ``gnu'' style, which mimics the output of \fBgcc\fR\&. The ``vc'' style is also available, which mimics the output of Microsoft's Visual C++ compiler\&. - -.SH "PREPROCESSOR OPTIONS" - -.TP -\fB\-e\fR or \fB\-\-preproc\-only\fR -Stops assembly after the preprocessing stage; preprocessed output is sent to the specified output name or, if no output name is specified, the standard output\&. No object file is produced\&. - -.TP -\fB\-I \fIpath\fR\fR -Adds directory \fIpath\fR to the search path for include files\&. - -.TP -\fB\-P \fIfilename\fR\fR -Pre\-includes file \fIfilename\fR, making it look as though \fIfilename\fR was prepended to the input\&. - -.TP -\fB\-D \fImacro[=value]\fR\fR -Pre\-defines a single\-line macro\&. - -.TP -\fB\-U \fImacro\fR\fR -Undefines a single\-line macro\&. - +is followed by +\fB\-Worphan\-labels\fR, all warnings are turned off +\fIexcept\fR +for orphan\-labels. +.TP 3n +\fB\-w\fR: Inhibit all warning messages +This option causes Yasm to inhibit all warning messages. As discussed above, this option may be followed by other options to re\-enable specified warnings. +.TP 3n +\fB\-Werror\fR: Treat warnings as errors +This option causes Yasm to treat all warnings as errors. Normally warnings do not prevent an object file from being generated and do not result in a failure exit status from +\fByasm\fR, whereas errors do. This option makes warnings equivalent to errors in terms of this behavior. +.TP 3n +\fB\-Wno\-unrecognized\-char\fR: Do not warn on unrecognized input characters +Causes Yasm to not warn on unrecognized characters found in the input. Normally Yasm will generate a warning for any non\-ASCII character found in the input file. +.TP 3n +\fB\-Worphan\-labels\fR: Warn on labels lacking a trailing option +When using the NASM\-compatible parser, causes Yasm to warn about labels found alone on a line without a trailing colon. While these are legal labels in NASM syntax, they may be unintentional, due to typos or macro definition ordering. +.TP 3n +\fB\-X \fR\fB\fIstyle\fR\fR: Change error/warning reporting style +Selects a specific output style for error and warning messages. The default is +\(lqgnu\(rq +style, which mimics the output of +\fBgcc\fR. The +\(lqvc\(rq +style is also available, which mimics the output of Microsoft's Visual C++ compiler. +.sp +This option is available so that Yasm integrates more naturally into IDE environments such as +Visual Studio +or +Emacs, allowing the IDE to correctly recognize the error/warning message as such and link back to the offending line of source code. +.\" end of SS subsection "Warning Options" +.SS "Preprocessor Options" +.PP +While these preprocessor options theoretically will affect any preprocessor, the only preprocessor currently in Yasm is the +\(lqnasm\(rq +preprocessor. +.TP 3n +\fB\-D \fR\fB\fImacro[=value]\fR\fR: Pre\-define a macro +Pre\-defines a single\-line macro. The value is optional (if no value is given, the macro is still defined, but to an empty value). +.TP 3n +\fB\-e\fR or \fB\-\-preproc\-only\fR: Only preprocess +Stops assembly after the preprocessing stage; preprocessed output is sent to the specified output name or, if no output name is specified, the standard output. No object file is produced. +.TP 3n +\fB\-I \fR\fB\fIpath\fR\fR: Add include file path +Adds directory +\fIpath\fR +to the search path for include files. The search path defaults to only including the directory in which the source file resides. +.TP 3n +\fB\-P \fR\fB\fIfilename\fR\fR: Pre\-include a file +Pre\-includes file +\fIfilename\fR, making it look as though +\fIfilename\fR +was prepended to the input. Can be useful for prepending multi\-line macros that the +\fB\-D\fR +can't support. +.TP 3n +\fB\-U \fR\fB\fImacro\fR\fR: Undefine a macro +Undefines a single\-line macro (may be either a built\-in macro or one defined earlier in the command line with +\fB\-D\fR. +.\" end of SS subsection "Preprocessor Options" .SH "EXAMPLES" - -.PP -To assemble NASM syntax, 32\-bit x86 source \fIsource\&.asm\fR into ELF file \fIsource\&.o\fR, warning on orphan labels: - -.IP -yasm \-f elf \-Worphan\-labels source\&.asm - -.PP -To assemble NASM syntax AMD64 source \fIx\&.asm\fR into AMD64 Win32 file \fIobject\&.obj\fR: - -.IP -yasm \-m amd64 \-f win32 \-o object\&.obj x\&.asm - -.PP -To assemble already preprocessed NASM syntax 32\-bit x86 source \fIy\&.asm\fR into flat binary file \fIy\&.com\fR: - -.IP -yasm \-f bin \-r raw \-o y\&.com y\&.asm - +.PP +To assemble NASM syntax, 32\-bit x86 source +\fIsource.asm\fR +into ELF file +\fIsource.o\fR, warning on orphan labels: +.sp +.RS 3n +.nf +yasm \-f elf32 \-Worphan\-labels source.asm +.fi +.RE +.PP +To assemble NASM syntax AMD64 source +\fIx.asm\fR +into Win64 file +\fIobject.obj\fR: +.sp +.RS 3n +.nf +yasm \-f win64 \-o object.obj x.asm +.fi +.RE +.PP +To assemble already preprocessed NASM syntax x86 source +\fIy.asm\fR +into flat binary file +\fIy.com\fR: +.sp +.RS 3n +.nf +yasm \-f bin \-r raw \-o y.com y.asm +.fi +.RE .SH "DIAGNOSTICS" - .PP -The \fByasm\fR command exits 0 on success, and nonzero if an error occurs\&. - +The +\fByasm\fR +command exits 0 on success, and nonzero if an error occurs. .SH "COMPATIBILITY" - .PP -YASM's NASM parser and preprocessor, while they strive to be as compatible as possible with NASM, have a few incompatibilities due to YASM's different internal structure\&. - -.SH "RESTRICTIONS" - +Yasm's NASM parser and preprocessor, while they strive to be as compatible as possible with NASM, have a few incompatibilities due to YASM's different internal structure. .PP -As object files are often architecture and machine dependent, not all combinations of object formats, architectures, and machines are legal; trying to use an invalid combination will result in an error\&. - +Yasm's GAS parser and preprocessor are missing a number of features present in GNU AS. +.SH "RESTRICTIONS" .PP -There is no support for list files or symbol maps\&. - +As object files are often architecture and machine dependent, not all combinations of object formats, architectures, and machines are legal; trying to use an invalid combination will result in an error. .PP -Relocatable object formats are limited to static linking applications, as YASM cannot generate relocations for dynamic linking\&. - +There is no support for symbol maps. .SH "SEE ALSO" - .PP -\fBas\fR(1), \fBld\fR(1), \fBnasm\fR(1), \fByasm_arch\fR(7) - +\fByasm_arch\fR(7), +\fByasm_dbgfmt\fR(7), +\fByasm_objfmt\fR(7), +\fByasm_parsers\fR(7) +.PP +Related tools: +\fBas\fR(1), +\fBld\fR(1), +\fBnasm\fR(1) .SH "BUGS" - .PP -When using the ``x86'' architecture, it is overly easy to generate AMD64 code (using the \fBBITS 64\fR directive) and generate a 32\-bit object file (by failing to specify \fB\-m amd64\fR on the command line)\&. Similarly, specifying \fB\-m amd64\fR does not default the BITS setting to 64\&. - -.SH AUTHOR -Peter Johnson . +When using the +\(lqx86\(rq +architecture, it is overly easy to generate AMD64 code (using the +\fBBITS 64\fR +directive) and generate a 32\-bit object file (by failing to specify +\fB\-m amd64\fR +or selecting a 64\-bit object format such as ELF64 on the command line). Similarly, specifying +\fB\-m amd64\fR +does not default the BITS setting to 64. An easy way to avoid this is by directly specifying a 64\-bit object format such as +\fB\-f elf64\fR. +.SH "AUTHOR" +.PP +\fBPeter\fR \fBJohnson\fR +.sp -1n +.IP "" 3n +Author. +.SH "COPYRIGHT" +Copyright \(co 2004, 2005, 2006 Peter Johnson diff --git a/yasm_arch.7 b/yasm_arch.7 index 24eae9de..7694dd73 100644 --- a/yasm_arch.7 +++ b/yasm_arch.7 @@ -1,163 +1,372 @@ -.\"Generated by db2man.xsl. Don't modify this, modify the source. -.de Sh \" Subsection -.br -.if t .Sp -.ne 5 -.PP -\fB\\$1\fR -.PP -.. -.de Sp \" Vertical space (when we can't use .PP) -.if t .sp .5v -.if n .sp -.. -.de Ip \" List item -.br -.ie \\n(.$>=3 .ne \\$3 -.el .ne 3 -.IP "\\$1" \\$2 -.. -.TH "YASM_ARCH" 7 "September 2004" "YASM" "YASM Architectures" -.SH NAME -yasm_arch \- YASM Architectures -.SH "SYNOPSIS" +.\" Title: yasm_arch +.\" Author: Peter Johnson +.\" Generator: DocBook XSL Stylesheets v1.70.1 +.\" Date: October 2006 +.\" Manual: Yasm Supported Target Architectures +.\" Source: YASM +.\" +.TH "YASM_ARCH" "7" "October 2006" "YASM" "Yasm Supported Target Architec" +.\" disable hyphenation +.nh +.\" disable justification (adjust text to left margin only) .ad l -.hy 0 +.SH "NAME" +yasm_arch \- Yasm Supported Target Architectures +.SH "SYNOPSIS" .HP 5 -\fByasm\fR \fB\-a\ \fIarch\fR\fR [\fB\-m\ \fImachine\fR\fR] \fB\fI\&.\&.\&.\fR\fR -.ad -.hy - +\fByasm\fR \fB\-a\ \fR\fB\fIarch\fR\fR [\fB\-m\ \fR\fB\fImachine\fR\fR] \fB\fI...\fR\fR .SH "DESCRIPTION" - .PP -The standard YASM distribution includes a number of loadable modules for different target architectures\&. Additional target architectures may be installed as third\-party modules\&. Each target architecture can support one or more machine architectures\&. - +The standard Yasm distribution includes a number of modules for different target architectures. Each target architecture can support one or more machine architectures. +.PP +The architecture and machine are selected on the +\fByasm\fR(1) +command line by use of the +\fB\-a \fR\fB\fIarch\fR\fR +and +\fB\-m \fR\fB\fImachine\fR\fR +command line options, respectively. .PP -The architecture and machine are selected on the \fByasm\fR(1) command line by use of the \fB\-a \fIarch\fR\fR and \fB\-m \fImachine\fR\fR command line options, respectively\&. - +The machine architecture may also automatically be selected by certain object formats. For example, the +\(lqelf32\(rq +object format selects the +\(lqx86\(rq +machine architecture by default, while the +\(lqelf64\(rq +object format selects the +\(lqamd64\(rq +machine architecture by default. .SH "X86 ARCHITECTURE" - .PP -The ``x86'' architecture supports the IA\-32 instruction set and derivatives and the AMD64 instruction set\&. It consists of two machines: ``x86'' (for the IA\-32 and derivatives) and ``amd64'' (for the AMD64 and derivatives)\&. The default machine for the ``x86'' architecture is the ``x86'' machine\&. - +The +\(lqx86\(rq +architecture supports the IA\-32 instruction set and derivatives and the AMD64 instruction set. It consists of two machines: +\(lqx86\(rq +(for the IA\-32 and derivatives) and +\(lqamd64\(rq +(for the AMD64 and derivatives). The default machine for the +\(lqx86\(rq +architecture is the +\(lqx86\(rq +machine. .SS "BITS Setting" - .PP -The x86 architecture BITS setting specifies to YASM the processor mode in which the generated code is intended to execute\&. x86 processors can run in three different major execution modes: 16\-bit, 32\-bit, and on AMD64\-supporting processors, 64\-bit\&. As the x86 instruction set contains portions whose function is execution\-mode dependent (such as operand\-size and address\-size override prefixes), YASM cannot assemble x86 instructions correctly unless it is told by the user in what processor mode the code will execute\&. - +The x86 architecture BITS setting specifies to Yasm the processor mode in which the generated code is intended to execute. x86 processors can run in three different major execution modes: 16\-bit, 32\-bit, and on AMD64\-supporting processors, 64\-bit. As the x86 instruction set contains portions whose function is execution\-mode dependent (such as operand\-size and address\-size override prefixes), Yasm cannot assemble x86 instructions correctly unless it is told by the user in what processor mode the code will execute. .PP -The BITS setting can be changed in a variety of ways\&. When using the NASM\-compatible parser, the BITS setting can be changed directly via the use of the \fBBITS xx\fR assembler directive\&. The default BITS setting is determined by the object format in use\&. - +The BITS setting can be changed in a variety of ways. When using the NASM\-compatible parser, the BITS setting can be changed directly via the use of the +\fBBITS xx\fR +assembler directive. The default BITS setting is determined by the object format in use. +.\" end of SS subsection "BITS Setting" .SS "BITS 64 Extensions" - .PP -When an AMD64\-supporting processor is executing in 64\-bit mode, a number of additional extensions are available, including extra general purpose registers, extra SSE2 registers, and RIP\-relative addressing\&. - +The AMD64 architecture is a new 64\-bit architecture developed by AMD, based on the 32\-bit x86 architecture. It extends the original x86 architecture by doubling the number of general purpose and SIMD registers, extending the arithmetic operations and address space to 64 bits, as well as other features. .PP -The additional 64\-bit general purpose registers are named r8\-r15\&. There are also 8\-bit (rXb), 16\-bit (rXw), and 32\-bit (rXd) subregisters that map to the least significant 8, 16, or 32 bits of the 64\-bit register\&. The original 8 general purpose registers have also been extended to 64\-bits: eax, edx, ecx, ebx, esi, edi, esp, and ebp have new 64\-bit versions called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp respectively\&. The old 32\-bit registers map to the least significant bits of the new 64\-bit registers\&. - +Recently, Intel has introduced an essentially identical version of AMD64 called EM64T. .PP -New 8\-bit registers are also available that map to the 8 least significant bits of rsi, rdi, rsp, and rbp\&. These are called sil, dil, spl, and bpl respectively\&. Unfortunately, due to the way instructions are encoded, these new 8\-bit registers are encoded the same as the old 8\-bit registers ah, dh, ch, and bh\&. The processor tells which is being used by the presence of the new REX prefix that is used to specify the other extended registers\&. This means it is illegal to mix the use of ah, dh, ch, and bh with an instruction that requires the REX prefix for other reasons\&. For instance: - -.IP +When an AMD64\-supporting processor is executing in 64\-bit mode, a number of additional extensions are available, including extra general purpose registers, extra SSE2 registers, and RIP\-relative addressing. +.PP +Yasm extends the base NASM syntax to support AMD64 as follows. To enable assembly of instructions for the 64\-bit mode of AMD64 processors, use the directive +\fBBITS 64\fR. As with NASM's BITS directive, this does not change the format of the output object file to 64 bits; it only changes the assembler mode to assume that the instructions being assembled will be run in 64\-bit mode. To specify an AMD64 object file, use +\fB\-m amd64\fR +on the Yasm command line, or explicitly target a 64\-bit object format such as +\fB\-f win64\fR +or +\fB\-f elf64\fR. +.sp +.it 1 an-trap +.nr an-no-space-flag 1 +.nr an-break-flag 1 +.br +\fBRegister Changes\fR +.RS +.PP +The additional 64\-bit general purpose registers are named r8\-r15. There are also 8\-bit (rXb), 16\-bit (rXw), and 32\-bit (rXd) subregisters that map to the least significant 8, 16, or 32 bits of the 64\-bit register. The original 8 general purpose registers have also been extended to 64\-bits: eax, edx, ecx, ebx, esi, edi, esp, and ebp have new 64\-bit versions called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp respectively. The old 32\-bit registers map to the least significant bits of the new 64\-bit registers. +.PP +New 8\-bit registers are also available that map to the 8 least significant bits of rsi, rdi, rsp, and rbp. These are called sil, dil, spl, and bpl respectively. Unfortunately, due to the way instructions are encoded, these new 8\-bit registers are encoded the same as the old 8\-bit registers ah, dh, ch, and bh. The processor tells which is being used by the presence of the new REX prefix that is used to specify the other extended registers. This means it is illegal to mix the use of ah, dh, ch, and bh with an instruction that requires the REX prefix for other reasons. For instance: +.sp +.RS 3n +.nf add ah, [r10] +.fi +.RE .PP -(NASM syntax) is not a legal instruction because the use of r10 requires a REX prefix, making it impossible to use ah\&. - +(NASM syntax) is not a legal instruction because the use of r10 requires a REX prefix, making it impossible to use ah. .PP -In 64\-bit mode, an additional 8 SSE2 registers are also available\&. These are named xmm8\-xmm15\&. - +In 64\-bit mode, an additional 8 SSE2 registers are also available. These are named xmm8\-xmm15. +.RE +.\" end of subsection "Register Changes" +.sp +.it 1 an-trap +.nr an-no-space-flag 1 +.nr an-break-flag 1 +.br +\fB64 Bit Instructions\fR +.RS .PP -By default, most operations in 64\-bit mode remain 32\-bit; operations that are 64\-bit usually require a REX prefix (one bit in the REX prefix determines whether an operation is 64\-bit or 32\-bit)\&. Thus, essentially all 32\-bit instructions have a 64\-bit version, and the 64\-bit versions of instructions can use extended registers ``for free'' (as the REX prefix is already present)\&. Examples in NASM syntax: - -.IP +By default, most operations in 64\-bit mode remain 32\-bit; operations that are 64\-bit usually require a REX prefix (one bit in the REX prefix determines whether an operation is 64\-bit or 32\-bit). Thus, essentially all 32\-bit instructions have a 64\-bit version, and the 64\-bit versions of instructions can use extended registers +\(lqfor free\(rq +(as the REX prefix is already present). Examples in NASM syntax: +.sp +.RS 3n +.nf mov eax, 1 ; 32\-bit instruction -.IP +.fi +.RE +.sp +.RS 3n +.nf mov rcx, 1 ; 64\-bit instruction +.fi +.RE .PP -Instructions that modify the stack (push, pop, call, ret, enter, and leave) are implicitly 64\-bit\&. Their 32\-bit counterparts are not available, but their 16\-bit counterparts are\&. Examples in NASM syntax: - -.IP +Instructions that modify the stack (push, pop, call, ret, enter, and leave) are implicitly 64\-bit. Their 32\-bit counterparts are not available, but their 16\-bit counterparts are. Examples in NASM syntax: +.sp +.RS 3n +.nf push eax ; illegal instruction -.IP +.fi +.RE +.sp +.RS 3n +.nf push rbx ; 1\-byte instruction -.IP +.fi +.RE +.sp +.RS 3n +.nf push r11 ; 2\-byte instruction with REX prefix +.fi +.RE +.RE +.\" end of subsection "64 Bit Instructions" +.sp +.it 1 an-trap +.nr an-no-space-flag 1 +.nr an-break-flag 1 +.br +\fBImplicit Zero Extension\fR +.RS .PP -Results of 32\-bit operations are implicitly zero\-extended to the upper 32 bits of the corresponding 64\-bit register\&. 16 and 8 bit operations, on the other hand, do not affect upper bits of the register (just as in 32\-bit and 16\-bit modes)\&. This can be used to generate smaller code in some instances\&. Examples in NASM syntax: - -.IP +Results of 32\-bit operations are implicitly zero\-extended to the upper 32 bits of the corresponding 64\-bit register. 16 and 8 bit operations, on the other hand, do not affect upper bits of the register (just as in 32\-bit and 16\-bit modes). This can be used to generate smaller code in some instances. Examples in NASM syntax: +.sp +.RS 3n +.nf mov ecx, 1 ; 1 byte shorter than mov rcx, 1 -.IP +.fi +.RE +.sp +.RS 3n +.nf and edx, 3 ; equivalent to and rdx, 3 +.fi +.RE +.RE +.\" end of subsection "Implicit Zero Extension" +.sp +.it 1 an-trap +.nr an-no-space-flag 1 +.nr an-break-flag 1 +.br +\fBImmediates\fR +.RS .PP -For most instructions in 64\-bit mode, immediate values remain 32 bits; their value is sign\-extended into the upper 32 bits of the target register prior to being used\&. The exception is the mov instruction, which can take a 64\-bit immediate when the destination is a 64\-bit register\&. Examples in NASM syntax: - -.IP -add rax, 1 ; legal -.IP -add rax, 0xffffffff ; sign\-extended -.IP -add rax, \-1 ; same as above -.IP -add rax, 0xffffffffffffffff ; warning (>32 bit) -.IP -mov eax, 1 ; 5 byte instruction -.IP -mov rax, 1 ; 10 byte instruction -.IP -mov rbx, 0x1234567890abcdef ; 10 byte instruction -.IP -mov rcx, 0xffffffff ; 10 byte instruction -.IP -mov ecx, \-1 ; 5 byte instruction equivalent to above -.PP -Just like immediates, displacements, for the most part, remain 32 bits and are sign extended prior to use\&. Again, the exception is one restricted form of the mov instruction: between the al/ax/eax/rax register and a 64\-bit absolute address (no registers allowed in the effective address)\&. In NASM syntax, use of the 64\-bit absolute form requires \fB[qword]\fR\&. Examples in NASM syntax: - -.IP +For most instructions in 64\-bit mode, immediate values remain 32 bits; their value is sign\-extended into the upper 32 bits of the target register prior to being used. The exception is the mov instruction, which can take a 64\-bit immediate when the destination is a 64\-bit register. Examples in NASM syntax: +.sp +.RS 3n +.nf +add rax, 1 ; optimized down to signed 8\-bit +.fi +.RE +.sp +.RS 3n +.nf +add rax, dword 1 ; force size to 32\-bit +.fi +.RE +.sp +.RS 3n +.nf +add rax, 0xffffffff ; sign\-extended 32\-bit +.fi +.RE +.sp +.RS 3n +.nf +add rax, \-1 ; same as above +.fi +.RE +.sp +.RS 3n +.nf +add rax, 0xffffffffffffffff ; truncated to 32\-bit (warning) +.fi +.RE +.sp +.RS 3n +.nf +mov eax, 1 ; 5 byte +.fi +.RE +.sp +.RS 3n +.nf +mov rax, 1 ; 5 byte (optimized to signed 32\-bit) +.fi +.RE +.sp +.RS 3n +.nf +mov rax, qword 1 ; 10 byte (forced 64\-bit) +.fi +.RE +.sp +.RS 3n +.nf +mov rbx, 0x1234567890abcdef ; 10 byte +.fi +.RE +.sp +.RS 3n +.nf +mov rcx, 0xffffffff ; 10 byte (does not fit in signed 32\-bit) +.fi +.RE +.sp +.RS 3n +.nf +mov ecx, \-1 ; 5 byte, equivalent to above +.fi +.RE +.sp +.RS 3n +.nf +mov rcx, sym ; 5 byte, 32\-bit size default for symbols +.fi +.RE +.sp +.RS 3n +.nf +mov rcx, qword sym ; 10 byte, override default size +.fi +.RE +.RE +.\" end of subsection "Immediates" +.sp +.it 1 an-trap +.nr an-no-space-flag 1 +.nr an-break-flag 1 +.br +\fBDisplacements\fR +.RS +.PP +Just like immediates, displacements, for the most part, remain 32 bits and are sign extended prior to use. Again, the exception is one restricted form of the mov instruction: between the al/ax/eax/rax register and a 64\-bit absolute address (no registers allowed in the effective address). In NASM syntax, use of the 64\-bit absolute form requires +\fB[qword]\fR. Examples in NASM syntax: +.sp +.RS 3n +.nf mov eax, [1] ; 32 bit, with sign extension -.IP +.fi +.RE +.sp +.RS 3n +.nf mov al, [rax\-1] ; 32 bit, with sign extension -.IP +.fi +.RE +.sp +.RS 3n +.nf mov al, [qword 0x1122334455667788] ; 64\-bit absolute -.IP +.fi +.RE +.sp +.RS 3n +.nf mov al, [0x1122334455667788] ; truncated to 32\-bit (warning) +.fi +.RE +.RE +.\" end of subsection "Displacements" +.sp +.it 1 an-trap +.nr an-no-space-flag 1 +.nr an-break-flag 1 +.br +\fBRIP Relative Addressing\fR +.RS .PP -In 64\-bit mode, a new form of effective addressing is available to make it easier to write position\-independent code\&. Any memory reference may be made RIP relative (RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction)\&. - +In 64\-bit mode, a new form of effective addressing is available to make it easier to write position\-independent code. Any memory reference may be made RIP relative (RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction). .PP In NASM syntax, there are two ways to specify RIP\-relative addressing: - -.IP +.sp +.RS 3n +.nf mov dword [rip+10], 1 +.fi +.RE .PP -stores the value 1 ten bytes after the end of the instruction\&. \fB10\fR can also be a symbolic constant, and will be treated the same way\&. On the other hand, - -.IP +stores the value 1 ten bytes after the end of the instruction. +\fB10\fR +can also be a symbolic constant, and will be treated the same way. On the other hand, +.sp +.RS 3n +.nf mov dword [symb wrt rip], 1 +.fi +.RE .PP -stores the value 1 into the address of symbol \fBsymb\fR\&. This is distinctly different than the behavior of: - -.IP +stores the value 1 into the address of symbol +\fBsymb\fR. This is distinctly different than the behavior of: +.sp +.RS 3n +.nf mov dword [symb+rip], 1 +.fi +.RE .PP -which takes the address of the end of the instruction, adds the address of \fBsymb\fR to it, then stores the value 1 there\&. If \fBsymb\fR is a variable, this will NOT store the value 1 into the \fBsymb\fR variable! - +which takes the address of the end of the instruction, adds the address of +\fBsymb\fR +to it, then stores the value 1 there. If +\fBsymb\fR +is a variable, this will +\fInot\fR +store the value 1 into the +\fBsymb\fR +variable! +.RE +.\" end of subsection "RIP Relative Addressing" +.\" end of SS subsection "BITS 64 Extensions" .SH "LC3B ARCHITECTURE" - .PP -The ``lc3b'' architecture supports the LC\-3b ISA as used in the ECE 312 (now ECE 411) course at the University of Illinois, Urbana\-Champaign, as well as other university courses\&. See \fIhttp://courses.ece.uiuc.edu/ece411/\fR for more details and example code\&. The ``lc3b'' architecture consists of only one machine: ``lc3b''\&. - +The +\(lqlc3b\(rq +architecture supports the LC\-3b ISA as used in the ECE 312 (now ECE 411) course at the University of Illinois, Urbana\-Champaign, as well as other university courses. See +\fI\%http://courses.ece.uiuc.edu/ece411/\fR +for more details and example code. The +\(lqlc3b\(rq +architecture consists of only one machine: +\(lqlc3b\(rq. .SH "SEE ALSO" - .PP \fByasm\fR(1) - .SH "BUGS" - .PP -When using the ``x86'' architecture, it is overly easy to generate AMD64 code (using the \fBBITS 64\fR directive) and generate a 32\-bit object file (by failing to specify \fB\-m amd64\fR on the command line)\&. Similarly, specifying \fB\-m amd64\fR does not default the BITS setting to 64\&. - -.SH AUTHOR -Peter Johnson . +When using the +\(lqx86\(rq +architecture, it is overly easy to generate AMD64 code (using the +\fBBITS 64\fR +directive) and generate a 32\-bit object file (by failing to specify +\fB\-m amd64\fR +on the command line or selecting a 64\-bit object format). Similarly, specifying +\fB\-m amd64\fR +does not default the BITS setting to 64. An easy way to avoid this is by directly specifying a 64\-bit object format such as +\fB\-f elf64\fR. +.SH "AUTHOR" +.PP +\fBPeter\fR \fBJohnson\fR +.sp -1n +.IP "" 3n +Author. +.SH "COPYRIGHT" +Copyright \(co 2004, 2005, 2006 Peter Johnson -- 2.40.0