<listitem>
- <para>Selects the target architecture. The default
- architecture is <quote>x86</quote>, which supports both
- the IA-32 and derivatives and AMD64 instruction sets. To
- print a list of available architectures to standard
- output, use <quote>help</quote> as
- <replaceable>arch</replaceable>. See <citerefentry>
- <refentrytitle>yasm_arch</refentrytitle>
- <manvolnum>7</manvolnum>
- </citerefentry> for more details.</para>
+ <para>Selects the target architecture. The default
+ architecture is <quote>x86</quote>, which supports both
+ the IA-32 and derivatives and AMD64 instruction sets. To
+ print a list of available architectures to standard
+ output, use <quote>help</quote> as
+ <replaceable>arch</replaceable>. See <citerefentry>
+ <refentrytitle>yasm_arch</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry> for more details.</para>
</listitem>
</varlistentry>
<listitem>
- <para>Selects the target machine architecture. Essentially a
- subtype of the selected architecture, the machine type
- selects between major subsets of an architecture. For
- example, for the <quote>x86</quote> architecture, the two
- available machines are <quote>x86</quote>, which is used
- for the IA-32 and derivative 32-bit instruction set, and
- <quote>amd64</quote>, which is used for the 64-bit
- instruction set. This differentiation is required to
- generate the proper object file for relocatable object
- formats such as COFF and ELF. To print a list of
- available machines for a given architecture to standard
- output, use <quote>help</quote> as
- <replaceable>machine</replaceable> and the given
- architecture using <option>-a
- <replaceable>arch</replaceable></option>. See
- <citerefentry>
- <refentrytitle>yasm_arch</refentrytitle>
- <manvolnum>7</manvolnum>
- </citerefentry> for more details.</para>
+ <para>Selects the target machine architecture. Essentially a
+ subtype of the selected architecture, the machine type
+ selects between major subsets of an architecture. For
+ example, for the <quote>x86</quote> architecture, the two
+ available machines are <quote>x86</quote>, which is used
+ for the IA-32 and derivative 32-bit instruction set, and
+ <quote>amd64</quote>, which is used for the 64-bit
+ instruction set. This differentiation is required to
+ generate the proper object file for relocatable object
+ formats such as COFF and ELF. To print a list of
+ available machines for a given architecture to standard
+ output, use <quote>help</quote> as
+ <replaceable>machine</replaceable> and the given
+ architecture using <option>-a
+ <replaceable>arch</replaceable></option>. See
+ <citerefentry>
+ <refentrytitle>yasm_arch</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry> for more details.</para>
</listitem>
</varlistentry>
<refsect1><title>Bugs</title>
<para>When using the <quote>x86</quote> architecture, it is overly easy to
- generate AMD64 code (using the <userinput>BITS 64</userinput>
- directive) and generate a 32-bit object file (by failing to specify
- <option>-m amd64</option> on the command line). Similarly, specifying
- <option>-m amd64</option> does not default the BITS setting to
- 64.</para>
+ generate AMD64 code (using the <userinput>BITS 64</userinput>
+ directive) and generate a 32-bit object file (by failing to specify
+ <option>-m amd64</option> on the command line). Similarly, specifying
+ <option>-m amd64</option> does not default the BITS setting to
+ 64.</para>
</refsect1>
.SH "X86 ARCHITECTURE"
.PP
-The ``x86'' architecture supports the IA\-32 instruction set and derivatives and the AMD64 instruction set\&. It consists of two machines: ``x86'' (for the IA\-32 and derivatives) and``amd64'' (for the AMD64 and derivatives)\&. The default machine for the ``x86'' architecture is the``x86'' machine\&.
+The ``x86'' architecture supports the IA\-32 instruction set and derivatives and the AMD64 instruction set\&. It consists of two machines: ``x86'' (for the IA\-32 and derivatives) and ``amd64'' (for the AMD64 and derivatives)\&. The default machine for the ``x86'' architecture is the ``x86'' machine\&.
+
+.SS "BITS Setting"
+
+.PP
+The x86 architecture BITS setting specifies to YASM the processor mode in which the generated code is intended to execute\&. x86 processors can run in three different major execution modes: 16\-bit, 32\-bit, and on AMD64\-supporting processors, 64\-bit\&. As the x86 instruction set contains portions whose function is execution\-mode dependent (such as operand\-size and address\-size override prefixes), YASM cannot assemble x86 instructions correctly unless it is told by the user in what processor mode the code will execute\&.
+
+.PP
+The BITS setting can be changed in a variety of ways\&. When using the NASM\-compatible parser, the BITS setting can be changed directly via the use of the \fBBITS xx\fR assembler directive\&. The default BITS setting is determined by the object format in use\&.
+
+.SS "BITS 64 Extensions"
+
+.PP
+When an AMD64\-supporting processor is executing in 64\-bit mode, a number of additional extensions are available, including extra general purpose registers, extra SSE2 registers, and RIP\-relative addressing\&.
+
+.PP
+The additional 64\-bit general purpose registers are named r8\-r15\&. There are also 8\-bit (rXb), 16\-bit (rXw), and 32\-bit (rXd) subregisters that map to the least significant 8, 16, or 32 bits of the 64\-bit register\&. The original 8 general purpose registers have also been extended to 64\-bits: eax, edx, ecx, ebx, esi, edi, esp, and ebp have new 64\-bit versions called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp respectively\&. The old 32\-bit registers map to the least significant bits of the new 64\-bit registers\&.
+
+.PP
+New 8\-bit registers are also available that map to the 8 least significant bits of rsi, rdi, rsp, and rbp\&. These are called sil, dil, spl, and bpl respectively\&. Unfortunately, due to the way instructions are encoded, these new 8\-bit registers are encoded the same as the old 8\-bit registers ah, dh, ch, and bh\&. The processor tells which is being used by the presence of the new REX prefix that is used to specify the other extended registers\&. This means it is illegal to mix the use of ah, dh, ch, and bh with an instruction that requires the REX prefix for other reasons\&. For instance:
+
+.IP
+add ah, [r10]
+.PP
+(NASM syntax) is not a legal instruction because the use of r10 requires a REX prefix, making it impossible to use ah\&.
+
+.PP
+In 64\-bit mode, an additional 8 SSE2 registers are also available\&. These are named xmm8\-xmm15\&.
+
+.PP
+By default, most operations in 64\-bit mode remain 32\-bit; operations that are 64\-bit usually require a REX prefix (one bit in the REX prefix determines whether an operation is 64\-bit or 32\-bit)\&. Thus, essentially all 32\-bit instructions have a 64\-bit version, and the 64\-bit versions of instructions can use extended registers ``for free'' (as the REX prefix is already present)\&. Examples in NASM syntax:
+
+.IP
+mov eax, 1 ; 32\-bit instruction
+.IP
+mov rcx, 1 ; 64\-bit instruction
+.PP
+Instructions that modify the stack (push, pop, call, ret, enter, and leave) are implicitly 64\-bit\&. Their 32\-bit counterparts are not available, but their 16\-bit counterparts are\&. Examples in NASM syntax:
+
+.IP
+push eax ; illegal instruction
+.IP
+push rbx ; 1\-byte instruction
+.IP
+push r11 ; 2\-byte instruction with REX prefix
+.PP
+Results of 32\-bit operations are implicitly zero\-extended to the upper 32 bits of the corresponding 64\-bit register\&. 16 and 8 bit operations, on the other hand, do not affect upper bits of the register (just as in 32\-bit and 16\-bit modes)\&. This can be used to generate smaller code in some instances\&. Examples in NASM syntax:
+
+.IP
+mov ecx, 1 ; 1 byte shorter than mov rcx, 1
+.IP
+and edx, 3 ; equivalent to and rdx, 5
+.PP
+For most instructions in 64\-bit mode, immediate values remain 32 bits; their value is sign\-extended into the upper 32 bits of the target register prior to being used\&. The exception is the mov instruction, which can take a 64\-bit immediate when the destination is a 64\-bit register\&. Examples in NASM syntax:
+
+.IP
+add rax, 1 ; legal
+.IP
+add rax, 0xffffffff ; sign\-extended
+.IP
+add rax, \-1 ; same as above
+.IP
+add rax, 0xffffffffffffffff ; warning (>32 bit)
+.IP
+mov eax, 1 ; 5 byte instruction
+.IP
+mov rax, 1 ; 10 byte instruction
+.IP
+mov rbx, 0x1234567890abcdef ; 10 byte instruction
+.IP
+mov rcx, 0xffffffff ; 10 byte instruction
+.IP
+mov ecx, \-1 ; 5 byte instruction equivalent to above
+.PP
+Just like immediates, displacements, for the most part, remain 32 bits and are sign extended prior to use\&. Again, the exception is one restricted form of the mov instruction: between the al/ax/eax/rax register and a 64\-bit absolute address (no registers allowed in the effective address)\&. In NASM syntax, use of the 64\-bit absolute form requires \fB[qword]\fR\&. Examples in NASM syntax:
+
+.IP
+mov eax, [1] ; 32 bit, with sign extension
+.IP
+mov al, [rax\-1] ; 32 bit, with sign extension
+.IP
+mov al, [qword 0x1122334455667788] ; 64\-bit absolute
+.IP
+mov al, [0x1122334455667788] ; truncated to 32\-bit (warning)
+.PP
+In 64\-bit mode, a new form of effective addressing is available to make it easier to write position\-independent code\&. Any memory reference may be made RIP relative (RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction)\&.
+
+.PP
+In NASM syntax, there are two ways to specify RIP\-relative addressing:
+
+.IP
+mov dword [rip+10], 1
+.PP
+stores the value 1 ten bytes after the end of the instruction\&. \fB10\fR can also be a symbolic constant, and will be treated the same way\&. On the other hand,
+
+.IP
+mov dword [symb wrt rip], 1
+.PP
+stores the value 1 into the address of symbol \fBsymb\fR\&. This is distinctly different than the behavior of:
+
+.IP
+mov dword [symb+rip], 1
+.PP
+which takes the address of the end of the instruction, adds the address of \fBsymb\fR to it, then stores the value 1 there\&. If \fBsymb\fR is a variable, this will NOT store the value 1 into the \fBsymb\fR variable!
.SH "LC3B ARCHITECTURE"
.SH "BUGS"
.PP
-When using the ``x86'' architecture, it is overly easy to generate AMD64 code (using the \fBBITS 64\fR directive) and generate a 32\-bit object file (by failing to specify\fB\-m amd64\fR on the command line)\&. Similarly, specifying\fB\-m amd64\fR does not default the BITS setting to 64\&.
+When using the ``x86'' architecture, it is overly easy to generate AMD64 code (using the \fBBITS 64\fR directive) and generate a 32\-bit object file (by failing to specify \fB\-m amd64\fR on the command line)\&. Similarly, specifying \fB\-m amd64\fR does not default the BITS setting to 64\&.
.SH AUTHOR
Peter Johnson <peter@tortall\&.net>.
<refsect1><title>Description</title>
<para>The standard YASM distribution includes a number of loadable modules
- for different target architectures. Additional target architectures
- may be installed as third-party modules. Each target architecture can
- support one or more machine architectures.</para>
+ for different target architectures. Additional target architectures
+ may be installed as third-party modules. Each target architecture can
+ support one or more machine architectures.</para>
<para>The architecture and machine are selected on the <citerefentry>
- <refentrytitle>yasm</refentrytitle> <manvolnum>1</manvolnum>
- </citerefentry> command line by use of the <option>-a
- <replaceable>arch</replaceable></option> and <option>-m
- <replaceable>machine</replaceable></option> command line options,
- respectively.</para>
+ <refentrytitle>yasm</refentrytitle> <manvolnum>1</manvolnum>
+ </citerefentry> command line by use of the <option>-a
+ <replaceable>arch</replaceable></option> and <option>-m
+ <replaceable>machine</replaceable></option> command line options,
+ respectively.</para>
</refsect1>
<refsect1><title>x86 Architecture</title>
<para>The <quote>x86</quote> architecture supports the IA-32 instruction
- set and derivatives and the AMD64 instruction set. It consists of two
- machines: <quote>x86</quote> (for the IA-32 and derivatives) and
- <quote>amd64</quote> (for the AMD64 and derivatives). The default
- machine for the <quote>x86</quote> architecture is the
- <quote>x86</quote> machine.</para>
+ set and derivatives and the AMD64 instruction set. It consists of two
+ machines: <quote>x86</quote> (for the IA-32 and derivatives) and
+ <quote>amd64</quote> (for the AMD64 and derivatives). The default
+ machine for the <quote>x86</quote> architecture is the
+ <quote>x86</quote> machine.</para>
+ <refsect2><title>BITS Setting</title>
+
+ <para>The x86 architecture BITS setting specifies to YASM the
+ processor mode in which the generated code is intended to execute.
+ x86 processors can run in three different major execution modes:
+ 16-bit, 32-bit, and on AMD64-supporting processors, 64-bit. As
+ the x86 instruction set contains portions whose function is
+ execution-mode dependent (such as operand-size and address-size
+ override prefixes), YASM cannot assemble x86 instructions
+ correctly unless it is told by the user in what processor mode the
+ code will execute.</para>
+
+ <para>The BITS setting can be changed in a variety of ways. When
+ using the NASM-compatible parser, the BITS setting can be changed
+ directly via the use of the <userinput>BITS xx</userinput>
+ assembler directive. The default BITS setting is determined by
+ the object format in use.</para>
+
+ </refsect2>
+
+ <refsect2><title>BITS 64 Extensions</title>
+
+ <para>When an AMD64-supporting processor is executing in 64-bit mode,
+ a number of additional extensions are available, including extra
+ general purpose registers, extra SSE2 registers, and RIP-relative
+ addressing.</para>
+
+ <refsect3><title>Register Changes</title>
+
+ <para>The additional 64-bit general purpose registers are named
+ r8-r15. There are also 8-bit (rXb), 16-bit (rXw), and 32-bit
+ (rXd) subregisters that map to the least significant 8, 16, or
+ 32 bits of the 64-bit register. The original 8 general
+ purpose registers have also been extended to 64-bits: eax,
+ edx, ecx, ebx, esi, edi, esp, and ebp have new 64-bit versions
+ called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp
+ respectively. The old 32-bit registers map to the least
+ significant bits of the new 64-bit registers.</para>
+
+ <para>New 8-bit registers are also available that map to the 8
+ least significant bits of rsi, rdi, rsp, and rbp. These are
+ called sil, dil, spl, and bpl respectively. Unfortunately,
+ due to the way instructions are encoded, these new 8-bit
+ registers are encoded the same as the old 8-bit registers ah,
+ dh, ch, and bh. The processor tells which is being used by
+ the presence of the new REX prefix that is used to specify the
+ other extended registers. This means it is illegal to mix the
+ use of ah, dh, ch, and bh with an instruction that requires
+ the REX prefix for other reasons. For instance:</para>
+
+ <screen>add ah, [r10]</screen>
+
+ <para>(NASM syntax) is not a legal instruction because the use of
+ r10 requires a REX prefix, making it impossible to use
+ ah.</para>
+
+ <para>In 64-bit mode, an additional 8 SSE2 registers are also
+ available. These are named xmm8-xmm15.</para>
+
+ </refsect3>
+
+ <refsect3><title>64 Bit Instructions</title>
+
+ <para>By default, most operations in 64-bit mode remain 32-bit;
+ operations that are 64-bit usually require a REX prefix (one
+ bit in the REX prefix determines whether an operation is
+ 64-bit or 32-bit). Thus, essentially all 32-bit instructions
+ have a 64-bit version, and the 64-bit versions of instructions
+ can use extended registers <quote>for free</quote> (as the REX
+ prefix is already present). Examples in NASM syntax:</para>
+
+ <screen>mov eax, 1 ; 32-bit instruction</screen>
+ <screen>mov rcx, 1 ; 64-bit instruction</screen>
+
+ <para>Instructions that modify the stack (push, pop, call, ret,
+ enter, and leave) are implicitly 64-bit. Their 32-bit
+ counterparts are not available, but their 16-bit counterparts
+ are. Examples in NASM syntax:</para>
+
+ <screen>push eax ; illegal instruction</screen>
+ <screen>push rbx ; 1-byte instruction</screen>
+ <screen>push r11 ; 2-byte instruction with REX prefix</screen>
+
+ </refsect3>
+
+ <refsect3><title>Implicit Zero Extension</title>
+
+ <para>Results of 32-bit operations are implicitly zero-extended to
+ the upper 32 bits of the corresponding 64-bit register. 16
+ and 8 bit operations, on the other hand, do not affect upper
+ bits of the register (just as in 32-bit and 16-bit modes).
+ This can be used to generate smaller code in some instances.
+ Examples in NASM syntax:</para>
+
+ <screen>mov ecx, 1 ; 1 byte shorter than mov rcx, 1</screen>
+ <screen>and edx, 3 ; equivalent to and rdx, 5</screen>
+
+ </refsect3>
+
+ <refsect3><title>Immediates</title>
+
+ <para>For most instructions in 64-bit mode, immediate values
+ remain 32 bits; their value is sign-extended into the upper 32
+ bits of the target register prior to being used. The
+ exception is the mov instruction, which can take a 64-bit
+ immediate when the destination is a 64-bit register. Examples
+ in NASM syntax:</para>
+
+ <screen>add rax, 1 ; legal</screen>
+ <screen>add rax, 0xffffffff ; sign-extended</screen>
+ <screen>add rax, -1 ; same as above</screen>
+ <screen>add rax, 0xffffffffffffffff ; warning (>32 bit)</screen>
+ <screen>mov eax, 1 ; 5 byte instruction</screen>
+ <screen>mov rax, 1 ; 10 byte instruction</screen>
+ <screen>mov rbx, 0x1234567890abcdef ; 10 byte instruction</screen>
+ <screen>mov rcx, 0xffffffff ; 10 byte instruction</screen>
+ <screen>mov ecx, -1 ; 5 byte instruction equivalent to above</screen>
+
+ </refsect3>
+
+ <refsect3><title>Displacements</title>
+
+ <para>Just like immediates, displacements, for the most part,
+ remain 32 bits and are sign extended prior to use. Again, the
+ exception is one restricted form of the mov instruction:
+ between the al/ax/eax/rax register and a 64-bit absolute
+ address (no registers allowed in the effective address). In
+ NASM syntax, use of the 64-bit absolute form requires
+ <userinput>[qword]</userinput>. Examples in NASM
+ syntax:</para>
+
+ <screen>mov eax, [1] ; 32 bit, with sign extension</screen>
+ <screen>mov al, [rax-1] ; 32 bit, with sign extension</screen>
+ <screen>mov al, [qword 0x1122334455667788] ; 64-bit absolute</screen>
+ <screen>mov al, [0x1122334455667788] ; truncated to 32-bit (warning)</screen>
+
+ </refsect3>
+
+ <refsect3><title>RIP Relative Addressing</title>
+
+ <para>In 64-bit mode, a new form of effective addressing is
+ available to make it easier to write position-independent
+ code. Any memory reference may be made RIP relative (RIP is
+ the instruction pointer register, which contains the address
+ of the location immediately following the current
+ instruction).</para>
+
+ <para>In NASM syntax, there are two ways to specify RIP-relative
+ addressing:</para>
+
+ <screen>mov dword [rip+10], 1</screen>
+
+ <para>stores the value 1 ten bytes after the end of the
+ instruction. <userinput>10</userinput> can also be a symbolic
+ constant, and will be treated the same way. On the other
+ hand,</para>
+
+ <screen>mov dword [symb wrt rip], 1</screen>
+
+ <para>stores the value 1 into the address of symbol
+ <userinput>symb</userinput>. This is distinctly different
+ than the behavior of:</para>
+
+ <screen>mov dword [symb+rip], 1</screen>
+
+ <para>which takes the address of the end of the instruction, adds
+ the address of <userinput>symb</userinput> to it, then stores
+ the value 1 there. If <userinput>symb</userinput> is a
+ variable, this will NOT store the value 1 into the
+ <userinput>symb</userinput> variable!</para>
+
+ </refsect3>
+ </refsect2>
</refsect1>
<refsect1><title>lc3b Architecture</title>
<para>The <quote>lc3b</quote> architecture supports the LC-3b ISA as used
- in the ECE 312 (now ECE 411) course at the University of Illinois,
- Urbana-Champaign, as well as other university courses. See <ulink
- url="http://courses.ece.uiuc.edu/ece411/"/> for more details and
- example code. The <quote>lc3b</quote> architecture consists of only
- one machine: <quote>lc3b</quote>.</para>
+ in the ECE 312 (now ECE 411) course at the University of Illinois,
+ Urbana-Champaign, as well as other university courses. See <ulink
+ url="http://courses.ece.uiuc.edu/ece411/"/> for more details and
+ example code. The <quote>lc3b</quote> architecture consists of only
+ one machine: <quote>lc3b</quote>.</para>
</refsect1>
<refsect1><title>Bugs</title>
<para>When using the <quote>x86</quote> architecture, it is overly easy to
- generate AMD64 code (using the <userinput>BITS 64</userinput>
- directive) and generate a 32-bit object file (by failing to specify
- <option>-m amd64</option> on the command line). Similarly, specifying
- <option>-m amd64</option> does not default the BITS setting to
- 64.</para>
+ generate AMD64 code (using the <userinput>BITS 64</userinput>
+ directive) and generate a 32-bit object file (by failing to specify
+ <option>-m amd64</option> on the command line). Similarly, specifying
+ <option>-m amd64</option> does not default the BITS setting to
+ 64.</para>
</refsect1>