From: Ivan Maidanski Date: Mon, 8 Apr 2013 19:52:44 +0000 (+0400) Subject: Add comment about double-wide load/store on x86_64 (GCC) X-Git-Tag: libatomic_ops-7_4_0~27 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=ce4a44ca4a1acfb2f6ee44d52993d894dffe7b6f;p=libatomic_ops Add comment about double-wide load/store on x86_64 (GCC) * src/atomic_ops/sysdeps/gcc/x86.h: Add comment about AO_double_load and AO_double_store implementation in 64-bit mode; remove the corresponding TODO item. --- diff --git a/src/atomic_ops/sysdeps/gcc/x86.h b/src/atomic_ops/sysdeps/gcc/x86.h index 9ce3949..62b50c9 100644 --- a/src/atomic_ops/sysdeps/gcc/x86.h +++ b/src/atomic_ops/sysdeps/gcc/x86.h @@ -282,7 +282,18 @@ AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, } # define AO_HAVE_int_fetch_and_add_full -/* TODO: Implement double_load/store. */ + /* The Intel and AMD Architecture Programmer Manuals state roughly */ + /* the following: */ + /* - CMPXCHG16B (with a LOCK prefix) can be used to perform 16-byte */ + /* atomic accesses in 64-bit mode (with certain alignment */ + /* restrictions); */ + /* - SSE instructions that access data larger than a quadword (like */ + /* MOVDQA) may be implemented using multiple memory accesses; */ + /* - LOCK prefix causes an invalid-opcode exception when used with */ + /* 128-bit media (SSE) instructions. */ + /* Thus, currently, the only way to implement lock-free double_load */ + /* and double_store on x86_64 is to use CMPXCHG16B (if available). */ + /* TODO: Test some gcc macro to detect presence of cmpxchg16b. */ # ifdef AO_CMPXCHG16B_AVAILABLE