Correct ordering of legacy prefix and REX prefix for SSE/SSE2 instructions
in 64-bit (AMD64) mode. Intel says these bytes should not be treated as
prefixes, but AMD64 treats them as legacy prefixes, expecting them to come
before the REX byte.
For now, keep the three-byte max instruction length (although it's not truly
correct), as handling the other "3-byte" cases such as R/M spare with no EA
is probably more painful than it's worth to push down to later in the code
generation path.
Reported by: Henryk Richter <henryk.richter@comlab.uni-rostock.de>