]> granicus.if.org Git - libx264/commit
avoid memory loads that span the border between two cachelines.
authorLoren Merritt <pengvado@videolan.org>
Tue, 20 Nov 2007 06:07:17 +0000 (06:07 +0000)
committerLoren Merritt <pengvado@videolan.org>
Tue, 20 Nov 2007 06:07:17 +0000 (06:07 +0000)
commitd4ebafa5d7db55e0d21c633c25d4835c5b94e3fd
treec0de08b8784e4f98fd97f0ac6cb30e66dd24e4be
parent125e0a84e04d04ac2dde69e091a75295f35120bc
avoid memory loads that span the border between two cachelines.
on core2 this makes x264_pixel_sad an average of 2x faster. other intel cpus gain various amounts. amd are unaffected.
overall speedup: 1-10%, depending on how much time is spent in fullpel motion estimation.

git-svn-id: svn://svn.videolan.org/x264/trunk@696 df754926-b1dd-0310-bc7b-ec298dee348c
common/amd64/amd64inc.asm
common/amd64/pixel-sse2.asm
common/frame.c
common/i386/i386inc.asm
common/i386/pixel-sse2.asm
common/i386/pixel.h
common/pixel.c
tools/checkasm.c