]> granicus.if.org Git - libx264/log
libx264
16 years agonew ssd_8x*_sse2
Loren Merritt [Thu, 20 Mar 2008 20:00:08 +0000 (14:00 -0600)]
new ssd_8x*_sse2
align ssd_16x*_sse2
unroll ssd_4x*_mmx

16 years agoupdate altivec zigzags
Manuel Rommel [Thu, 20 Mar 2008 19:21:16 +0000 (13:21 -0600)]
update altivec zigzags

16 years agor768 borked cavlc
Loren Merritt [Thu, 20 Mar 2008 16:41:50 +0000 (10:41 -0600)]
r768 borked cavlc

16 years agocosmetics in intra predict
Loren Merritt [Thu, 20 Mar 2008 06:52:11 +0000 (00:52 -0600)]
cosmetics in intra predict

16 years agofaster intra predict 8x8 hu/hd
Fiona Glaser [Thu, 20 Mar 2008 06:31:42 +0000 (00:31 -0600)]
faster intra predict 8x8 hu/hd

16 years agoreduce zigzag arrays from int to int16_t
Loren Merritt [Thu, 20 Mar 2008 05:43:19 +0000 (23:43 -0600)]
reduce zigzag arrays from int to int16_t

16 years agoreduce the size of some arrays
Loren Merritt [Thu, 20 Mar 2008 05:42:20 +0000 (23:42 -0600)]
reduce the size of some arrays

16 years agoskip intra pred+dct+quant in cases where it's redundant (analyse vs encode)
Fiona Glaser [Wed, 19 Mar 2008 21:01:05 +0000 (15:01 -0600)]
skip intra pred+dct+quant in cases where it's redundant (analyse vs encode)
large speedup with trellis=2, small speedup with trellis=0 and/or subme>=6

16 years agocosmetics in asm
Loren Merritt [Wed, 19 Mar 2008 20:03:34 +0000 (14:03 -0600)]
cosmetics in asm

16 years agosatd_4x4_ssse3
Fiona Glaser [Wed, 19 Mar 2008 20:00:34 +0000 (14:00 -0600)]
satd_4x4_ssse3

16 years agoget_ref_sse2
Fiona Glaser [Wed, 19 Mar 2008 19:40:41 +0000 (13:40 -0600)]
get_ref_sse2

16 years agocontinue instead of crash when the threading mv constraint is violated.
Fiona Glaser [Wed, 19 Mar 2008 01:17:22 +0000 (19:17 -0600)]
continue instead of crash when the threading mv constraint is violated.
doesn't fix the underlying bug, but hopefully less annoying until we find it.

16 years agoremove remaining reference to clip1.h
Loren Merritt [Wed, 19 Mar 2008 00:24:01 +0000 (18:24 -0600)]
remove remaining reference to clip1.h

16 years agofix name mangling again.
Loren Merritt [Tue, 18 Mar 2008 18:34:10 +0000 (12:34 -0600)]
fix name mangling again.
apparently it's not just a convention, dll build fails if you try to export a non-prefixed name.

16 years agoupdate msvc projectfile
Gabriel Bouvigne [Mon, 17 Mar 2008 21:44:40 +0000 (15:44 -0600)]
update msvc projectfile

16 years agomissing #ifdef HAVE_SSE3
Loren Merritt [Mon, 17 Mar 2008 21:41:59 +0000 (15:41 -0600)]
missing #ifdef HAVE_SSE3

16 years agodon't define offsetof since it's standard
Loren Merritt [Mon, 17 Mar 2008 21:41:30 +0000 (15:41 -0600)]
don't define offsetof since it's standard

16 years agoshut up gcc warning in offsetof
Loren Merritt [Mon, 17 Mar 2008 07:23:35 +0000 (01:23 -0600)]
shut up gcc warning in offsetof

16 years agoincrease alignment of mv arrays
Håkan Hjort [Mon, 17 Mar 2008 07:20:02 +0000 (01:20 -0600)]
increase alignment of mv arrays

16 years agomemcpy_aligned_sse2
Fiona Glaser [Mon, 17 Mar 2008 05:58:04 +0000 (23:58 -0600)]
memcpy_aligned_sse2

16 years agocheckasm check whether callee-saved regs are correctly saved
Loren Merritt [Mon, 17 Mar 2008 04:40:43 +0000 (22:40 -0600)]
checkasm check whether callee-saved regs are correctly saved
x86_32 only for now since x86_64 varargs are annoying

16 years agofix x86_32 ads which failed to preserve a register
Loren Merritt [Mon, 17 Mar 2008 04:28:20 +0000 (22:28 -0600)]
fix x86_32 ads which failed to preserve a register

16 years agofix some name mangling issues introduced by the merge
Loren Merritt [Sun, 16 Mar 2008 22:34:41 +0000 (16:34 -0600)]
fix some name mangling issues introduced by the merge

16 years agoremove x264_mc_clip1.
Loren Merritt [Sun, 16 Mar 2008 21:30:40 +0000 (15:30 -0600)]
remove x264_mc_clip1.
it's wrong for sufficiently perverse inputs, and clip_uint8 is faster anyway.

16 years agomerge x86_32 and x86_64 asm, with macros to abstract calling convention and register...
Loren Merritt [Sun, 16 Mar 2008 19:54:58 +0000 (13:54 -0600)]
merge x86_32 and x86_64 asm, with macros to abstract calling convention and register names

16 years agogit compatible version script
Loren Merritt [Sun, 9 Mar 2008 11:58:55 +0000 (05:58 -0600)]
git compatible version script

16 years agocheck for broken versions of yasm
Loren Merritt [Mon, 3 Mar 2008 00:53:01 +0000 (17:53 -0700)]
check for broken versions of yasm

16 years agoincrease the alignment of the i8x8 edge cache, needed for sse2 intra prediction.
Loren Merritt [Mon, 3 Mar 2008 00:27:38 +0000 (17:27 -0700)]
increase the alignment of the i8x8 edge cache, needed for sse2 intra prediction.
patch by Alexander Strange.

16 years ago.gitignore
Loren Merritt [Sun, 2 Mar 2008 23:12:57 +0000 (16:12 -0700)]
.gitignore

16 years agopic macros now keep track of which register holds the GOT, so variable access doesn...
Loren Merritt [Sun, 2 Mar 2008 03:04:07 +0000 (03:04 +0000)]
pic macros now keep track of which register holds the GOT, so variable access doesn't have to care

git-svn-id: svn://svn.videolan.org/x264/trunk@745 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agoremove x86_64 predict_8x8_ddl_mmxext because sse2 is faster even on amd
Loren Merritt [Sun, 2 Mar 2008 02:27:45 +0000 (02:27 +0000)]
remove x86_64 predict_8x8_ddl_mmxext because sse2 is faster even on amd

git-svn-id: svn://svn.videolan.org/x264/trunk@744 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agocosmetics in dsp init
Loren Merritt [Sun, 2 Mar 2008 02:26:00 +0000 (02:26 +0000)]
cosmetics in dsp init

git-svn-id: svn://svn.videolan.org/x264/trunk@743 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agosse2 16x16 intra pred.
Loren Merritt [Sun, 2 Mar 2008 02:11:12 +0000 (02:11 +0000)]
sse2 16x16 intra pred.
port the remaining intra pred functions from x86_64 to x86_32.
patch by Fiona Glaser.

git-svn-id: svn://svn.videolan.org/x264/trunk@742 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agosome simplifications to mmx intra pred that should have been done way back when we...
Loren Merritt [Sat, 1 Mar 2008 13:47:05 +0000 (13:47 +0000)]
some simplifications to mmx intra pred that should have been done way back when we switched to constant fdec_stride.
and remove pic spills in functions that have a free caller-saved reg.
patch partly by Fiona Glaser.

git-svn-id: svn://svn.videolan.org/x264/trunk@741 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agofaster array_non_zero
Loren Merritt [Sat, 1 Mar 2008 07:30:34 +0000 (07:30 +0000)]
faster array_non_zero

git-svn-id: svn://svn.videolan.org/x264/trunk@740 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agox86_32 sse2 idct8
Loren Merritt [Sat, 1 Mar 2008 04:33:24 +0000 (04:33 +0000)]
x86_32 sse2 idct8
ported from ffmpeg by Fiona Glaser

git-svn-id: svn://svn.videolan.org/x264/trunk@739 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agocheckasm: relax the threshold for floating-point ssim
Loren Merritt [Sat, 1 Mar 2008 04:13:55 +0000 (04:13 +0000)]
checkasm: relax the threshold for floating-point ssim

git-svn-id: svn://svn.videolan.org/x264/trunk@738 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agocheckasm: test idct with the range of coefficients what can really be encountered...
Loren Merritt [Sat, 1 Mar 2008 04:07:44 +0000 (04:07 +0000)]
checkasm: test idct with the range of coefficients what can really be encountered, as opposed to random numbers which might overflow.

git-svn-id: svn://svn.videolan.org/x264/trunk@737 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agointra_rd_refine in B-frames
Loren Merritt [Mon, 28 Jan 2008 14:33:42 +0000 (14:33 +0000)]
intra_rd_refine in B-frames

git-svn-id: svn://svn.videolan.org/x264/trunk@736 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agoprint average of macroblock QPs instead of frame's nominal QP
Loren Merritt [Sun, 27 Jan 2008 16:29:54 +0000 (16:29 +0000)]
print average of macroblock QPs instead of frame's nominal QP

git-svn-id: svn://svn.videolan.org/x264/trunk@735 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agoupdate date
Loren Merritt [Sun, 27 Jan 2008 16:16:37 +0000 (16:16 +0000)]
update date

git-svn-id: svn://svn.videolan.org/x264/trunk@734 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agoremove colorspace conversion support, because it has no business in any codec
Loren Merritt [Sun, 27 Jan 2008 16:06:31 +0000 (16:06 +0000)]
remove colorspace conversion support, because it has no business in any codec

git-svn-id: svn://svn.videolan.org/x264/trunk@733 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agomisc fixes in checkasm
Loren Merritt [Sun, 27 Jan 2008 14:01:40 +0000 (14:01 +0000)]
misc fixes in checkasm

git-svn-id: svn://svn.videolan.org/x264/trunk@732 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agoremove a useless bit of me=umh (originally copied from JM, where it was used for...
Loren Merritt [Sun, 27 Jan 2008 13:39:09 +0000 (13:39 +0000)]
remove a useless bit of me=umh (originally copied from JM, where it was used for something)

git-svn-id: svn://svn.videolan.org/x264/trunk@731 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agofix a memleak in cqm
Loren Merritt [Sun, 27 Jan 2008 11:50:50 +0000 (11:50 +0000)]
fix a memleak in cqm

git-svn-id: svn://svn.videolan.org/x264/trunk@730 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agofix a memleak in mkv muxer
Loren Merritt [Sun, 27 Jan 2008 11:49:16 +0000 (11:49 +0000)]
fix a memleak in mkv muxer
patch by saintdev

git-svn-id: svn://svn.videolan.org/x264/trunk@729 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agosatd exhaustive motion search (--me tesa)
Loren Merritt [Sun, 27 Jan 2008 11:36:11 +0000 (11:36 +0000)]
satd exhaustive motion search (--me tesa)

git-svn-id: svn://svn.videolan.org/x264/trunk@728 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agofix cabac context for nonzero delta_qp of the 2nd mb of a frame in interlaced mode
Loren Merritt [Sun, 27 Jan 2008 11:09:52 +0000 (11:09 +0000)]
fix cabac context for nonzero delta_qp of the 2nd mb of a frame in interlaced mode

git-svn-id: svn://svn.videolan.org/x264/trunk@727 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agofix mapping of mvs to partitions in p4x4_chroma
Loren Merritt [Sun, 27 Jan 2008 10:32:36 +0000 (10:32 +0000)]
fix mapping of mvs to partitions in p4x4_chroma
patch by Noboru Asai

git-svn-id: svn://svn.videolan.org/x264/trunk@726 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agofix mvp for b16x8 and b8x16 L1 search
Loren Merritt [Sun, 27 Jan 2008 10:12:24 +0000 (10:12 +0000)]
fix mvp for b16x8 and b8x16 L1 search
patch by Wei-Yin Chen

git-svn-id: svn://svn.videolan.org/x264/trunk@725 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agoshave a couple cycles off cabac functions
Loren Merritt [Sun, 27 Jan 2008 10:05:20 +0000 (10:05 +0000)]
shave a couple cycles off cabac functions

git-svn-id: svn://svn.videolan.org/x264/trunk@724 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agofaster and smaller x264_macroblock_cache_mv etc
Loren Merritt [Sun, 27 Jan 2008 09:12:39 +0000 (09:12 +0000)]
faster and smaller x264_macroblock_cache_mv etc

git-svn-id: svn://svn.videolan.org/x264/trunk@723 df754926-b1dd-0310-bc7b-ec298dee348c

16 years agoconfigure test for endianness
Loren Merritt [Sun, 27 Jan 2008 09:11:01 +0000 (09:11 +0000)]
configure test for endianness

git-svn-id: svn://svn.videolan.org/x264/trunk@722 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agochange the meaning of --ref: it now selects DPB size (including B-frames), rather...
Loren Merritt [Fri, 18 Jan 2008 00:42:38 +0000 (00:42 +0000)]
change the meaning of --ref: it now selects DPB size (including B-frames), rather than L0 size (which B-frames are added to)

git-svn-id: svn://svn.videolan.org/x264/trunk@721 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoadd / fix support for FreeBSD, based on a patch by Igor Mozolevsky % igor A hybrid...
Guillaume Poirier [Mon, 14 Jan 2008 09:54:33 +0000 (09:54 +0000)]
add / fix support for FreeBSD, based on a patch by Igor Mozolevsky % igor A hybrid-lab P co P uk %

git-svn-id: svn://svn.videolan.org/x264/trunk@720 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoshut up some valgrind warnings
Loren Merritt [Wed, 9 Jan 2008 11:25:09 +0000 (11:25 +0000)]
shut up some valgrind warnings

git-svn-id: svn://svn.videolan.org/x264/trunk@719 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoslightly wrong memory allocation in r717, fixes a potential crash with merange>32
Loren Merritt [Tue, 8 Jan 2008 18:10:51 +0000 (18:10 +0000)]
slightly wrong memory allocation in r717, fixes a potential crash with merange>32

git-svn-id: svn://svn.videolan.org/x264/trunk@718 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoconvert absolute difference of sums from mmx to sse2
Loren Merritt [Sun, 6 Jan 2008 08:15:04 +0000 (08:15 +0000)]
convert absolute difference of sums from mmx to sse2
convert mv bits cost and ads threshold from C to sse2
convert bytemask-to-list from C to scalar asm
1.6x faster me=esa (x86_64) or 1.3x faster (x86_32). (times consider only motion estimation. overall encode speedup may vary.)

git-svn-id: svn://svn.videolan.org/x264/trunk@717 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoround esa range to a multiple of 4
Loren Merritt [Sun, 6 Jan 2008 08:06:36 +0000 (08:06 +0000)]
round esa range to a multiple of 4

git-svn-id: svn://svn.videolan.org/x264/trunk@716 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agouse define _WIN32 instead of __WIN32__ or WIN32 defines.
Guillaume Poirier [Thu, 3 Jan 2008 22:24:38 +0000 (22:24 +0000)]
use define _WIN32 instead of __WIN32__ or WIN32 defines.
NSDN reference: http://msdn2.microsoft.com/en-us/library/b0084kay(VS.80).aspx
Patch by BugMaster %BugMaster A narod P ru%
Original thread:
date: Dec 27, 2007 3:18 AM
subject: [x264-devel] VS2008 compilation error (need of replacement __WIN32__ with _WIN32)

git-svn-id: svn://svn.videolan.org/x264/trunk@715 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agotweak x264_pixel_sad_x4_16x16_sse2 horizontal sum. 168 -> 166 cycles on core2.
Loren Merritt [Fri, 21 Dec 2007 01:57:14 +0000 (01:57 +0000)]
tweak x264_pixel_sad_x4_16x16_sse2 horizontal sum. 168 -> 166 cycles on core2.

git-svn-id: svn://svn.videolan.org/x264/trunk@714 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofix a nondeterminism involving 8x8dct, rdo, and threads.
Loren Merritt [Thu, 20 Dec 2007 19:24:17 +0000 (19:24 +0000)]
fix a nondeterminism involving 8x8dct, rdo, and threads.

git-svn-id: svn://svn.videolan.org/x264/trunk@713 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoalso test arch-specific x264_zigzag_* implementations in checkasm.c
Guillaume Poirier [Thu, 13 Dec 2007 15:43:41 +0000 (15:43 +0000)]
also test arch-specific x264_zigzag_* implementations in checkasm.c
patch by Patch by Noboru Asai % noboru P asai A gmail P com%

git-svn-id: svn://svn.videolan.org/x264/trunk@712 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoAdd AltiVec implementation of
Guillaume Poirier [Mon, 10 Dec 2007 22:09:13 +0000 (22:09 +0000)]
Add AltiVec implementation of
- x264_zigzag_scan_4x4_frame_altivec()
- x264_zigzag_scan_4x4ac_frame_altivec()
- x264_zigzag_scan_4x4_field_altivec()
- x264_zigzag_scan_4x4ac_field_altivec()
each around 1.3 tp 1.8x faster than C version
Patch by Noboru Asai % noboru P asai A gmail P com%

git-svn-id: svn://svn.videolan.org/x264/trunk@711 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoadds AliVec implementation of predict_16x16_p()
Guillaume Poirier [Sun, 9 Dec 2007 15:50:52 +0000 (15:50 +0000)]
adds AliVec implementation of predict_16x16_p()
over 4x faster than C version

git-svn-id: svn://svn.videolan.org/x264/trunk@710 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agorevert the x86_32 part of r708. elf shared libraries aren't important enough to be...
Loren Merritt [Tue, 4 Dec 2007 21:56:18 +0000 (21:56 +0000)]
revert the x86_32 part of r708. elf shared libraries aren't important enough to be worth the extra lines of code to check for nasm.

git-svn-id: svn://svn.videolan.org/x264/trunk@709 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agomark asm functions as hidden
Loren Merritt [Mon, 3 Dec 2007 01:17:23 +0000 (01:17 +0000)]
mark asm functions as hidden

git-svn-id: svn://svn.videolan.org/x264/trunk@708 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agocheck whether ld supports -Bsymbolic before using it
Loren Merritt [Mon, 3 Dec 2007 01:16:57 +0000 (01:16 +0000)]
check whether ld supports -Bsymbolic before using it

git-svn-id: svn://svn.videolan.org/x264/trunk@707 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoreduce the data type used in some tables. 16KB smaller exe.
Loren Merritt [Sun, 2 Dec 2007 15:57:43 +0000 (15:57 +0000)]
reduce the data type used in some tables. 16KB smaller exe.

git-svn-id: svn://svn.videolan.org/x264/trunk@706 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofaster removal of duplicate mv predictors
Loren Merritt [Sat, 1 Dec 2007 18:03:16 +0000 (18:03 +0000)]
faster removal of duplicate mv predictors

git-svn-id: svn://svn.videolan.org/x264/trunk@705 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoavoid a division in x264_mb_predict_mv_ref16x16.
Loren Merritt [Sat, 1 Dec 2007 15:17:19 +0000 (15:17 +0000)]
avoid a division in x264_mb_predict_mv_ref16x16.
patch by Fiona Glaser.

git-svn-id: svn://svn.videolan.org/x264/trunk@704 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoavoid a division in umh.
Loren Merritt [Sat, 1 Dec 2007 02:58:34 +0000 (02:58 +0000)]
avoid a division in umh.
patch by Fiona Glaser.

git-svn-id: svn://svn.videolan.org/x264/trunk@703 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofix a memleak in h->mb.mvr
Loren Merritt [Mon, 26 Nov 2007 11:44:37 +0000 (11:44 +0000)]
fix a memleak in h->mb.mvr

git-svn-id: svn://svn.videolan.org/x264/trunk@702 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofix compilation as a shared library on x86_64 (regression in r696)
Loren Merritt [Sun, 25 Nov 2007 12:38:19 +0000 (12:38 +0000)]
fix compilation as a shared library on x86_64 (regression in r696)

git-svn-id: svn://svn.videolan.org/x264/trunk@701 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoadd support for x86_64 on Darwin9.0 (Mac OS X 10.5, aka Leopard)
Guillaume Poirier [Wed, 21 Nov 2007 18:30:49 +0000 (18:30 +0000)]
add support for x86_64 on Darwin9.0 (Mac OS X 10.5, aka Leopard)
Patch by Antoine Gerschenfeld %gerschen A clipper P ens P fr%

git-svn-id: svn://svn.videolan.org/x264/trunk@700 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agocover some more options in fprofile. (esa, bime, cqm, nr, no-dct-decimate, trellis2)
Loren Merritt [Wed, 21 Nov 2007 11:52:19 +0000 (11:52 +0000)]
cover some more options in fprofile. (esa, bime, cqm, nr, no-dct-decimate, trellis2)
previously, esa was slower with fprofile than without, since gcc thought it wasn't important. now esa benefits like anything else.

git-svn-id: svn://svn.videolan.org/x264/trunk@699 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoAdd AltiVec implementation of x264_pixel_ssd_8x8, 3x faster than C version
Guillaume Poirier [Tue, 20 Nov 2007 18:22:03 +0000 (18:22 +0000)]
Add AltiVec implementation of x264_pixel_ssd_8x8, 3x faster than C version
Overall speed-up: 0.7% with  --bframes 3 --ref 5 -m 7 --b-rdo
Patch by Noboru Asai %noboru P asai A gmail P com%

git-svn-id: svn://svn.videolan.org/x264/trunk@698 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agolimit mvs to [-512,511.75] instead of [-512,512]
Loren Merritt [Tue, 20 Nov 2007 08:53:26 +0000 (08:53 +0000)]
limit mvs to [-512,511.75] instead of [-512,512]

git-svn-id: svn://svn.videolan.org/x264/trunk@697 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoavoid memory loads that span the border between two cachelines.
Loren Merritt [Tue, 20 Nov 2007 06:07:17 +0000 (06:07 +0000)]
avoid memory loads that span the border between two cachelines.
on core2 this makes x264_pixel_sad an average of 2x faster. other intel cpus gain various amounts. amd are unaffected.
overall speedup: 1-10%, depending on how much time is spent in fullpel motion estimation.

git-svn-id: svn://svn.videolan.org/x264/trunk@696 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoadd cache info to cpu_detect. also print sse3.
Loren Merritt [Tue, 20 Nov 2007 05:57:29 +0000 (05:57 +0000)]
add cache info to cpu_detect. also print sse3.

git-svn-id: svn://svn.videolan.org/x264/trunk@695 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agocosmetics: reorder mc_luma/mc_chroma/get_ref arguments for consistency with other...
Loren Merritt [Mon, 19 Nov 2007 17:10:57 +0000 (17:10 +0000)]
cosmetics: reorder mc_luma/mc_chroma/get_ref arguments for consistency with other functions

git-svn-id: svn://svn.videolan.org/x264/trunk@694 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoseparate pixel_avg into cases for mc and for bipred
Loren Merritt [Mon, 19 Nov 2007 17:08:07 +0000 (17:08 +0000)]
separate pixel_avg into cases for mc and for bipred

git-svn-id: svn://svn.videolan.org/x264/trunk@693 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoadd AltiVec implementation of ssim_4x4x2_core, about 4x faster than C version.
Guillaume Poirier [Sun, 18 Nov 2007 23:58:18 +0000 (23:58 +0000)]
add AltiVec implementation of ssim_4x4x2_core, about 4x faster than C version.
Overall: 0.1-0.2% faster with default encoding settings
Patch by Noboru Asai %noboru P asai A gmail P com%

git-svn-id: svn://svn.videolan.org/x264/trunk@692 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoAdd AltiVec implementation ofx264_hpel_filter. Provides a 10-11% overall speed-up...
Guillaume Poirier [Sun, 18 Nov 2007 23:47:41 +0000 (23:47 +0000)]
Add AltiVec implementation ofx264_hpel_filter. Provides a 10-11% overall speed-up with default encoding options
Patch by Noboru Asai %noboru P asai A gmail P com%

git-svn-id: svn://svn.videolan.org/x264/trunk@691 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agocosmetics in dsp function selection
Loren Merritt [Sun, 18 Nov 2007 01:45:44 +0000 (01:45 +0000)]
cosmetics in dsp function selection

git-svn-id: svn://svn.videolan.org/x264/trunk@690 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoremove sad_pde. it's been unused ever since successive elimination replaced it.
Loren Merritt [Sat, 17 Nov 2007 10:21:46 +0000 (10:21 +0000)]
remove sad_pde. it's been unused ever since successive elimination replaced it.

git-svn-id: svn://svn.videolan.org/x264/trunk@689 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agocosmetics: use symbolic constants for frame padding radius
Loren Merritt [Fri, 16 Nov 2007 10:27:14 +0000 (10:27 +0000)]
cosmetics: use symbolic constants for frame padding radius

git-svn-id: svn://svn.videolan.org/x264/trunk@688 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agomove hpel_filter cpu detection to a function pointer like everything else
Loren Merritt [Fri, 16 Nov 2007 09:17:58 +0000 (09:17 +0000)]
move hpel_filter cpu detection to a function pointer like everything else

git-svn-id: svn://svn.videolan.org/x264/trunk@687 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agocosmetics: use separate variables for frame width and stride
Loren Merritt [Thu, 15 Nov 2007 10:50:37 +0000 (10:50 +0000)]
cosmetics: use separate variables for frame width and stride

git-svn-id: svn://svn.videolan.org/x264/trunk@686 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoAdd AltiVec implementation of add4x4_idct, add8x8_idct, add16x16_idct, 3.2x faster...
Guillaume Poirier [Mon, 12 Nov 2007 20:36:33 +0000 (20:36 +0000)]
Add AltiVec implementation of add4x4_idct, add8x8_idct, add16x16_idct, 3.2x faster on average
1.05x faster overall with default encoding options
Patch by Noboru Asai % noboru DD asai AA gmail DD com %

git-svn-id: svn://svn.videolan.org/x264/trunk@685 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoadd AltiVec implementation of dequant_4x4 and dequant_8x8, 2.8x faster than C,
Guillaume Poirier [Mon, 12 Nov 2007 20:28:30 +0000 (20:28 +0000)]
add AltiVec implementation of dequant_4x4 and dequant_8x8, 2.8x faster than C,
 1.01x faster than previous revision with default encoding options
Patch by Noboru Asai % noboru DD asai AA gmail DD com %

git-svn-id: svn://svn.videolan.org/x264/trunk@684 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoAdd AltiVec implementation of quant_2x2_dc,
Guillaume Poirier [Mon, 12 Nov 2007 12:47:38 +0000 (12:47 +0000)]
Add AltiVec implementation of quant_2x2_dc,
fix Altivec implementation of quant_(4x4|8x8)(|_dc) wrt current C implementation
Patch by Noboru Asai % noboru DD asai AA gmail DD com %

git-svn-id: svn://svn.videolan.org/x264/trunk@683 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofix a possible nondeterminism with me=umh + threads.
Loren Merritt [Thu, 1 Nov 2007 12:21:13 +0000 (12:21 +0000)]
fix a possible nondeterminism with me=umh + threads.

git-svn-id: svn://svn.videolan.org/x264/trunk@682 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agouse hex instead of dia for rdo mv refinement. ~0.5% lower bitrate at subme=7.
Loren Merritt [Mon, 29 Oct 2007 14:48:46 +0000 (14:48 +0000)]
use hex instead of dia for rdo mv refinement. ~0.5% lower bitrate at subme=7.
patch by Fiona Glaser.

git-svn-id: svn://svn.videolan.org/x264/trunk@681 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoport sad_*_x3_sse2 to x86_64
Loren Merritt [Mon, 24 Sep 2007 13:37:44 +0000 (13:37 +0000)]
port sad_*_x3_sse2 to x86_64

git-svn-id: svn://svn.videolan.org/x264/trunk@680 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agodon't overwrite pthread* namespace, because system headers might define those functio...
Loren Merritt [Mon, 24 Sep 2007 11:24:28 +0000 (11:24 +0000)]
don't overwrite pthread* namespace, because system headers might define those functions even if we don't want them

git-svn-id: svn://svn.videolan.org/x264/trunk@679 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofaster 4x4 sad
Loren Merritt [Fri, 21 Sep 2007 20:20:22 +0000 (20:20 +0000)]
faster 4x4 sad

git-svn-id: svn://svn.videolan.org/x264/trunk@678 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofix an arithmetic overflow in trellis at high qp.
Loren Merritt [Thu, 20 Sep 2007 08:10:45 +0000 (08:10 +0000)]
fix an arithmetic overflow in trellis at high qp.

git-svn-id: svn://svn.videolan.org/x264/trunk@677 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agoimplement multithreaded me=esa
Loren Merritt [Sat, 15 Sep 2007 06:34:05 +0000 (06:34 +0000)]
implement multithreaded me=esa

git-svn-id: svn://svn.videolan.org/x264/trunk@676 df754926-b1dd-0310-bc7b-ec298dee348c

17 years agofix some integer overflows. now vbv size can exceed 2 Gbit.
Loren Merritt [Wed, 12 Sep 2007 05:42:23 +0000 (05:42 +0000)]
fix some integer overflows. now vbv size can exceed 2 Gbit.

git-svn-id: svn://svn.videolan.org/x264/trunk@675 df754926-b1dd-0310-bc7b-ec298dee348c