]>
granicus.if.org Git - libx264/log
Loren Merritt [Tue, 25 Mar 2008 01:25:19 +0000 (19:25 -0600)]
reduce the size of some cabac arrays
Fiona Glaser [Tue, 25 Mar 2008 01:21:24 +0000 (19:21 -0600)]
use cabac context transition table from trellis in normal residual coding too
Fiona Glaser [Tue, 25 Mar 2008 01:12:07 +0000 (19:12 -0600)]
rearrange cabac struct to reduce code size
Fiona Glaser [Mon, 24 Mar 2008 09:25:25 +0000 (03:25 -0600)]
higher precision RD lambda
improves quality at QP<=12.
Loren Merritt [Mon, 24 Mar 2008 07:56:31 +0000 (01:56 -0600)]
faster cabac_encode_ue_bypass
Loren Merritt [Mon, 24 Mar 2008 04:14:18 +0000 (22:14 -0600)]
cabac asm.
mostly because gcc refuses to use cmov.
28% faster than c on core2, 11% on k8, 6% on p4.
Loren Merritt [Mon, 24 Mar 2008 04:08:07 +0000 (22:08 -0600)]
cosmetics in cabac
Loren Merritt [Sun, 23 Mar 2008 02:25:06 +0000 (20:25 -0600)]
inline cabac_size_decision
Loren Merritt [Sat, 22 Mar 2008 09:25:03 +0000 (03:25 -0600)]
cosmetics in DECLARE_ALIGNED
Loren Merritt [Sat, 22 Mar 2008 09:06:18 +0000 (03:06 -0600)]
don't distinguish between luma4x4 and luma4x4ac
Loren Merritt [Sat, 22 Mar 2008 08:46:31 +0000 (02:46 -0600)]
faster lossless zigzag
Loren Merritt [Sat, 22 Mar 2008 09:14:33 +0000 (03:14 -0600)]
more alignment
Loren Merritt [Sat, 22 Mar 2008 07:49:52 +0000 (01:49 -0600)]
add tesa and lossless to fprofile
Loren Merritt [Sat, 22 Mar 2008 07:46:43 +0000 (01:46 -0600)]
cosmetics in residual_write
Loren Merritt [Sat, 22 Mar 2008 05:24:33 +0000 (23:24 -0600)]
remove unused bitstream reader
Loren Merritt [Sat, 22 Mar 2008 00:58:46 +0000 (18:58 -0600)]
cosmetics in quant asm
Loren Merritt [Sat, 22 Mar 2008 00:46:29 +0000 (18:46 -0600)]
special case dequant for flat matrix
Loren Merritt [Fri, 21 Mar 2008 06:04:46 +0000 (00:04 -0600)]
faster dequant
Loren Merritt [Fri, 21 Mar 2008 04:08:07 +0000 (22:08 -0600)]
simplify hpel_filter_c
Loren Merritt [Fri, 21 Mar 2008 01:35:54 +0000 (19:35 -0600)]
use x264_mc_copy_w16_sse2 in mc.copy, it was previously only in mc_luma
Loren Merritt [Thu, 20 Mar 2008 20:00:08 +0000 (14:00 -0600)]
new ssd_8x*_sse2
align ssd_16x*_sse2
unroll ssd_4x*_mmx
Manuel Rommel [Thu, 20 Mar 2008 19:21:16 +0000 (13:21 -0600)]
update altivec zigzags
Loren Merritt [Thu, 20 Mar 2008 16:41:50 +0000 (10:41 -0600)]
r768 borked cavlc
Loren Merritt [Thu, 20 Mar 2008 06:52:11 +0000 (00:52 -0600)]
cosmetics in intra predict
Fiona Glaser [Thu, 20 Mar 2008 06:31:42 +0000 (00:31 -0600)]
faster intra predict 8x8 hu/hd
Loren Merritt [Thu, 20 Mar 2008 05:43:19 +0000 (23:43 -0600)]
reduce zigzag arrays from int to int16_t
Loren Merritt [Thu, 20 Mar 2008 05:42:20 +0000 (23:42 -0600)]
reduce the size of some arrays
Fiona Glaser [Wed, 19 Mar 2008 21:01:05 +0000 (15:01 -0600)]
skip intra pred+dct+quant in cases where it's redundant (analyse vs encode)
large speedup with trellis=2, small speedup with trellis=0 and/or subme>=6
Loren Merritt [Wed, 19 Mar 2008 20:03:34 +0000 (14:03 -0600)]
cosmetics in asm
Fiona Glaser [Wed, 19 Mar 2008 20:00:34 +0000 (14:00 -0600)]
satd_4x4_ssse3
Fiona Glaser [Wed, 19 Mar 2008 19:40:41 +0000 (13:40 -0600)]
get_ref_sse2
Fiona Glaser [Wed, 19 Mar 2008 01:17:22 +0000 (19:17 -0600)]
continue instead of crash when the threading mv constraint is violated.
doesn't fix the underlying bug, but hopefully less annoying until we find it.
Loren Merritt [Wed, 19 Mar 2008 00:24:01 +0000 (18:24 -0600)]
remove remaining reference to clip1.h
Loren Merritt [Tue, 18 Mar 2008 18:34:10 +0000 (12:34 -0600)]
fix name mangling again.
apparently it's not just a convention, dll build fails if you try to export a non-prefixed name.
Gabriel Bouvigne [Mon, 17 Mar 2008 21:44:40 +0000 (15:44 -0600)]
update msvc projectfile
Loren Merritt [Mon, 17 Mar 2008 21:41:59 +0000 (15:41 -0600)]
missing #ifdef HAVE_SSE3
Loren Merritt [Mon, 17 Mar 2008 21:41:30 +0000 (15:41 -0600)]
don't define offsetof since it's standard
Loren Merritt [Mon, 17 Mar 2008 07:23:35 +0000 (01:23 -0600)]
shut up gcc warning in offsetof
Håkan Hjort [Mon, 17 Mar 2008 07:20:02 +0000 (01:20 -0600)]
increase alignment of mv arrays
Fiona Glaser [Mon, 17 Mar 2008 05:58:04 +0000 (23:58 -0600)]
memcpy_aligned_sse2
Loren Merritt [Mon, 17 Mar 2008 04:40:43 +0000 (22:40 -0600)]
checkasm check whether callee-saved regs are correctly saved
x86_32 only for now since x86_64 varargs are annoying
Loren Merritt [Mon, 17 Mar 2008 04:28:20 +0000 (22:28 -0600)]
fix x86_32 ads which failed to preserve a register
Loren Merritt [Sun, 16 Mar 2008 22:34:41 +0000 (16:34 -0600)]
fix some name mangling issues introduced by the merge
Loren Merritt [Sun, 16 Mar 2008 21:30:40 +0000 (15:30 -0600)]
remove x264_mc_clip1.
it's wrong for sufficiently perverse inputs, and clip_uint8 is faster anyway.
Loren Merritt [Sun, 16 Mar 2008 19:54:58 +0000 (13:54 -0600)]
merge x86_32 and x86_64 asm, with macros to abstract calling convention and register names
Loren Merritt [Sun, 9 Mar 2008 11:58:55 +0000 (05:58 -0600)]
git compatible version script
Loren Merritt [Mon, 3 Mar 2008 00:53:01 +0000 (17:53 -0700)]
check for broken versions of yasm
Loren Merritt [Mon, 3 Mar 2008 00:27:38 +0000 (17:27 -0700)]
increase the alignment of the i8x8 edge cache, needed for sse2 intra prediction.
patch by Alexander Strange.
Loren Merritt [Sun, 2 Mar 2008 23:12:57 +0000 (16:12 -0700)]
.gitignore
Loren Merritt [Sun, 2 Mar 2008 03:04:07 +0000 (03:04 +0000)]
pic macros now keep track of which register holds the GOT, so variable access doesn't have to care
git-svn-id: svn://svn.videolan.org/x264/trunk@745
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 2 Mar 2008 02:27:45 +0000 (02:27 +0000)]
remove x86_64 predict_8x8_ddl_mmxext because sse2 is faster even on amd
git-svn-id: svn://svn.videolan.org/x264/trunk@744
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 2 Mar 2008 02:26:00 +0000 (02:26 +0000)]
cosmetics in dsp init
git-svn-id: svn://svn.videolan.org/x264/trunk@743
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 2 Mar 2008 02:11:12 +0000 (02:11 +0000)]
sse2 16x16 intra pred.
port the remaining intra pred functions from x86_64 to x86_32.
patch by Fiona Glaser.
git-svn-id: svn://svn.videolan.org/x264/trunk@742
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Mar 2008 13:47:05 +0000 (13:47 +0000)]
some simplifications to mmx intra pred that should have been done way back when we switched to constant fdec_stride.
and remove pic spills in functions that have a free caller-saved reg.
patch partly by Fiona Glaser.
git-svn-id: svn://svn.videolan.org/x264/trunk@741
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Mar 2008 07:30:34 +0000 (07:30 +0000)]
faster array_non_zero
git-svn-id: svn://svn.videolan.org/x264/trunk@740
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Mar 2008 04:33:24 +0000 (04:33 +0000)]
x86_32 sse2 idct8
ported from ffmpeg by Fiona Glaser
git-svn-id: svn://svn.videolan.org/x264/trunk@739
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Mar 2008 04:13:55 +0000 (04:13 +0000)]
checkasm: relax the threshold for floating-point ssim
git-svn-id: svn://svn.videolan.org/x264/trunk@738
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Mar 2008 04:07:44 +0000 (04:07 +0000)]
checkasm: test idct with the range of coefficients what can really be encountered, as opposed to random numbers which might overflow.
git-svn-id: svn://svn.videolan.org/x264/trunk@737
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Mon, 28 Jan 2008 14:33:42 +0000 (14:33 +0000)]
intra_rd_refine in B-frames
git-svn-id: svn://svn.videolan.org/x264/trunk@736
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 16:29:54 +0000 (16:29 +0000)]
print average of macroblock QPs instead of frame's nominal QP
git-svn-id: svn://svn.videolan.org/x264/trunk@735
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 16:16:37 +0000 (16:16 +0000)]
update date
git-svn-id: svn://svn.videolan.org/x264/trunk@734
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 16:06:31 +0000 (16:06 +0000)]
remove colorspace conversion support, because it has no business in any codec
git-svn-id: svn://svn.videolan.org/x264/trunk@733
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 14:01:40 +0000 (14:01 +0000)]
misc fixes in checkasm
git-svn-id: svn://svn.videolan.org/x264/trunk@732
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 13:39:09 +0000 (13:39 +0000)]
remove a useless bit of me=umh (originally copied from JM, where it was used for something)
git-svn-id: svn://svn.videolan.org/x264/trunk@731
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 11:50:50 +0000 (11:50 +0000)]
fix a memleak in cqm
git-svn-id: svn://svn.videolan.org/x264/trunk@730
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 11:49:16 +0000 (11:49 +0000)]
fix a memleak in mkv muxer
patch by saintdev
git-svn-id: svn://svn.videolan.org/x264/trunk@729
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 11:36:11 +0000 (11:36 +0000)]
satd exhaustive motion search (--me tesa)
git-svn-id: svn://svn.videolan.org/x264/trunk@728
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 11:09:52 +0000 (11:09 +0000)]
fix cabac context for nonzero delta_qp of the 2nd mb of a frame in interlaced mode
git-svn-id: svn://svn.videolan.org/x264/trunk@727
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 10:32:36 +0000 (10:32 +0000)]
fix mapping of mvs to partitions in p4x4_chroma
patch by Noboru Asai
git-svn-id: svn://svn.videolan.org/x264/trunk@726
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 10:12:24 +0000 (10:12 +0000)]
fix mvp for b16x8 and b8x16 L1 search
patch by Wei-Yin Chen
git-svn-id: svn://svn.videolan.org/x264/trunk@725
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 10:05:20 +0000 (10:05 +0000)]
shave a couple cycles off cabac functions
git-svn-id: svn://svn.videolan.org/x264/trunk@724
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 09:12:39 +0000 (09:12 +0000)]
faster and smaller x264_macroblock_cache_mv etc
git-svn-id: svn://svn.videolan.org/x264/trunk@723
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 27 Jan 2008 09:11:01 +0000 (09:11 +0000)]
configure test for endianness
git-svn-id: svn://svn.videolan.org/x264/trunk@722
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Fri, 18 Jan 2008 00:42:38 +0000 (00:42 +0000)]
change the meaning of --ref: it now selects DPB size (including B-frames), rather than L0 size (which B-frames are added to)
git-svn-id: svn://svn.videolan.org/x264/trunk@721
df754926 -b1dd-0310-bc7b-
ec298dee348c
Guillaume Poirier [Mon, 14 Jan 2008 09:54:33 +0000 (09:54 +0000)]
add / fix support for FreeBSD, based on a patch by Igor Mozolevsky % igor A hybrid-lab P co P uk %
git-svn-id: svn://svn.videolan.org/x264/trunk@720
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Wed, 9 Jan 2008 11:25:09 +0000 (11:25 +0000)]
shut up some valgrind warnings
git-svn-id: svn://svn.videolan.org/x264/trunk@719
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Tue, 8 Jan 2008 18:10:51 +0000 (18:10 +0000)]
slightly wrong memory allocation in r717, fixes a potential crash with merange>32
git-svn-id: svn://svn.videolan.org/x264/trunk@718
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 6 Jan 2008 08:15:04 +0000 (08:15 +0000)]
convert absolute difference of sums from mmx to sse2
convert mv bits cost and ads threshold from C to sse2
convert bytemask-to-list from C to scalar asm
1.6x faster me=esa (x86_64) or 1.3x faster (x86_32). (times consider only motion estimation. overall encode speedup may vary.)
git-svn-id: svn://svn.videolan.org/x264/trunk@717
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 6 Jan 2008 08:06:36 +0000 (08:06 +0000)]
round esa range to a multiple of 4
git-svn-id: svn://svn.videolan.org/x264/trunk@716
df754926 -b1dd-0310-bc7b-
ec298dee348c
Guillaume Poirier [Thu, 3 Jan 2008 22:24:38 +0000 (22:24 +0000)]
use define _WIN32 instead of __WIN32__ or WIN32 defines.
NSDN reference: http://msdn2.microsoft.com/en-us/library/b0084kay(VS.80).aspx
Patch by BugMaster %BugMaster A narod P ru%
Original thread:
date: Dec 27, 2007 3:18 AM
subject: [x264-devel] VS2008 compilation error (need of replacement __WIN32__ with _WIN32)
git-svn-id: svn://svn.videolan.org/x264/trunk@715
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Fri, 21 Dec 2007 01:57:14 +0000 (01:57 +0000)]
tweak x264_pixel_sad_x4_16x16_sse2 horizontal sum. 168 -> 166 cycles on core2.
git-svn-id: svn://svn.videolan.org/x264/trunk@714
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Thu, 20 Dec 2007 19:24:17 +0000 (19:24 +0000)]
fix a nondeterminism involving 8x8dct, rdo, and threads.
git-svn-id: svn://svn.videolan.org/x264/trunk@713
df754926 -b1dd-0310-bc7b-
ec298dee348c
Guillaume Poirier [Thu, 13 Dec 2007 15:43:41 +0000 (15:43 +0000)]
also test arch-specific x264_zigzag_* implementations in checkasm.c
patch by Patch by Noboru Asai % noboru P asai A gmail P com%
git-svn-id: svn://svn.videolan.org/x264/trunk@712
df754926 -b1dd-0310-bc7b-
ec298dee348c
Guillaume Poirier [Mon, 10 Dec 2007 22:09:13 +0000 (22:09 +0000)]
Add AltiVec implementation of
- x264_zigzag_scan_4x4_frame_altivec()
- x264_zigzag_scan_4x4ac_frame_altivec()
- x264_zigzag_scan_4x4_field_altivec()
- x264_zigzag_scan_4x4ac_field_altivec()
each around 1.3 tp 1.8x faster than C version
Patch by Noboru Asai % noboru P asai A gmail P com%
git-svn-id: svn://svn.videolan.org/x264/trunk@711
df754926 -b1dd-0310-bc7b-
ec298dee348c
Guillaume Poirier [Sun, 9 Dec 2007 15:50:52 +0000 (15:50 +0000)]
adds AliVec implementation of predict_16x16_p()
over 4x faster than C version
git-svn-id: svn://svn.videolan.org/x264/trunk@710
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Tue, 4 Dec 2007 21:56:18 +0000 (21:56 +0000)]
revert the x86_32 part of r708. elf shared libraries aren't important enough to be worth the extra lines of code to check for nasm.
git-svn-id: svn://svn.videolan.org/x264/trunk@709
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Mon, 3 Dec 2007 01:17:23 +0000 (01:17 +0000)]
mark asm functions as hidden
git-svn-id: svn://svn.videolan.org/x264/trunk@708
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Mon, 3 Dec 2007 01:16:57 +0000 (01:16 +0000)]
check whether ld supports -Bsymbolic before using it
git-svn-id: svn://svn.videolan.org/x264/trunk@707
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 2 Dec 2007 15:57:43 +0000 (15:57 +0000)]
reduce the data type used in some tables. 16KB smaller exe.
git-svn-id: svn://svn.videolan.org/x264/trunk@706
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Dec 2007 18:03:16 +0000 (18:03 +0000)]
faster removal of duplicate mv predictors
git-svn-id: svn://svn.videolan.org/x264/trunk@705
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Dec 2007 15:17:19 +0000 (15:17 +0000)]
avoid a division in x264_mb_predict_mv_ref16x16.
patch by Fiona Glaser.
git-svn-id: svn://svn.videolan.org/x264/trunk@704
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sat, 1 Dec 2007 02:58:34 +0000 (02:58 +0000)]
avoid a division in umh.
patch by Fiona Glaser.
git-svn-id: svn://svn.videolan.org/x264/trunk@703
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Mon, 26 Nov 2007 11:44:37 +0000 (11:44 +0000)]
fix a memleak in h->mb.mvr
git-svn-id: svn://svn.videolan.org/x264/trunk@702
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Sun, 25 Nov 2007 12:38:19 +0000 (12:38 +0000)]
fix compilation as a shared library on x86_64 (regression in r696)
git-svn-id: svn://svn.videolan.org/x264/trunk@701
df754926 -b1dd-0310-bc7b-
ec298dee348c
Guillaume Poirier [Wed, 21 Nov 2007 18:30:49 +0000 (18:30 +0000)]
add support for x86_64 on Darwin9.0 (Mac OS X 10.5, aka Leopard)
Patch by Antoine Gerschenfeld %gerschen A clipper P ens P fr%
git-svn-id: svn://svn.videolan.org/x264/trunk@700
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Wed, 21 Nov 2007 11:52:19 +0000 (11:52 +0000)]
cover some more options in fprofile. (esa, bime, cqm, nr, no-dct-decimate, trellis2)
previously, esa was slower with fprofile than without, since gcc thought it wasn't important. now esa benefits like anything else.
git-svn-id: svn://svn.videolan.org/x264/trunk@699
df754926 -b1dd-0310-bc7b-
ec298dee348c
Guillaume Poirier [Tue, 20 Nov 2007 18:22:03 +0000 (18:22 +0000)]
Add AltiVec implementation of x264_pixel_ssd_8x8, 3x faster than C version
Overall speed-up: 0.7% with --bframes 3 --ref 5 -m 7 --b-rdo
Patch by Noboru Asai %noboru P asai A gmail P com%
git-svn-id: svn://svn.videolan.org/x264/trunk@698
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Tue, 20 Nov 2007 08:53:26 +0000 (08:53 +0000)]
limit mvs to [-512,511.75] instead of [-512,512]
git-svn-id: svn://svn.videolan.org/x264/trunk@697
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Tue, 20 Nov 2007 06:07:17 +0000 (06:07 +0000)]
avoid memory loads that span the border between two cachelines.
on core2 this makes x264_pixel_sad an average of 2x faster. other intel cpus gain various amounts. amd are unaffected.
overall speedup: 1-10%, depending on how much time is spent in fullpel motion estimation.
git-svn-id: svn://svn.videolan.org/x264/trunk@696
df754926 -b1dd-0310-bc7b-
ec298dee348c
Loren Merritt [Tue, 20 Nov 2007 05:57:29 +0000 (05:57 +0000)]
add cache info to cpu_detect. also print sse3.
git-svn-id: svn://svn.videolan.org/x264/trunk@695
df754926 -b1dd-0310-bc7b-
ec298dee348c