]> granicus.if.org Git - libx264/log
libx264
16 years agoavg_weight_ssse3
Loren Merritt [Mon, 22 Sep 2008 10:17:35 +0000 (04:17 -0600)]
avg_weight_ssse3

16 years agofix bitstream writer on bigendian 64bit (regression in r903)
Loren Merritt [Sat, 20 Sep 2008 14:41:17 +0000 (08:41 -0600)]
fix bitstream writer on bigendian 64bit (regression in r903)

16 years agoremove authors whose code no longer exists
Loren Merritt [Sat, 20 Sep 2008 05:52:11 +0000 (23:52 -0600)]
remove authors whose code no longer exists

16 years agomore diagnostics when configure finds an unsuitable assembler
Loren Merritt [Mon, 15 Sep 2008 11:00:26 +0000 (05:00 -0600)]
more diagnostics when configure finds an unsuitable assembler

16 years agoMake x264 progress indicator more concise
Fiona Glaser [Fri, 26 Sep 2008 16:19:56 +0000 (09:19 -0700)]
Make x264 progress indicator more concise
Now the % indicator should be readable on the header of a minimized window on Windows systems.

16 years agoFix deblocking + threads + AQ bug
Fiona Glaser [Mon, 22 Sep 2008 05:17:34 +0000 (22:17 -0700)]
Fix deblocking + threads + AQ bug
At low QPs, with threads and deblocking on, deblocking could be improperly disabled.
Revision in which this bug was introduced is unknown; it may be as old as b_variable_qp in x264 itself.

16 years agoResolve possible crash in bime, improve the fix in r985
Fiona Glaser [Sun, 21 Sep 2008 20:35:00 +0000 (13:35 -0700)]
Resolve possible crash in bime, improve the fix in r985

16 years agoFix rare crash issue in b-adapt
Fiona Glaser [Sun, 21 Sep 2008 02:36:07 +0000 (19:36 -0700)]
Fix rare crash issue in b-adapt
Regression *probably* in r979

16 years agoMerging Holger's GSOC branch part 1: hpel_filter speedups
Holger Lubitz [Sat, 20 Sep 2008 09:36:55 +0000 (02:36 -0700)]
Merging Holger's GSOC branch part 1: hpel_filter speedups

16 years agor980 borked weighted bime
Loren Merritt [Sat, 20 Sep 2008 18:31:10 +0000 (12:31 -0600)]
r980 borked weighted bime

16 years agoDisable I_PCM with psy-RD
Fiona Glaser [Sat, 20 Sep 2008 08:39:16 +0000 (01:39 -0700)]
Disable I_PCM with psy-RD
psy-RD seems to put the PCM threshold a bit lower than it should be, so PCM is now disabled under psy-RD.

16 years agoMerge avg and avg_weight
Fiona Glaser [Fri, 19 Sep 2008 16:21:34 +0000 (09:21 -0700)]
Merge avg and avg_weight
avg_weight no longer has to be special-cased in the code; faster weightb

16 years agoRewrite avg/avg_weight to take two source pointers
Fiona Glaser [Thu, 18 Sep 2008 04:25:05 +0000 (21:25 -0700)]
Rewrite avg/avg_weight to take two source pointers
This allows the use of get_ref instead of mc_luma almost everywhere for bipred

16 years agoUse low-resolution lookahead motion vectors as an extra predictor
Fiona Glaser [Wed, 17 Sep 2008 07:33:37 +0000 (00:33 -0700)]
Use low-resolution lookahead motion vectors as an extra predictor
Improves quality considerably (0-5%) in 1pass/CRF mode, especially with lower --me values and complex motion.
Reverses the order of lowres lookahead search to improve the usefulness of the extra predictors.

16 years agoAdd missing free() for f_qp_offset in frame.c
Fiona Glaser [Wed, 17 Sep 2008 05:44:10 +0000 (22:44 -0700)]
Add missing free() for f_qp_offset in frame.c

16 years agoCorrect misprediction of bitrate in threaded mode
Gabriel Bouvigne [Tue, 16 Sep 2008 08:54:37 +0000 (01:54 -0700)]
Correct misprediction of bitrate in threaded mode
Improves bitrate accuracy in cases with large numbers of threads.
Loosely based on a patch by BugMaster.

16 years agoFix a case in which VBV underflows can occur
Gabriel Bouvigne [Tue, 16 Sep 2008 08:53:02 +0000 (01:53 -0700)]
Fix a case in which VBV underflows can occur
Fix a potential case where a frame might be initially allocated too low a QP, which would then have to be raised a low during row-based ratecontrol.
In some cases, this could even produce VBV underflows in 2pass mode.

16 years agoUse correct format specifier for uint64_t
Panagiotis Issaris [Mon, 15 Sep 2008 18:47:50 +0000 (20:47 +0200)]
Use correct format specifier for uint64_t

16 years agoCache motion vectors in lowres lookahead
Fiona Glaser [Tue, 16 Sep 2008 07:31:26 +0000 (00:31 -0700)]
Cache motion vectors in lowres lookahead
This vastly speeds up b-adapt 2, especially at large bframes values.
This changes output because now MV prediction in lookahead only uses L0/L1 MVs, not bidir.  This isn't a problem, since the bidir prediction wasn't really correct to begin with, so the change in output is neither positive nor negative.
This also allowed the removal of some unnecessary memsets, which should also give a small speed boost.
Finally, this allows the use of the lowres motion vectors for predictors in some future patch.

16 years agoFix regression in b-adapt patch: encoder_open failed for multipass encodes without...
Fiona Glaser [Mon, 15 Sep 2008 19:22:48 +0000 (12:22 -0700)]
Fix regression in b-adapt patch: encoder_open failed for multipass encodes without bframes.

16 years agoStop SAR in y4m input from overriding --sar on commandline
Fiona Glaser [Mon, 15 Sep 2008 17:53:29 +0000 (10:53 -0700)]
Stop SAR in y4m input from overriding --sar on commandline

16 years agohadamard_ac for psy-rd
Loren Merritt [Mon, 15 Sep 2008 08:24:12 +0000 (02:24 -0600)]
hadamard_ac for psy-rd
c version is 1.7x faster than satd+sa8d+sad
ssse3 version is 2.3x faster than satd+sa8d+sad

16 years agoPsychovisually optimized rate-distortion optimization and trellis
Fiona Glaser [Mon, 15 Sep 2008 04:36:45 +0000 (21:36 -0700)]
Psychovisually optimized rate-distortion optimization and trellis
The latter, psy-trellis, is disabled by default and is reserved as experimental; your mileage may vary.
Default subme is raised to 6 so that psy RD is on by default.

16 years agoAdd optional more optimal B-frame decision method
Fiona Glaser [Mon, 15 Sep 2008 01:18:15 +0000 (18:18 -0700)]
Add optional more optimal B-frame decision method
This method (--b-adapt 2) uses a Viterbi algorithm somewhat similar to that used in trellis quantization.
Note that it is not fully optimized and is very slow with large --bframes values.
It also takes into account weightb, which should improve fade detection.
Additionally, changes were made to cache lowres intra results for each frame to avoid recalculating them.  This should improve performance in both B-frame decision methods.
This can also be done for motion vectors, which will dramatically improve b-adapt 2 performance when it is complete.
This patch also reads b_adapt and scenecut settings from the first pass so that the x264 header information in the output file will have correct information (since frametype decision is only done on the first pass).

16 years agoMove adaptive quantization to before ratecontrol, eliminate qcomp bias
Fiona Glaser [Sat, 13 Sep 2008 21:03:12 +0000 (14:03 -0700)]
Move adaptive quantization to before ratecontrol, eliminate qcomp bias
This change improves VBV accuracy and improves bit distribution in CRF and 2pass.
Instead of being applied after ratecontrol, AQ becomes part of the complexity measure that ratecontrol uses.
This allows for modularity for changes to AQ; a new AQ algorithm can be introduced simply by introducing a new aq_mode and a corresponding if in adaptive_quant_frame.
This also allows quantizer field smoothing, since quantizers are calculated beofrehand rather during encoding.
Since there is no more reason for it, aq_mode 1 is removed.  The new mode 1 is in a sense a merger of the old modes 1 and 2.
WARNING: This change redefines CRF when using AQ, so output bitrate for a given CRF may be significantly different from before this change!

16 years agoFix crash when using b-adapt at resolutions 32x32 or below.
Fiona Glaser [Wed, 10 Sep 2008 06:51:17 +0000 (23:51 -0700)]
Fix crash when using b-adapt at resolutions 32x32 or below.
Original patch by BugMaster, but was mostly rewritten in order to make b-adapt actually *work* at such resolutions, not merely stop crashing.

16 years agoAdd title-bar progress indicator under WIN32
Fiona Glaser [Wed, 10 Sep 2008 06:12:20 +0000 (23:12 -0700)]
Add title-bar progress indicator under WIN32
Also add bitrate-so-far output when piping data to x264 (total frames not known)
Patch mostly by recover from Doom9.

16 years agoRevert part of r963
Fiona Glaser [Sat, 6 Sep 2008 06:14:23 +0000 (23:14 -0700)]
Revert part of r963
In some rare (but significant) cases, the optimized nal_encode algorithm gave incorrect results.

16 years agoPredict 4x4_DC asm
Fiona Glaser [Fri, 5 Sep 2008 03:13:38 +0000 (20:13 -0700)]
Predict 4x4_DC asm
Also remove 5-year-old unnecessary #define that reduced speed unnecessarily under MSVC-compiled builds

16 years agoFaster NAL unit encoding and remove unused nal_decode
Fiona Glaser [Thu, 4 Sep 2008 07:43:54 +0000 (00:43 -0700)]
Faster NAL unit encoding and remove unused nal_decode
Small speedup at very high bitrates

16 years agoCAVLC cleanup and optimizations
Fiona Glaser [Thu, 4 Sep 2008 05:12:23 +0000 (22:12 -0700)]
CAVLC cleanup and optimizations
Also move some small functions in macroblock.c to a .h file so they can be inlined.

16 years agoFaster avg_weight assembly
Fiona Glaser [Thu, 4 Sep 2008 04:43:06 +0000 (21:43 -0700)]
Faster avg_weight assembly
Unrolling the loop a bit improves performance

16 years agoFaster H asm intra prediction functions
Fiona Glaser [Wed, 3 Sep 2008 22:35:22 +0000 (15:35 -0700)]
Faster H asm intra prediction functions
Take advantage of the H prediction method invented for merged intra SAD and apply it to regular prediction, too.

16 years agoAdd merged SAD for i16x16 analysis
Fiona Glaser [Wed, 3 Sep 2008 22:32:16 +0000 (15:32 -0700)]
Add merged SAD for i16x16 analysis
Roughly 30% faster i16x16 analysis under subme=1

16 years agoAdd sad_aligned for faster subme=1 mbcmp
Fiona Glaser [Wed, 3 Sep 2008 22:15:17 +0000 (15:15 -0700)]
Add sad_aligned for faster subme=1 mbcmp
Distinguish between unaligned and aligned uses of mbcmp
SAD_aligned, for MMX SADs, uses non-cacheline SADs.

16 years agoImprove progress indicator
Fiona Glaser [Tue, 2 Sep 2008 18:49:55 +0000 (11:49 -0700)]
Improve progress indicator
Show average bitrate so far during encoding
Decrease update interval for longer encodes (max of 10 frames encoded between updates)

16 years agoFix speed regression in r951
Fiona Glaser [Mon, 1 Sep 2008 17:35:41 +0000 (10:35 -0700)]
Fix speed regression in r951
Row SATDs are only necessary in VBV mode, so don't need to be checked if VBV is off.

16 years agozigzag asm
Holger Lubitz [Mon, 1 Sep 2008 02:55:50 +0000 (20:55 -0600)]
zigzag asm

16 years agofix SOFLAGS used when building gtk frontend
Guillaume Poirier [Sun, 31 Aug 2008 19:46:31 +0000 (21:46 +0200)]
fix SOFLAGS used when building gtk frontend
patch by Markus Kanet %darkvision A gmx P eu%

16 years agoremove the distinction between itex and ptex
Loren Merritt [Thu, 21 Aug 2008 02:56:56 +0000 (20:56 -0600)]
remove the distinction between itex and ptex
(changes 2pass statsfile format)

16 years agohardcode the ratecontrol equation, and remove the rceq option
Loren Merritt [Thu, 21 Aug 2008 02:51:39 +0000 (20:51 -0600)]
hardcode the ratecontrol equation, and remove the rceq option

16 years agoFix some uses of uninitialized row_satd values in VBV
Fiona Glaser [Wed, 27 Aug 2008 17:14:36 +0000 (13:14 -0400)]
Fix some uses of uninitialized row_satd values in VBV
Resolves some issues with QP51 in I-frames with scenecut

16 years agoActivate trellis in p8x8 qpel RD
Fiona Glaser [Tue, 26 Aug 2008 18:51:29 +0000 (14:51 -0400)]
Activate trellis in p8x8 qpel RD
Also clean up macroblock.c with some refactoring
Note that this change significantly reduces subme7+trellis2 performance, but improves quality.
Issue originally reported by Alex_W.

16 years agoImprove VBV accuracy
Gabriel Bouvigne [Mon, 25 Aug 2008 14:50:45 +0000 (10:50 -0400)]
Improve VBV accuracy
Don't use the previous frame's row SATD as a predictor if it is too different from this frame's row SATD.

16 years agoimprove generation of Darwin libraries
Guillaume Poirier [Fri, 22 Aug 2008 19:05:37 +0000 (21:05 +0200)]
improve generation of Darwin libraries
Patch by vmrsss %vmrsss A gmail P com%

16 years agoFix compilation in gcc 3.4.x (issue in r946)
Fiona Glaser [Fri, 22 Aug 2008 01:23:08 +0000 (21:23 -0400)]
Fix compilation in gcc 3.4.x (issue in r946)
Due to a bug in gcc 3.4.x, in certain cases of inlining, the array_non_zero_int_mmx inline asssembly is miscompiled and causes a crash with --subme 7 --8x8dct.
This minor hack fixes this issue.

16 years agoshut up various gcc warnings
Loic Le Loarer [Thu, 21 Aug 2008 10:19:24 +0000 (04:19 -0600)]
shut up various gcc warnings

16 years agofix a crash with invalid args and --thread-input (introduced in r921)
Loren Merritt [Thu, 21 Aug 2008 10:15:49 +0000 (04:15 -0600)]
fix a crash with invalid args and --thread-input (introduced in r921)

16 years agodrop support for x86_32 PIC.
Loren Merritt [Wed, 20 Aug 2008 11:36:32 +0000 (05:36 -0600)]
drop support for x86_32 PIC.

16 years agouse permute macros in satd
Loren Merritt [Tue, 19 Aug 2008 07:55:57 +0000 (01:55 -0600)]
use permute macros in satd
move some more shared macros to x264util.asm

16 years agocosmetics
Loren Merritt [Thu, 21 Aug 2008 02:32:13 +0000 (20:32 -0600)]
cosmetics

16 years agor940 broke threads
Loren Merritt [Thu, 21 Aug 2008 01:00:52 +0000 (19:00 -0600)]
r940 broke threads

16 years agoCleanups in macroblock_cache_save/load
Fiona Glaser [Wed, 20 Aug 2008 17:28:15 +0000 (13:28 -0400)]
Cleanups in macroblock_cache_save/load
A bit more loop unrolling, and moving some constant code to the global init function

16 years agoDeblocking code cleanup and cosmetics
Fiona Glaser [Tue, 19 Aug 2008 20:18:24 +0000 (14:18 -0600)]
Deblocking code cleanup and cosmetics
Convert the style of the deblocking code to the standard x264 style
Eliminate some trailing whitespace

16 years ago4% faster deblock: special-case macroblock edges
Fiona Glaser [Tue, 19 Aug 2008 05:03:37 +0000 (23:03 -0600)]
4% faster deblock: special-case macroblock edges
Along with a bit of related code reorganization and macroification

16 years agoAdd dedicated variance function instead of using SAD+SSD
David Pethes [Sat, 16 Aug 2008 15:43:26 +0000 (09:43 -0600)]
Add dedicated variance function instead of using SAD+SSD
Faster variance calculation

16 years ago6% faster deblock: remove some clips, earlier termiantion on low qps.
Loren Merritt [Fri, 15 Aug 2008 09:04:28 +0000 (03:04 -0600)]
6% faster deblock: remove some clips, earlier termiantion on low qps.

16 years agoFaster deblocking
Fiona Glaser [Fri, 15 Aug 2008 01:31:42 +0000 (19:31 -0600)]
Faster deblocking
Early termination for bS=0, alpha=0, beta=0
Refactoring, various other optimizations
About 30% faster deblocking overall.

16 years agoasm cosmetics
Loren Merritt [Sat, 2 Aug 2008 14:19:50 +0000 (08:19 -0600)]
asm cosmetics

16 years agoyet another posix-emulating define on solaris
Daniel Vergien [Wed, 6 Aug 2008 14:10:53 +0000 (08:10 -0600)]
yet another posix-emulating define on solaris

16 years agoupdate msvc projectfile
Gabriel Bouvigne [Wed, 6 Aug 2008 13:45:05 +0000 (07:45 -0600)]
update msvc projectfile

16 years agodrop support for msvc6
Loren Merritt [Wed, 6 Aug 2008 13:34:42 +0000 (07:34 -0600)]
drop support for msvc6

16 years agoPrevent VBV from lowering quantizer too much
Fiona Glaser [Sat, 9 Aug 2008 15:36:04 +0000 (09:36 -0600)]
Prevent VBV from lowering quantizer too much
This code seemed to act up unexpectedly sometimes, creating a situation where in 1-pass VBV mode, a frame's quantizer would drop all the way to qpmin and then shoot back upwards to qpmax, causing serious visual issues.
This change may decrease bitrate in VBV mode, but that is preferable to the artifacting produced by this code.

16 years agoImprove subme7 at low QPs and add subme7 support in lossless mode
Fiona Glaser [Sat, 9 Aug 2008 15:34:37 +0000 (09:34 -0600)]
Improve subme7 at low QPs and add subme7 support in lossless mode

16 years agocosmetics: merge x86inc*.asm
Loren Merritt [Thu, 31 Jul 2008 04:35:20 +0000 (22:35 -0600)]
cosmetics: merge x86inc*.asm

16 years agoAdd missing x264util.asm
Fiona Glaser [Wed, 30 Jul 2008 21:29:46 +0000 (15:29 -0600)]
Add missing x264util.asm

16 years agoBasic sanity checking of qpmax/qpmin options
Fiona Glaser [Wed, 30 Jul 2008 21:28:21 +0000 (15:28 -0600)]
Basic sanity checking of qpmax/qpmin options

16 years agoFix regression in r922
Fiona Glaser [Wed, 30 Jul 2008 20:42:29 +0000 (14:42 -0600)]
Fix regression in r922
set the chroma DC coefficients to zero for residual coding in qpel-rd
fix C99ism

16 years agoRefactor asm macros part 2: DCT
Holger Lubitz [Wed, 30 Jul 2008 03:36:01 +0000 (21:36 -0600)]
Refactor asm macros part 2: DCT

16 years agoRefactor asm macros part 1: DCT
Holger Lubitz [Wed, 30 Jul 2008 03:26:58 +0000 (21:26 -0600)]
Refactor asm macros part 1: DCT

16 years agoImprove intra RD refine, speed up residual_write_cabac
Fiona Glaser [Tue, 29 Jul 2008 23:08:38 +0000 (17:08 -0600)]
Improve intra RD refine, speed up residual_write_cabac
a do/while loop can be used for residual_write, but i8x8 had to be fixed so that it wouldn't call residual_write with zero coeffs
proper nnz handling added to cabac intra rd refine
chroma cbp added to 8x8 chroma rd
cbp was tested, but wasn't useful

16 years agoFix a few more minor memleaks
Fiona Glaser [Tue, 29 Jul 2008 19:42:41 +0000 (13:42 -0600)]
Fix a few more minor memleaks

16 years agostats summary: print distribution of numbers of consecutive B-frames
Loren Merritt [Sat, 26 Jul 2008 00:14:31 +0000 (18:14 -0600)]
stats summary: print distribution of numbers of consecutive B-frames

16 years agoadd interlacing to the list of stuff checked by x264_validate_levels
Loic Le Loarer [Fri, 25 Jul 2008 22:08:32 +0000 (16:08 -0600)]
add interlacing to the list of stuff checked by x264_validate_levels

16 years agoFix C99-ism in r907
Fiona Glaser [Thu, 24 Jul 2008 13:58:50 +0000 (07:58 -0600)]
Fix C99-ism in r907

16 years agoFaster temporal predictor calculation
Fiona Glaser [Fri, 18 Jul 2008 00:17:22 +0000 (18:17 -0600)]
Faster temporal predictor calculation
Split into a separate commit because this changes rounding, and thus changes output slightly.

16 years agoAlign lowres planes for improved cacheline split performance
Fiona Glaser [Thu, 17 Jul 2008 13:55:24 +0000 (07:55 -0600)]
Align lowres planes for improved cacheline split performance

16 years agoautodetect level based on resolution/bitrate/refs/etc, rather than defaulting to...
Loren Merritt [Wed, 16 Jul 2008 02:16:16 +0000 (20:16 -0600)]
autodetect level based on resolution/bitrate/refs/etc, rather than defaulting to L5.1
if vbv is not enabled (and especially in crf/cqp), we have to guess max bitrate, so we might underestimate the required level.

16 years agofix bs_write_ue_big for values >= 0x10000.
Loren Merritt [Fri, 18 Jul 2008 02:25:03 +0000 (20:25 -0600)]
fix bs_write_ue_big for values >= 0x10000.
(no immediate effect, since nothing writes such values yet)

16 years agoFix lossless mode borked in r901
BugMaster [Wed, 16 Jul 2008 17:54:51 +0000 (11:54 -0600)]
Fix lossless mode borked in r901

16 years agoRelax QPfile restrictions
Fiona Glaser [Sat, 12 Jul 2008 20:37:58 +0000 (14:37 -0600)]
Relax QPfile restrictions
Allow a QPfile to contain fewer frames than the total number of frames in the video and have ratecontrol fill in the rest.
Patch by kemuri9.

16 years agoLimit MVrange correctly in interlaced mode
Fiona Glaser [Sat, 12 Jul 2008 20:10:38 +0000 (14:10 -0600)]
Limit MVrange correctly in interlaced mode
Bug report by Sigma Designs, Inc.

16 years agoFix bug with PCM and adaptive quantization
Fiona Glaser [Sat, 12 Jul 2008 04:53:27 +0000 (22:53 -0600)]
Fix bug with PCM and adaptive quantization
In rare cases CABAC desync could occur, causing bitstream corruption

16 years agoFix memory leak upon x264 closing
BugMaster [Fri, 11 Jul 2008 22:00:02 +0000 (16:00 -0600)]
Fix memory leak upon x264 closing
Doesn't affect the CLI, but potentially important for programs which call x264 as a shared library.

16 years agoFix compilation on PPC systems (borked in r903)
Fiona Glaser [Fri, 11 Jul 2008 21:45:54 +0000 (15:45 -0600)]
Fix compilation on PPC systems (borked in r903)
Bigendian systems didn't have endian_fix32 defined

16 years agoAdd L1 reflist and B macroblock types to x264 info
Fiona Glaser [Fri, 11 Jul 2008 20:16:18 +0000 (14:16 -0600)]
Add L1 reflist and B macroblock types to x264 info
Also remove display of "PCM" if PCM mode is never used in the encode.
L1 reflist information will only show if pyramid coding is used.

16 years agoFix and enable I_PCM macroblock support
Fiona Glaser [Thu, 10 Jul 2008 14:36:45 +0000 (08:36 -0600)]
Fix and enable I_PCM macroblock support
In RD mode, always consider PCM as a macroblock mode possibility
Fix bitstream writing for PCM blocks in CAVLC and CABAC, and a few other minor changes to make PCM work.
PCM macroblocks improve compression at very low QPs (1-5) and in lossless mode.

16 years agode-duplicate vlc tables
Loren Merritt [Sat, 5 Jul 2008 03:03:26 +0000 (21:03 -0600)]
de-duplicate vlc tables

16 years agofaster ue/se/te write
Loren Merritt [Sat, 5 Jul 2008 00:56:30 +0000 (18:56 -0600)]
faster ue/se/te write

16 years agofaster bs_write
Fiona Glaser [Sat, 5 Jul 2008 00:32:32 +0000 (18:32 -0600)]
faster bs_write

16 years agocosmetics in ssd asm
Loren Merritt [Thu, 3 Jul 2008 06:37:16 +0000 (00:37 -0600)]
cosmetics in ssd asm

16 years agoVarious optimizations and cosmetics
Fiona Glaser [Sun, 6 Jul 2008 18:59:15 +0000 (12:59 -0600)]
Various optimizations and cosmetics
Update AUTHORS file with Gabriel and me
update XCHG macro to work correctly in if statements
Add new lookup tables for block_idx and fdec/fenc addresses
Slightly faster array_non_zero_count_mmx (patch by holger)
Eliminate branch in analyse_intra
Unroll loops in and clean up chroma encode
Convert some for loops to do/while loops for speed improvement
Do explicit write-combining on --me tesa mvsad_t struct
Shrink --me esa zero[] array
Speed up bime by reducing size of visited[][][] array

16 years agoResolve floating point exception with frame_init_lowres mmx
Fiona Glaser [Sun, 6 Jul 2008 17:15:19 +0000 (11:15 -0600)]
Resolve floating point exception with frame_init_lowres mmx
In some cases, the mmx version of frame_init_lowres could leave the FPU uninitialized for use in ratecontrol, resulting in floating point exceptions.
Since frame_init_lowres is such a time-consuming function, an emms was just put at the end, since it costs almost nothing compared to the total time of frame_init_lowres.

16 years agoUpdate my email address
Eric Petit [Fri, 4 Jul 2008 09:31:32 +0000 (11:31 +0200)]
Update my email address

16 years agoUpdate file headers throughout x264
Fiona Glaser [Fri, 4 Jul 2008 02:05:00 +0000 (20:05 -0600)]
Update file headers throughout x264
Update "Authors" lists based on actual authorship; highest is most important
Update copyright notices and remove old CVS tags from file headers
Add file headers to GTK and other sections missing them
Update FSF address
Other header-related cosmetics

16 years agodenoise_dct asm
Fiona Glaser [Thu, 3 Jul 2008 02:59:24 +0000 (20:59 -0600)]
denoise_dct asm

16 years agocosmetics in permutation macros
Loren Merritt [Thu, 3 Jul 2008 02:55:10 +0000 (20:55 -0600)]
cosmetics in permutation macros
SWAP can now take mmregs directly, rather than just their numbers

16 years agoFix bug in adaptive quantization
Fiona Glaser [Wed, 2 Jul 2008 16:43:57 +0000 (10:43 -0600)]
Fix bug in adaptive quantization
In some cases adaptive quantization did not correctly calculate the variance.
Bug reported by MasterNobody

16 years agolowres_init asm
Loren Merritt [Sun, 29 Jun 2008 06:00:03 +0000 (00:00 -0600)]
lowres_init asm
rounding is changed for asm convenience. this makes the c version slower, but there's no way around that if all the implementations are to have the same results.

16 years agoOptimizations and cosmetics in macroblock.c
Fiona Glaser [Wed, 2 Jul 2008 05:42:39 +0000 (23:42 -0600)]
Optimizations and cosmetics in macroblock.c
If an i4x4 dct block has no coefficients, don't bother with dequant/zigzag/idct.  Not useful for larger sizes because the odds of an empty block are much lower.
Cosmetics in i16x16 to be more consistent with other similar functions.
Add an SSD threshold for chroma in probe_skip to improve speed and minimize time spent on chroma skip analysis.
Rename lambda arrays to lambda_tab for consistency.