Paul Wilkins [Mon, 15 Nov 2010 17:47:12 +0000 (17:47 +0000)]
Bad cost tables used in ARNR filtering.
The use of incorrect mv costing tables in the ARNR sub-pel
filtering code led to corruption of the altref buffer in some cases,
particularly at low data rates.
The average gain from this fix is about 0.3% but there are a few
extreme cases where nasty and visible artifacts manifested and
for these few data points the improvement is > 10%.
Frank Galligan [Thu, 4 Nov 2010 03:33:00 +0000 (23:33 -0400)]
Fixed bug first cluster timecode of webm file is wrong.
When the first pts equaled 0 ivfenc was incorrectly increasing the
pts by 1. I changed the pts and last pts to be signed. I also set
the default value of last pts to -1.
Yaowu Xu [Thu, 11 Nov 2010 05:16:17 +0000 (21:16 -0800)]
make rdmult adaptive for intra in quantizer RDO
This intends to correct the tendency that VP8 aggressively favors rate
on intra coded frames. Experiments tested different numbers in [0, 1]
and found 9/16 overall provided about 2-4% gains for all-intra coded
clips based on vpx-ssim metric. The impact on regular encoded clips
is much smaller but positive overall. Overall impact on psnr is also
positive even though very small.
John Koleszar [Thu, 11 Nov 2010 17:41:07 +0000 (12:41 -0500)]
quantizer: fix assertion in fast quantizer path
The fast quantizer assembly code has not been updated to match the new
exact quantizer, which was made the default in commit 6adbe09.
Specifically, they are not aware of the potential for the coefficient
to be scaled, which results in the quantized result exceeding the range
of the DCT. This patch restores the previous behavior of using the
non-shifted coefficients when in the fast quantizer code path, but
unfortunately requires rebuilding the tables when switching between the
two.
Use of temporal context for encoding delta updates.
- Used three probability approach for temporal context as follows:
P0 - probability of no change if both above and left not changed
P1 - probability of no change if one of above and left has changed
P2 - probability of no change if both above and left have changed
In addition, a 1 bit/frame has been used to decide whether to use temporal context or to encode directly. The cost of using both the schemes is calculated ahead and the temporal_update flag is set if the cost of using temporal context is lower than encoding the segment ids directly.
This approach has given around 20% reduction in cost of bits needed to encode segmentation ids.
Paul Wilkins [Wed, 10 Nov 2010 10:09:45 +0000 (10:09 +0000)]
Relax rate control for last few frames
VBR rate control can become very noisy for the last few frames.
If there are a few bits to spare or a small overshoot then the
target rate and hence quantizer may start to fluctuate wildly.
This patch prevents further adjustment of the active Q limits for
the last few frames.
Patch also removes some redundant variables and makes one small bug fix.
Paul Wilkins [Mon, 8 Nov 2010 15:28:54 +0000 (15:28 +0000)]
Tuning for the more exact quantizer.
Small changes to the default zero bin and rounding tables.
Though the tables are currently the same for the Y1 and Y2 cases
I have left them as separate tables in case we want to tune this later.
There is now some adjustment of the zbin based on the prediction mode.
Previously this was restricted to an adjustment for gf/arf 0,0 MV.
The exact quantizer now marginal outperforms and is the default.
John Koleszar [Thu, 4 Nov 2010 20:59:26 +0000 (16:59 -0400)]
fix integer promotion bug in partition size check
The check '(user_data_end - partition < partition_size)' must be
evaluated as a signed comparison, but because partition_size was
unsigned, the LHS was promoted to unsigned, causing an incorrect
result on 32-bit. Instead, check the upper and lower bounds of
the segment separately.
John Koleszar [Thu, 4 Nov 2010 19:05:45 +0000 (15:05 -0400)]
improve average framerate calculation
Change Ice204e86 identified a problem with bitrate undershoot due to
low precision in the timestamps passed to the library. This patch
takes a different approach by calculating the duration of this frame
and passing it to the library, rather than using a fixed duration
and letting the library average it out with higher precision
timestamps. This part of the fix only applies to vpxenc.
This patch also attempts to fix the problem for generic applications
that may have made the same mistake vpxenc did. Instead of
calculating this frame's duration by the difference of this frame's
and the last frame's start time, we use the end times instead. This
allows the framerate calculation to scavenge "unclaimed" time from
the last frame. For instance:
Yaowu Xu [Wed, 3 Nov 2010 19:56:31 +0000 (12:56 -0700)]
Increase the resolution of default timebase
The old value 1000 was too low, which caused the effective duration and
frame rate calculation to have an 1% error for typical 30 frame/second
inputs. Symptom of the issue has been that most 2 pass encodings were
undershooting target bit rate by 1% or so for 30 fps input.
John Koleszar [Wed, 3 Nov 2010 17:58:40 +0000 (13:58 -0400)]
vpxenc: require width and height for raw streams
Defaulting to 320x240 for raw streams is arbitrary and error-prone.
Instead, require that the width and height be set manually if they
can't be parsed from the input file.
John Koleszar [Tue, 2 Nov 2010 13:11:57 +0000 (09:11 -0400)]
fix pipe support on windows
STDIO streams are opened in text mode by default on Windows. This patch
changes the stdin/stdout streams to be in binary mode if they are being
used for I/O from the vpxenc or vpxdec tools.
Fixes issue #216. Thanks to mw AT hesotech.de for the fix.
This eliminates a large set of warnings exposed by the Mozilla build
system (Use of C++ comments in ISO C90 source, commas at the end of
enum lists, a couple incomplete initializers, and signed/unsigned
comparisons).
It also eliminates many (but not all) of the warnings expose by newer
GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
without checking the return values).
There are a few spurious warnings left on my system:
../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
change between the two if blocks that test it here.
../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum,
and the C standard does not mandate that enums be unsigned, so the
checks can't be removed.
Fritz Koenig [Wed, 27 Oct 2010 19:50:16 +0000 (12:50 -0700)]
postproc: Tweaks to line drawing and blending.
Turned down the blending level to make colored blocks obscure
the video less.
Not blending the entire block to give distinction to macro
block edges.
Added configuration so that macro block blending function can
be optimized.
Change to constrain line as to when dx and dy are computed.
Now draw two lines to form an arrow.
Frank Galligan [Wed, 27 Oct 2010 15:28:56 +0000 (11:28 -0400)]
Output the PSNR for the entire file.
If --psnr option is enabled vpxenc will output PSNR values for the
entire file. Added a \n before final output to make sure the output
is on its own line. Overall and Avg psnr matches the values written
to opsnr.stt file.
This eliminates a large set of warnings exposed by the Mozilla build
system (Use of C++ comments in ISO C90 source, commas at the end of
enum lists, a couple incomplete initializers, and signed/unsigned
comparisons).
It also eliminates many (but not all) of the warnings expose by newer
GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
without checking the return values).
There are a few spurious warnings left on my system:
../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
change between the two if blocks that test it here.
../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum, and the C
standard does not mandate that enums be unsigned, so the checks can't be
removed.
Fritz Koenig [Wed, 27 Oct 2010 19:50:16 +0000 (12:50 -0700)]
postproc: Tweaks to line drawing and blending.
Turned down the blending level to make colored blocks obscure
the video less.
Not blending the entire block to give distinction to macro
block edges.
Added configuration so that macro block blending function can
be optimized.
Change to constrain line as to when dx and dy are computed.
Now draw two lines to form an arrow.
Frank Galligan [Wed, 27 Oct 2010 15:28:56 +0000 (11:28 -0400)]
Output the PSNR for the entire file.
If --psnr option is enabled vpxenc will output PSNR values for the
entire file. Added a \n before final output to make sure the output
is on its own line. Overall and Avg psnr matches the values written
to opsnr.stt file.
Yunqing Wang [Wed, 27 Oct 2010 12:45:24 +0000 (08:45 -0400)]
Full search SAD function optimization in SSE4.1
Use mpsadbw, and calculate 8 sad at once. Function list:
vp8_sad16x16x8_sse4
vp8_sad16x8x8_sse4
vp8_sad8x16x8_sse4
vp8_sad8x8x8_sse4
vp8_sad4x4x8_sse4
(test clip: tulip)
For best quality mode, this gave encoder a 5% performance boost.
For good quality mode with speed=1, this gave encoder a 3%
performance boost.
John Koleszar [Wed, 27 Oct 2010 14:05:55 +0000 (10:05 -0400)]
vpxenc: add unique track id
MKV requires a unique(ish) TrackID element in the track info header.
Instead of the current hard-coded ID, take a hash of the video track
and use that. This value is not written in the deterministic output
mode, despite being a deterministic value itself, to give flexibility
to change the hash algorithm and not affect bisecting across the
change.
Johann [Wed, 27 Oct 2010 15:21:02 +0000 (11:21 -0400)]
fix implicit declarations
ARM used to explicitly remove this file from the build. With the RTCD
changes, that's no longer possible. These errors also exist for x86 w/o
RTCD, but that's not the default configuration
John Koleszar [Tue, 26 Oct 2010 19:34:16 +0000 (15:34 -0400)]
Add half-pixel variance RTCD functions
NEON has optimized 16x16 half-pixel variance functions, but they
were not part of the RTCD framework. Add these functions to RTCD,
so that other platforms can make use of this optimization in the
future and special-case ARM code can be removed.
A number of functions were taking two variance functions as
parameters. These functions were changed to take a single
parameter, a pointer to a struct containing all the variance
functions for that block size. This provides additional flexibility
for calling additional variance functions (the half-pixel special
case, for example) and by initializing the table for all block sizes,
we don't have to construct this function pointer table for each
macroblock.
John Koleszar [Tue, 26 Oct 2010 20:22:22 +0000 (16:22 -0400)]
vpxenc: add deterministic output option
By baking the version number into the output file, a hash of the file
will vary from commit to commit, even if the output is otherwise bit
exact. Add a -D option to suppress this behavior, for use when
bisecting or other debugging.
John Koleszar [Tue, 26 Oct 2010 15:37:23 +0000 (11:37 -0400)]
make vp8_recon16x16mb{,y} RTCD functions
ARM NEON has a platform specific version of vp8_recon16x16mb, though
it's just a stub to extract the various parameters from the
MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
that function's prototype directly will be a better long term solution,
but it's quite an invasive change.
John Koleszar [Tue, 26 Oct 2010 14:46:31 +0000 (10:46 -0400)]
make arm hex search the generic implementation
The ARM version of vp8_hex_search() is a faster implementation
of the same algorithm. Since it doesn't use any ARM specific
code, it can be made the default implementation. This removes
a linking error.
John Koleszar [Tue, 26 Oct 2010 13:51:35 +0000 (09:51 -0400)]
arm: move unrolled loops back to generic code
Some of the ARM functions differed from their generic counterparts
only by unrolling their loops. Since this change may be useful
on other platforms, or might even supercede the looped version
in the generic case, move it back to the generic file.
This code is left under #if ARCH_ARM for now, but it may be worth
considering a different (possibly new) conditional for these. If
it turns out that this should be runtime selectable, these
functions will have to move to the RTCD infrastructure. Don't want
to take that step at this time without more profile data.
John Koleszar [Tue, 26 Oct 2010 13:37:44 +0000 (09:37 -0400)]
arm: remove duplicate functions
These functions were true duplicates of functions present in the
generic code. This fixes some of the link errors when building
with --enable-shared --enable-pic.