Yunqing Wang [Mon, 10 Jan 2011 21:16:59 +0000 (16:16 -0500)]
Fix bug in motion search
The maximum possible MV in 1/8 pel units is (1<<11), which could
cause mvcost out of its range that is 1023. Change maximum
possible MV in 1/8 pel units to (1<<11)-8 will fix this problem.
Paul Wilkins [Mon, 10 Jan 2011 16:41:53 +0000 (16:41 +0000)]
Two Pass VBR change
Further experiment with restriction of the Q range.
This uses the average non KF/GF/ARF quantizer, instead
of just relying on the initial value. It is not such a strong constraint
but there may be a reduced risk of rate misses.
Paul Wilkins [Fri, 7 Jan 2011 18:29:37 +0000 (18:29 +0000)]
CQ Mode
The merge includes hooks to for CQ mode and other code
changes merged from the test branch.
CQ mode attempts to maintain a more stable quantizer within a clip
whilst also trying to adhere to a guidline maximum bitrate.
The existing target data rate parameter is used to specify the
guideline maximum bitrate.
A new parameter allows the user to specify a target CQ level.
For normal (non kf/gf/arf) frames, the quantizer will not drop BELOW the
user specified value (0-63). However, in some cases the encoder may
choose to impose a target CQ that is above that specified by the user,
if it estimates that consistent use of the target value is not compatible
with guideline maximum bitrate.
Paul Wilkins [Fri, 7 Jan 2011 16:33:59 +0000 (16:33 +0000)]
Limit Q variability in two pass.
In two pass encoding each frame is given an active
Q range to work with. This change limits how much this
Q range can be altered over time from the initial estimate
made for the clip as a whole.
There is some danger this could lead to overshoot or undershoot
in some corner cases but it helps considerably in regard to
clips where either there is a glut or famine of bits in some sections,
particularly near the end of a clip.
Johann [Wed, 22 Dec 2010 16:23:51 +0000 (11:23 -0500)]
x86 sse2 temporal_filter_apply
count can be reduced to short because the max number of filtered frames
is set to 15. the max value for any frame is 32 (modifier = 16,
filter_weight = 2). 15*32 = 480 which requires 9 bits
this function goes from about 7000 us / 1000 iterations for the C code
to < 275 us / 1000 iterations for sse2 for block_size = 16 and from
about 1800 us / 1000 iters to < 100 us / 1000 iters for block_size = 8
Yunqing Wang [Wed, 29 Dec 2010 15:28:35 +0000 (10:28 -0500)]
Always update last_frame_type
Scott pointed out that last_frame_type only gets updated while
loopfilter exists. Since last_frame_type is also needed in
motion search now, it needs to be updated every frame.
Scott LaVarnway [Tue, 28 Dec 2010 19:51:46 +0000 (14:51 -0500)]
Use the fast quantizer for inter mode selection
Use the fast quantizer for inter mode selection and the
regular quantizer for the rest of the encode for good quality,
speed 1. Both performance and quality were improved. The
quality gains will make up for the quality loss mentioned in
I9dc089007ca08129fb6c11fe7692777ebb8647b0.
Yunqing Wang [Thu, 23 Dec 2010 16:23:03 +0000 (11:23 -0500)]
Modify motion estimation for SPLITMV mode
1. Search for block8x16/block16x8 uses block8x8's search results.
2. Check block4x4 only if block8x8 is chosen. (This hurts quality,
which will be improved in another check-in.)
3. In block4x4 search, the previous block's result is used as
MV predictor for next block.
Yaowu Xu [Fri, 24 Dec 2010 03:59:12 +0000 (19:59 -0800)]
adjusted sad_per_bit to correlate with quantizer
Re-calibrated sad_per_bit16 and sad_per_bit4 tables to linearly
correlated to quantizer values, these two variables are used in
motion search for costing motion vectors. This change has an small
positive effect on compression.
Johann [Mon, 29 Nov 2010 19:21:11 +0000 (14:21 -0500)]
abstract apply_temporal_filter
allow for optimized versions of apply_temporal_filter
(now vp8_apply_temporal_filter_c)
the function was previously declared as static and appears to have been
inlined. with this change, that's no longer possible. performance takes
a small hit.
the declaration for vp8_cx_temp_filter_c was moved to onyx_if.c because
of a circular dependency. for rtcd, temporal_filter.h holds the
definition for the rtcd table, so it needs to be included by onyx_int.h.
however, onyx_int.h holds the definition for VP8_COMP which is needed
for the function prototype. blah.
John Koleszar [Fri, 17 Dec 2010 16:34:02 +0000 (11:34 -0500)]
propagate user private data on decode
The pointer passed in the user_priv argument to vpx_codec_decode()
should be propagated through to the corresponding output frame and
made available in the image's user_priv member. Fixes issue #252
John Koleszar [Fri, 17 Dec 2010 14:43:39 +0000 (09:43 -0500)]
Add psnr/ssim tuning option
Add a new encoder control, VP8E_SET_TUNING, to allow the application
to inform the encoder that the material will benefit from certain
tuning. Expose this control as the --tune option to vpxenc. The args
helper is expanded to support enumerated arguments by name or value.
Two tunings are provided by this patch, PSNR (default) and SSIM.
Activity masking is made dependent on setting --tune=ssim, as the
current implementation hurts speed (10%) and PSNR (2.7% avg,
10% peak) too much for it to be a default yet.
Henrik Lundin [Tue, 14 Dec 2010 13:05:06 +0000 (14:05 +0100)]
Inform caller of decoder about updated references
Inform the caller of the decoder if a decoded frame updated last,
golden, or altref frames, required for realtime communication
proposed in document VP8 RTP payload format.
Added a new vpx_codec_control called VP8D_GET_LAST_REF_UPDATES, to be
called after vpx_codec_decode. The control will indicate which of the
reference frames that were updated by setting the 3 LSBs in the input
int (pointer).
Scott LaVarnway [Thu, 16 Dec 2010 22:01:27 +0000 (17:01 -0500)]
Changed segmentation check order
In SPLITMV, the 8x8 segment will be checked first. If the 8x8 rd
is better than the best, we check the other segments. Otherwise
bail. Adjustments to the thresh_mult were necessary to make
up for the initial quality loss.
The performance improved by 20% (average) for good quality,
speed 0 and speed 1, while the overall quality remained the same.
Scott LaVarnway [Thu, 16 Dec 2010 14:38:02 +0000 (09:38 -0500)]
Adjusted breakout RD for SPLITMV
vp8_rd_pick_best_mbsegmentation looks at y only. The new
breakout does not include the frame cost, the prob_skip_false
cost, or the uv rate. Performance improved by a few percent
and the quality remained the same.
Yunqing Wang [Tue, 14 Dec 2010 22:39:25 +0000 (17:39 -0500)]
Fix a bug in motion search code(2)
This fix added MV range checks for NEWMV mode as suggested by Jim.
To reduce unnecessary MV range checks, I tried Yaowu's suggestion.
Update UMV borders in NEWMV mode to also cover MV range check.
Also, in this way, every MV that is valid gets checked in diamond
search function.
Yunqing Wang [Tue, 14 Dec 2010 16:00:25 +0000 (11:00 -0500)]
Fix a bug in motion search code
The MV's range is 256. Since the new motion search uses a different
starting MV than the center ref MV, a MV range checking needs to
be done to avoid corruption.
James Berry [Mon, 13 Dec 2010 18:10:58 +0000 (13:10 -0500)]
fixed vpxenc bug where ivf files would be read incorrectly
read_frame would incorrectly insert detect->buf into img
for ivf files. detect->position now set to 4 if input file is
detected to be ivf in file_is_ivf to keep this from occuring.
Yaowu Xu [Mon, 6 Dec 2010 21:33:01 +0000 (13:33 -0800)]
adjust RDMULT for UV plane in quantization RDO
This patch adds a weighting factor on RDMULT for UV blocks. The change
has an overall gain about 0.5% based on ssim, between 0.1 and 0.2% by
psnr numbers.
Scott LaVarnway [Mon, 6 Dec 2010 21:42:52 +0000 (16:42 -0500)]
vp8_rd_pick_best_mbsegmentation code restructure
Moved the code from the segmentation loop into a function
which is now called for each segment. This will allow us
to change the segment order checking more easily.
Paul Wilkins [Sat, 4 Dec 2010 10:04:12 +0000 (10:04 +0000)]
Change to inter_minq table.
The inter_minq table controls the range of quantizers available
for a particular frame in two pass relative to a max Q value.
The changes reduces the range somewhat. The effect of this
was a small increase (0.3% average) in psnr for the test set
but it should also help encode speed somewhat for higher
quality modes as it will reduce the number of iterations in the
recode loop.
The change damps the range of quantizers available locally
within a section of a clip and should therefore help keep quality
more uniform. If there is systematic overshoot or undershoot the
range can shift gradually to accommodate. However, there is
some increased risk of overshoot or undershoot against the target
bit rate in VBR mode and this risk will be more pronounced for short
clips.
The change damps the range of quantizers available locally
within a section of a clip and should therefore help keep quality
more uniform. If there is systematic overshoot or undershoot the
range can shift gradually to accommodate. However, there is
some increased risk of overshoot or undershoot against the
target bit rate in VBR mode and this risk will be more
pronounced for short clips.
Yunqing Wang [Fri, 3 Dec 2010 16:26:21 +0000 (11:26 -0500)]
Improve MV prediction accuracy to achieve performance gain
Add vp8_mv_pred() to better predict starting MV for NEWMV
mode in vp8_rd_pick_inter_mode(). Set different search
ranges according to MV prediction accuracy, which improves
encoder performance without hurting the quality. Also,
as Yaowu suggested, using diamond search result as full
search starting point and therefore adjusting(reducing)
full search range helps the performance.
Fritz Koenig [Thu, 18 Nov 2010 18:40:58 +0000 (10:40 -0800)]
Set refresh_alt_ref_frame on keyframe encode.
On a keyframe alt ref and golden are refreshed. The flag was
not being set and so on the frame after a keyframe, motion
search would occur on the alt ref frame. This is not necessary
because the alt ref frame identical to the last frame in this
scenario.
Handle corner case where a forward alt-ref frame is put
directly after a keyframe.
Pascal Massimino [Wed, 24 Nov 2010 08:22:59 +0000 (00:22 -0800)]
allow dimensions as low as 1 pixel
remove warning comment in vpxenc.c: in case of 1x1 picture,
detect_bytes will be equal to '3' and we'll fall back to
RAW_TYPE.
fix read_frame() by tracking the pre-read buffer length
in the struct detect
Paul Wilkins [Mon, 22 Nov 2010 13:17:35 +0000 (13:17 +0000)]
Recalibration of bits per MB tables
The baseline bits per MB prediction tables have been
re calibrated based on the assumption that bits per mb
is inversely proportional to the quantizer level.
Paul Wilkins [Wed, 17 Nov 2010 15:12:04 +0000 (15:12 +0000)]
Replaced recode loop test with a function call
Replaced existing code to decide if a frame recode is required
with a function call. This is to simplify addition of extra clauses
that may be needed for the planned constrained quality mode.
Also fixed a bug where by alt ref not considered in the test.
John Koleszar [Wed, 17 Nov 2010 14:13:54 +0000 (09:13 -0500)]
vp8mt_alloc_temp_buffers: make prototype return void
This function was never called in a context expecting a return value,
the return value was always a constant, and the !CONFIG_MULTITHREAD
path didn't have a return statement, which caused a compiler warning.
This patch changes the function to return void instead.
Paul Wilkins [Mon, 15 Nov 2010 17:47:12 +0000 (17:47 +0000)]
Bad cost tables used in ARNR filtering.
The use of incorrect mv costing tables in the ARNR sub-pel
filtering code led to corruption of the altref buffer in some cases,
particularly at low data rates.
The average gain from this fix is about 0.3% but there are a few
extreme cases where nasty and visible artifacts manifested and
for these few data points the improvement is > 10%.