Attila Nagy [Mon, 10 Jan 2011 09:14:10 +0000 (11:14 +0200)]
Fix encoder real-time only configuration.
Remove allocation/deallocation of stats storage.
Remove full search functions in machine specific encoder inits.
Remove last pass validation in validate_config.
Paul Wilkins [Mon, 17 Jan 2011 17:23:11 +0000 (17:23 +0000)]
Fix CQ range and experimental KF sizing changes.
The CQ level was not using the q_trans[] array to convert
to a 0-127 range as per min and maxq
Experimental change to try and match the reconstruction
error for forced key frames approximately to that of the
previous frame by means of the recode loop. Though this
may cause extra recodes and the recode behavior has not
been optimized, it can only happen on forced key frames.
Paul Wilkins [Fri, 14 Jan 2011 14:52:15 +0000 (14:52 +0000)]
Testing of modes with Alt Ref frame
Previously when a frame was being overlaid on a previously coded
alt ref frame we only checked the alt ref 0,0 mode. Where there is
a possibility that the alt ref buffer is a filtered frame we should allow
the other prediction modes as normal or at the least allow use of
the last frame buffer.
Adrian Grange [Fri, 14 Jan 2011 15:04:39 +0000 (15:04 +0000)]
ARNR filter pointer update bug fix
In cases where the frame width is not a multiple of 16 the
ARNR filter would go wrong.
In vp8_temporal_filter_iterate_c when updating pointers
at the end of a row of MBs, the image size was
incorrectly used rather than using Num_MBs_In_Row
times 16 (Y) or 8 (U,V).
This worked when width is multiple of 16 but failed
otherwise.
Paul Wilkins [Fri, 14 Jan 2011 11:34:53 +0000 (11:34 +0000)]
KF/GF Pulsing
This change is designed to try and reduce pulsing effects when moving
with a complex transition like a fade, into an easy or static section in
an otherwise difficult clip in CQ mode.
The active CQ level is relaxed down to the user entered level for frames that
are generating less than the passed in minimum bandwidth.
Paul Wilkins [Wed, 12 Jan 2011 17:08:42 +0000 (17:08 +0000)]
Limit key frame quantizer for forced key frames.
Where a key frame occurs because of a minimum interval
selected by the user, then these forced key frames ideally need
to be more closely matched in quality to the surrounding frame.
Yunqing Wang [Mon, 10 Jan 2011 21:16:59 +0000 (16:16 -0500)]
Fix bug in motion search
The maximum possible MV in 1/8 pel units is (1<<11), which could
cause mvcost out of its range that is 1023. Change maximum
possible MV in 1/8 pel units to (1<<11)-8 will fix this problem.
Paul Wilkins [Mon, 10 Jan 2011 16:41:53 +0000 (16:41 +0000)]
Two Pass VBR change
Further experiment with restriction of the Q range.
This uses the average non KF/GF/ARF quantizer, instead
of just relying on the initial value. It is not such a strong constraint
but there may be a reduced risk of rate misses.
Paul Wilkins [Fri, 7 Jan 2011 18:29:37 +0000 (18:29 +0000)]
CQ Mode
The merge includes hooks to for CQ mode and other code
changes merged from the test branch.
CQ mode attempts to maintain a more stable quantizer within a clip
whilst also trying to adhere to a guidline maximum bitrate.
The existing target data rate parameter is used to specify the
guideline maximum bitrate.
A new parameter allows the user to specify a target CQ level.
For normal (non kf/gf/arf) frames, the quantizer will not drop BELOW the
user specified value (0-63). However, in some cases the encoder may
choose to impose a target CQ that is above that specified by the user,
if it estimates that consistent use of the target value is not compatible
with guideline maximum bitrate.
Paul Wilkins [Fri, 7 Jan 2011 16:33:59 +0000 (16:33 +0000)]
Limit Q variability in two pass.
In two pass encoding each frame is given an active
Q range to work with. This change limits how much this
Q range can be altered over time from the initial estimate
made for the clip as a whole.
There is some danger this could lead to overshoot or undershoot
in some corner cases but it helps considerably in regard to
clips where either there is a glut or famine of bits in some sections,
particularly near the end of a clip.
Johann [Wed, 22 Dec 2010 16:23:51 +0000 (11:23 -0500)]
x86 sse2 temporal_filter_apply
count can be reduced to short because the max number of filtered frames
is set to 15. the max value for any frame is 32 (modifier = 16,
filter_weight = 2). 15*32 = 480 which requires 9 bits
this function goes from about 7000 us / 1000 iterations for the C code
to < 275 us / 1000 iterations for sse2 for block_size = 16 and from
about 1800 us / 1000 iters to < 100 us / 1000 iters for block_size = 8
John Koleszar [Thu, 6 Jan 2011 18:07:39 +0000 (13:07 -0500)]
fix last frame buffer copy logic regression
Commit 0ce3901 introduced a change in the frame buffer copy logic where
the NEW frame could be copied to the ARF or GF buffer through the
copy_buffer_to_{arf,gf}==1 flags, if the LAST frame was not being
refreshed. This is not correct. The intent of the
copy_buffer_to_{arf,gf}==1 flag is to copy the LAST buffer. To copy the
NEW buffer, the refresh_{alt_ref,golden}_frame flag should be used.
The original buffer copy logic is fairly convoluted. For example:
if (cm->refresh_last_frame)
{
vp8_swap_yv12_buffer(&cm->last_frame, &cm->new_frame);
cm->frame_to_show = &cm->last_frame;
}
else
{
cm->frame_to_show = &cm->new_frame;
}
...
if (cm->copy_buffer_to_arf)
{
if (cm->copy_buffer_to_arf == 1)
{
if (cm->refresh_last_frame)
vp8_yv12_copy_frame_ptr(&cm->new_frame, &cm->alt_ref_frame);
else
vp8_yv12_copy_frame_ptr(&cm->last_frame, &cm->alt_ref_frame);
}
else if (cm->copy_buffer_to_arf == 2)
vp8_yv12_copy_frame_ptr(&cm->golden_frame, &cm->alt_ref_frame);
}
Effectively, if refresh_last_frame, then new and last are swapped, so
when "new" is copied to ARF, it's equivalent to copying LAST to ARF. If
not refresh_last_frame, then LAST is copied to ARF. So LAST is copied to
ARF in both cases.
Commit 0ce3901 removed the first buffer swap but kept the
refresh_last_frame?new:last behavior, changing the sense since the first
swap wasn't done to the more readable refresh_last_frame?last:new, but
this logic is not correct when !refresh_last_frame.
This commit restores the correct behavior from v0.9.1 and prior. This
case is missing from the test vector set.
Yunqing Wang [Wed, 29 Dec 2010 15:28:35 +0000 (10:28 -0500)]
Always update last_frame_type
Scott pointed out that last_frame_type only gets updated while
loopfilter exists. Since last_frame_type is also needed in
motion search now, it needs to be updated every frame.
Scott LaVarnway [Tue, 28 Dec 2010 19:51:46 +0000 (14:51 -0500)]
Use the fast quantizer for inter mode selection
Use the fast quantizer for inter mode selection and the
regular quantizer for the rest of the encode for good quality,
speed 1. Both performance and quality were improved. The
quality gains will make up for the quality loss mentioned in
I9dc089007ca08129fb6c11fe7692777ebb8647b0.
Yunqing Wang [Thu, 23 Dec 2010 16:23:03 +0000 (11:23 -0500)]
Modify motion estimation for SPLITMV mode
1. Search for block8x16/block16x8 uses block8x8's search results.
2. Check block4x4 only if block8x8 is chosen. (This hurts quality,
which will be improved in another check-in.)
3. In block4x4 search, the previous block's result is used as
MV predictor for next block.
Yaowu Xu [Fri, 24 Dec 2010 03:59:12 +0000 (19:59 -0800)]
adjusted sad_per_bit to correlate with quantizer
Re-calibrated sad_per_bit16 and sad_per_bit4 tables to linearly
correlated to quantizer values, these two variables are used in
motion search for costing motion vectors. This change has an small
positive effect on compression.
Johann [Mon, 29 Nov 2010 19:21:11 +0000 (14:21 -0500)]
abstract apply_temporal_filter
allow for optimized versions of apply_temporal_filter
(now vp8_apply_temporal_filter_c)
the function was previously declared as static and appears to have been
inlined. with this change, that's no longer possible. performance takes
a small hit.
the declaration for vp8_cx_temp_filter_c was moved to onyx_if.c because
of a circular dependency. for rtcd, temporal_filter.h holds the
definition for the rtcd table, so it needs to be included by onyx_int.h.
however, onyx_int.h holds the definition for VP8_COMP which is needed
for the function prototype. blah.
John Koleszar [Fri, 17 Dec 2010 16:34:02 +0000 (11:34 -0500)]
propagate user private data on decode
The pointer passed in the user_priv argument to vpx_codec_decode()
should be propagated through to the corresponding output frame and
made available in the image's user_priv member. Fixes issue #252
John Koleszar [Fri, 17 Dec 2010 14:43:39 +0000 (09:43 -0500)]
Add psnr/ssim tuning option
Add a new encoder control, VP8E_SET_TUNING, to allow the application
to inform the encoder that the material will benefit from certain
tuning. Expose this control as the --tune option to vpxenc. The args
helper is expanded to support enumerated arguments by name or value.
Two tunings are provided by this patch, PSNR (default) and SSIM.
Activity masking is made dependent on setting --tune=ssim, as the
current implementation hurts speed (10%) and PSNR (2.7% avg,
10% peak) too much for it to be a default yet.
Henrik Lundin [Tue, 14 Dec 2010 13:05:06 +0000 (14:05 +0100)]
Inform caller of decoder about updated references
Inform the caller of the decoder if a decoded frame updated last,
golden, or altref frames, required for realtime communication
proposed in document VP8 RTP payload format.
Added a new vpx_codec_control called VP8D_GET_LAST_REF_UPDATES, to be
called after vpx_codec_decode. The control will indicate which of the
reference frames that were updated by setting the 3 LSBs in the input
int (pointer).
Scott LaVarnway [Thu, 16 Dec 2010 22:01:27 +0000 (17:01 -0500)]
Changed segmentation check order
In SPLITMV, the 8x8 segment will be checked first. If the 8x8 rd
is better than the best, we check the other segments. Otherwise
bail. Adjustments to the thresh_mult were necessary to make
up for the initial quality loss.
The performance improved by 20% (average) for good quality,
speed 0 and speed 1, while the overall quality remained the same.
Scott LaVarnway [Thu, 16 Dec 2010 14:38:02 +0000 (09:38 -0500)]
Adjusted breakout RD for SPLITMV
vp8_rd_pick_best_mbsegmentation looks at y only. The new
breakout does not include the frame cost, the prob_skip_false
cost, or the uv rate. Performance improved by a few percent
and the quality remained the same.
Yunqing Wang [Tue, 14 Dec 2010 22:39:25 +0000 (17:39 -0500)]
Fix a bug in motion search code(2)
This fix added MV range checks for NEWMV mode as suggested by Jim.
To reduce unnecessary MV range checks, I tried Yaowu's suggestion.
Update UMV borders in NEWMV mode to also cover MV range check.
Also, in this way, every MV that is valid gets checked in diamond
search function.
Yunqing Wang [Tue, 14 Dec 2010 16:00:25 +0000 (11:00 -0500)]
Fix a bug in motion search code
The MV's range is 256. Since the new motion search uses a different
starting MV than the center ref MV, a MV range checking needs to
be done to avoid corruption.
James Berry [Mon, 13 Dec 2010 18:10:58 +0000 (13:10 -0500)]
fixed vpxenc bug where ivf files would be read incorrectly
read_frame would incorrectly insert detect->buf into img
for ivf files. detect->position now set to 4 if input file is
detected to be ivf in file_is_ivf to keep this from occuring.
Yaowu Xu [Mon, 6 Dec 2010 21:33:01 +0000 (13:33 -0800)]
adjust RDMULT for UV plane in quantization RDO
This patch adds a weighting factor on RDMULT for UV blocks. The change
has an overall gain about 0.5% based on ssim, between 0.1 and 0.2% by
psnr numbers.
Scott LaVarnway [Mon, 6 Dec 2010 21:42:52 +0000 (16:42 -0500)]
vp8_rd_pick_best_mbsegmentation code restructure
Moved the code from the segmentation loop into a function
which is now called for each segment. This will allow us
to change the segment order checking more easily.
Paul Wilkins [Sat, 4 Dec 2010 10:04:12 +0000 (10:04 +0000)]
Change to inter_minq table.
The inter_minq table controls the range of quantizers available
for a particular frame in two pass relative to a max Q value.
The changes reduces the range somewhat. The effect of this
was a small increase (0.3% average) in psnr for the test set
but it should also help encode speed somewhat for higher
quality modes as it will reduce the number of iterations in the
recode loop.
The change damps the range of quantizers available locally
within a section of a clip and should therefore help keep quality
more uniform. If there is systematic overshoot or undershoot the
range can shift gradually to accommodate. However, there is
some increased risk of overshoot or undershoot against the target
bit rate in VBR mode and this risk will be more pronounced for short
clips.
The change damps the range of quantizers available locally
within a section of a clip and should therefore help keep quality
more uniform. If there is systematic overshoot or undershoot the
range can shift gradually to accommodate. However, there is
some increased risk of overshoot or undershoot against the
target bit rate in VBR mode and this risk will be more
pronounced for short clips.
Yunqing Wang [Fri, 3 Dec 2010 16:26:21 +0000 (11:26 -0500)]
Improve MV prediction accuracy to achieve performance gain
Add vp8_mv_pred() to better predict starting MV for NEWMV
mode in vp8_rd_pick_inter_mode(). Set different search
ranges according to MV prediction accuracy, which improves
encoder performance without hurting the quality. Also,
as Yaowu suggested, using diamond search result as full
search starting point and therefore adjusting(reducing)
full search range helps the performance.
Fritz Koenig [Thu, 18 Nov 2010 18:40:58 +0000 (10:40 -0800)]
Set refresh_alt_ref_frame on keyframe encode.
On a keyframe alt ref and golden are refreshed. The flag was
not being set and so on the frame after a keyframe, motion
search would occur on the alt ref frame. This is not necessary
because the alt ref frame identical to the last frame in this
scenario.
Handle corner case where a forward alt-ref frame is put
directly after a keyframe.
Pascal Massimino [Wed, 24 Nov 2010 08:22:59 +0000 (00:22 -0800)]
allow dimensions as low as 1 pixel
remove warning comment in vpxenc.c: in case of 1x1 picture,
detect_bytes will be equal to '3' and we'll fall back to
RAW_TYPE.
fix read_frame() by tracking the pre-read buffer length
in the struct detect