Scott LaVarnway [Wed, 3 Oct 2012 17:07:13 +0000 (10:07 -0700)]
WIP: Multiple decoder instances support
Started adding support for multiple internal decoder instances. Also added
code to limit the vp8 config options available when using frame-based
multithreading.
When error concealment is enabled, it swaps the mi and prev_mi ptrs after
each frame is decoded. The postproc uses the mi ptr for the mode info context.
Now the postproc will use the correct mode info context.
John Koleszar [Thu, 17 Jan 2013 00:44:33 +0000 (16:44 -0800)]
Don't include x86inc.asm on non-x86 targets
This file is currently unused, as the asm that depended on it has been
disabled for the current roll into Chromium. It's expected that it
will return in some form, so wrap it in an x86 check rather than
deleting it. This extra file isn't really an issue with the libvpx
build system, but affects the gyp builds since on ARM (android) it
tries to do the ADS->GAS conversion on all .asm files reported in
libvpx_srcs.txt.
John Koleszar [Mon, 14 Jan 2013 19:49:30 +0000 (11:49 -0800)]
Use INT64_MAX instead of LLONG_MAX
These variables have the type int64_t, not long long. long long could
be a larger type than 64 bits. Emulate INT64_MAX for older versions of
MSVC, and remove the unreferenced vpx_ports/vpxtypes.h
Yaowu Xu [Mon, 14 Jan 2013 17:28:35 +0000 (09:28 -0800)]
changed UV plane loop filtering for TX_8X8
In commit 9a1d73d, loop filtering was added for UV 4x4 boundaries
when TX_8X8 is used by a MB. This commit further refined the decision
to be based on the actual transform used for the UV planes. When
UV planes use 4x4 transform, i.e. when prediction mode used is either
I8X8_PRED or SPLITMV, UV planes are filtered on 4x4 boundaries, and no
filtering is applied on 4x4 block boundaries when UV planes use 8X8
transform.
Ronald S. Bultje [Mon, 14 Jan 2013 20:43:12 +0000 (12:43 -0800)]
Reset x->skip for each iteration in the RD loop.
This prevents ill-defined behaviour, such as setting x->skip for a mode
that is excluded because of frame-level flags (e.g. filter selection,
compound prediction selection), then not breaking out of the RD loop
because the mode is not allowed, but keeping the flag on. Whatever mode
is iterated through next in the RD loop will then carry this flag, and
all sort of bad stuff happens, such as x->skip being set on intra pred
modes.
Deb Mukherjee [Wed, 9 Jan 2013 14:26:54 +0000 (06:26 -0800)]
Further enhancements/fixes on dct/dwt hybrid txfm
Fixes some scaling issues. Adds an option to only compute the
dct on the low-low subband for 32x32 and 64x64 blocks using
only a single 16x16 dct after 1 and 2 wavelet decomposition
levels respectively. Also adds an option to use a 8x8 dct
as building block.
Currenlty with the 2/6 filter and with a single 16x16 dct on
the low low band, the reuslts compared to full 32x32 dct is
as follows:
derf: -0.15%
yt: -0.29%
std-hd: -0.18%
hd: -0.6%
These are my current recommended settings, since the 2/6 filter
is very simple.
Scott LaVarnway [Sat, 12 Jan 2013 01:11:04 +0000 (17:11 -0800)]
WIP: Added sse2 version of vp9_mb_lpf_horizontal_edge_w
and vp9_mb_lpf_vertical_edge_w_sse2. This was quickly done so we can
run some tests over the weekend. Future commits will optimize/refactor these
functions further.
The decoder performance improved by ~17% for the clip used.
Adrian Grange [Tue, 8 Jan 2013 22:14:01 +0000 (14:14 -0800)]
New prediction filter
This patch removes the old pred-filter experiment and replaces it
with one that is implemented using the switchable filter framework.
If the pred-filter experiment is enabled, three interopolation
filters are tested during mode selection; the standard 8-tap
interpolation filter, a sharp 8-tap filter and a (new) 8-tap
smoothing filter.
The 6-tap filter code has been preserved for now and if the
enable-6tap experiment is enabled (in addition to the pred-filter
experiment) the original 6-tap filter replaces the new 8-tap smooth
filter in the switchable mode.
The new experiment applies the prediction filter in cases of a
fractional-pel motion vector. Future patches will apply the filter
where the mv is pel-aligned and also to intra predicted blocks.
Deb Mukherjee [Tue, 8 Jan 2013 20:18:16 +0000 (12:18 -0800)]
Adds 64x64 hybrid dct/dwt transform
This is to add to the 64x64 transform experiment as an alternative to
a 64x64 DCT.
Two levels of wavelet decomposition is used on a 64x64 block, followed
by 16x16 DCT on the four lowest subbands. The highest three subbands
are left untransformed after the first level DWT.
Yaowu Xu [Wed, 19 Dec 2012 19:34:49 +0000 (11:34 -0800)]
minor loop filter refactoring and cleanup
This commit did a couple of minor cleanup/refactoring to prepare for
futher loop filter experiments. It merged y_only version of loop filter
function into the regular one, which makes sure that same logic is used
for functions for picking level and for actual loop filtering.
Paul Wilkins [Thu, 3 Jan 2013 15:14:36 +0000 (15:14 +0000)]
Further change to mv reference search.
This experimental change reorders the search so
that all possible references that match the target
reference frame are tested first and these in order
of distance from the current block. These will usually
be the highest scoring candidates.
If we do not find enough good candidates this way
we try non matching cases. These will usually be lower
scoring candidates.
The change in order together with breakouts when
we have found enough candidates should reduce
the computational cost and especially reduce the number
of sort operations.
Quality Results:
Std Hd +0.228%, Hd +0.074%, YT +0.046%, derf +0.137%
This effect is probably due to the fact that more distant
weak candidates are now less likely to get "promoted" over
near candidates even if they are repeated.
Adrian Grange [Thu, 20 Dec 2012 22:56:19 +0000 (14:56 -0800)]
New interpolation filter selection algorithm
Old Scheme:
When SWITCHABLE filter selection is enabled the encoder
evaluates the use of each interpolation filter type and
selects the best one to use at the MB level. A frame-
level flag can be set to force the use of a particular
filter type for all MBs in a frame if it is more efficient
to encode that way. The logic here involved a Q dependent
threshold that assumed that the second 8-tap filter was
a high-pass filter. However, this requires a trip around
the recode loop. If the frame-level flag indicates use
of a particular filter, the other filters are not
evaluated in the pick_mode loop.
New Scheme:
Each filter type is evaluated at the MB level and a record
of the best filter is kept, irrespective of what filter
is signaled at the frame-level. Once all MBs have been
encoded, a decision is made as to what frame-level mode
to set for the *next* frame. If one filter is used by 80%
or more of the MBs, then this filter is forced since it
is assumed that this will be more efficient if the
next frame has similar characteristics. i.e. there is a
one-frame lag between measuring the filter selection and
setting the frame-level mode to use.
Yaowu Xu [Thu, 3 Jan 2013 16:00:00 +0000 (08:00 -0800)]
Merge cost_coeffs_2x2() into cost_coeffs()
Remove special case function cost_coeffs_2x2() and change function
cost_coeffs() to handle 2nd order haar block as it is handle all
other block types already.
Yunqing Wang [Fri, 28 Dec 2012 00:04:44 +0000 (16:04 -0800)]
Skip finding best ref_mvs when the mode is ZEROMV
Read mode before calling vp9_find_best_ref_mvs(). If the mode is
ZEROMV, the best ref_mvs are not needed. Then, we can skip calling
vp9_find_best_ref_mvs().
Yunqing Wang [Thu, 27 Dec 2012 21:48:17 +0000 (13:48 -0800)]
Switch the order of calculating 2-D inverse transform
The 2-D inverse transform X = M1*Z*Transposed_M2 was calculated
in 2 steps from left to right:
1. Vertical transform: Y = M1*Z
2. Horizontal transform: X= Y*Transposed_M2
In SIMD, a transpose is needed in vertical transform.
Here, switched the calculation order to do it from right to left.
In this way, we could eliminate that transpose by writing the
intermediate results out to their transposed positions.