Jingning Han [Tue, 12 Nov 2013 20:47:32 +0000 (12:47 -0800)]
Enable 4x4 DCT/ADST transform unit test
This commit enables the unit tests for 4x4 DCT and ADST transforms.
It covers tests of round-trip error check, coefficient match check,
coefficient overflow check, and inverse accuracy check.
Jingning Han [Wed, 13 Nov 2013 22:48:38 +0000 (14:48 -0800)]
Fix an overflow issue in SSE2 forward ADST
The step that sums three input samples could potentially cause the
intermediate result go beyond 16 bit limit, when operating as the
second 1-D transform. This commit fixes the issue.
Jingning Han [Thu, 7 Nov 2013 22:56:58 +0000 (14:56 -0800)]
Dual buffer encoding for intra modes
Overall change (using dual buffer scheme for superblocks of both inter
and intra modes) reduces speed 2 runtime:
bluesky_1080p at 6000kbps: 263553ms -> 257441ms
riverbed_1080p at 8000kbps: 233230ms -> 225308ms.
Johann [Tue, 12 Nov 2013 20:26:45 +0000 (12:26 -0800)]
Split macro strings on whitespace
Match any whitespace instead of individual spaces. The macro
definitions in vp9/common/arm/neon/vp9_short_idct32x32_1_add_neon.asm
triggered this and treated spaces as arguments leading to lines like:
$8vld1$8.$88$8 {$8q8$8}, [$$89$8], $$8stride$8
Joshua Litt [Mon, 11 Nov 2013 20:31:42 +0000 (12:31 -0800)]
Android.mk file for vpx unittests
These changes are to support automated regressions of vpx on android
new file: test/android/Android.mk
new file: test/android/README
new file: test/android/get_files.py
Jingning Han [Wed, 6 Nov 2013 05:07:08 +0000 (21:07 -0800)]
Enable dual buffer rd search and encoding scheme
This commit enables the dual buffer rate-distortion optimization
and encoding scheme. It stacks the original transform coefficients,
quantized levels, and reconstructed coefficients, in the rate-
distortion optimization search process, hence eliminates the need
to re-run residual generation, forward transform, and quantization
in the encoding stage.
Jingning Han [Fri, 1 Nov 2013 19:53:37 +0000 (12:53 -0700)]
Allocate dual buffer sets for encoding
Allocate memory space of dual buffer sets that store the coeff, qcoeff,
dqcoeff, and eobs. Connect the pointers of macroblock_plane and
macroblockd_plane to the actual buffer in use accordingly.
Jingning Han [Thu, 31 Oct 2013 19:21:49 +0000 (12:21 -0700)]
Decouple macroblockd_plane buffer usage
Make the macroblockd_plane contain dynamic buffer pointers instead
static pointers to the memory space allocated therein. The decoder
uses the buffer allocated in pbi, while encoder will use a dual
buffer approach for rate-distortion optimization search.
Dmitry Kovalev [Mon, 11 Nov 2013 23:18:48 +0000 (15:18 -0800)]
Replacing raster_block with block in the encoder.
We only used "ib" to call get_scan() function, which in turn calls
get_tx_type_4x4() function. The latter one only needs block index if
bsize < BLOCK_8X8 -- under that condition raster_block == block.
Yaowu Xu [Fri, 8 Nov 2013 21:04:08 +0000 (13:04 -0800)]
[BITSTREAM]Fix row tile mode_info pointer setup
This commit fixes the assignment of mode_info pointer per tile. It
makes recognition of tiles in both row and column formats and properly
arrange the use of mode_info.
The bug was first introduced in
I6226456dd11f275fa991e4a7a930549da6675915
https://gerrit.chromium.org/gerrit/#/c/67492/
Dmitry Kovalev [Fri, 8 Nov 2013 20:44:56 +0000 (12:44 -0800)]
Optimizing set_contexts() function.
Inlining set_contexts_on_border() into set_contexts(). The only difference
is the additional check that "has_eob != 0" in addition to
"xd->mb_to_right_edge < 0" and "xd->mb_to_right_edge < 0". If has_eob == 0
then memset does the right thing and works faster.
Yunqing Wang [Fri, 1 Nov 2013 21:08:54 +0000 (14:08 -0700)]
Improve loopfilter function
This patch continued the work done in "Rewrite loop_filter_info_n
struct"(commit:00dbd369c70270428d56da6d15ea5486fc821c52) to further
improve loopfilter function.
1. Instead of storing pointers to thresholds, store loopfilter
levels within 64x64 SB;
2. Since loopfilter levels are already calculated in setup_mask,
we don't need call build_lfi to look up them again. Just save
loopfilter levels in setup_mask.
3. Reorganized and simplified filter_block_plane().
Dmitry Kovalev [Thu, 7 Nov 2013 02:51:01 +0000 (18:51 -0800)]
Replacing (raster_block >> tx_size) with (block >> (tx_size << 1)).
The new expression is much more logical than previous one. Surprisingly
both expressions give exactly the same set of dependent values
-- have_top, have_left, have_right -- in vp9_predict_intra_block.
Yunqing Wang [Wed, 6 Nov 2013 19:06:21 +0000 (11:06 -0800)]
Remove TEXTREL from 32bit encoder
This patch fixed the issue reported in "Issue 655: remove textrel's
from 32-bit vp9 encoder". The set of vp9_subpel_variance functions
that used x86inc.asm ABI didn't build correctly for 32bit PIC. The
fix was carefully done under the situation that there was not
enough registers.
After the change, we got
$ eu-findtextrel libvpx.so
eu-findtextrel: no text relocations reported in 'libvpx.so'
Dmitry Kovalev [Thu, 7 Nov 2013 02:15:33 +0000 (18:15 -0800)]
Unifying tile decoding for both direct and inverse tile order.
Now tile decoding consists of two stages:
1. Find tile buffer start and its size, put this info into tile_buffers.
2. Decode each tile based on information from tile_buffers.
It seems that stage 1 can also be reused by multithreaded tile decoder.
Dmitry Kovalev [Thu, 7 Nov 2013 00:14:45 +0000 (16:14 -0800)]
Using pd->dqcoeff instead of pd->qcoeff in the decoder.
It is more logical to use dqcoeff buffer to put there *dequantized*
transform coefficients (inside inverse_transform_block and
decode_coefs functions). Dequantization happens inside WRITE_COEF_CONTINUE
macro.
qcoeff buffer should be only used in the encoder for *quantized*
transform coefficients.
Ivan Maltz [Wed, 23 Oct 2013 18:53:37 +0000 (11:53 -0700)]
Move SVC per-frame loop from sample app into libvpx proper
SVC multiple layer per frame encoding is invoked with vpx_svc_init and
vpx_svc_encode. These interfaces are designed to be invoked from ffmpeg.
Additional improvements:
- make dummy frame handling a bit more explicit
- fixed bug with single layer encodes
- track individual frame sizes and psnrs instead of averages
- parameterized quantizer, 16th scalefactors, more logging,
- enabled single layer encodes to generate baseline
- include new mode for 3 layer I frame with 5 total layers
Tom Finegan [Wed, 6 Nov 2013 18:02:31 +0000 (10:02 -0800)]
webmenc: Clean up the truly egregious style issues.
I'm sure I could do more, but I don't know how long this code has to
live. I think this at least makes the code a little easier to read and
understand.