James Zern [Fri, 7 Mar 2014 23:03:50 +0000 (15:03 -0800)]
Merge "Disable the neon version vpx_yv12_copy_y. For some dimensions, neon code ends up in a dead loop inside. This will fix the unit test failure in svc_test on ARM."
Jingning Han [Fri, 7 Mar 2014 22:06:20 +0000 (14:06 -0800)]
Skip mode check when mv has been tested
This commit allows the non-RD mode decision to skip mode RD modelling
check, if the motion vector associated with the current mode is
same as that of NEARESTMV mode. This makes speed -7 about 2% faster.
Previous change that converts cost metric from SAD to model based RD
value makes the codec 6% slower at speed -7.
Minghai Shang [Fri, 7 Mar 2014 22:02:35 +0000 (14:02 -0800)]
Merge "[svc] 1. Add two pass RC options in vp9_spatial_scalable_encoder. 2. Add read/write for RC stats file The two pass RC for svc does not work yet. This is just the first step. We need further development to make it working. Change-Id: I8ef0e177dff0b5ed3c97a916beea5123717cc6f2"
hkuang [Fri, 7 Mar 2014 18:21:25 +0000 (10:21 -0800)]
Disable the neon version vpx_yv12_copy_y.
For some dimensions, neon code ends up in a dead loop inside.
This will fix the unit test failure in svc_test on ARM.
Jingning Han [Fri, 7 Mar 2014 02:56:50 +0000 (18:56 -0800)]
Use modeled rate distortion costs for non-RD mode
This commit replaces SAD cost with modeled rate-distortion cost
for non-RD mode decision. It translates the prediction residual
SSE into estimate rate and reconstruction distorion costs, hence
capturing the quantization setting effect. The compression
performance of speed -7 for rtc set is improved by 14.79%.
Tom Finegan [Thu, 6 Mar 2014 22:54:49 +0000 (14:54 -0800)]
Avoid unknown warning warnings and fix -Werror on macosx.
clang on macosx does not support -Wunused-but-set-variable; adding the flag
causes additional warnings about the flag. As a more generalized fix, use
-Werror when checking compiler flag support in order to avoid using
unsupported warning flags.
Dmitry Kovalev [Tue, 4 Mar 2014 01:48:06 +0000 (17:48 -0800)]
Merging force-keyframe example into simple_encoder.
The only difference between two examples was usage of VPX_EFLAG_FORCE_KF
flag for frame encoding. Moving this functionality into simple_encoder
with additional command line option.
James Zern [Wed, 5 Mar 2014 03:46:29 +0000 (19:46 -0800)]
msvs: filter out include-only asm files
avoid building x86inc.asm, x86_abi_support.asm and vpx_config.asm as
they provide no symbols themselves
fixes:
warning LNK4221: This object file does not define any previously
undefined public symbols, so it will not be used by any link operation
that consumes this library
Alex Converse [Fri, 28 Feb 2014 04:07:43 +0000 (20:07 -0800)]
Prune RT mode decisions for BLOCK_32x32 and up
* Remove all non-DC intra modes for BLOCK_32x32 and up
* Remove all intra modes for blocks bigger than BLOCK_32x32
* Remove ZEROMV for BLOCK_32x32 and up
* Only consider NEARESTMV for blocks bigger than BLOCK_32x32
Tom Finegan [Tue, 4 Mar 2014 02:04:35 +0000 (18:04 -0800)]
vp8_decrypt_test.c: Silence MSVC data loss warning.
- Change type of encrypt_buffer() offset argument to ptrdiff_t, and change the
type of the size argument to size_t.
- Update size argument encrypt_buffer() in vp8_boolcoder_test.c with
same.
Deb Mukherjee [Fri, 28 Feb 2014 22:29:22 +0000 (14:29 -0800)]
Refactoring motion search libs
The core motion estimation fucntions all return sad now consistently.
The only exception is vp9_full_pixel_diamond(), however the core diamond
and refining search routines called from vp9_full_pixel_diamond() also
return SAD. If variance of pred error + mv cost is desired it must be
calculated explicitly outside these functions. For very fast encoding,
hopefully this will eliminate some redundant computations.
Also suggests reimplementing FAST_HEX with the vp9_pattern_search
framework. It is not exactly the same as the existing FAST_HEX, but
performance is slightly better and speed is very similar. Enables
removing a lot of duplicate code.
James Zern [Mon, 24 Feb 2014 00:33:14 +0000 (16:33 -0800)]
build: convert rtcd.sh to perl
significantly speeds up file generation.
the goal of this change is to convert rtcd.sh to perl as directly as
possible to allow for simple comparison. future changes can make it more
perl-like.
---
Linux
[CREATE] vpx_scale_rtcd.h
real 0m0.485s -> 0m0.022s
[CREATE] vp8_rtcd.h
real 0m4.619s -> 0m0.060s
[CREATE] vp9_rtcd.h
real 0m10.102s -> 0m0.087s
Windows
[CREATE] vpx_scale_rtcd.h
real 0m8.360s -> 0m0.080s
[CREATE] vp8_rtcd.h
real 1m8.083s -> 0m0.160s
[CREATE] vp9_rtcd.h
real 2m6.489s -> 0m0.233s
Andrew Russell [Mon, 3 Mar 2014 15:38:02 +0000 (07:38 -0800)]
improved speed of 4x4 sse2 fdct.
* speed improvment of 30 percent achieved
* multiplies and adds remain the same
* non-arithmetic instructions minimized by hand, by:
-expanding 2 pass loop
-removing irrelivant "shuffles"
-combining last two rounding steps
* further improvments may be possible