granicus.if.org Git - libvpx/log

]> granicus.if.org Git - libvpx/log

Johann [Sat, 10 Dec 2016 00:51:01 +0000 (16:51 -0800)]

Use Buffer class for post proc tests

Add Buffer features for:
Setting the buffer to the output of an ACMRandom function.
Copying a buffer.
Comparing two buffers.
Printing two buffers.

Change-Id: Ib53fb602451a3abdcee279ea2b65b51fbc02d3df

commit | commitdiff | tree

Marco Paniconi [Thu, 12 Jan 2017 17:54:41 +0000 (17:54 +0000)]

Merge "vp9: Make the denoiser work with spatial SVC."

commit | commitdiff | tree

Johann Koenig [Thu, 12 Jan 2017 01:02:58 +0000 (01:02 +0000)]

Merge "Create a class for buffers used in tests"

commit | commitdiff | tree

Peter Boström [Wed, 11 Jan 2017 21:05:47 +0000 (21:05 +0000)]

Merge "Add Y,U,V channel metrics and unweighted metrics."

commit | commitdiff | tree

Jerome Jiang [Wed, 11 Jan 2017 20:50:42 +0000 (20:50 +0000)]

Merge "vp9: Turn on the partition copy for speed 8. Tune threshold."

commit | commitdiff | tree

Johann Koenig [Wed, 11 Jan 2017 20:22:27 +0000 (20:22 +0000)]

Merge "arm idct16x16: remove extra config guards"

commit | commitdiff | tree

Peter Boström [Wed, 11 Jan 2017 17:28:03 +0000 (12:28 -0500)]

Add Y,U,V channel metrics and unweighted metrics.

Renames SSIM to VpxSSIM as an upscaled weighted SSIM metric, then prints
Y, U and V channels unweighted as well as a weighted but not scaled SSIM
score that's 8/1/1 parts Y/U/V (same as VpxSSIM).

Change-Id: Iff800cc8f145314eeb1a9b4af1e11a25bec095ca

commit | commitdiff | tree

Jingning Han [Wed, 11 Jan 2017 19:28:39 +0000 (19:28 +0000)]

Merge "Rework forward 8x8 2D-DCT ssse3 implementation"

commit | commitdiff | tree

Jerome Jiang [Tue, 10 Jan 2017 20:43:22 +0000 (12:43 -0800)]

vp9: Turn on the partition copy for speed 8. Tune threshold.

For speed 8, it speeds up the encoding on android by 6% for QVGA and
7.4% for VGA with the new threshold. Overall PSNR is improved by 0.667
for rtc.

Change-Id: I4a644560b32c0b5b4e9f49ffb953d000413a3732

commit | commitdiff | tree

Johann [Wed, 11 Jan 2017 18:17:14 +0000 (10:17 -0800)]

arm idct16x16: remove extra config guards

This file is guarded by HAVE_NEON_ASM in the .mk file now.

Change-Id: I513a621c234aa90ad52e426c8ed494d8a7d4b74a

commit | commitdiff | tree

Johann [Mon, 24 Oct 2016 19:17:51 +0000 (12:17 -0700)]

Create a class for buffers used in tests

Demonstrate its use with the IDCT test.

Change-Id: Idf87fe048847c180f13818fd4df916ba4500134b

commit | commitdiff | tree

hui su [Wed, 11 Jan 2017 00:37:59 +0000 (16:37 -0800)]

Add "Large" label to VP9 target level tests

Also reduce the number of test frames.

Change-Id: Iea6fa93ca6b924535aef7bf8b388db4d0ec84c08

commit | commitdiff | tree

Marco [Wed, 21 Dec 2016 22:33:21 +0000 (14:33 -0800)]

vp9: Make the denoiser work with spatial SVC.

If enabled denoiser will only denoise the top spatial layer for now.

Added unittest for SVC with denoising.

Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b

commit | commitdiff | tree

Jingning Han [Mon, 9 Jan 2017 22:00:29 +0000 (14:00 -0800)]

Rework forward 8x8 2D-DCT ssse3 implementation

This commit reworks the SSSE3 implementation of the forward 8x8
2D-DCT. It uses a cyclic rotation approach to the temporary xmm
registers. It reduces the average cycles from 158 to 154. The SSE2
version uses 169 cycles.

Change-Id: I1b79b9642aae0ed3fb3cefb5b70246e6de5d5caa

commit | commitdiff | tree

Marco [Tue, 10 Jan 2017 00:38:49 +0000 (16:38 -0800)]

vp9: 1 pass cbr: Adjustments to usage of gf_cbr_boost and aq=3 mode.

When aq=3 mode is on and the gf_cbr_boost is set: make sure golden frame
is always refreshed, and don't incorporate segement cost in qp setting
on the boosted golden frame.

Better performance on RTC set with gf_cbr_boost on,
for example with gf_cbr_boost=50, gains from ~0.5-3%.

Change-Id: Ie811f5e4d444ff3320bd6e2c1745b2c4c09a8460

commit | commitdiff | tree

Jerome Jiang [Tue, 10 Jan 2017 00:51:09 +0000 (00:51 +0000)]

Merge "vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8."

commit | commitdiff | tree

Jerome Jiang [Mon, 9 Jan 2017 23:04:13 +0000 (15:04 -0800)]

vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8.

Quality improved by 1.866 and 0.386 for two noisy clips (dark720p and
marcooffice720p), respectively.

Change-Id: Ib33a7672ae9ca53da156208f7cd13f07b5543e44

commit | commitdiff | tree

Jerome Jiang [Mon, 9 Jan 2017 23:53:41 +0000 (23:53 +0000)]

Merge "Fix compile warnings for target=armv7-android-gcc"

commit | commitdiff | tree

James Zern [Mon, 9 Jan 2017 23:52:29 +0000 (23:52 +0000)]

Merge "Refine 8-bit 16x16 idct NEON intrinsics"

commit | commitdiff | tree

Marco Paniconi [Mon, 9 Jan 2017 23:30:32 +0000 (23:30 +0000)]

Merge "vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used."

commit | commitdiff | tree

Marco [Mon, 9 Jan 2017 21:03:50 +0000 (13:03 -0800)]

vp9: Fix comment in speed features.

Change-Id: I65d79c06b152922d725bf559adaa508f91cd5766

commit | commitdiff | tree

Marco [Mon, 9 Jan 2017 20:46:01 +0000 (12:46 -0800)]

vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used.

Avoid the qp-clamping on gf/alt frame if gf_cbr_boost_pct is set.

Change only affect CBR mode when gf_cbr_boost_pct is set.

Change-Id: I0655ed4f2b047c8ed1ed33a070c17960ad776704

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 19:53:15 +0000 (19:53 +0000)]

Merge "postproc: vpx_mbpost_proc_down_neon"

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 19:49:02 +0000 (19:49 +0000)]

Merge "Add mips dspr2 partial idct tests"

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 19:47:47 +0000 (19:47 +0000)]

Merge "Fix mips dspr2 idct32x32 functions for large coefficient input"

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 19:47:00 +0000 (19:47 +0000)]

Merge "Fix mips dspr2 idct16x16 functions for large coefficient input"

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 19:46:18 +0000 (19:46 +0000)]

Merge "Fix mips dspr2 idct8x8 functions for large coefficient input"

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 19:45:53 +0000 (19:45 +0000)]

Merge "Fix mips dspr2 idct4x4 functions for large coefficient input"

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 19:39:13 +0000 (19:39 +0000)]

Merge "Add mips dspr2 vp9 intrapred tests"

commit | commitdiff | tree

Johann [Thu, 22 Dec 2016 18:04:42 +0000 (10:04 -0800)]

postproc: vpx_mbpost_proc_down_neon

This was much more amenable to optimization than the across filter.
Speedup of almost 2.5x

BUG=webm:1320

Change-Id: I49acc0f9cb2e7642303df90132cbc938acade4c4

commit | commitdiff | tree

Johann Koenig [Mon, 9 Jan 2017 18:17:26 +0000 (18:17 +0000)]

Merge "postproc: vpx_mbpost_proc_across_ip_neon"

commit | commitdiff | tree

Marco Paniconi [Mon, 9 Jan 2017 17:23:12 +0000 (17:23 +0000)]

Merge "vp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage."

commit | commitdiff | tree

Kaustubh Raste [Mon, 9 Jan 2017 12:00:16 +0000 (17:30 +0530)]

Add mips dspr2 partial idct tests

Change-Id: Idf4003ea6f9a2a42a9f26e156bee73697acb7a37

commit | commitdiff | tree

Kaustubh Raste [Mon, 9 Jan 2017 11:51:09 +0000 (17:21 +0530)]

Fix mips dspr2 idct32x32 functions for large coefficient input

Change-Id: If9da7099f226a27a09cc9e2899eb66a1158909d2

commit | commitdiff | tree

Kaustubh Raste [Mon, 9 Jan 2017 11:05:28 +0000 (16:35 +0530)]

Fix mips dspr2 idct16x16 functions for large coefficient input

Change-Id: I9be3d3d040837f658c6314606e28db8c31092a1a

commit | commitdiff | tree

Kaustubh Raste [Mon, 9 Jan 2017 10:52:19 +0000 (16:22 +0530)]

Fix mips dspr2 idct8x8 functions for large coefficient input

Change-Id: If011dd923bbe976589735d5aa1c3167dda1a3b61

commit | commitdiff | tree

Kaustubh Raste [Mon, 9 Jan 2017 09:58:30 +0000 (15:28 +0530)]

Fix mips dspr2 idct4x4 functions for large coefficient input

Change-Id: I06730eec80ca81e0b7436d26232465b79f447e89

commit | commitdiff | tree

Kaustubh Raste [Mon, 9 Jan 2017 08:41:57 +0000 (14:11 +0530)]

Add mips dspr2 vp9 intrapred tests

Change-Id: I6be8c59ee220af0597bc2d7213f2779ac2e88db9

commit | commitdiff | tree

Linfeng Zhang [Sat, 7 Jan 2017 01:52:07 +0000 (17:52 -0800)]

Refine 8-bit 16x16 idct NEON intrinsics

Speed test shows 25% gain on vpx_idct16x16_256_add_neon(),
and vpx_idct16x16_10_add_neon() got trippled.

Change-Id: If8518d9b6a3efab74031297b8d40cd83c4a49541

commit | commitdiff | tree

Hui Su [Sat, 7 Jan 2017 00:55:41 +0000 (00:55 +0000)]

Merge "Add support for VP9 level targeting"

commit | commitdiff | tree

Johann [Wed, 21 Dec 2016 22:19:25 +0000 (14:19 -0800)]

postproc: vpx_mbpost_proc_across_ip_neon

The speedup is pretty poor. I would be concerned except the SSE2 is
worse:
Existing SSE2 improvement: 22%
New neon improvement: 35%

BUG=webm:1320

Change-Id: Ied598a261134aa6cbe69f96f58589d2bae17bf62

commit | commitdiff | tree

Marco [Fri, 6 Jan 2017 23:28:21 +0000 (15:28 -0800)]

vp9: 1 pass cbr mode: increase threshold for gf_cbr_boost_pct usage.

Increase the boost threshold below which GOLDEN update will use same
rate correction factor as INTER_NORMAL.

Improves performance when gf_cbr_boost_pct is set (between 0 and 100)
in CBR mode.

Change-Id: I9f54cc18664786a100b13a416b7137ae03bd0cab

commit | commitdiff | tree

Jerome Jiang [Fri, 6 Jan 2017 22:38:39 +0000 (22:38 +0000)]

Merge "vp9: Enable more aggresive short circuit for speed 8."

commit | commitdiff | tree

Marco Paniconi [Fri, 6 Jan 2017 22:34:49 +0000 (22:34 +0000)]

Merge "vp9: Add some controls to sample encoder: vpx_temporal_svc_encoder"

commit | commitdiff | tree

Jerome Jiang [Fri, 6 Jan 2017 21:57:27 +0000 (21:57 +0000)]

Merge "vp9: Compute source sad for every superblock when partition copy is on."

commit | commitdiff | tree

Marco [Fri, 6 Jan 2017 19:28:31 +0000 (11:28 -0800)]

vp9: Add some controls to sample encoder: vpx_temporal_svc_encoder

Add the gf boost and frame_parallel controls.
Set as default to off.

Change-Id: Id85fcb16a4fae97f51c09e9ebadb5cdcd510c2f5

commit | commitdiff | tree

Jerome Jiang [Fri, 6 Jan 2017 18:06:37 +0000 (10:06 -0800)]

vp9: Enable more aggresive short circuit for speed 8.

Set short_circuit_low_temp_var to 3 for speed 8 for all res.
No strong visual difference on all clips.

Change-Id: Ia6d9a314291ab1c14d5421bbdd769974083aeb2a

commit | commitdiff | tree

hui su [Fri, 2 Dec 2016 18:11:33 +0000 (10:11 -0800)]

Add support for VP9 level targeting

Constraints on encoder config:
-target_bandwidth is no larger than 80% of level bitrate limit
-target_bandwidth * (1 + max_over_shoot_pct) is no larger than
88% of level bitrate limit
-min_gf_interval is no smaller than level limit
-tile_columns is no larger than level limit

Constraints on rate control:
-current frame size plus previous three frames' size is no larger
than the CPB level limit
-current frame size is no larger than 50%/40%/20% of the CPB
level limit if it's a key/alt-ref/other frame.

Change-Id: I84d1a2d6d6e3c82bfd533b3309ce999cfaba2c8b

commit | commitdiff | tree

Jerome Jiang [Thu, 5 Jan 2017 00:19:42 +0000 (16:19 -0800)]

vp9: Compute source sad for every superblock when partition copy is on.

The source sad could be used to copy the partition without going into
choose_partitioning function to speed up vp9 encoding. Computing source
sad takes little time. Speed test on Android and Linux shows little
encoding time gain (less than 1.4%).

Turned off for now since partition copy is turned off.

Change-Id: I61c9d5b8f22329760cb29a4ee30a7f9c232ce8d3

commit | commitdiff | tree

Linfeng Zhang [Fri, 6 Jan 2017 16:47:22 +0000 (16:47 +0000)]

Merge "Add high bitdepth 8x8 idct NEON intrinsics"

commit | commitdiff | tree

Linfeng Zhang [Fri, 6 Jan 2017 01:16:18 +0000 (01:16 +0000)]

Merge "Clean DC only idct NEON intrinsics"

commit | commitdiff | tree

Jerome Jiang [Wed, 4 Jan 2017 19:22:51 +0000 (11:22 -0800)]

vp9: Set short circuit to level 3 for VGA for speed 8.

vp9: Set short circuit to level 3 for VGA for speed 8. Also change the
threshold_32x32 to 5/8*thresholds[1] to improve quality regression
caused to VGA clips.

Change-Id: Ia1590e91e7cb22be78d5b85013387bb1be4272e3

commit | commitdiff | tree

Marco Paniconi [Wed, 4 Jan 2017 17:24:08 +0000 (17:24 +0000)]

Merge "vp9: 1 pass cbr: allow noise estimation down to 360p."

commit | commitdiff | tree

Marco [Wed, 4 Jan 2017 00:01:05 +0000 (16:01 -0800)]

vp9: 1 pass cbr: allow noise estimation down to 360p.

Also adjust some thresholds for noise level setting.

Change-Id: I7e03d7057ef2061c9447728deb9c6aff5d3da4b7

commit | commitdiff | tree

Marco [Wed, 21 Dec 2016 20:53:51 +0000 (12:53 -0800)]

vp9: SVC unittests: fix to use y4m source.

Comment out check on buffer underrun, as it currently fails
on some of the svc tests.

Also cast the update of bits_in_buffer_model_, as this can
go negative now due to the buffer underrun.
This fixes the issue in #1352.

BUG=webm:1350
BUG=webm:1352

Change-Id: Ibd4ef23921daf09e5c15b000aca904aa4573599c

commit | commitdiff | tree

Yunqing Wang [Tue, 3 Jan 2017 17:46:15 +0000 (17:46 +0000)]

Merge "Fix for out of range motion vector bug in joint motion search"

commit | commitdiff | tree

Ranjit Kumar Tulabandu [Wed, 21 Dec 2016 09:42:17 +0000 (15:12 +0530)]

Fix for out of range motion vector bug in joint motion search

Clamped the initial mv in vp9_refining_search_8p_c.

BUG=webm:1354

Change-Id: I47d302b350937e3e6e52e95c983b5fb0b4c64fba

commit | commitdiff | tree

Yunqing Wang [Thu, 29 Dec 2016 19:16:00 +0000 (19:16 +0000)]

Merge "Make sub-pixel mv search's return value consistent with the return type"

commit | commitdiff | tree

Yunqing Wang [Thu, 29 Dec 2016 17:24:24 +0000 (17:24 +0000)]

Merge "Bug fix to avoid random crashes during ARNR filtering"

commit | commitdiff | tree

Gabriel Marin [Thu, 29 Dec 2016 06:03:43 +0000 (06:03 +0000)]

Merge "Remove superfluous conditional on 'shortcut'"

commit | commitdiff | tree

Linfeng Zhang [Wed, 28 Dec 2016 21:51:44 +0000 (13:51 -0800)]

Clean DC only idct NEON intrinsics

BUG=webm:1301

Change-Id: Iffc83854218460b3f687f3774e71d45b552382a5

commit | commitdiff | tree

Linfeng Zhang [Wed, 28 Dec 2016 00:28:53 +0000 (16:28 -0800)]

Add high bitdepth 8x8 idct NEON intrinsics

BUG=webm:1301

Change-Id: I56e3bc3aab9214e2debac93796389a7194991084

commit | commitdiff | tree

Yunqing Wang [Tue, 27 Dec 2016 19:52:39 +0000 (11:52 -0800)]

Make sub-pixel mv search's return value consistent with the return type

For out-of-range cases, returned UINT_MAX instead of INT_MAX in the
sub-pixel mv search to be consistent with the "uint32_t" return type.

Change-Id: I8e206d771228c13d89bafbbe9f14722c8ecc6a7a

commit | commitdiff | tree

Ranjit Kumar Tulabandu [Wed, 23 Nov 2016 13:16:44 +0000 (18:46 +0530)]

Bug fix to avoid random crashes during ARNR filtering

The function 'vp9_find_best_sub_pixel_tree_pruned_more' is modified
to return INT_MAX for handling invalid MV cases from UINT32_MAX.

yunqingwang:
patch 3: rebased on top of the tree.
patch 4: The return type of vp9_find_best_sub_pixel_tree* was changed
to uint32_t to fix ubsan warnings. Changing UINT_MAX back to INT_MAX
was not quite right. Patch 4 modified vp9_temporal_filter.c to accept
uint32_t.
(Note: Inconsistency exists in vp9_find_best_sub_pixel_tree*, which
will be fixed in a separate CL.)

Change-Id: Ib1a79dc2aa41ea6335c21669c76883cdbb7e0535

commit | commitdiff | tree

Linfeng Zhang [Tue, 27 Dec 2016 17:59:27 +0000 (17:59 +0000)]

Merge "Clean idct 8x8 neon functions"

commit | commitdiff | tree

James Zern [Fri, 23 Dec 2016 22:10:13 +0000 (14:10 -0800)]

Revert "vp9: SVC unittests: fix to use y4m source."

This reverts commit f0b491a52405abb1b3dbb6b2c74dd6a4c7a7ddb1.

This change results in unsigned integer overflows (as reported by
-fsanitize=integer) in datarate_test.cc,
for many of --gtest_filter=VP9/DatarateOnePassCbrSvc.OnePassCbrSvc*:
unsigned integer overflow: 167198 - 185560 cannot be represented in type
'unsigned long'

As the encoder didn't change, but the input with the change to
(correctly) use Y4mVideoSource, this revert is merely masking the issue.

BUG=webm:1352

Change-Id: Iecd9a6c83b3fca67c566732a5c92d36193cc2060

commit | commitdiff | tree

Jerome Jiang [Wed, 21 Dec 2016 00:49:42 +0000 (16:49 -0800)]

Fix compile warnings for target=armv7-android-gcc

Fix compile warnings about implicit type conversion for
target=armv7-android-gcc in vpxenc.c.

BUG=webm:1348

Change-Id: I9fbabd843512f2a1a09f4bb934cd091e834eed9c

commit | commitdiff | tree

Marco Paniconi [Thu, 22 Dec 2016 17:26:41 +0000 (17:26 +0000)]

Merge "vp9: SVC unittests: fix to use y4m source."

commit | commitdiff | tree

James Zern [Thu, 22 Dec 2016 13:20:55 +0000 (08:20 -0500)]

libs.mk/stress.sh,curl: set --retry to 1

provide some resilience for transient errors

Change-Id: I8db3d4eb5ef3cccc235a8c4c0052199c0ce23a27

commit | commitdiff | tree

Marco [Wed, 21 Dec 2016 20:53:51 +0000 (12:53 -0800)]

vp9: SVC unittests: fix to use y4m source.

Comment out check on buffer underrun, as it currently fails
on some of the svc tests.

BUG=webm:1350

Change-Id: I73c88b800cdcc06bd2f900f7b7e2a5fd08248065

commit | commitdiff | tree

Linfeng Zhang [Wed, 21 Dec 2016 22:24:17 +0000 (14:24 -0800)]

Clean idct 8x8 neon functions

BUG=webm:1301

Change-Id: I05f47dca1fddc155c8396e627cfccf6449677307

commit | commitdiff | tree

Marco [Fri, 16 Dec 2016 00:10:30 +0000 (16:10 -0800)]

vp9: 1 pass vbr: Skip find_predictors in pickmode when source is altref.

When source frame is altref, we only do zero-mv mode, so we can skip
the find_predictors(). No change in compression.
Small speed gain, ~1%.

Only affects 1 pass vbr with lookhead altref, for ytlive with
the macro flag USE_ALTREF_FOR_ONE_PASS on.

Change-Id: I9318c5da8521f017bf54919cd652438b3a6313d1

commit | commitdiff | tree

Marco Paniconi [Wed, 21 Dec 2016 19:38:00 +0000 (19:38 +0000)]

Merge "vp9; Fix to unitest for high noise."

commit | commitdiff | tree

Marco [Wed, 21 Dec 2016 18:19:44 +0000 (10:19 -0800)]

vp9; Fix to unitest for high noise.

Source if y4m, and fix comment.

Change-Id: I1eb84977d42dd0f9009c276b56b3fdb03949bfc2

commit | commitdiff | tree

Marco Paniconi [Wed, 21 Dec 2016 03:56:10 +0000 (03:56 +0000)]

Merge "vp9: Add datarate test for denoiser, for high noise case."

commit | commitdiff | tree

Marco [Mon, 19 Dec 2016 22:07:49 +0000 (14:07 -0800)]

vp9: Add datarate test for denoiser, for high noise case.

Also breakout the denoiser tests, as the denoiser only
runs for real-time speed >=5.

Change-Id: I921b785860c35e9d1ebfad0833673a98490186c2

commit | commitdiff | tree

Jerome Jiang [Tue, 20 Dec 2016 21:46:43 +0000 (21:46 +0000)]

Merge "vp9: Add feature to copy partition from the last frame."

commit | commitdiff | tree

Gabriel Marin [Wed, 14 Dec 2016 19:07:50 +0000 (11:07 -0800)]

Remove superfluous conditional on 'shortcut'

Remove superfluous test. Produces a small improvement in instruction scheduling.
Measured a 1% to 1.5% reduction in execution time for routine vp9_optimize_b
with different compilers.

No change in behavior.

TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225

Change-Id: I2bf248d4c25fc0256147d7a8766ff9108ae9cba3

commit | commitdiff | tree

Kaustubh Raste [Tue, 20 Dec 2016 02:27:07 +0000 (02:27 +0000)]

Merge "Add mips msa vp9 intrapred tests"

commit | commitdiff | tree

Jerome Jiang [Mon, 19 Dec 2016 18:39:04 +0000 (10:39 -0800)]

vp9: Add feature to copy partition from the last frame.

Add feature to copy partition from the last frame.
The copy is only done under certain conditions that SAD is below threshold.
Feature is currently disabled, until threshold is tuned.
Feature will be initially used for Speed 8 (ARM).

Under extreme case of always copying partition for speed 8:
Encode time is reduced by 5.4% on rtc_derf and 7.8% on rtc.
Overall PSNR reduced by 2.1 on rtc_derf and 0.968 on rtc.

Change-Id: I1bcab515af3088e4d60675758f72613c2d3dc7a5

commit | commitdiff | tree

Gabriel Marin [Mon, 19 Dec 2016 23:25:38 +0000 (23:25 +0000)]

Merge "Simplify address arithmetic in vp9_optimize_b"

commit | commitdiff | tree

James Zern [Mon, 19 Dec 2016 22:39:01 +0000 (22:39 +0000)]

Merge "vpx_idct32x32_1024_add_neon: quiet uninitialized warning"

commit | commitdiff | tree

Marco Paniconi [Mon, 19 Dec 2016 21:15:36 +0000 (21:15 +0000)]

Merge "vp9 denoiser: Fix the logic for re-evaluating zeromv after denoising."

commit | commitdiff | tree

Gabriel Marin [Wed, 14 Dec 2016 00:22:48 +0000 (16:22 -0800)]

Simplify address arithmetic in vp9_optimize_b

Simplify address arithmetic on token_costs to reduce the number of generated
instructions that are used for address arithmetic inside routine
vp9_optimize_b. It also helps improve instruction scheduling depending on
compiler and optimization level.

Measured a 9.3% reduction in retired instructions and 5.3% reduction in
execution time for this routine with GCC v4.8.4 and optimization flags -O3,
and a reduction of up to 11.6% in execution time with other compilers.

No change in behavior.

TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225

Change-Id: I6098650fb5cd2aa04e014fe6e68ca20761f3a21f

commit | commitdiff | tree

James Zern [Mon, 19 Dec 2016 18:51:59 +0000 (10:51 -0800)]

vpx_idct32x32_1024_add_neon: quiet uninitialized warning

relocate the assignment to 'in' outside of the for loop. this quiets a
spurious warning in visual studio builds since:
86e340c enable vpx_idct32x32_1024_add_neon in hbd builds

+ give the variable a more descriptive name

BUG=webm:1294

Change-Id: I5c3da5c7939621477e0fc0ad3a1b2a3045c5bffd

commit | commitdiff | tree

Marco [Sat, 17 Dec 2016 00:01:59 +0000 (16:01 -0800)]

vp9: With denoising on, only estimate noise level for higher resolns.

Allow it for resolns above 640x360 for now.

Change-Id: I087d0d8173f96b316164fdd4a499110ce2e7a233

commit | commitdiff | tree

Marco [Mon, 19 Dec 2016 17:22:44 +0000 (09:22 -0800)]

vp9 denoiser: Fix the logic for re-evaluating zeromv after denoising.

Correctly set interp_filter to SWITCHABLE for INTRA mode.
Also reduce threshold on noise level for re-evaluating zeromv.

Change-Id: Id32c01e193209fb380aa07204f0be3babf29f70a

commit | commitdiff | tree

Linfeng Zhang [Mon, 19 Dec 2016 17:09:26 +0000 (17:09 +0000)]

Merge "Clean hbd idct 4x4 neon functions and other"

commit | commitdiff | tree

Kaustubh Raste [Mon, 19 Dec 2016 11:56:17 +0000 (17:26 +0530)]

Add mips msa vp9 intrapred tests

Change-Id: I49b91464a87cad8692f4b1477e45e5f567b4fe87

commit | commitdiff | tree

Johann Koenig [Sat, 17 Dec 2016 01:12:34 +0000 (01:12 +0000)]

Merge "post proc test: add padding for sse2 tests"

commit | commitdiff | tree

Marco Paniconi [Fri, 16 Dec 2016 23:53:32 +0000 (23:53 +0000)]

Merge "vp9: Change condition to enable recheck_zeromv_after_denoising."

commit | commitdiff | tree

Marco [Fri, 16 Dec 2016 19:15:57 +0000 (11:15 -0800)]

vp9: Change condition to enable recheck_zeromv_after_denoising.

For when denoising enabled: change condition to enable
the recheck_zeromv_after_denoising for only very high noise level.
This is causing an issue, so enabling it for very high noise
to effectively shut it off.

Change-Id: Ic40d6025f3f398338cedd270d17c0ccd9a3daa84

commit | commitdiff | tree

Johann [Fri, 16 Dec 2016 22:03:53 +0000 (14:03 -0800)]

post proc test: add padding for sse2 tests

Avoid valgrind warnings for reading out of bounds when the width is not
divisible by 16.

Change-Id: I5670d7cfbbce00874b98cfb7472f99c7936c2c47

commit | commitdiff | tree

Johann [Fri, 16 Dec 2016 20:19:00 +0000 (12:19 -0800)]

postproc test: disable new down and across test

The new test is causing valgrind failures:
[ RUN ] SSE2/VpxPostProcDownAndAcrossMbRowTest.CheckCvsAssembly/0
==28923== Invalid read of size 16
28923== at 0x724016: ??? (deblock_sse2.asm:146)

Disable during investigation. The test is new but the code is not.

Change-Id: I5521e5fd48a595e3798b833bf7e3cc97b81c1975

commit | commitdiff | tree

Jim Bankoski [Fri, 16 Dec 2016 16:50:55 +0000 (08:50 -0800)]

vp8 : use threading mutex's for tsan only.

To avoid decode performance hit of 2% when running on hyperthreaded
cores.

This patch only uses the mutex's when we are running tsan.

This is safe because 32 bit operations like read and store are atomic
on all the platforms we care about. Tsan warns about race situations,
but in this case either situation ( read occurs before write or write
before read) the worst case is that we go around one extra time in the
loop.  So the ordering doesn't really matter.

That said a few other things have been tried :

for instance as per here:
webrtc/base/atomicops.h#52

In this patch they use:
__atomic_load_n(i, __ATOMIC_ACQUIRE);
__atomic_store_n(i, value, __ATOMIC_RELEASE);

This code works on gcc, clang ( replacing protected write and read), and
avoids tsan errors. Incurring no penalty in performance.  In C11 its
replaced by straight atomic operands.

However there is no equivalent in the visual studio's we support as
int32 on all windows platforms is already atomic.  To avoid tsan like
warnings on windows we'd need to use interlocked exchange and the
end result doesn't gain us any thing.

Change-Id: I2066e3c7f42641ebb23d53feb1f16f23f85bcf59

commit | commitdiff | tree

Marco Paniconi [Thu, 15 Dec 2016 19:48:16 +0000 (19:48 +0000)]

Merge "vp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS"

commit | commitdiff | tree

Johann [Tue, 13 Dec 2016 00:47:05 +0000 (16:47 -0800)]

postproc: neon down and across macroblock filter

Implement vpx_post_proc_down_and_across_mb_row in NEON.
Runs about 6-7x faster than C.

BUG=webm:1320

Change-Id: Ic5c7d3552a88cfcf999ec5bf2bd46fee460642c2

commit | commitdiff | tree

Marco [Wed, 14 Dec 2016 22:08:09 +0000 (14:08 -0800)]

vp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS

The flag USE_ALTREF_FOR_ONE_PASS allows for alt-ref lookahead
in 1 pass vbr (from https://chromium-review.googlesource.com/#/c/365498).
This change is to make sure this macro flag only has effect if
the config flag cpi->oxcf.enable_auto_altef is also on.

No change in ytlive encoding, as USE_ALTREF_FOR_ONE_PASS is not
yet enabled.

Change-Id: I1a69681e4a15c5244581a3dab4587fca08f02e0f

commit | commitdiff | tree

Linfeng Zhang [Wed, 14 Dec 2016 18:42:01 +0000 (10:42 -0800)]

Clean hbd idct 4x4 neon functions and other

BUG=webm:1301

Change-Id: I387b7eae716a7df15c691dc6f368b07602df7342

commit | commitdiff | tree

Yaowu Xu [Wed, 14 Dec 2016 17:37:14 +0000 (09:37 -0800)]

Change order of operation to avoid ubsan warnings

This commit change an order of operation to avoid left shifts of
negative numbers.

Change-Id: I607c7eb91658c7a5ef397fc1504721d1b10e3dd6

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom