]> granicus.if.org Git - libvpx/log
libvpx
6 years agoMerge "Fix frame offset computation for GOP extension"
Jingning Han [Thu, 25 Oct 2018 00:01:35 +0000 (00:01 +0000)]
Merge "Fix frame offset computation for GOP extension"

6 years agoMerge "Refactor gop_length use case in tpl model"
Jingning Han [Thu, 25 Oct 2018 00:01:29 +0000 (00:01 +0000)]
Merge "Refactor gop_length use case in tpl model"

6 years agoMerge "Clean up vpx_dsp/x86/convolve_sse2.h"
Chi Yo Tsai [Wed, 24 Oct 2018 16:36:20 +0000 (16:36 +0000)]
Merge "Clean up vpx_dsp/x86/convolve_sse2.h"

6 years agoClean up vpx_dsp/x86/convolve_sse2.h
chiyotsai [Thu, 18 Oct 2018 16:51:56 +0000 (09:51 -0700)]
Clean up vpx_dsp/x86/convolve_sse2.h

Removes unnecesssary includes and reword some functions/comments.

Change-Id: Ied557d7faa9d845d38255e6e3e0e3fe1395276e1

6 years agoMerge "Use 8-tap interp filter in temporal filtering"
Yunqing Wang [Tue, 23 Oct 2018 22:29:46 +0000 (22:29 +0000)]
Merge "Use 8-tap interp filter in temporal filtering"

6 years agoUse 8-tap interp filter in temporal filtering
Yunqing Wang [Tue, 23 Oct 2018 19:30:13 +0000 (12:30 -0700)]
Use 8-tap interp filter in temporal filtering

Used 8-tap interp filter in temporal filtering to achieve more accurate
motion search result. Using 8-tap sharp gave slight better result than
using 8-tap regular.

Speed 0 borg test showed that
        avg_psnr:  ovr_psnr:    ssim:
hdres:  -0.160      -0.157     -0.173
midres: -0.083      -0.061     -0.183
lowres: -0.077      -0.099     -0.204

Speed test didn't see noticeable encoder time changes.

Change-Id: I97dc3c4864b5a5675a6c1e3952799b81eedd7d93

6 years agoMerge "Remove empty else branch in mode_estimation"
Jingning Han [Tue, 23 Oct 2018 19:16:41 +0000 (19:16 +0000)]
Merge "Remove empty else branch in mode_estimation"

6 years agoFix frame offset computation for GOP extension
Jingning Han [Mon, 22 Oct 2018 21:21:48 +0000 (14:21 -0700)]
Fix frame offset computation for GOP extension

Properly compute the extended GOP frames' buffer offsets.

Change-Id: I9aed14f4b8d623f1832e782828dce07aa546507d

6 years agoRefactor gop_length use case in tpl model
Jingning Han [Mon, 22 Oct 2018 17:37:09 +0000 (10:37 -0700)]
Refactor gop_length use case in tpl model

Make it support both single- and multi-layer ARF GOP structure.

Change-Id: I760a95804d1b583b057120f6d6be65195a0e6c19

6 years agoRemove empty else branch in mode_estimation
Jingning Han [Tue, 23 Oct 2018 05:51:48 +0000 (22:51 -0700)]
Remove empty else branch in mode_estimation

Change-Id: Iefa184aae80b920b054e3e922a77244c2b0d4b61

6 years agoMerge "Use the proper gfu_boost factor to compute rd_mult"
Jingning Han [Tue, 23 Oct 2018 02:28:22 +0000 (02:28 +0000)]
Merge "Use the proper gfu_boost factor to compute rd_mult"

6 years agoUse the proper gfu_boost factor to compute rd_mult
Jingning Han [Mon, 22 Oct 2018 16:28:04 +0000 (09:28 -0700)]
Use the proper gfu_boost factor to compute rd_mult

Update the Lagrangian multiplier according to the gfu_boost factor
assigned per frame. It improves the multi-layer ARF compression
performance (results below shown for speed 0):

         avg PSNR      overall PSNR      SSIM
lowres    -0.08%          0.02%         -0.28%
midres    -0.08%          0.03%         -0.22%
hdres     -0.19%         -0.10%         -0.39%
nflx2k    -0.29%         -0.18%         -0.85%

Change-Id: Ifeb4b14918f880ba011ea41c1454ab00504f8855

6 years agoMerge "ML_VAR_PARTITION: enable at speed 5"
Hui Su [Fri, 19 Oct 2018 16:48:40 +0000 (16:48 +0000)]
Merge "ML_VAR_PARTITION: enable at speed 5"

6 years agoML_VAR_PARTITION: enable at speed 5
Hui Su [Tue, 16 Oct 2018 03:45:07 +0000 (20:45 -0700)]
ML_VAR_PARTITION: enable at speed 5

When the ML_VAR_PARTITION experiment is turned on, replace
REFERENCE_PARTITION with ML_BASED_PARTITION at speed 5.

Coding gains(avg_psnr) compared to baseline:
ytlivehr  1.63%
ytlivelr  0.07%

Tested encoding speed with several clips from ytlivehr and ytlivelr
on linux desktop(rt, vbr, 4 threads). Encoder speed is on average
faster than baseline:
360p:   14% faster
720p:    7% faster
1080p: 1.5% faster

Change-Id: I39b00078176ff516f7306818f33ba2b1ea53dfa1

6 years agoChanges 4-tap SSSE3 filter to 8-tap AVX2 filter.
chiyotsai [Thu, 18 Oct 2018 16:34:20 +0000 (09:34 -0700)]
Changes 4-tap SSSE3 filter to 8-tap AVX2 filter.

AVX2's 8-tap filter is slightly faster than 4-tap SSSE3 filter.

Change-Id: I5fc37c431670780108706b206b32c791828555c9

6 years agoMerge "Add SSSE3 support for 4-tap interpolation filter"
Chi Yo Tsai [Thu, 18 Oct 2018 18:19:41 +0000 (18:19 +0000)]
Merge "Add SSSE3 support for 4-tap interpolation filter"

6 years agoMerge "Enable rect partition search for HBD at speed 1"
Hui Su [Thu, 18 Oct 2018 16:46:15 +0000 (16:46 +0000)]
Merge "Enable rect partition search for HBD at speed 1"

6 years agoAdd SSSE3 support for 4-tap interpolation filter
chiyotsai [Wed, 17 Oct 2018 21:52:26 +0000 (14:52 -0700)]
Add SSSE3 support for 4-tap interpolation filter

Performance:
     | 4X4 | 8X8 |16X16|64X64|
2 DIM|1.526|1.827|1.844|1.906|
 HORZ|1.336|1.795|1.886|1.654|
 VERT|1.443|1.539|2.139|2.190|

The ratio is SSSE3 8-tap time / SSSE3 4-tap time.

Change-Id: I01ed2ab494428256e918875774a459afecc5ec6a

6 years agoMerge "Replace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop size"
Jingning Han [Thu, 18 Oct 2018 16:25:37 +0000 (16:25 +0000)]
Merge "Replace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop size"

6 years agoMerge "Optimize vp9_highbd_temporal_filter_apply_c"
Yunqing Wang [Wed, 17 Oct 2018 23:11:46 +0000 (23:11 +0000)]
Merge "Optimize vp9_highbd_temporal_filter_apply_c"

6 years agoReplace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop size
Jingning Han [Wed, 17 Oct 2018 23:04:21 +0000 (16:04 -0700)]
Replace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop size

MAX_ARF_GOP_SIZE accurately reflects the maximum frame operated
per group of pictures. Use that to replace MAX_LAG_BUFFERS in
such use cases.

Change-Id: Id26f9b1b2b0c38f255dee19795356c387d06d033

6 years agoMerge changes I6d5c77af,I6bf504b4,Ie5dc5ea7,Ie6024b1a,If45fba8a, ...
Angie Chiang [Wed, 17 Oct 2018 23:10:50 +0000 (23:10 +0000)]
Merge changes I6d5c77af,I6bf504b4,Ie5dc5ea7,Ie6024b1a,If45fba8a, ...

* changes:
  Add do_motion_search
  Preserve code of doing mv search in raster order
  Variant implementation of changing mv search order
  Add feature_score_loc_sort
  Init mv_[dist/cost]_sum in init_tpl_stats
  Change mv search order according to feature_score

6 years agoAdd do_motion_search
Angie Chiang [Wed, 17 Oct 2018 21:45:08 +0000 (14:45 -0700)]
Add do_motion_search

This will make the code cleaner.

Change-Id: I6d5c77af7261c39656b35ec40ac1451bbdbfb7a7

6 years agoMerge "Adds SSE2 support for interpolation filter for width 4 and 8"
Chi Yo Tsai [Wed, 17 Oct 2018 21:35:14 +0000 (21:35 +0000)]
Merge "Adds SSE2 support for interpolation filter for width 4 and 8"

6 years agoPreserve code of doing mv search in raster order
Angie Chiang [Wed, 17 Oct 2018 20:56:42 +0000 (13:56 -0700)]
Preserve code of doing mv search in raster order

With this change, there will be three version of mv search scheme
on the codebase simultaneously.
We will do further experiment to evaluate which version is better
in terms of visual quality and coding performance.

Change-Id: I6bf504b4551316ef10b8a341ab3ba14d0ec977ce

6 years agoEnable rect partition search for HBD at speed 1
Hui Su [Wed, 17 Oct 2018 15:40:26 +0000 (08:40 -0700)]
Enable rect partition search for HBD at speed 1

This patch enables rectangular partition search on speed 1 for high
bit depth encoding. The encoding speed loss is reduced thanks to
recently added speed features.

This only affects speed 1 high bit-depth encoding.

Coding gains:
                      avg_psnr     ovr_psnr
lowres_bd10(480p)      1.34%        1.40%
midres_bd10(720p)      1.28%        1.33%

Average speed loss:
        QP=30    QP=40    QP=50    average
480p     2.5%     2.3%     2.6%     2.5%
720p     4.0%     3.9%     3.2%     3.7%

Change-Id: Id9cac4eea0769d94e093c9d170194659b3342d89

6 years agoAdds SSE2 support for interpolation filter for width 4 and 8
chiyotsai [Tue, 16 Oct 2018 22:45:05 +0000 (15:45 -0700)]
Adds SSE2 support for interpolation filter for width 4 and 8

Performance:
The chart below shows the speed relative to baseline
(baseline_time/new_time)
_____| 4X4 | 8X8 |16X16|64X64|
2 DIM|1.889|1.780|1.811|1.963|
 HORZ|2.266|1.834|1.617|1.595|
 VERI|2.043|2.190|2.373|2.485|

Change-Id: Ic4262222db78f013b94a8c61b46efb8520722927

6 years agoMerge "For keyframe-only coding do not boost in q mode"
Urvang Joshi [Wed, 17 Oct 2018 20:25:02 +0000 (20:25 +0000)]
Merge "For keyframe-only coding do not boost in q mode"

6 years agoFor keyframe-only coding do not boost in q mode
Urvang Joshi [Wed, 17 Oct 2018 18:48:10 +0000 (11:48 -0700)]
For keyframe-only coding do not boost in q mode

If we are using keyframe only coding - either coding a
single frame, or a sequence of keyframes - in the end-usage=q
mode, use the cq_level directly as the quality of each
coded frame, rather than boost them.

Ported from AV1: 563a0d1eb92bdc1e987df071a568d8406c4ffa92

Change-Id: I6dc929b8b4f0aa18e279139077f3a87958c92245

6 years agoRefactor SSE2 Code for 4-tap interpolation filter on width 16.
chiyotsai [Tue, 16 Oct 2018 19:26:34 +0000 (12:26 -0700)]
Refactor SSE2 Code for 4-tap interpolation filter on width 16.

Some repeated codes are refactored as inline functions. No performance
degradation is observed. These inline functions can be used for width 8
and width 4.

Change-Id: Ibf08cc9ebd2dd47bd2a6c2bcc1616f9d4c252d4d

6 years agoOptimize vp9_highbd_temporal_filter_apply_c
Yunqing Wang [Sat, 13 Oct 2018 00:21:23 +0000 (17:21 -0700)]
Optimize vp9_highbd_temporal_filter_apply_c

Following the previous patch:
(https://chromium-review.googlesource.com/c/webm/libvpx/+/1277913),
this patch modified the highbd version of applying temporal filter
in the similar way.

Change-Id: I2bb6f1fff6e32bca86f7139a497181d34aa9f3ec

6 years agoAdd SSE2 support for 4-tap interpolation filter for width 16.
chiyotsai [Wed, 17 Oct 2018 00:50:37 +0000 (17:50 -0700)]
Add SSE2 support for 4-tap interpolation filter for width 16.

Horizontal filter on 64x64 block: 1.59 times as fast as baseline.
Vertical filter on 64x64 block: 2.5 times as fast as baseline.
2D filter on 64x64 block: 1.96 times as fast as baseline.

Change-Id: I12e46679f3108616d5b3475319dd38b514c6cb3c

6 years agoVariant implementation of changing mv search order
Angie Chiang [Tue, 16 Oct 2018 19:31:13 +0000 (12:31 -0700)]
Variant implementation of changing mv search order

We start mv search from the block with highest feature score, then
move on to the block's neighbors with with an searching order using
their feature scores.

We use max heap to help us achieve the functionality.

This feature is under flag USE_PQSORT

Change-Id: Ie5dc5ea715b0f9a7a594e5080a7cb4f5309f5597

6 years agoAdd feature_score_loc_sort
Angie Chiang [Mon, 15 Oct 2018 19:25:22 +0000 (12:25 -0700)]
Add feature_score_loc_sort

This CL is for facilitating the upcoming change,
a variant implementation of change mv search order according to
feature score

Change-Id: Ie6024b1a5ec02343aea6aa81fc14f94e2e515d06

6 years agoInit mv_[dist/cost]_sum in init_tpl_stats
Angie Chiang [Fri, 12 Oct 2018 22:37:26 +0000 (15:37 -0700)]
Init mv_[dist/cost]_sum in init_tpl_stats

Change-Id: If45fba8a74186803eec09da7dbaf2e1fe4e9e156

6 years agoChange mv search order according to feature_score
Angie Chiang [Thu, 11 Oct 2018 00:43:22 +0000 (17:43 -0700)]
Change mv search order according to feature_score

Sort the feature_score in descending order.
Do mv search from the block with higher score to the block with
lower score

Change-Id: I47a87cd66ea3e40d8c8fc55a7517ab8aa10fdb94

6 years agoMerge "Reduce the cpi->scaled_ref_idx array size by 1."
Wan-Teh Chang [Wed, 17 Oct 2018 14:43:22 +0000 (14:43 +0000)]
Merge "Reduce the cpi->scaled_ref_idx array size by 1."

6 years agoMerge "Refactor tpl dependency model to support multi-layer ARF updates"
Jingning Han [Tue, 16 Oct 2018 21:24:17 +0000 (21:24 +0000)]
Merge "Refactor tpl dependency model to support multi-layer ARF updates"

6 years agoMerge "Refactor GOP reference frame ordering for tpl model"
Jingning Han [Tue, 16 Oct 2018 21:23:52 +0000 (21:23 +0000)]
Merge "Refactor GOP reference frame ordering for tpl model"

6 years agoMerge "Record gop size"
Jingning Han [Tue, 16 Oct 2018 21:07:55 +0000 (21:07 +0000)]
Merge "Record gop size"

6 years agoMerge "Fix a bug in ml_prune_rect_partition()"
Hui Su [Tue, 16 Oct 2018 20:58:45 +0000 (20:58 +0000)]
Merge "Fix a bug in ml_prune_rect_partition()"

6 years agoMerge "Fix the filter tap calculation in mips optimizations"
Yunqing Wang [Tue, 16 Oct 2018 17:55:37 +0000 (17:55 +0000)]
Merge "Fix the filter tap calculation in mips optimizations"

6 years agoFix a bug in ml_prune_rect_partition()
Hui Su [Tue, 16 Oct 2018 16:50:13 +0000 (09:50 -0700)]
Fix a bug in ml_prune_rect_partition()

The quantization step size should be scaled properly for high bit depth
settings.

This only affects speed 0.
Encoder speed change is almost neutral.
There is a small coding gain of 0.09%.

Change-Id: I96b2bae03a53ce8ccd6428e3a050cfe18e06a024

6 years agoRefactor tpl dependency model to support multi-layer ARF updates
Jingning Han [Tue, 16 Oct 2018 17:09:52 +0000 (10:09 -0700)]
Refactor tpl dependency model to support multi-layer ARF updates

Refactor to form a systematic reference frame update system for
the temporal dependency model. This prepares to support the multi-
layer ARF system.

Change-Id: Idb90fbe3966695b487c1a0a52f4626b0b6807434

6 years agoMerge "Enable ML based partition search breakout for HBD"
Hui Su [Tue, 16 Oct 2018 17:14:27 +0000 (17:14 +0000)]
Merge "Enable ML based partition search breakout for HBD"

6 years agoFix the filter tap calculation in mips optimizations
Yunqing Wang [Tue, 16 Oct 2018 16:24:18 +0000 (09:24 -0700)]
Fix the filter tap calculation in mips optimizations

The interp filter tap calculation was not accurate to tell the
difference between 2 taps and 4 taps. This patch fixed the bug, and
resolved Jenkins test failures in mips sub-pel filter optimizations.

BUG=webm:1568

Change-Id: I51eb8adb7ed194ef2ea7dd4aa57aa9870ee38cfc

6 years agoMerge "fix output file check in vpxenc tests script."
Jerome Jiang [Tue, 16 Oct 2018 06:09:36 +0000 (06:09 +0000)]
Merge "fix output file check in vpxenc tests script."

6 years agoMerge "Add frame_gop_index to GF_GROUP"
Jingning Han [Tue, 16 Oct 2018 03:49:52 +0000 (03:49 +0000)]
Merge "Add frame_gop_index to GF_GROUP"

6 years agoMerge "Add encoder side frame buffer for tpl model"
Jingning Han [Tue, 16 Oct 2018 03:49:25 +0000 (03:49 +0000)]
Merge "Add encoder side frame buffer for tpl model"

6 years agofix output file check in vpxenc tests script.
Jerome Jiang [Tue, 16 Oct 2018 00:10:22 +0000 (17:10 -0700)]
fix output file check in vpxenc tests script.

BUG=webm:1556

Change-Id: I4be40e9bf667cd9896017f38d866a47d3e19dcaf

6 years agoMerge "CHANGELOG: fix v1.7.0 release date"
James Zern [Tue, 16 Oct 2018 03:04:37 +0000 (03:04 +0000)]
Merge "CHANGELOG: fix v1.7.0 release date"

6 years agoMerge "Add indep loop for motion_compensated_prediction"
Angie Chiang [Tue, 16 Oct 2018 00:43:25 +0000 (00:43 +0000)]
Merge "Add indep loop for motion_compensated_prediction"

6 years agoEnable ML based partition search breakout for HBD
Hui Su [Mon, 15 Oct 2018 17:41:21 +0000 (10:41 -0700)]
Enable ML based partition search breakout for HBD

For speed 0:
coding loss 0.045%; encoder speedup 6%.

For speed 1(only affects videos smaller than 720p):
coding loss 0.11%; encoder speedup 6.5%.

Change-Id: Ie441c9bad2021503e86fefd2f1fa3e1a42070bec

6 years agoA temporary fix to mips sub-pel filters
Yunqing Wang [Mon, 15 Oct 2018 22:27:49 +0000 (15:27 -0700)]
A temporary fix to mips sub-pel filters

There are Jenkins test failures in mips sub-pel filter optimizations.
[ RUN      ] MSA/ConvolveTest.MatchesReferenceSubpixelFilter/5
../libvpx/test/convolve_test.cc:889: Failure
Expected equality of these values:
  lookup(ref, y * kOutputStride + x)
    Which is: 255
  lookup(out, y * kOutputStride + x)
    Which is: 11
mismatch at (1,0), filters (4,0,1)

This relates to the 4-tap kernel added recently. This CL is a temporary
fix, while we investigate the issue.

BUG=webm:1568

Change-Id: If64c552b794425687cca4fbed893d8ccb73c89a5

6 years agoRefactor GOP reference frame ordering for tpl model
Jingning Han [Mon, 15 Oct 2018 22:27:02 +0000 (15:27 -0700)]
Refactor GOP reference frame ordering for tpl model

Process the frames in the order of GOP structure definition.
Decouple the dependency on rc->baseline_gf_interval.

Change-Id: I0d42c542aca552975cc8f08b0eb8b22ccf6a9537

6 years agoRecord gop size
Jingning Han [Mon, 15 Oct 2018 22:21:23 +0000 (15:21 -0700)]
Record gop size

Keep the frame operations needed within a group of picture.

Change-Id: Iece2e855f21860c930b34a3c586f084f7c61db00

6 years agoAdd frame_gop_index to GF_GROUP
Jingning Han [Mon, 15 Oct 2018 18:48:39 +0000 (11:48 -0700)]
Add frame_gop_index to GF_GROUP

Add frame_gop_index to track the frame offset within a group of
picture. This reworks the GOP frame offset calculation and use
case. The coding stats remain identical.

Change-Id: I94d0957bcc327f6bbeac6e84157635663c36b953

6 years agoAdd encoder side frame buffer for tpl model
Jingning Han [Mon, 15 Oct 2018 17:11:57 +0000 (10:11 -0700)]
Add encoder side frame buffer for tpl model

Add an encoder side reference frame buffer pool to store the
reference frames for tpl model. This servces as an intermediate
step to support multi-layer ARF system. The buffer memory size will
be optimized afterwards.

Change-Id: If2d2f095d4911a4996f6c2a0b0a8e3d235ceadb2

6 years agoCHANGELOG: fix v1.7.0 release date
James Zern [Mon, 15 Oct 2018 21:12:38 +0000 (14:12 -0700)]
CHANGELOG: fix v1.7.0 release date

BUG=webm:1567

Change-Id: Ia6091445504c8c94334bc062c945238782553d44

6 years agoRefactor tpl model setup to support multi-layer ARF setup
Jingning Han [Thu, 11 Oct 2018 19:16:01 +0000 (12:16 -0700)]
Refactor tpl model setup to support multi-layer ARF setup

Generalize the tpl model framework to support the newly designed
GOP structure system. The existing tpl model assumes single layer
ARF.

This design will separate the tpl model operation for GOP with
and without ARF cases. When a GOP has ARF, the maximum lookahead
offset would upper limit the needed frame buffer to build the
tpl model for the entire GOP. When a GOP does not have ARF, we
would use the temporal model in a different approach.

The first step will focus on GOP with ARF. All the tpl model related
operation will only be triggered by ARF frame generation.

Change-Id: I13ab03a7bc68f5a4f6b03f2cb01c10befe955e73

6 years agoMerge "Turn on ml_var_partition_pruning for HBD"
Hui Su [Sat, 13 Oct 2018 15:04:24 +0000 (15:04 +0000)]
Merge "Turn on ml_var_partition_pruning for HBD"

6 years agoMerge "Optimize apply_temporal_filter function"
Yunqing Wang [Sat, 13 Oct 2018 03:12:43 +0000 (03:12 +0000)]
Merge "Optimize apply_temporal_filter function"

6 years agoMerge "Remove unused variable from VP9_COMP"
Jingning Han [Sat, 13 Oct 2018 02:10:26 +0000 (02:10 +0000)]
Merge "Remove unused variable from VP9_COMP"

6 years agoOptimize apply_temporal_filter function
Yunqing Wang [Fri, 12 Oct 2018 19:25:36 +0000 (12:25 -0700)]
Optimize apply_temporal_filter function

This patch optimized apply_temporal_filter function. The diff^2 for each
pixel in the 16x16 block is calculated once beforehand, so that we don't
calculate it multiple times while evaluating a pixel's neighbors. This
would speed up the function.

Change-Id: Ibdb8b041f317fd6df198950e2acf9cfcde26860d

6 years agoTurn on ml_var_partition_pruning for HBD
Hui Su [Fri, 12 Oct 2018 17:59:48 +0000 (10:59 -0700)]
Turn on ml_var_partition_pruning for HBD

This affects speed 0 and 1 only.

Tested on lowres_bd10(480p) and midres_bd10(720p),
                   speed 0       speed 1
coding loss:        0.07%         0.10%
encoder speedup:     14%          6.5%

Change-Id: I5812400d8c7393321b7284d3fca06026842390b5

6 years agoMerge "Enable ML based rect partition pruning for HBD"
Hui Su [Fri, 12 Oct 2018 19:59:53 +0000 (19:59 +0000)]
Merge "Enable ML based rect partition pruning for HBD"

6 years agoEnable ML based rect partition pruning for HBD
Hui Su [Thu, 11 Oct 2018 21:33:40 +0000 (14:33 -0700)]
Enable ML based rect partition pruning for HBD

Tested on lowres_bd10(480p) and midres_bd10(720p), average coding
loss is 0.09%; average encoding speedup is 9%.

Only speed 0 is affected.

Change-Id: Ia8d48c1c6d1669745f0e956b172572a37e42f0c7

6 years agoMerge "Make 4-tap interp filter coefficients even numbers"
Yunqing Wang [Fri, 12 Oct 2018 16:14:00 +0000 (16:14 +0000)]
Merge "Make 4-tap interp filter coefficients even numbers"

6 years agoRemove unused variable from VP9_COMP
Jingning Han [Thu, 11 Oct 2018 22:23:19 +0000 (15:23 -0700)]
Remove unused variable from VP9_COMP

Change-Id: I61447b7a21ac5b03f2a6accd6e433d8f9369e508

6 years agoMake 4-tap interp filter coefficients even numbers
Yunqing Wang [Thu, 11 Oct 2018 22:13:47 +0000 (15:13 -0700)]
Make 4-tap interp filter coefficients even numbers

This CL modified 4-tap interp filter coefficients to be even numbers,
which would help in writing 4-tap filter SIMD optimizations. The coding
performance change was negligible. Speed 1 borg test showed:
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -0.003    -0.012      -0.017
midres:  0.029     0.018        0.043
hdres:   0.024     0.044        0.033

Change-Id: Id7c54bb9a9c1aee19c41bc6f1dc3b9682d158bba

6 years agoMerge "ML_VAR_PARTITION: adjust model threshold"
Hui Su [Thu, 11 Oct 2018 23:41:57 +0000 (23:41 +0000)]
Merge "ML_VAR_PARTITION: adjust model threshold"

6 years agoMerge "Call tpl model build at the beginning of a GOP"
Jingning Han [Thu, 11 Oct 2018 16:22:01 +0000 (16:22 +0000)]
Merge "Call tpl model build at the beginning of a GOP"

6 years agoMerge "Revert "vp8: Increase rate threshold for overshoot-drop""
Marco Paniconi [Thu, 11 Oct 2018 10:25:12 +0000 (10:25 +0000)]
Merge "Revert "vp8: Increase rate threshold for overshoot-drop""

6 years agoRevert "vp8: Increase rate threshold for overshoot-drop"
Marco Paniconi [Thu, 11 Oct 2018 10:24:16 +0000 (10:24 +0000)]
Revert "vp8: Increase rate threshold for overshoot-drop"

This reverts commit bc066684ca4deff24d02ee56071d731b431bf438.

Reason for revert: <INSERT REASONING HERE>
Regression in webrtc perf test

Original change's description:
> vp8: Increase rate threshold for overshoot-drop
>
> Increase the rate threshold for the dropping when
> overshoot is detected during encoding. This helps
> to prevent some unneccessary drops for hard content.
>
> Change-Id: I258bf33883d46347efd44e1e192cb25c444d05fe

TBR=sprang@chromium.org,marpan@google.com,builds@webmproject.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: Ib0e84747430ba6d04e479f9efd86d628b80a1e67

6 years agoMerge changes Ia5978d91,I3e3754f3
Angie Chiang [Thu, 11 Oct 2018 01:18:45 +0000 (01:18 +0000)]
Merge changes Ia5978d91,I3e3754f3

* changes:
  Simplify mode_estimation / tpl_model_store
  Move [inter/intra]_cost change to mode_estimation

6 years agoAdd indep loop for motion_compensated_prediction
Angie Chiang [Wed, 10 Oct 2018 22:10:07 +0000 (15:10 -0700)]
Add indep loop for motion_compensated_prediction

This is for non_greedy_mv experiment only
This is part of the change of changing mv search order according
feature_score.

Change-Id: I432efccd83d448a4a275dffd37921c76c3d84588

6 years agoMerge "Loopfilter Multi-Thread Optimization"
Harish Mahendrakar [Thu, 11 Oct 2018 00:03:26 +0000 (00:03 +0000)]
Merge "Loopfilter Multi-Thread Optimization"

6 years agoMerge "subpel asm: fix whitespace"
James Zern [Wed, 10 Oct 2018 22:11:15 +0000 (22:11 +0000)]
Merge "subpel asm: fix whitespace"

6 years agoCall tpl model build at the beginning of a GOP
Jingning Han [Wed, 10 Oct 2018 21:52:30 +0000 (14:52 -0700)]
Call tpl model build at the beginning of a GOP

The gop index 0 is default as kf / gf. The effective first coding
frame controlled by the current GOP rate allocation is indexed 1.
Call the tpl model build for the current GOP once at index 1
position. This would unify the calling system for single/multi-layer
ARF GOP structure.

Change-Id: I4ce69337e04646098d5513c0aa56b4e0b4483337

6 years agoMerge "Use 4-tap interp filter in speed 1 sub-pel motion search"
Yunqing Wang [Wed, 10 Oct 2018 21:05:44 +0000 (21:05 +0000)]
Merge "Use 4-tap interp filter in speed 1 sub-pel motion search"

6 years agosubpel asm: fix whitespace
Johann [Wed, 10 Oct 2018 14:44:49 +0000 (07:44 -0700)]
subpel asm: fix whitespace

Change-Id: I7a3314a268cf6049a7260361043e76d4561085c6

6 years agoSimplify mode_estimation / tpl_model_store
Angie Chiang [Tue, 9 Oct 2018 20:40:47 +0000 (13:40 -0700)]
Simplify mode_estimation / tpl_model_store

1) Let mode_estimation() save the results into tpl_frame directly
2) In tpl_model_store(), replace copies of tpl_stats parameters by
   memset()

Change-Id: Ia5978d91cb60cf896bd53d3f27701ef9ae3ba09a

6 years agoMove [inter/intra]_cost change to mode_estimation
Angie Chiang [Tue, 9 Oct 2018 19:43:36 +0000 (12:43 -0700)]
Move [inter/intra]_cost change to mode_estimation

Change-Id: I3e3754f349d31d17554f02bd14cd34620057ddcb

6 years agoMerge changes I67700eba,I9e8f8ed3,Id93565cc
Angie Chiang [Tue, 9 Oct 2018 18:33:32 +0000 (18:33 +0000)]
Merge changes I67700eba,I9e8f8ed3,Id93565cc

* changes:
  Move feature_score into an independent for loop
  Add set_mv_limits()
  Move lambda into TplDepFrame

6 years agoUse 4-tap interp filter in speed 1 sub-pel motion search
Yunqing Wang [Mon, 8 Oct 2018 23:21:54 +0000 (16:21 -0700)]
Use 4-tap interp filter in speed 1 sub-pel motion search

Added the 4-tap interp filter, and used it for speed 1 sub-pel motion
search. Speed 2 motion search still used bilinear filter as before.

Speed 1 borg test showed good bit savings.
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -1.125    -1.179      -1.021
midres:  -0.717    -0.710      -0.543
hdres:   -0.357    -0.370      -0.342
Speed test at speed 1 showed ~10% encoder time increase, which was
partially because of no SIMD version of 4-tap filter.

Change-Id: Ic9b48cdc6a964538c20144108526682d64348301

6 years agoAdd accurate sub-pel motion search
Yunqing Wang [Mon, 8 Oct 2018 22:10:12 +0000 (15:10 -0700)]
Add accurate sub-pel motion search

Added the accurate sub-pel motion search. In this patch, used the 8-tap
filter in sub-pel motion search, and this was enabled at speed 0.

Speed 0 borg test showed that
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -1.363     -1.403     -1.282
midres:  -0.842     -0.849     -0.720
hdres:   -0.480     -0.488     -0.503
Speed test at speed 0 showed ~8% encoder time increase.

Change-Id: I194ca709681ea588f3f6381093940e22d03c4d7b

6 years agoMerge "Set up the unit scaling factor for motion search"
Yunqing Wang [Tue, 9 Oct 2018 16:08:06 +0000 (16:08 +0000)]
Merge "Set up the unit scaling factor for motion search"

6 years agoMove feature_score into an independent for loop
Angie Chiang [Mon, 8 Oct 2018 23:37:15 +0000 (16:37 -0700)]
Move feature_score into an independent for loop

We aim at change the mv search order according to feature_score
This is part of the change.

Change-Id: I67700eba014df92190eabc78060cf29adf0fc38b

6 years agoAdd set_mv_limits()
Angie Chiang [Mon, 8 Oct 2018 22:51:05 +0000 (15:51 -0700)]
Add set_mv_limits()

Change-Id: I9e8f8ed3eb150b3af1f465f595000bd05d43f3f6

6 years agoSet up the unit scaling factor for motion search
Yunqing Wang [Mon, 8 Oct 2018 18:43:02 +0000 (11:43 -0700)]
Set up the unit scaling factor for motion search

Set up the unit scaling factor used during motion search.

Change-Id: I6fda018d593b7ad4b7658d44c39be950a502d192

6 years agoLoopfilter Multi-Thread Optimization
Supradeep T R [Tue, 12 Jun 2018 08:27:39 +0000 (13:57 +0530)]
Loopfilter Multi-Thread Optimization

Take the original loopfilter multi-thread optimization
(dafe064289a917977439ab6f4f002b9946496084) along with the fixes for bugs
1558 and 1562.

BUG=webm:1558
BUG=webm:1562

Change-Id: Ibbf6bd13f4ffff0e79184ccfd6b85a49e067a6d8

6 years agoMove lambda into TplDepFrame
Angie Chiang [Mon, 8 Oct 2018 21:35:26 +0000 (14:35 -0700)]
Move lambda into TplDepFrame

Change-Id: Id93565cca41e00d4ab5de4c6de30accabf2adc52

6 years agoReduce the cpi->scaled_ref_idx array size by 1.
Wan-Teh Chang [Fri, 5 Oct 2018 18:27:01 +0000 (11:27 -0700)]
Reduce the cpi->scaled_ref_idx array size by 1.

The last element of the cpi->scaled_ref_idx array was not used, so
reduce the array size by 1.

The corresponding libaom CL is
https://aomedia-review.googlesource.com/c/aom/+/72445.

Change-Id: I9166f0fbe1a7898c8b611b1535fcc74b4f766997

6 years agoMerge "Avoid null checks related to pool->frame_bufs."
Wan-Teh Chang [Mon, 8 Oct 2018 20:51:28 +0000 (20:51 +0000)]
Merge "Avoid null checks related to pool->frame_bufs."

6 years agoMerge "Correct a for loop in init_ref_frame_bufs."
Wan-Teh Chang [Mon, 8 Oct 2018 19:07:02 +0000 (19:07 +0000)]
Merge "Correct a for loop in init_ref_frame_bufs."

6 years agoMerge "Turn on ml_var_partition_pruning for speed 1"
Hui Su [Mon, 8 Oct 2018 17:25:23 +0000 (17:25 +0000)]
Merge "Turn on ml_var_partition_pruning for speed 1"

6 years agoCorrect a for loop in init_ref_frame_bufs.
Wan-Teh Chang [Mon, 8 Oct 2018 17:03:06 +0000 (10:03 -0700)]
Correct a for loop in init_ref_frame_bufs.

The cm->ref_frame_map and pool->frame_bufs arrays are of different sizes
(REF_FRAMES and FRAME_BUFFERS, respectively), so init_ref_frame_bufs()
cannot iterate over these two arrays using the same for loop.

Change-Id: Ica5bbd9d0c30ea3d089ad2d4bcf6cd8ae2daea64

6 years agoAvoid null checks related to pool->frame_bufs.
Wan-Teh Chang [Mon, 8 Oct 2018 16:41:55 +0000 (09:41 -0700)]
Avoid null checks related to pool->frame_bufs.

It seems that null pointer checks such as the following may make clang
scan-build think pool->frame_bufs may be a null pointer:

    buf = (buf_idx != INVALID_IDX) ? &pool->frame_bufs[buf_idx] : NULL;
    if (buf != NULL) {

This "misinformation" may make scan-build warn about the ref_cnt_fb()
function's use of its 'bufs' argument (Dereference of null pointer) when
we pass pool->frame_bufs to ref_cnt_fb().

Rewriting the above code as:

    if (buf_idx != INVALID_IDX) {
      buf = &pool->frame_bufs[buf_idx];

not only is clearer but also avoids confusing scan-build.

Change-Id: Ia64858dbd7ff89f74ba1a4fc9239b0c4413592c8

6 years agoMerge "Changes to facilitate accurate sub-pel motion search"
Yunqing Wang [Mon, 8 Oct 2018 15:41:09 +0000 (15:41 +0000)]
Merge "Changes to facilitate accurate sub-pel motion search"

6 years agoMerge "Fix bug in prepare_nb_full_mvs"
Angie Chiang [Sat, 6 Oct 2018 00:42:47 +0000 (00:42 +0000)]
Merge "Fix bug in prepare_nb_full_mvs"