]> granicus.if.org Git - libvpx/log
libvpx
6 years agoExtend 16 wide AVX2 convolve8 code to support averaging.
Kyle Siefring [Sun, 8 Oct 2017 03:25:03 +0000 (23:25 -0400)]
Extend 16 wide AVX2 convolve8 code to support averaging.

Also adds vpx_convolve8_avg_horiz_avx2.

Change-Id: I38783d972ac26bec77610e9e15a0a058ed498cbf

6 years agoAdd AVX2 version of vpx_convolve8_avg.
Kyle Siefring [Sat, 7 Oct 2017 20:02:02 +0000 (16:02 -0400)]
Add AVX2 version of vpx_convolve8_avg.

vpx_convolve8_avg works by first running a normal horizontal filter then a
vertical filter averages at the end.

The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the
horizontal step.

vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code.

Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983

6 years agoMerge "ppc: Add vpx_idct32x32_1024_add_vsx"
James Zern [Sat, 7 Oct 2017 19:08:26 +0000 (19:08 +0000)]
Merge "ppc: Add  vpx_idct32x32_1024_add_vsx"

6 years agoMerge "Revert "Speed >=5 real-time: add TM intra mode for high_source_sad.""
Marco Paniconi [Fri, 6 Oct 2017 22:41:34 +0000 (22:41 +0000)]
Merge "Revert "Speed >=5 real-time: add TM intra mode for high_source_sad.""

6 years agoRevert "Speed >=5 real-time: add TM intra mode for high_source_sad."
Marco Paniconi [Fri, 6 Oct 2017 22:14:56 +0000 (22:14 +0000)]
Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."

This reverts commit 9311ef18b4b4eff0da3adf9d702a34f489a270ff.

Reason for revert:
Notice small regression in some clips.
Will revisit in another change.

Original change's description:
> Speed >=5 real-time: add TM intra mode for high_source_sad.
>
> Small/neutral change in metrics or speed for ytlive.
> Some improvement in quality on frames with big content change.
>
> Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: I9d8ec5195bb05ddf329d325699355185affb9b13
No-Presubmit: true
No-Tree-Checks: true
No-Try: true

6 years agoAdjust threshold in scene detection
Marco [Fri, 6 Oct 2017 17:53:40 +0000 (10:53 -0700)]
Adjust threshold in scene detection

For 1 pass vbr: increase min_thresh slightly, and also add
condition on golden/arf update for using full nonrd_pick_partition.

Reduces possible false detection for scene cut detection.

Neutral/small change in metrics or speed for speed 5.

Change-Id: I388f4d9a56e3cc763e0148338c1bc0381e58ad76

6 years agoMerge "Speed >=5 real-time: add TM intra mode for high_source_sad."
Marco Paniconi [Fri, 6 Oct 2017 06:29:46 +0000 (06:29 +0000)]
Merge "Speed >=5 real-time: add TM intra mode for high_source_sad."

6 years agoSpeed >=5 real-time: add TM intra mode for high_source_sad.
Marco [Thu, 5 Oct 2017 19:58:51 +0000 (12:58 -0700)]
Speed >=5 real-time: add TM intra mode for high_source_sad.

Small/neutral change in metrics or speed for ytlive.
Some improvement in quality on frames with big content change.

Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d

6 years agoMerge "vpx_codec.h: namespace local defines"
James Zern [Fri, 6 Oct 2017 05:30:16 +0000 (05:30 +0000)]
Merge "vpx_codec.h: namespace local defines"

6 years agovpx_codec.h: namespace local defines
James Zern [Thu, 5 Oct 2017 22:09:33 +0000 (15:09 -0700)]
vpx_codec.h: namespace local defines

add VPX_ to UNUSED/*DEPRECATED to avoid conflicts with other headers.

Change-Id: Ie16bdac3575bc1af57a05d37e65b994370585377

6 years agovp9_ethread_test: abort early/add more detailed output
James Zern [Thu, 5 Oct 2017 22:02:51 +0000 (15:02 -0700)]
vp9_ethread_test: abort early/add more detailed output

in the case compare_fp_stats fails report the 2 values and their index

Change-Id: I927a832b7a1e24c392961093b7caee1134223def

6 years agoMerge "Adjust threshold for adapt_partition for speed 6."
Marco Paniconi [Thu, 5 Oct 2017 03:28:06 +0000 (03:28 +0000)]
Merge "Adjust threshold for adapt_partition for speed 6."

6 years agoAdjust threshold for adapt_partition for speed 6.
Marco [Thu, 5 Oct 2017 01:01:37 +0000 (18:01 -0700)]
Adjust threshold for adapt_partition for speed 6.

Lower SAD threshold to select non_rd pickmode partition
at superblock level more often.
Small gain in metrics, small/negligible decrease in speed.

Change-Id: I0f728236b91a604e4ca7e02039adc54d5985c4dc

6 years agoMerge "Avoid nonrd_pick_partition for speed >= 6."
Marco Paniconi [Wed, 4 Oct 2017 23:36:27 +0000 (23:36 +0000)]
Merge "Avoid nonrd_pick_partition for speed >= 6."

6 years agoAvoid nonrd_pick_partition for speed >= 6.
Marco [Wed, 4 Oct 2017 22:27:45 +0000 (15:27 -0700)]
Avoid nonrd_pick_partition for speed >= 6.

For 1 pass vbr speed >= 6: when REFERENCE_PARTITION is selected,
avoid doing the full nonrd_pickmode based partition.
No change in overall metrics or speed.
Reduces encode times on scene cuts by 10-20%.

Change-Id: I0310b1610cc1c83793a509e0a9059840e8f18308

6 years agoMerge "Modify early exit for alt_ref in nonrd_pickmode."
Marco Paniconi [Wed, 4 Oct 2017 19:38:49 +0000 (19:38 +0000)]
Merge "Modify early exit for alt_ref in nonrd_pickmode."

6 years agoModify early exit for alt_ref in nonrd_pickmode.
Marco [Wed, 4 Oct 2017 18:41:52 +0000 (11:41 -0700)]
Modify early exit for alt_ref in nonrd_pickmode.

For 1 pass vbr mode:
On no-show_frame/ARF: instead of skipping alt_ref_frame
completely in mode testing, allow for checking (0, 0) on alt_ref.

Small gain in metrics, ~0.18%, no change in speed.

Change-Id: I32a3c24faca64ab70dd5091071a0dc301db7dd1e

6 years agoMerge changes Id6a8c549,Ib1e0650b,Ic369dd86
Linfeng Zhang [Wed, 4 Oct 2017 16:15:14 +0000 (16:15 +0000)]
Merge changes Id6a8c549,Ib1e0650b,Ic369dd86

* changes:
  Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
  Add vpx_dsp/x86/mem_sse2.h
  Add transpose_8bit_{4x4,8x8}() x86 optimization

6 years agoMerge "Fix image width alignment. Enable ImageSizeSetting test."
Jerome Jiang [Wed, 4 Oct 2017 14:48:03 +0000 (14:48 +0000)]
Merge "Fix image width alignment. Enable ImageSizeSetting test."

6 years agoEnable arf usage for speed >= 6, 1 pass vbr.
Marco [Wed, 4 Oct 2017 00:14:24 +0000 (17:14 -0700)]
Enable arf usage for speed >= 6, 1 pass vbr.

For speed 6 on ytlive set:
On average, speed slowdown ~5%, quality gain ~2%.

Change-Id: Ia18237cc1d52c54d7e2cb3c71f571cf37ef61b44

6 years agovp9: 1 pass vbr: Limit qpdelta on high_source_sad.
Marco [Fri, 29 Sep 2017 18:34:00 +0000 (11:34 -0700)]
vp9: 1 pass vbr: Limit qpdelta on high_source_sad.

For 1 pass vbr: when significant content/scene change is detected
(high_source_sad = 1) reduce/turnoff the additional qdelta on the
active_worst_quality. This helps somewhat to reduce the occurrence
of large frame sizes and large encode times.
Allow it only when use_altef_onepass is enabled.

Neutral/no change on metrics.

Change-Id: I1dd97dd2ab892d65f707b841b27a5de300b714ea

6 years agoMerge "vpx: fix nasm build errors"
James Zern [Tue, 3 Oct 2017 21:47:49 +0000 (21:47 +0000)]
Merge "vpx: fix nasm build errors"

6 years agovpx: fix nasm build errors
Scott LaVarnway [Sat, 30 Sep 2017 12:51:24 +0000 (05:51 -0700)]
vpx: fix nasm build errors

BUG=webm:1462,766721

Change-Id: Icfa536a8e38623636b96c396e3c94889bfde7a98

6 years agoRefactor x86/vpx_subpixel_8t_intrin_ssse3.c
Linfeng Zhang [Mon, 2 Oct 2017 21:29:06 +0000 (14:29 -0700)]
Refactor x86/vpx_subpixel_8t_intrin_ssse3.c

Change-Id: Id6a8c549709a3c516ed5d7b719b05117c5ef8bac

6 years agoAdd vpx_dsp/x86/mem_sse2.h
Linfeng Zhang [Mon, 2 Oct 2017 20:46:15 +0000 (13:46 -0700)]
Add vpx_dsp/x86/mem_sse2.h

Add some load and store sse2 inline functions.

Change-Id: Ib1e0650b5a3d8e2b3736ab7c7642d6e384354222

6 years agoUse adapt_partition for ARF in 1 pass.
Marco [Tue, 3 Oct 2017 17:55:55 +0000 (10:55 -0700)]
Use adapt_partition for ARF in 1 pass.

For speed 6 real-time mode: use adapt_partition
on ARF frame instead of REFERENCE_PARTITION (which is slower).
This requires enabling compute_source_sad_onepass for no-show_frames.

Speedup of ~3-5% on some clips that heavily use ARF,
small loss (~0.2%) in quality on ytlive set.

Change-Id: Ib50acc97df06458244a6ac55d2bd882c30012536

6 years agoAdd transpose_8bit_{4x4,8x8}() x86 optimization
Linfeng Zhang [Mon, 2 Oct 2017 20:01:56 +0000 (13:01 -0700)]
Add transpose_8bit_{4x4,8x8}() x86 optimization

Change-Id: Ic369dd86b3b81686f68fbc13ad34ab8ea8846878

6 years agoMerge "ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode."
Marco Paniconi [Tue, 3 Oct 2017 03:01:14 +0000 (03:01 +0000)]
Merge "ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode."

6 years agoARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode.
Marco [Mon, 2 Oct 2017 21:00:18 +0000 (14:00 -0700)]
ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode.

Speedup of ~2-3% on 1080p clips speed 6.
Neutral/negligible loss in metrics on ytlive.

Change-Id: I7ac47a4d8b58c566920bae29a94a0e8d59c36dee

6 years agoAdd 4 to 3 scaling NEON optimization
Linfeng Zhang [Tue, 12 Sep 2017 18:49:58 +0000 (11:49 -0700)]
Add 4 to 3 scaling NEON optimization

Speed comparing with the one calling vpx_scaled_2d_neon()
  ~1.7 x in general
  ~2.8x for BILINEAR filter

BUG=webm:1419

Change-Id: I8f0a54c2013e61ea086033010f97c19ecf47c7c6

6 years agoSpecialize 4 to 3 frame scaling in C
Linfeng Zhang [Wed, 20 Sep 2017 17:58:39 +0000 (10:58 -0700)]
Specialize 4 to 3 frame scaling in C

Scale 3x3 block instead of 16x16 block in each loop. Disabled by
default.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3.
   Optimization code will be smaller and faster.
2. Maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: I59a1f7496d89a1b090498c935d30cfcf1d0c282b

6 years agoMerge "vpxdsp: [x86] add highbd_d135_predictor functions"
Scott LaVarnway [Mon, 2 Oct 2017 15:00:19 +0000 (15:00 +0000)]
Merge "vpxdsp: [x86] add highbd_d135_predictor functions"

6 years agoppc: Add vpx_idct32x32_1024_add_vsx
Alexandra Hájková [Mon, 31 Jul 2017 19:07:22 +0000 (19:07 +0000)]
ppc: Add  vpx_idct32x32_1024_add_vsx

Change-Id: I55cd0a1569ccc47a53d0ecf751aac259d510e10d

6 years agoFix partition selection in speed features for arf overlay frame.
Marco [Fri, 29 Sep 2017 21:54:56 +0000 (14:54 -0700)]
Fix partition selection in speed features for arf overlay frame.

For real-time mode. Move the switch to fixed partition
for is_src_frame_alt_ref so all speeds may use it
if use_altref_onepass is set.

Improves metrics by ~2% for ytlive set at speed 4
(where use_altref_onepass is currently used).

Change-Id: I033240386598c9dbd0364da89ccbcca64bc663ee

6 years agoEnable use_altref_onepass for speed 4 real-time mode.
Marco [Fri, 29 Sep 2017 17:53:59 +0000 (10:53 -0700)]
Enable use_altref_onepass for speed 4 real-time mode.

Used for VBR mode with lag-in-frames > 0.
On ytlive set at speed 4: ~3% average gain.

Change-Id: I45dad1700bf8be9d8f177815dc062774f6f2f0de

6 years agovpxdsp: [x86] add highbd_d135_predictor functions
Scott LaVarnway [Fri, 29 Sep 2017 13:34:16 +0000 (06:34 -0700)]
vpxdsp: [x86] add highbd_d135_predictor functions

C vs SSE2 speed gains:
_4x4 : ~1.81x

C vs SSSE3 speed gains:
_8x8 : ~1.96x
_16x16 : ~1.88x
_32x32 : ~2.02x

BUG=webm:1411

Change-Id: Iefaf8b39afbbfe34c1ad1d21e3a003b20f1f61e0

6 years agovpxdsp: [x86] add highbd_d117_predictor functions
Scott LaVarnway [Wed, 20 Sep 2017 12:21:23 +0000 (05:21 -0700)]
vpxdsp: [x86] add highbd_d117_predictor functions

C vs SSE2 speed gains:
_4x4 : ~2.04x

C vs SSSE3 speed gains:
_8x8 : ~2.82x
_16x16 : ~5.93x
_32x32 : ~2.79x

BUG=webm:1411

Change-Id: I31d949695991c067dac89d91e0bed3e666c94993

6 years agoFix image width alignment. Enable ImageSizeSetting test.
Jerome Jiang [Wed, 27 Sep 2017 18:08:37 +0000 (11:08 -0700)]
Fix image width alignment. Enable ImageSizeSetting test.

BUG=b/64710201

Change-Id: I5465f6c6481d3c9a5e00fcab024cf4ae562b6b01

6 years agoSet rc->high_source_sad = 0 before scene detection.
Marco [Thu, 28 Sep 2017 17:47:34 +0000 (10:47 -0700)]
Set rc->high_source_sad = 0  before scene detection.

Only has effect when sf->use_altref_onepass is enabled,
as in that case scene detection is skipped for non-show frame
and so high_source_sad does not get reset to 0.

No change in metrics or speed.

Change-Id: I421f066d239341449c18826089e1810b9fc5967f

6 years agoMerge "vp9: Modification to adapt the ARF usage for 1 pass vbr"
Marco Paniconi [Thu, 28 Sep 2017 16:52:28 +0000 (16:52 +0000)]
Merge "vp9: Modification to adapt the ARF usage for 1 pass vbr"

6 years agovp9: Modification to adapt the ARF usage for 1 pass vbr
Marco [Thu, 21 Sep 2017 17:59:33 +0000 (10:59 -0700)]
vp9: Modification to adapt the ARF usage for 1 pass vbr

Add stats for past ARF usage, and use it to disable
ARF usage based on some conditions.

Overall improvement on ytlive set, reduces the regression
on the problem clips for this feature.

Only affects when sf->use_altref_onepass is enabled
(currently off by default).

Change-Id: I66267f227ea132dc86acb730e9882f85bead2cdb

6 years agoAdd use_svc condition to the scene detection in 1 pass.
Marco [Wed, 27 Sep 2017 21:49:58 +0000 (14:49 -0700)]
Add use_svc condition to the scene detection in 1 pass.

Scene detection is not currently used in SVC 1 pass code.
Speedup of ~0.4%.

Change-Id: I0ab769300919de710cd2da1402014fa3f22a1f86

6 years agoMerge "Revert "Remove the speed condition on scene detection in 1 pass code.""
Marco Paniconi [Wed, 27 Sep 2017 20:42:48 +0000 (20:42 +0000)]
Merge "Revert "Remove the speed condition on scene detection in 1 pass code.""

6 years agoMerge "vpxdsp: [x86] add highbd_d153_predictor functions"
Scott LaVarnway [Wed, 27 Sep 2017 20:40:21 +0000 (20:40 +0000)]
Merge "vpxdsp: [x86] add highbd_d153_predictor functions"

6 years agoRevert "Remove the speed condition on scene detection in 1 pass code."
Marco Paniconi [Wed, 27 Sep 2017 19:42:48 +0000 (19:42 +0000)]
Revert "Remove the speed condition on scene detection in 1 pass code."

This reverts commit 535b7b915ae5574db2f95632243cc5bee865f02e.

This is actually used in CBR to reset the rate control if high source sad is detected.

Original change's description:
> Remove the speed condition on scene detection in 1 pass code.
>
> Scene detection is used for VBR mode and for screen_content mode.
>
> It was also enabled for CBR mode via the speed condition,
> but currently the analysis in the scene detection is not used
> in CRB mode (similar computations are done locally at superblock level
> when the source_sad feature is enabled).
>
> For 1 pass code.
> No change in behavior. Small speed gain, ~0.5%.
>
> Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: Ib4e6b02047f75632503e7b0fc870af97fa9291c3
No-Presubmit: true
No-Tree-Checks: true
No-Try: true

6 years agoMerge "fix signed integer overflow of idct"
James Zern [Wed, 27 Sep 2017 19:39:11 +0000 (19:39 +0000)]
Merge "fix signed integer overflow of idct"

6 years agoMerge "vp9_dx_iface: Stop using iter parameter incorrectly"
James Zern [Wed, 27 Sep 2017 18:37:20 +0000 (18:37 +0000)]
Merge "vp9_dx_iface: Stop using iter parameter incorrectly"

6 years agofix signed integer overflow of idct
Linfeng Zhang [Tue, 26 Sep 2017 19:33:40 +0000 (12:33 -0700)]
fix signed integer overflow of idct

Exposed by fuzz test in high bitdepth.
The bug is introduced in commit 64653fa.

BUG=webm:1466

Change-Id: Idd77d5c6a60efb9241471611ce1aba0646cb6ff5

6 years agovpxdsp: [x86] add highbd_d153_predictor functions
Scott LaVarnway [Wed, 27 Sep 2017 17:06:14 +0000 (10:06 -0700)]
vpxdsp: [x86] add highbd_d153_predictor functions

C vs SSE2 speed gains:
_4x4 : ~1.95x

C vs SSSE3 speed gains:
_8x8 : ~3.30x
_16x16 : ~5.67x
_32x32 : ~3.87x

BUG=webm:1411

Change-Id: Ib483989b25614aa89b635e8c087d0879a5d71904

6 years agoRemove the speed condition on scene detection in 1 pass code.
Marco [Wed, 27 Sep 2017 17:11:24 +0000 (10:11 -0700)]
Remove the speed condition on scene detection in 1 pass code.

Scene detection is used for VBR mode and for screen_content mode.

It was also enabled for CBR mode via the speed condition,
but currently the analysis in the scene detection is not used
in CRB mode (similar computations are done locally at superblock level
when the source_sad feature is enabled).

For 1 pass code.
No change in behavior. Small speed gain, ~0.5%.

Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f

6 years agovp9_dx_iface: Stop using iter parameter incorrectly
Vignesh Venkatasubramanian [Mon, 25 Sep 2017 23:56:04 +0000 (16:56 -0700)]
vp9_dx_iface: Stop using iter parameter incorrectly

'iter' parameter is being checked for NULL in every call to
decoder_get_frame which is quite pointless because it is always
going to be NULL unless the application changed it. The code works
as described only because vp9_get_raw_frame returns -1 on all
subsequent calls after the first.

Change-Id: Ic736b9e8fe36fc1430fc11d6a9b292be02497248

6 years agoMerge "Add vpx_scaled_2d_neon()"
Linfeng Zhang [Wed, 27 Sep 2017 16:12:48 +0000 (16:12 +0000)]
Merge "Add vpx_scaled_2d_neon()"

6 years agoMerge "Add unit test to expose vp8 bug when width is set odd."
Jerome Jiang [Wed, 27 Sep 2017 01:26:59 +0000 (01:26 +0000)]
Merge "Add unit test to expose vp8 bug when width is set odd."

6 years agoMerge "vp8: [loongson] optimize copymen with mmi"
Shiyou Yin [Wed, 27 Sep 2017 00:49:28 +0000 (00:49 +0000)]
Merge "vp8: [loongson] optimize copymen with mmi"

6 years agoAdd unit test to expose vp8 bug when width is set odd.
Jerome Jiang [Thu, 21 Sep 2017 17:52:20 +0000 (10:52 -0700)]
Add unit test to expose vp8 bug when width is set odd.

BUG=b/64710201

Change-Id: Ia518af5494a42e80949cf1165244fbed59606cf7

6 years agoRemove the speed condition in setting compute_source_sad.
Marco [Tue, 26 Sep 2017 22:47:14 +0000 (15:47 -0700)]
Remove the speed condition in setting compute_source_sad.

The speed condition is not needed, feature can used for any
speed in 1 pass code.

Change-Id: I878ef3f63a075302eda48c0343fa243c80aab9ba

6 years agoReplace flag USE_ALTREF_FOR_ONE_PASS with speed feature.
Marco [Tue, 26 Sep 2017 17:18:43 +0000 (10:18 -0700)]
Replace flag USE_ALTREF_FOR_ONE_PASS with speed feature.

To be used for 1 pass VBR.
Off by default in speed features.

Change-Id: I5d6110d6d191990db526fe68ec9715379a4d1754

6 years agoMerge "SVC: Add setting for max_intra_rate_pct in sample encoder."
Marco Paniconi [Tue, 26 Sep 2017 16:28:30 +0000 (16:28 +0000)]
Merge "SVC: Add setting for max_intra_rate_pct in sample encoder."

6 years agoAdd vpx_scaled_2d_neon()
Linfeng Zhang [Tue, 19 Sep 2017 23:55:35 +0000 (16:55 -0700)]
Add vpx_scaled_2d_neon()

BUG=webm:1419

Change-Id: I39c8033734562efc0ac0e28e7f06fa05130f9b96

6 years agoMerge changes Ib9105462,Idfac00ed,If8d8a0e2
Linfeng Zhang [Tue, 26 Sep 2017 16:10:46 +0000 (16:10 +0000)]
Merge changes Ib9105462,Idfac00ed,If8d8a0e2

* changes:
  cosmetics: NEON scaling code
  Refactor convolve NEON code
  Refactor convolve code

6 years agovp8: [loongson] optimize copymen with mmi
Shiyou Yin [Wed, 6 Sep 2017 09:57:16 +0000 (17:57 +0800)]
vp8: [loongson] optimize copymen with mmi

1. vp8_copy_mem16x16_mmi
2. vp8_copy_mem8x8_mmi
3. vp8_copy_mem8x4_mmi

Change-Id: I3de29a11fa7402df0e48bbb944440b1e66498a65

6 years agoSVC: Add setting for max_intra_rate_pct in sample encoder.
Marco [Mon, 25 Sep 2017 20:36:25 +0000 (13:36 -0700)]
SVC: Add setting for max_intra_rate_pct in sample encoder.

Set it as default to 900.

Change-Id: Id2d990925dccff1f6762411c66ea95973440c92f

6 years agoMerge "vpxdsp: [x86] add highbd_d45_predictor functions"
Scott LaVarnway [Mon, 25 Sep 2017 11:34:14 +0000 (11:34 +0000)]
Merge "vpxdsp: [x86] add highbd_d45_predictor functions"

6 years agovpxdsp: [x86] add highbd_d45_predictor functions
Scott LaVarnway [Wed, 30 Aug 2017 16:27:44 +0000 (09:27 -0700)]
vpxdsp: [x86] add highbd_d45_predictor functions

C vs SSSE3 speed gains:
_4x4 : ~2.45x
_8x8 : ~10.61x
_16x16 : ~11.34x
_32x32 : ~6.36x

BUG=webm:1411

Change-Id: Ic91389a4f1a8ad093f498afe53765b897fb9be09

6 years agoMerge changes If59743aa,Ib046fe28,Ia2345752
James Zern [Fri, 22 Sep 2017 07:35:55 +0000 (07:35 +0000)]
Merge changes If59743aa,Ib046fe28,Ia2345752

* changes:
  Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
  Remove the unnecessary upcasts of (int)cospi_{1...31}_64
  Change cospi_{1...31}_64 from tran_high_t to tran_coef_t

6 years agoMerge "Comma-separate VP9 encoder tmp.stt output"
Andrew Lewis [Thu, 21 Sep 2017 08:50:53 +0000 (08:50 +0000)]
Merge "Comma-separate VP9 encoder tmp.stt output"

6 years agoMerge "vp9: Modify pickmode early exit for ARF in 1pass."
Marco Paniconi [Thu, 21 Sep 2017 01:33:12 +0000 (01:33 +0000)]
Merge "vp9: Modify pickmode early exit for ARF in 1pass."

6 years agovp9: Modify pickmode early exit for ARF in 1pass.
Marco [Wed, 20 Sep 2017 21:55:31 +0000 (14:55 -0700)]
vp9: Modify pickmode early exit for ARF in 1pass.

Add the condition frames_since_golden > 0 to the
early exit check for ARF usage in nonrd_pickmode.
This improves quality of first frame following ARF, where
frame_since_golden = 0.

Small/neutral gain in metrics for speed 6, neutral change in speed.

Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Change-Id: I82e73e6ff6fc849e5ca5448563cb8a0515fe0cdc

6 years agoRemove the unnecessary cast of (int16_t)cospi_{1...31}_64
Linfeng Zhang [Mon, 18 Sep 2017 16:33:31 +0000 (09:33 -0700)]
Remove the unnecessary cast of (int16_t)cospi_{1...31}_64

BUG=webm:1450

Change-Id: If59743aafe99226e0ec67ab5d20678ce25f53ab8

6 years agoRemove the unnecessary upcasts of (int)cospi_{1...31}_64
Linfeng Zhang [Wed, 13 Sep 2017 20:05:47 +0000 (13:05 -0700)]
Remove the unnecessary upcasts of (int)cospi_{1...31}_64

BUG=webm:1450

Change-Id: Ib046fe28caec5b9ebdc9d0152df7c54ff4266858

6 years agoChange cospi_{1...31}_64 from tran_high_t to tran_coef_t
Linfeng Zhang [Wed, 13 Sep 2017 00:13:17 +0000 (17:13 -0700)]
Change cospi_{1...31}_64 from tran_high_t to tran_coef_t

The unnecessary upcast to (int) will be cleaned later.

BUG=webm:1450

Change-Id: Ia234575206d5a74540526924b06ed3939322d063

6 years agoMerge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c"
James Zern [Wed, 20 Sep 2017 21:12:45 +0000 (21:12 +0000)]
Merge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c"

6 years agoBug fix: fadst4() in vp9/encoder/vp9_dct.c
Linfeng Zhang [Wed, 20 Sep 2017 16:18:04 +0000 (09:18 -0700)]
Bug fix: fadst4() in vp9/encoder/vp9_dct.c

A new bug was introduced in a80bdfd "Change sinpi_{1,2,3,4}_9 from
tran_high_t to int16_t". Reverted the change in this file.

BUG=webm:1450

Failed test C/TransHT.AccuracyCheck/26.

Change-Id: Id001f57aad811803ef7d367d2b2bc008d8499991

6 years agoMerge "vp9: Modify simple_block_yrd condition for SVC"
Marco Paniconi [Wed, 20 Sep 2017 16:42:31 +0000 (16:42 +0000)]
Merge "vp9: Modify simple_block_yrd condition for SVC"

6 years agoMerge "vpxdsp: [x86] add highbd_d63_predictor functions"
Scott LaVarnway [Wed, 20 Sep 2017 11:39:28 +0000 (11:39 +0000)]
Merge "vpxdsp: [x86] add highbd_d63_predictor functions"

6 years agotemporal_filter_apply_sse2.asm: add ':' to label
James Zern [Wed, 20 Sep 2017 01:59:11 +0000 (18:59 -0700)]
temporal_filter_apply_sse2.asm: add ':' to label

quiets nasm warning:
label alone on a line without a colon might be in error

BUG=webm:1462

Change-Id: I660407ca60e8c9a810dba9d76afb65852029a29c

6 years agocosmetics: NEON scaling code
Linfeng Zhang [Tue, 19 Sep 2017 23:39:17 +0000 (16:39 -0700)]
cosmetics: NEON scaling code

Change-Id: Ib91054622c1f09c4ca523bc6837d7d8ab9f03618

6 years agoRefactor convolve NEON code
Linfeng Zhang [Tue, 19 Sep 2017 23:14:56 +0000 (16:14 -0700)]
Refactor convolve NEON code

Rename a couple of hbd static functions.
Move the position of NEON function convolve8_4().

Change-Id: Idfac00edf2e99cdd8e0a73b9f895402f60be6349

6 years agoRefactor convolve code
Linfeng Zhang [Tue, 19 Sep 2017 23:23:14 +0000 (16:23 -0700)]
Refactor convolve code

Extract a couple of static functions into their caller functions.

Change-Id: If8d8a0e217fba6b402d2a79ede13b5b444ff08a0

6 years agovpxdsp: [x86] add highbd_d63_predictor functions
Scott LaVarnway [Wed, 13 Sep 2017 01:01:31 +0000 (18:01 -0700)]
vpxdsp: [x86] add highbd_d63_predictor functions

C vs SSE2 speed gains:
_4x4 : ~2.94x

C vs SSSE3 speed gains:
_8x8 : ~8.69x
_16x16 : ~6.32x
_32x32 : ~5.33x

BUG=webm:1411

Change-Id: I2c35b527eac2229f17aaa9d118fb601e7195efe4

6 years agovp9: Modify simple_block_yrd condition for SVC
Marco [Tue, 19 Sep 2017 22:19:41 +0000 (15:19 -0700)]
vp9: Modify simple_block_yrd condition for SVC

Modify simple_block_yrd condition in nonrd_pickmode for SVC:
allow it to be used also on base temporal_layer, only when
spatial_layer > 1 and block size < 32x32.

Speed up of about ~2% for 3 layer SVC, with little/negligible
loss in quality.

Change-Id: I7734bdae51cf51f22b96f6b2b27da20ea1d84344

6 years agoMerge "Add datarate test for frame_parallel_decoding mode off."
Marco Paniconi [Tue, 19 Sep 2017 22:31:08 +0000 (22:31 +0000)]
Merge "Add datarate test for frame_parallel_decoding mode off."

6 years agovp9: Fix condition for limiting ARF 1 pass vbr.
Marco [Tue, 19 Sep 2017 18:00:40 +0000 (11:00 -0700)]
vp9: Fix condition for limiting ARF 1 pass vbr.

Fix the setting to frames_till_gf_update_due, and
adjust the limit value.
Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Neutral change to metrics and speed for ytlive.

Change-Id: I266d9a00b36221bc8602fa2746d4e8a8f7d4dfae

6 years agoMerge "vp9: Adjustments for ARF usage in 1 pass vbr."
Marco Paniconi [Tue, 19 Sep 2017 16:29:19 +0000 (16:29 +0000)]
Merge "vp9: Adjustments for ARF usage in 1 pass vbr."

6 years agovp9: Adjustments for ARF usage in 1 pass vbr.
Marco [Tue, 19 Sep 2017 00:30:49 +0000 (17:30 -0700)]
vp9: Adjustments for ARF usage in 1 pass vbr.

Only when USE_ALT_REF_ONE_PASS is enabled (off by default).
Force fixed partition to 64x64 when is_src_alt_ref_frame is true,
and don't force early exit for some modes in nonrd_pickmode
for ARF noshow frames.

Small gain ~0.2% on ytlive metrics for speed 6.
Neutral speed difference.

Change-Id: I27eb6622d0453c09a06ccdc3b16368762474d11d

6 years agoChange sinpi_{1,2,3,4}_9 from tran_high_t to int16_t
Linfeng Zhang [Tue, 12 Sep 2017 22:24:54 +0000 (15:24 -0700)]
Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t

Add "typedef int16_t tran_coef_t;"

BUG=webm:1450

Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154

6 years agoMerge "cosmetics: vp9_rtcd_defs.pl"
Linfeng Zhang [Mon, 18 Sep 2017 16:23:33 +0000 (16:23 +0000)]
Merge "cosmetics: vp9_rtcd_defs.pl"

6 years agoMerge "vp8: [loongson] optimize dequantize with mmi"
Shiyou Yin [Fri, 15 Sep 2017 23:53:40 +0000 (23:53 +0000)]
Merge "vp8: [loongson] optimize dequantize with mmi"

6 years agoAdd datarate test for frame_parallel_decoding mode off.
Marco [Fri, 15 Sep 2017 18:35:53 +0000 (11:35 -0700)]
Add datarate test for frame_parallel_decoding mode off.

Add datarate test, for both VBR and CBR mode, with the
frame_parallel_decoding mode disabled (and error_resilience off).

Change-Id: I54feec3248a68ecff4bef8d9a31bb1616fab77df

6 years agoMerge "Fix bug in intra mode rd penalty."
Paul Wilkins [Fri, 15 Sep 2017 15:43:29 +0000 (15:43 +0000)]
Merge "Fix bug in intra mode rd penalty."

6 years agoMerge "mips msa clean-up msa macros"
Kaustubh Raste [Fri, 15 Sep 2017 01:27:02 +0000 (01:27 +0000)]
Merge "mips msa clean-up msa macros"

6 years agoMerge "vp9_scale_test: add C config"
James Zern [Fri, 15 Sep 2017 00:27:58 +0000 (00:27 +0000)]
Merge "vp9_scale_test: add C config"

6 years agoMerge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()""
James Zern [Fri, 15 Sep 2017 00:27:41 +0000 (00:27 +0000)]
Merge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()""

6 years agoMerge "VP9 level targeting: add a new AUTO mode"
Hui Su [Thu, 14 Sep 2017 21:02:38 +0000 (21:02 +0000)]
Merge "VP9 level targeting: add a new AUTO mode"

6 years agovp9_scale_test: add C config
James Zern [Thu, 14 Sep 2017 20:08:04 +0000 (13:08 -0700)]
vp9_scale_test: add C config

Change-Id: I9dfe8255d1c096d246bf9719729f57dbae779ffc

6 years agoRevert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"
James Zern [Thu, 14 Sep 2017 20:06:40 +0000 (13:06 -0700)]
Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"

This reverts commit afee58f2c4159172f5340f2c7d3e8041cfa0eb91.

This causes ~8x slowdown in 4:3 in the C-code

Change-Id: I60a7ead12dc4ec1548b1b12cfe4b0be42ef04e0e

6 years agoVP9 level targeting: add a new AUTO mode
Hui Su [Thu, 10 Aug 2017 22:05:20 +0000 (15:05 -0700)]
VP9 level targeting: add a new AUTO mode

In the new AUTO mode, restrict the minimum alt-ref interval and max column
tiles adaptively based on picture size, while not applying any rate control
constraints.

This mode aims to produce encodings that fit into levels corresponding to
the source picture size, with minimum compression quality lost. However, the
bitstream is not guaranteed to be level compatible, e.g., the average bitrate
may exceed level limit.

BUG=b/64451920

Change-Id: I02080b169cbbef4ab2e08c0df4697ce894aad83c

6 years agovp8: [loongson] optimize dequantize with mmi
Shiyou Yin [Wed, 6 Sep 2017 03:30:25 +0000 (11:30 +0800)]
vp8: [loongson] optimize dequantize with mmi

1. vp8_dequantize_b_mmi
2. vp8_dequant_idct_add_mmi

Change-Id: I505f8afb7a444173392b325906e6a4f420f00709

6 years agovp8: [loongson] optimize idctllm with mmi
Shiyou Yin [Wed, 6 Sep 2017 00:51:21 +0000 (08:51 +0800)]
vp8: [loongson] optimize idctllm with mmi

1. vp8_short_idct4x4llm_mmi
2. vp8_short_inv_walsh4x4_mmi
3. vp8_dc_only_idct_add_mmi

Change-Id: I616923681e79d78607a4988608fc39df77b093f4

6 years agomips msa clean-up msa macros
Kaustubh Raste [Thu, 14 Sep 2017 06:59:19 +0000 (12:29 +0530)]
mips msa clean-up msa macros

Removed inline for GP load-store in case of (__mips_isa_rev >= 6)
Created one define LD_V for vector load and ST_V for vector store

Change-Id: Ifec3570fa18346e39791b0dd622892e5c18bd448