]>
granicus.if.org Git - libvpx/log
Vignesh Venkatasubramanian [Mon, 13 Feb 2017 19:36:02 +0000 (11:36 -0800)]
vp9,realtime: Enable row multithreading for non-rd
Enable row level multithreading for realtime encodes where non-rd
path is used (speed >= 5).
Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
Yi Luo [Wed, 1 Mar 2017 00:38:41 +0000 (16:38 -0800)]
Improve idct32x32_34_add SSSE3 intrinsics performance
- Split the transform into first half and second half.
- Reschedule the instructions to avoid stack spillover.
- Function level speed improves ~16%.
Change-Id: I166889840d23aa8a273eca00f6fbdae8b4566f35
Chrome Cunningham [Wed, 1 Mar 2017 18:01:13 +0000 (18:01 +0000)]
Merge "VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface"
Chris Cunningham [Thu, 16 Feb 2017 23:02:30 +0000 (15:02 -0800)]
VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface
Moves the def from vpx_encoder.h -> vpx_codec.h. The defined value
is changed as part of this move.
Adds the value to decoder capabilities when CONFIG_VP9_HIGHBITDEPTH.
Change-Id: I7d61fc821cda29f1e32bb9b2b9ffd3d83966e419
James Zern [Wed, 1 Mar 2017 00:17:49 +0000 (16:17 -0800)]
Revert "Fix for max qindex calculation of a gf interval"
This reverts commit
d3db846cc50b1b0a9f6efcbe2b36c9c1943bc528 .
This change causes a large drop in psnr (4-5db) on low framerate
difficult content (tested at 360/480p)
BUG=b/
35804225
Change-Id: I8e90012d3b9c8a0cddb062ba93b01b36c0e0c0a0
James Zern [Tue, 28 Feb 2017 23:13:11 +0000 (15:13 -0800)]
vp9_ethread_test,cosmetics: s/new-mt/row-mt/
Change-Id: I8c145337adf49d30b88a17ff31501b8751ed1fa0
James Zern [Fri, 24 Feb 2017 08:55:01 +0000 (00:55 -0800)]
stress.sh: add vp9_stress_test_row_mt
vp9_stress_test now forces --row-mt=0 to cover both versions
Change-Id: I8d134879435bf1d8e76ab3fd89e698efba0e86b2
James Zern [Fri, 24 Feb 2017 08:54:02 +0000 (00:54 -0800)]
stress.sh: parameterize thread count
Change-Id: Iae45266cea86585f0935af4012335198cf93719f
James Zern [Fri, 24 Feb 2017 08:30:08 +0000 (00:30 -0800)]
stress.sh: add one pass encodes
Change-Id: I38e6c988f17c56fbfacd95378b27ef8d77c75f90
Yunqing Wang [Tue, 28 Feb 2017 19:13:09 +0000 (11:13 -0800)]
Add a comment in encoder thread test
Added a comment.
Change-Id: I82f71c72598ad6f1eaa0b57b0b8ec56ab9658e81
Yunqing Wang [Tue, 28 Feb 2017 19:00:56 +0000 (11:00 -0800)]
Set row_mt to 0 by default
Set row_mt to 0 for now.
Change-Id: I922536a6d71a765e435daeaf4d932ef14363d19a
Marco [Mon, 27 Feb 2017 20:03:12 +0000 (12:03 -0800)]
vp9: Fix an issue with setting variance thresholds.
From commit:
https://chromium-review.googlesource.com/c/441393/
On non-segment the set_vbp_thresholds() should be called
again to adjust thresholds based on content_state of superblock.
This was the intended behavior from 441393.
Small change in RTC metrics and speed.
Change-Id: I45e5fbdc4af74db76b3cb4f13074fcae0eb2219e
Vignesh Venkatasubramanian [Mon, 27 Feb 2017 18:50:02 +0000 (10:50 -0800)]
vp9_ethread_test: Rename new_mt to row_mt
Rename left over occurences of new_mt.
Change-Id: Ib884e84c801fcd366ca4b57ec912ac5972023375
Vignesh Venkatasubramanian [Fri, 24 Feb 2017 19:40:22 +0000 (11:40 -0800)]
vp9: Rename new_mt to row_mt
new_mt is a very generic name that will get obsolete soon enough.
Since this is exposed as a codec control, renaming it to row_mt to
signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
codec control to ROW_MT_BIT_EXACT.
Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
Yunqing Wang [Sat, 25 Feb 2017 02:31:21 +0000 (18:31 -0800)]
Remove an old leftover comment
Removed an old comment that wasn't true anymore.
Change-Id: I286ad8d7cb2843070a55e45a599d26bc226d6bd7
James Zern [Fri, 24 Feb 2017 23:36:52 +0000 (15:36 -0800)]
get_prob(): rationalize int types
promote the unsigned int calculation to uint64_t rather than int64_t for
type consistency
Change-Id: Ic34dee1dc707d9faf6a3ae250bfe39b60bef3438
Yunqing Wang [Fri, 24 Feb 2017 23:26:22 +0000 (23:26 +0000)]
Merge "Improve VP9 encoder threading test for better coverage"
Yunqing Wang [Wed, 22 Feb 2017 20:24:16 +0000 (12:24 -0800)]
Improve VP9 encoder threading test for better coverage
Re-organized the encoder threading tests and grouped tests into
4 parts. Added PSNR checking test to make sure the PSNR variation
is within a small range.
BUG=webm:1376
Change-Id: I09edb990236a87a4d2b2b0e1ceaf6c6435a35eff
Jerome Jiang [Fri, 24 Feb 2017 16:56:33 +0000 (16:56 +0000)]
Merge "Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8."
Johann [Fri, 17 Feb 2017 01:57:44 +0000 (17:57 -0800)]
consolidate block_error functions
vp9_highbd_block_error_8bit_c was a very simple wrapper around
vp9_block_error_c. The SSE2 implemention was practically identical to
the non-HBD one. It was missing some minor improvements which only
went into the original version.
In quick speed tests, the AVX implementation showed minimal
improvement over SSE2 when it does not detect overflow. However, when
overflow is detected the function is run a second time. The
OperationCheck test seems to trigger this case and reverses any
speed benefits by running ~60% slower. AVX2 on the other hand is
always 30-40% faster.
Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1
Johann Koenig [Fri, 24 Feb 2017 05:24:34 +0000 (05:24 +0000)]
Merge "block error sse2: use tran_low_t"
Jerome Jiang [Wed, 22 Feb 2017 22:24:02 +0000 (14:24 -0800)]
Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8.
Only works for bitdepth = 8 when compiled with high bitdepth flag.
4x speed ups for handling 1:2 down/upsampling.
Validated manually for:
1) Dynamic resize for a single layer encoding
2) SVC encoding with 3 spatial layers
Results are bitexact with the patch and the speed gain (~4x) in the
scaling was verified.
BUG=webm:1371
Change-Id: I1bdb5f4d4bd0df67763fc271b6aa355e60f34712
Johann [Thu, 16 Feb 2017 20:44:49 +0000 (12:44 -0800)]
block error sse2: use tran_low_t
Change-Id: Ib04990e4a7bda9fbf501f294da2057a2b2595deb
Johann Koenig [Thu, 23 Feb 2017 07:41:20 +0000 (07:41 +0000)]
Merge "vp8_fdct4x4 test: fix segfault again"
Marco Paniconi [Thu, 23 Feb 2017 03:24:26 +0000 (03:24 +0000)]
Merge "vp9: 1pass CBR: modify condition for reducing loop filter."
Jerome Jiang [Wed, 22 Feb 2017 23:19:29 +0000 (23:19 +0000)]
Merge "vp9: Non-rd pickmode: use simple block_yrd under some conditons."
Marco [Wed, 22 Feb 2017 23:06:28 +0000 (15:06 -0800)]
vp9: 1pass CBR: modify condition for reducing loop filter.
The reduction showed improvement on RTC when aq-mode=3 is on.
Add that (cyclic refresh enabled) to the condition.
Only affects 1 pass CBR.
Change-Id: I5d0843002d8e31d7c165098a62e7a71146b08664
Marco [Fri, 17 Feb 2017 16:44:50 +0000 (08:44 -0800)]
vp9: Non-rd pickmode: use simple block_yrd under some conditons.
For speed 8 only.
3% speed up for QVGA and 6.3% for VGA on Nexus 6.
~3% avgPSNR decrease on rtc_derf and 2.9% on rtc.
Disabled for now.
Change-Id: I70133f1f6c804d663d594df437bfe7fdb0030d6a
Marco Paniconi [Wed, 22 Feb 2017 19:52:24 +0000 (19:52 +0000)]
Merge "vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0."
Marco [Wed, 22 Feb 2017 18:45:21 +0000 (10:45 -0800)]
vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0.
This prevent possible reduction of cyclic refresh after key frame.
Change-Id: Idd4e49b69cd95476e7eccfa31b2bd8669569e9e8
Johann [Tue, 21 Feb 2017 19:12:45 +0000 (11:12 -0800)]
vp8_fdct4x4 test: fix segfault again
The output needs to be aligned. Input is read with 'movq' not 'movqda'
so it is not expected to be aligned.
Change-Id: Ibd48a84c1785917a6a97c3689a05322abba486b4
Jerome Jiang [Wed, 22 Feb 2017 17:49:17 +0000 (09:49 -0800)]
vp9: Only compute y_sad for golden in variance partition for speed < 8.
Only affects speed 8. No obvious quality regression. Systematic speed
ups by ~1% on Nexus 6.
Change-Id: Ia904ca28ea041c3281c532911ec38fb7d7f46a17
Yunqing Wang [Wed, 22 Feb 2017 16:55:03 +0000 (16:55 +0000)]
Merge "Refactored the row based multi-threading code"
Jerome Jiang [Wed, 22 Feb 2017 04:44:55 +0000 (04:44 +0000)]
Merge "Fix segmentation fault caused by denoiser working with spatial SVC."
Marco [Mon, 13 Feb 2017 18:16:42 +0000 (10:16 -0800)]
vp9: Incorporate source sum_diff into non-rd partition thresholds.
Increase the variance partition thresholds for superblocks that
have low sum-diff (from source analysis prior to encoding frame).
Use it for now only for speed >= 7 or for denoising on.
Small change on metrics for rtc set: less than ~0.1 avgPNSR decrease
on RTC set, for both speed 7 and 8.
Change-Id: I38325046ebd5f371f51d6e91233d68ff73561af1
Yi Luo [Tue, 21 Feb 2017 20:07:47 +0000 (12:07 -0800)]
Following SSSE3 intrinsics functions also work for HBD
- vpx_idct8x8_12_add_ssse3
vpx_idct8x8_64_add_ssse3
vpx_idct32x32_34_add_ssse3
vpx_idct32x32_135_add_ssse3
vpx_idct32x32_1024_add_ssse3
- turn on unit tests.
Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7
Johann Koenig [Tue, 21 Feb 2017 18:16:38 +0000 (18:16 +0000)]
Merge "Drop zbin_ptr and quant_shift_ptr"
Jerome Jiang [Sat, 18 Feb 2017 01:56:08 +0000 (17:56 -0800)]
Fix segmentation fault caused by denoiser working with spatial SVC.
Re-enable the affected test.
BUG=webm:1374
Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb
Yi Luo [Tue, 21 Feb 2017 16:36:05 +0000 (16:36 +0000)]
Merge "Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests"
Paul Wilkins [Tue, 21 Feb 2017 09:42:37 +0000 (09:42 +0000)]
Merge "Change to prediction decay calculation."
Marco Paniconi [Tue, 21 Feb 2017 05:37:22 +0000 (05:37 +0000)]
Merge "vp9: Fix for non-rd pickmode for high-bitdepth build."
Marco [Tue, 21 Feb 2017 04:15:40 +0000 (20:15 -0800)]
vp9: Fix for non-rd pickmode for high-bitdepth build.
Use the simple block_yrd under certain conditions.
The optimization code is completed but the speed is still slower
(~6% on 720p) than the low-bitdepth build.
For now, use the more complex block_yrd under certain conditions
(always use it for speed <= 5, otherwise use it on key frames and for
bsize >= 32x32).
This gives about ~2-3% gain in quality for speed 7 on RTC set
(over high bitdepth build), with about the same encoder fps as the
low bitdepth build.
Change-Id: Ibe92a1945d0bd635f880befb4c815727df62d754
Ranjit Kumar Tulabandu [Thu, 16 Feb 2017 13:37:41 +0000 (19:07 +0530)]
Refactored the row based multi-threading code
Modified the code to facilitate bit-match tests in first pass
Added unit-tests to test the row based multi-threading behavior for bit-exactness
Change-Id: Ieaf6a8f935bb1075597e0a3b52d9989c8546d7df
James Zern [Sat, 18 Feb 2017 21:24:32 +0000 (13:24 -0800)]
vp8_fdct4x4_test: align input and output buffers
fixes segfault in 32-bit builds
Change-Id: I5b3cc5a335cb236a6ec4cb11fa8feb54ae0182c7
James Zern [Sat, 18 Feb 2017 00:23:22 +0000 (16:23 -0800)]
datarate_test: disable OnePassCbrSvc2SpatialLayersDenoiserOn
segfaults
BUG=webm:1374
Change-Id: I3790c6cb8a539d13dee6a8225ef09b1575dea26c
Johann Koenig [Fri, 17 Feb 2017 22:11:08 +0000 (22:11 +0000)]
Merge "vp8_short_fdct4x4: verify optimized functions"
Yi Luo [Fri, 17 Feb 2017 18:59:46 +0000 (10:59 -0800)]
Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests
- In SSSE3 optimization, 16-bit addition and subtraction would
overflow when input coefficient is 16-bit signed extreme values.
- Function-level speed becomes slower (unit ms):
idct8x8_64: 284 -> 294
idct8x8_12: 145 -> 158.
BUG=webm:1332
Change-Id: I1e4bf9d30a6d4112b8cac5823729565bf145e40b
James Zern [Fri, 17 Feb 2017 20:29:36 +0000 (20:29 +0000)]
Merge "Add vpx_highbd_idct16x16_10_add_neon()"
paulwilkins [Wed, 15 Feb 2017 16:41:38 +0000 (16:41 +0000)]
Change to prediction decay calculation.
This change subtracts out low complexity intra regions that are also low
error in the inter domain, in the calculation of the frame prediction decay.
The rationale here his that low complexity regions (such as sky) do not imply
high prediction decay in the same way as high error intra or neutral blocks.
The effect of this is small in most clips but in a few clips it can be > 10%.
(E.g. In to tree)
Change-Id: If67ac23d17fca14285cad2defa464c61c9ea861c
Johann [Fri, 23 Sep 2016 23:45:03 +0000 (16:45 -0700)]
vp8_short_fdct4x4: verify optimized functions
Change-Id: I7c7f5dfabde65c09f111fb0ced0e3ad231ee716e
Johann [Tue, 31 Jan 2017 23:58:43 +0000 (15:58 -0800)]
tiny_ssim: clean up on failure
Clears up clang static analysis warnings about memory leaks.
Change-Id: I60d4d0f3794735a8b81d9da4a30d19e7a9cba9cf
Yi Luo [Thu, 16 Feb 2017 21:15:22 +0000 (13:15 -0800)]
Replace idct32x32_1024_add_ssse3 assembly with intrinsics
- Encoding/decoding test, BQTerrace_1920x1080_60.y4m, on
i7-6700, no obvious user-level speed performance downgrade.
- Passed unit tests.
Change-Id: I20688e0dd3731021ec8fb4404734336f1a426bfc
James Zern [Fri, 17 Feb 2017 00:04:42 +0000 (00:04 +0000)]
Merge "cosmetics: Fix spelling mistake in compile flag name."
Johann Koenig [Thu, 16 Feb 2017 23:51:14 +0000 (23:51 +0000)]
Merge "block error avx2: use tran_low_t"
Linfeng Zhang [Tue, 14 Feb 2017 18:24:51 +0000 (10:24 -0800)]
Add vpx_highbd_idct16x16_10_add_neon()
BUG=webm:1301
Change-Id: If686c8144764c4162458f0bc4bb1bbf6555c48ab
James Zern [Thu, 16 Feb 2017 23:02:10 +0000 (23:02 +0000)]
Merge "Fix mips vpx_post_proc_down_and_across_mb_row_msa function"
James Zern [Thu, 16 Feb 2017 22:56:02 +0000 (22:56 +0000)]
Merge "disable VP9MultiThreadedFrameParallel tests"
paulwilkins [Thu, 16 Feb 2017 12:36:56 +0000 (12:36 +0000)]
cosmetics: Fix spelling mistake in compile flag name.
agressive -> aggressive
after:
ce7b38459 Aggressive VBR method.
Change-Id: Ie0f30b1bbc77ed9f32bec047b4a9b3d0cf4853f5
Johann Koenig [Thu, 16 Feb 2017 21:41:27 +0000 (21:41 +0000)]
Merge "correct bitdepth_conversion_sse2.h header guard"
Johann [Tue, 14 Feb 2017 00:29:49 +0000 (16:29 -0800)]
Drop zbin_ptr and quant_shift_ptr
vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use
of these parameters.
scan is used for C code and iscan is used for SIMD implementations.
Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
James Zern [Thu, 16 Feb 2017 20:56:04 +0000 (12:56 -0800)]
disable VP9MultiThreadedFrameParallel tests
these are flaky and cause TSan warnings with clang-3.9.1
BUG=webm:1372
Change-Id: I8a7047552ba2ccd2d8c45f8795818c74562e5990
Johann [Thu, 16 Feb 2017 20:43:33 +0000 (12:43 -0800)]
correct bitdepth_conversion_sse2.h header guard
Change-Id: Ic4ffd861608e67fe59bcb3a86010ce3ef11a5519
Yi Luo [Thu, 16 Feb 2017 20:43:28 +0000 (20:43 +0000)]
Merge "Add idct32x32_135_add SSSE3 intrinsics"
Johann [Thu, 16 Feb 2017 19:12:31 +0000 (11:12 -0800)]
block error avx2: use tran_low_t
Change-Id: Ic5f3a1f569d6f82afeaf4fcd7235374bb460db3c
Johann Koenig [Thu, 16 Feb 2017 20:34:48 +0000 (20:34 +0000)]
Merge changes I267050a5,Iebade0ef,Id96a8df3
* changes:
quantize_fp_32x32 highbd ssse3: enable existing function
quantize_fp highbd ssse3: use tran_low_t for coeff
quantize_fp highbd sse2: use tran_low_t for coeff
Yi Luo [Wed, 15 Feb 2017 01:09:59 +0000 (17:09 -0800)]
Add idct32x32_135_add SSSE3 intrinsics
- Replace the corresponding assembly code.
- No user level speed performance degrade.
- Unit tests passed.
Change-Id: Idd0c5a4bad4976f1617c34100cb46e75e3b961e5
Yunqing Wang [Thu, 16 Feb 2017 16:22:54 +0000 (16:22 +0000)]
Merge "Structured the mode ordering code to avoid redundant memcpy"
Johann [Thu, 16 Feb 2017 15:29:32 +0000 (07:29 -0800)]
quantize_fp_32x32 highbd ssse3: enable existing function
This was created as part of the quantize_fp_ssse3 change. Both
functions use the same source file with different macro parameters.
Change-Id: I267050a559426a85955d215aa0aaca270439c5ab
Johann [Thu, 16 Feb 2017 03:01:38 +0000 (19:01 -0800)]
quantize_fp highbd ssse3: use tran_low_t for coeff
Change-Id: Iebade0efc0efbb0a80a0f3adbef4962e3a2f25e8
Johann [Fri, 3 Feb 2017 23:57:28 +0000 (15:57 -0800)]
quantize_fp highbd sse2: use tran_low_t for coeff
Change-Id: Id96a8df33354a7987ce890a3d6798c7375ffa4aa
Johann [Thu, 16 Feb 2017 01:17:45 +0000 (17:17 -0800)]
bitdepth conversion: really use num elements
The previous implementation confused bit/bytes/elements. It was using
'32' as the multiplier but that was mistakenly adopted because a 32x32
transform embedded the stride.
Change-Id: Ieeb867a332416b9a40580b5e7c9b20088e9e691a
Ranjit Kumar Tulabandu [Thu, 16 Feb 2017 14:07:39 +0000 (19:37 +0530)]
Structured the mode ordering code to avoid redundant memcpy
Change-Id: I4f5d6b54018bd1928cd9e5e42619e6f55b334803
Paul Wilkins [Thu, 16 Feb 2017 10:02:09 +0000 (10:02 +0000)]
Merge "Disconnect ARF breakout from frame boost."
Paul Wilkins [Thu, 16 Feb 2017 10:01:57 +0000 (10:01 +0000)]
Merge "Remove unnecessary factor."
Paul Wilkins [Thu, 16 Feb 2017 10:01:45 +0000 (10:01 +0000)]
Merge "Bug in scale_sse_threshold()"
Paul Wilkins [Thu, 16 Feb 2017 09:39:29 +0000 (09:39 +0000)]
Merge "Additional first pass stats."
Kaustubh Raste [Thu, 16 Feb 2017 06:42:24 +0000 (12:12 +0530)]
Fix mips vpx_post_proc_down_and_across_mb_row_msa function
Added fix to handle non-multiple of 16 cols case for size 16
Change-Id: If3a6d772d112077c5e0a9be9e612e1148f04338c
Johann Koenig [Thu, 16 Feb 2017 02:40:59 +0000 (02:40 +0000)]
Merge "Use 'packssdw' for loading tran_low_t values"
Johann Koenig [Thu, 16 Feb 2017 01:00:44 +0000 (01:00 +0000)]
Merge "vp8_dx_iface: remove unused 'else' condition"
James Zern [Thu, 16 Feb 2017 00:21:19 +0000 (00:21 +0000)]
Merge "vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism"
Marco Paniconi [Wed, 15 Feb 2017 23:03:27 +0000 (23:03 +0000)]
Merge "vp9: Some code cleanup for aq-mode = 3."
Marco [Wed, 15 Feb 2017 21:51:14 +0000 (13:51 -0800)]
vp9: Some code cleanup for aq-mode = 3.
The weight segment needs to only be computed once per frame,
so remove it from the funciton vp9_cyclic_refresh_rc_bits_per_mb(),
which is called within a loop inside vp9_rc_regulate_q.
Change-Id: Ia0e18b89abb97e42c466d4dbc47700d7f76555db
Jerome Jiang [Wed, 15 Feb 2017 03:09:15 +0000 (19:09 -0800)]
vpx_temporal_svc_encoder: Expose error resilient control to cmd line.
Change-Id: Ic74a8690b136ffbc370080f70b2d5a6b1572bf63
Linfeng Zhang [Wed, 15 Feb 2017 20:18:23 +0000 (20:18 +0000)]
Merge "cosmetics,dsp/inv_txfm.c: reorder functions"
Marco Paniconi [Wed, 15 Feb 2017 20:07:19 +0000 (20:07 +0000)]
Merge "vp9. Use same source_sad threshold for all speeds."
Linfeng Zhang [Wed, 15 Feb 2017 00:27:30 +0000 (16:27 -0800)]
cosmetics,dsp/inv_txfm.c: reorder functions
Change-Id: Ie0f7689ebe230c68eadb22a32b14838c1a7543a6
Linfeng Zhang [Wed, 15 Feb 2017 19:34:18 +0000 (19:34 +0000)]
Merge "Add vpx_highbd_idct16x16_38_add_neon()"
Marco [Wed, 15 Feb 2017 19:26:29 +0000 (11:26 -0800)]
vp9. Use same source_sad threshold for all speeds.
Only affects real-time mode.
Change-Id: Iba836f110c4da936f5173cc0f54424d5b6121bff
Marco [Wed, 15 Feb 2017 17:18:34 +0000 (09:18 -0800)]
Vp9: Speed 8 aq-mode=3: Reduce computation in estimating bits per mb.
vp9_compute_qdelta_by_rate has almost 2% overhead in profiling on Nexus 6.
Reduce the calling of that function in speed 8 by estimating the delta-q.
Both rtc and rtc_derf show little/no change in avg psnr/ssim.
Encoding speed is 2~3% faster on Nexus 6.
Change-Id: If25933715783f31104a18a5092ea347b1221b5f5
Linfeng Zhang [Wed, 8 Feb 2017 00:58:12 +0000 (16:58 -0800)]
Add vpx_highbd_idct16x16_38_add_neon()
BUG=webm:1301
Change-Id: Ic6cd8c1e63e1b7a997cbed221e20fff4c599e0fe
Linfeng Zhang [Wed, 15 Feb 2017 17:06:16 +0000 (17:06 +0000)]
Merge "Add vpx_highbd_idct16x16_38_add_c()"
paulwilkins [Wed, 15 Feb 2017 10:33:10 +0000 (10:33 +0000)]
Disconnect ARF breakout from frame boost.
This small change replaces the frame boost check in the arf group
length break out clause with a test against a prediction decay value.
The boost value is in fact partly dependent on the decay value but
this change means that the per frame boost calculation can be adjusted
without influencing the group length calculation.
The value chosen gives a close match on all the test sets with the previous
code (on average) but it was noted that a lower threshold was slightly better
for 1080P and up and a slightly higher value for small image sizes.
Change-Id: I4d5b9f67d5b17b0d99ea3f796d3d6202fd61ee0c
paulwilkins [Tue, 14 Feb 2017 10:35:12 +0000 (10:35 +0000)]
Remove unnecessary factor.
Removed unnecessary scaling factor to simplify.
Change-Id: I3fc9c5975a2597e72f1324e09dd586dea1facfa7
paulwilkins [Thu, 9 Feb 2017 16:30:38 +0000 (16:30 +0000)]
Bug in scale_sse_threshold()
The function scale_sse_threshold() returns a threshold scaled
if necessary for use with 10 and 12 bit from an 8 bit baseline.
SSE error values would be expected to rise for the 10 and 12
bit cases where there are more bits of precision.
Hence the threshold used for the test should also be scaled up.
Change-Id: I4009c98b6eecd1bf64c3c38aaa56598e0136b03d
paulwilkins [Mon, 12 Dec 2016 14:05:19 +0000 (14:05 +0000)]
Additional first pass stats.
Added counts that split the intra coded blocks into low and high variance.
Change-Id: Ic540144b34d5141659081bb22f7ee16fd6861f14
Paul Wilkins [Wed, 15 Feb 2017 10:37:02 +0000 (10:37 +0000)]
Merge "Aggressive VBR method."
James Zern [Wed, 15 Feb 2017 07:44:00 +0000 (23:44 -0800)]
vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism
replace with an explicit output file prefix that matches the function
name
Change-Id: I7f6a4105adb34327b1099a5fbf132aa8d1ad5b90
Johann Koenig [Wed, 15 Feb 2017 01:33:00 +0000 (01:33 +0000)]
Merge "vp9 fdct higbd neon: connect existing highbd calls"
Linfeng Zhang [Tue, 14 Feb 2017 23:39:37 +0000 (15:39 -0800)]
Add vpx_highbd_idct16x16_38_add_c()
When eob is less than or equal to 38 for high-bitdepth 16x16 idct,
call this function.
BUG=webm:1301
Change-Id: I09167f89d29c401f9c36710b0fd2d02644052060
Yunqing Wang [Wed, 15 Feb 2017 00:54:10 +0000 (00:54 +0000)]
Merge "Row based multi-threading of encoding stage"