]>
granicus.if.org Git - libvpx/log
Parag Salasakar [Mon, 3 Aug 2015 07:30:55 +0000 (13:00 +0530)]
mips msa vpx subpel variance optimization
Removed redundant clip/saturate code from 2tap filter functions
average improvement 20%-40%
Change-Id: I362540b0c7d5d3d69932c39d61b7d2a44da533d2
Jingning Han [Fri, 31 Jul 2015 19:57:52 +0000 (12:57 -0700)]
Add _dspr2 to local function names
It avoids symbol conflicts between function names of various
implementation versions.
Change-Id: Iad79ebcb8e289457801812a7745c8380b5b06a46
Jingning Han [Mon, 3 Aug 2015 03:18:39 +0000 (03:18 +0000)]
Merge "Factor out mips/msa inverse transform implementations"
Jingning Han [Sun, 2 Aug 2015 21:56:09 +0000 (21:56 +0000)]
Merge "Add x86inc flag guard to inv_txfm_sse2.asm"
Jingning Han [Sun, 2 Aug 2015 15:43:13 +0000 (08:43 -0700)]
Add x86inc flag guard to inv_txfm_sse2.asm
Fix the VS build failure.
Change-Id: I4fb9d1c83980c4b52d5a848a9cb02ec72493dccb
James Zern [Sat, 1 Aug 2015 18:45:49 +0000 (11:45 -0700)]
vpx_convolve_copy_sse2: fix win64
xmm6-7 need to be stored
Change-Id: I6c51559598d335946ec91be6246b49589c63b724
Jingning Han [Fri, 31 Jul 2015 18:15:55 +0000 (11:15 -0700)]
Factor out mips/msa inverse transform implementations
Move mips/msa inverse transform implementations from vp9 folder to
vpx_dsp.
Change-Id: Ic4cf3f05247c3c63db7b532a0e5000017a962391
Jingning Han [Sat, 1 Aug 2015 16:20:43 +0000 (16:20 +0000)]
Merge "Use precise header files in inverse transform msa implementations"
Jingning Han [Sat, 1 Aug 2015 16:20:24 +0000 (16:20 +0000)]
Merge "Factor inverse transform functions into vpx_dsp"
Parag Salasakar [Sat, 1 Aug 2015 02:12:20 +0000 (02:12 +0000)]
Merge "mips msa vp8 temporal filter optimization"
Jingning Han [Sat, 1 Aug 2015 01:01:37 +0000 (01:01 +0000)]
Merge "Add dynamic range notes to vp9_vector_var_c"
Aℓex Converse [Fri, 31 Jul 2015 23:51:01 +0000 (23:51 +0000)]
Merge "Turn off simple_model_rd_from_var at speed 4."
Jingning Han [Fri, 31 Jul 2015 23:41:51 +0000 (16:41 -0700)]
Add dynamic range notes to vp9_vector_var_c
Change-Id: If536ad31046ecd9e2ecd9c21f52f8192c8153ad7
Jingning Han [Fri, 31 Jul 2015 17:53:25 +0000 (10:53 -0700)]
Use precise header files in inverse transform msa implementations
Change-Id: Ie8a79d9e2837842c3f60776b661cd42782b108d5
James Zern [Fri, 31 Jul 2015 23:22:34 +0000 (23:22 +0000)]
Merge "VP9_COPY_CONVOLVE_SSE2 optimization"
Jingning Han [Fri, 31 Jul 2015 01:53:18 +0000 (18:53 -0700)]
Factor inverse transform functions into vpx_dsp
This commit moves the module inverse transform functions from vp9
to vpx_dsp folder. The hybrid transform wrapper functions stay in
the vp9 folder, since it involves codec-specific data structures.
Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
Alex Converse [Fri, 31 Jul 2015 22:50:17 +0000 (15:50 -0700)]
Turn off simple_model_rd_from_var at speed 4.
This got erroneously changed during the refactor. This fixes
SvcTest.TwoPassEncode2TemporalLayersWithMultipleFrameContextsAndTiles.
Change-Id: Ifa5ab0e098396c5e2d10478db87df256eadfa4c7
James Zern [Fri, 31 Jul 2015 22:22:48 +0000 (22:22 +0000)]
Merge changes Iecdbbc34,I8b4db93f
* changes:
Android.mk: fix *_rtcd.h deps for armeabi-v7a
Android.mk: add a dep on vpx_config.asm for x86_64
Scott LaVarnway [Thu, 30 Jul 2015 12:02:04 +0000 (05:02 -0700)]
VP9_COPY_CONVOLVE_SSE2 optimization
This function suffers from a couple problems in small core(tablets):
-The load of the next iteration is blocked by the store of previous iteration
-4k aliasing (between future store and older loads)
-current small core machine are in-order machine and because of it the store will spin the rehabQ until the load is finished
fixed by:
- prefetching 2 lines ahead
- unroll copy of 2 rows of block
- pre-load all xmm regiters before the loop, final stores after the loop
The function is optimized by:
copy_convolve_sse2 64x64 - 16%
copy_convolve_sse2 32x32 - 52%
copy_convolve_sse2 16x16 - 6%
copy_convolve_sse2 8x8 - 2.5%
copy_convolve_sse2 4x4 - 2.7%
credit goes to Tom Craver(tom.r.craver@intel.com) and Ilya Albrekht(ilya.albrekht@intel.com)
Change-Id: I63d3428799c50b2bf7b5677c8268bacb9fc29671
Jingning Han [Fri, 31 Jul 2015 21:29:50 +0000 (21:29 +0000)]
Merge "Fix compiler warning in mips/dspr2"
Aℓex Converse [Fri, 31 Jul 2015 21:19:11 +0000 (21:19 +0000)]
Merge "Compute skippable inside the block_rd_txfm loop."
Jingning Han [Fri, 31 Jul 2015 19:33:35 +0000 (12:33 -0700)]
Fix compiler warning in mips/dspr2
This commit fixes the mix declaration and definition warning when
mips/dspr2 is turned on.
Change-Id: I633d6fe42368b9ac35b106786ebac6969ad53552
Aℓex Converse [Fri, 31 Jul 2015 19:05:54 +0000 (19:05 +0000)]
Merge changes Ic1ce346a,Ic0b4e92c
* changes:
Simplify model_rd_for_sb HBD ifdefs
Simplify dist_block HBD ifdefs
Alex Converse [Fri, 31 Jul 2015 00:39:23 +0000 (17:39 -0700)]
Compute skippable inside the block_rd_txfm loop.
Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2
Zoe Liu [Fri, 31 Jul 2015 18:23:19 +0000 (18:23 +0000)]
Merge "Refactor mips/dspr2 on convolution."
Zoe Liu [Fri, 31 Jul 2015 18:20:14 +0000 (18:20 +0000)]
Merge "Code refactor on InterpKernel"
Alex Converse [Fri, 31 Jul 2015 17:56:11 +0000 (10:56 -0700)]
Simplify model_rd_for_sb HBD ifdefs
Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d
Alex Converse [Fri, 31 Jul 2015 00:52:55 +0000 (17:52 -0700)]
Simplify dist_block HBD ifdefs
Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114
Aℓex Converse [Fri, 31 Jul 2015 17:59:22 +0000 (17:59 +0000)]
Merge "Short circuit rate_block in block_rd_txfm."
Zoe Liu [Tue, 28 Jul 2015 17:52:24 +0000 (10:52 -0700)]
Refactor mips/dspr2 on convolution.
Change-Id: If59a39d5a92c261537342726f94bb7f7f26dfff3
Zoe Liu [Wed, 22 Jul 2015 17:40:42 +0000 (10:40 -0700)]
Code refactor on InterpKernel
It in essence refactors the code for both the interpolation
filtering and the convolution. This change includes the moving
of all the files as well as the changing of the code from vp9_
prefix to vpx_ prefix accordingly, for underneath architectures:
(1) x86;
(2) arm/neon; and
(3) mips/msa.
The work on mips/drsp2 will be done in a separate change list.
Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46
Alex Converse [Thu, 30 Jul 2015 18:52:28 +0000 (11:52 -0700)]
Give skip_txfm constants names.
This is using a define instead of an enum to keep byte packing.
Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
Alex Converse [Thu, 30 Jul 2015 22:33:47 +0000 (15:33 -0700)]
Short circuit rate_block in block_rd_txfm.
Don't run rate_block (cost_coeffs) if distortion alone is enough to
surpass best_rd.
This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is
zero effect on output if tx_cache is removed.
Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784
Parag Salasakar [Fri, 31 Jul 2015 06:33:19 +0000 (12:03 +0530)]
mips msa vp8 temporal filter optimization
average improvement ~2x-3x
Change-Id: I05593bed583234dc7809aaec6cab82773a29505d
Parag Salasakar [Fri, 31 Jul 2015 03:59:10 +0000 (09:29 +0530)]
mips msa vp8 block subtract optimization
average improvement ~2x-3x
Change-Id: I30abf4c92cddcc9e87b7a40d4106076e1ec701c2
Parag Salasakar [Fri, 31 Jul 2015 03:44:03 +0000 (03:44 +0000)]
Merge "mips msa vp8 quantize optimization"
Yunqing Wang [Wed, 29 Jul 2015 20:37:41 +0000 (13:37 -0700)]
Remove tx cache and speed up tx size selection
1. The RD scores obtained during the tx size selection were stored in the
tx cache, and used to help make the tx decision for the following frames.
This wasn't used anymore in VP9 encoder. Recovered the related decision
making code from 1.5+ years ago, and borg tests didn't show any quality
gain. This patch removed it to lower the complexity.
2. An optimization was done after the above refactoring. If the tx_mode
is not TX_MODE_SELECT, we only need to test the chosen tx size instead
of all posible tx sizes. This gave a 1.5% average speed gain at speed 2,
and a 1% average speed gain at speed 3.
Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
Aℓex Converse [Thu, 30 Jul 2015 23:04:28 +0000 (23:04 +0000)]
Merge "Convert simple_model_rd_from_var from a speed check to a speed feature."
Hui Su [Thu, 30 Jul 2015 22:29:35 +0000 (22:29 +0000)]
Merge "Exclude vpx intra prediction functions in vp8-only build"
Alex Converse [Thu, 30 Jul 2015 20:52:02 +0000 (13:52 -0700)]
Convert simple_model_rd_from_var from a speed check to a speed feature.
Change-Id: I8877025e172fff29bc4e270790211463b676b4d7
hui su [Thu, 30 Jul 2015 02:43:29 +0000 (19:43 -0700)]
Exclude vpx intra prediction functions in vp8-only build
Currently vp8 is not using the intra prediction functions in vpx_dsp.
Change-Id: I1522b5f5cb12a81999fb126cf7c62c70259e7a52
James Zern [Wed, 29 Jul 2015 23:07:05 +0000 (16:07 -0700)]
Android.mk: fix *_rtcd.h deps for armeabi-v7a
strip '.neon' so *_rtcd.h depends on the correct file
Change-Id: Iecdbbc34c9ce5c6d0a4b466332d52f4e6a0cb128
Parag Salasakar [Thu, 30 Jul 2015 05:26:40 +0000 (10:56 +0530)]
mips msa vp8 quantize optimization
average improvement ~2x-3x
Change-Id: I6fc37191bf9cb5a67e1af9787d0d27659c17bdba
Alex Converse [Thu, 30 Jul 2015 19:36:57 +0000 (12:36 -0700)]
Cleanup rdcost_block_args
Change-Id: I9d613cbe9e76b5dd15e935878ef9fd04521690ba
Aℓex Converse [Thu, 30 Jul 2015 19:37:28 +0000 (19:37 +0000)]
Merge "Clean up some casts."
Jingning Han [Thu, 30 Jul 2015 05:37:53 +0000 (05:37 +0000)]
Merge "Cosmetics - Fix header file order in unit tests"
Jingning Han [Wed, 29 Jul 2015 21:51:36 +0000 (14:51 -0700)]
Cosmetics - Fix header file order in unit tests
Change-Id: I9582a8d74990125b71e8fe620f7f3f2585a30798
Parag Salasakar [Thu, 30 Jul 2015 02:44:42 +0000 (08:14 +0530)]
mips msa vp8 fdct optimization
average improvement ~2x-4x
Change-Id: Id0bc600440f7ef53348f585ebadb1ac6869e9a00
Parag Salasakar [Thu, 30 Jul 2015 02:34:06 +0000 (02:34 +0000)]
Merge "mips msa vp8 post proc optimization"
Aℓex Converse [Thu, 30 Jul 2015 01:06:08 +0000 (01:06 +0000)]
Merge "Comment zcoeff_blk."
Alex Converse [Wed, 29 Jul 2015 23:53:33 +0000 (16:53 -0700)]
Comment zcoeff_blk.
Change-Id: Iefc2eb78e71472ecf51802ec59ff32caef4bd0f4
Yaowu Xu [Wed, 29 Jul 2015 23:27:34 +0000 (16:27 -0700)]
Add const to a variable declaration
Change-Id: Idf572c22a87098665f5179dc3212a06d9a85a342
Yaowu Xu [Wed, 29 Jul 2015 23:23:14 +0000 (16:23 -0700)]
Fix a typo
Change-Id: Ief8eea8fe6bef139d1e94f8d6dfac5a44efe785d
James Zern [Wed, 29 Jul 2015 22:38:43 +0000 (15:38 -0700)]
Android.mk: add a dep on vpx_config.asm for x86_64
Change-Id: I8b4db93f754607aab64351745bd102ab238d9501
Alex Converse [Fri, 24 Jul 2015 21:59:03 +0000 (14:59 -0700)]
Clean up some casts.
Change-Id: I264ca534cd7d4755906e20aea47e7a2523bca611
Parag Salasakar [Wed, 29 Jul 2015 04:10:26 +0000 (09:40 +0530)]
mips msa vp8 post proc optimization
average improvement ~2x-4x
Change-Id: I93abc15389649c169bb8b69127c0b95407d34692
Parag Salasakar [Wed, 29 Jul 2015 04:00:41 +0000 (04:00 +0000)]
Merge "mips msa vp8 filter by weight optimization"
James Zern [Wed, 29 Jul 2015 00:47:09 +0000 (00:47 +0000)]
Merge "add vp9_block_error_fp_neon"
Hui Su [Wed, 29 Jul 2015 00:38:48 +0000 (00:38 +0000)]
Merge "Replace prefix vp9_ with vpx_ for intra prediction functions"
Jingning Han [Wed, 29 Jul 2015 00:07:31 +0000 (00:07 +0000)]
Merge "Replace vp9_ prefix in 2D-DCT functions with vpx_"
Jingning Han [Wed, 29 Jul 2015 00:06:56 +0000 (00:06 +0000)]
Merge "Remove vp9_dct.h file"
Jingning Han [Wed, 29 Jul 2015 00:06:37 +0000 (00:06 +0000)]
Merge "Move DC only forward 2D-DCT functions to vpx_dsp"
Jingning Han [Tue, 28 Jul 2015 22:57:40 +0000 (15:57 -0700)]
Replace vp9_ prefix in 2D-DCT functions with vpx_
Clean up the forward 2D-DCT function names in vpx_dsp.
Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
Jingning Han [Tue, 28 Jul 2015 22:25:05 +0000 (15:25 -0700)]
Remove vp9_dct.h file
The forward 32x32 2D-DCT functions are aligned in vpx_dsp folder.
The vp9_dct.h file is not effectively used now.
Change-Id: Ie7946b6fdd784b8e91496242337bc9002c75c281
Aℓex Converse [Tue, 28 Jul 2015 21:59:33 +0000 (21:59 +0000)]
Merge "Remove branch in inner loop of foreach_transformed_block_in_plane()"
Aℓex Converse [Tue, 28 Jul 2015 21:59:02 +0000 (21:59 +0000)]
Merge changes If196d9e5,Ib669d572
* changes:
Simplify is_skippable to point straight to eobs.
Don't initialize extra context tree buffers for 4x8 and 8x4.
Jingning Han [Tue, 28 Jul 2015 21:42:25 +0000 (14:42 -0700)]
Move DC only forward 2D-DCT functions to vpx_dsp
This completes the forward transform functions layout refactoring.
Change-Id: I996fb0fb795f41e2040f7b21db985774098aedbd
James Zern [Tue, 28 Jul 2015 21:50:35 +0000 (21:50 +0000)]
Merge "build/make/Android.mk: support TARGET_ARCH_ABI=x86_64"
Johann [Tue, 28 Jul 2015 21:00:32 +0000 (14:00 -0700)]
Don't use 'h' for functions using x86inc.asm
In newer version of x86inc.asm 'h' is used as a modifier for register
names.
Change-Id: Ie5b9dd2f91ecdc8f6f18b2701b6dc23042b604e4
Hui Su [Tue, 28 Jul 2015 20:41:01 +0000 (20:41 +0000)]
Merge "Move intra prediction functions from vp9/common/ to vpx_dsp/"
Jingning Han [Tue, 28 Jul 2015 20:36:59 +0000 (20:36 +0000)]
Merge "Factor 32x32 fwd DCT to vpx_dsp folder"
Jingning Han [Mon, 27 Jul 2015 23:05:15 +0000 (16:05 -0700)]
Factor 32x32 fwd DCT to vpx_dsp folder
Move the 32x32 2D-DCT implementations from vp9/ to vpx_dsp/.
Change-Id: Id3980696f8b69906ff7a59ff9fb2b9013d60047d
Frank Galligan [Tue, 28 Jul 2015 16:05:41 +0000 (09:05 -0700)]
Fix dspr2 build.
Change-Id: I18895c29d6db872d033b3874de9dcd9501d0c10e
James Zern [Sat, 25 Jul 2015 19:27:56 +0000 (12:27 -0700)]
add vp9_block_error_fp_neon
~60-70% faster depending on the block size
Change-Id: Icdbaa9977a91a63cbcc6ead0cf19d5a2af7f27e1
Parag Salasakar [Tue, 28 Jul 2015 02:46:34 +0000 (08:16 +0530)]
mips msa vp8 filter by weight optimization
average improvement ~3x-5x
Change-Id: Ia808ae56b118e0e1b293901447aa5a0f597b405b
Parag Salasakar [Tue, 28 Jul 2015 02:27:31 +0000 (02:27 +0000)]
Merge "mips msa vp8 recon intra optimization"
Yunqing Wang [Tue, 28 Jul 2015 01:25:14 +0000 (01:25 +0000)]
Merge "Remove tx_select_threshes"
Jingning Han [Mon, 27 Jul 2015 21:56:43 +0000 (14:56 -0700)]
Move forward dct sse2 header file to vpx_dsp
Change-Id: Iba03852ce778c956200818e3473cfb2b48cf8d8e
hui su [Tue, 21 Jul 2015 16:39:46 +0000 (09:39 -0700)]
Replace prefix vp9_ with vpx_ for intra prediction functions
Change-Id: I8ae6fb586f8d5d018ace228df11714f82b085076
hui su [Sun, 19 Jul 2015 22:02:56 +0000 (15:02 -0700)]
Move intra prediction functions from vp9/common/ to vpx_dsp/
Change-Id: I64edc26cf4aab050c83f2d393df6250628ad43b8
Jingning Han [Mon, 27 Jul 2015 19:05:33 +0000 (12:05 -0700)]
Use common coefficient definition in neon idct implementations
Replace the duplicate coefficient definition in neon implementations
of inverse transform with those from vpx_dsp/txfm_common.h
Change-Id: I4cd9bd9569ab1793dfdbb6f16d80bcb581599f0d
Yunqing Wang [Mon, 27 Jul 2015 18:58:39 +0000 (11:58 -0700)]
Remove tx_select_threshes
Removed unused tx_select_threshes and tx_select_diff.
Change-Id: I5e9e7ad170056efe14b5f071e94d0c5a36e4a34c
Jingning Han [Fri, 24 Jul 2015 17:27:23 +0000 (10:27 -0700)]
Replace vp9_idct.h for precise dependency
This commit replaces vp9_idct.h with txfm_common.h in many SIMD
implementation files for precise file dependency.
Change-Id: If73dd726bb16537e7494f28538b0a169810f9756
Jingning Han [Thu, 23 Jul 2015 23:35:44 +0000 (16:35 -0700)]
Refactor vp9_idct.h file
Separate the common coefficient constant into vpx_dsp/txfm_common.h.
Move the SSE2 macro definitions to vpx_dsp/x86/txfm_common_sse2.h.
This clears the use case of vp9_idct.h in vpx_dsp folder.
Change-Id: I319735a2abf42888e5080ac14cfbcde34be7b121
Parag Salasakar [Sat, 25 Jul 2015 07:02:26 +0000 (12:32 +0530)]
mips msa vp8 recon intra optimization
average improvement ~3x-5x
Change-Id: I73306863e9bf172d5adc06b8dd54e43985d1e063
James Zern [Fri, 24 Jul 2015 21:24:20 +0000 (14:24 -0700)]
build/make/Android.mk: support TARGET_ARCH_ABI=x86_64
requires r10e or newer:
Android NDK, Revision 10e (May 2015)
...
Other bug fixes:
...
- Fixed .asm support for ABI x86_64.
Change-Id: I51ec9a5f77c982b7412d922e896348a83ae2d7d6
Marco Paniconi [Fri, 24 Jul 2015 22:23:10 +0000 (22:23 +0000)]
Merge "Dynamic resize for real-time: reference scaling."
Jingning Han [Thu, 23 Jul 2015 21:55:05 +0000 (14:55 -0700)]
Remove redundant function definitions in vp9_dct_sse2.h
Change-Id: I283d364a4e65ca9bf6ff581da1d0b498433c5402
Jingning Han [Thu, 23 Jul 2015 21:41:23 +0000 (14:41 -0700)]
Remove vp9_dct.h from fwd_txfm_impl_sse2 header file
Change-Id: Ib3a4814fdb9d69cf6cc23bdd208f9bc9e7972edc
Jingning Han [Fri, 24 Jul 2015 21:11:33 +0000 (21:11 +0000)]
Merge "Move msa implementations of 2D-DCT to vpx_dsp"
Jingning Han [Wed, 22 Jul 2015 18:53:21 +0000 (11:53 -0700)]
Move msa implementations of 2D-DCT to vpx_dsp
Refactor and clean up the msa transform related code layout.
Change-Id: Ic5048bd3d62a6046589817da745370ea89448e44
Parag Salasakar [Fri, 24 Jul 2015 18:16:23 +0000 (18:16 +0000)]
Merge "mips msa vp8 bilinear filter optimization"
Alex Converse [Fri, 24 Jul 2015 17:32:09 +0000 (10:32 -0700)]
Remove branch in inner loop of foreach_transformed_block_in_plane()
Change-Id: Ib14d09376a9ce4fa5f541264e5c335aceb71380a
Alex Converse [Fri, 24 Jul 2015 17:35:44 +0000 (10:35 -0700)]
Simplify is_skippable to point straight to eobs.
Change-Id: If196d9e5c7a15ee7d988ee2ecbf155a54d59b480
Alex Converse [Fri, 24 Jul 2015 17:35:10 +0000 (10:35 -0700)]
Don't initialize extra context tree buffers for 4x8 and 8x4.
Change-Id: Ib669d572654f24fd43410a9399a8b609e87f846a
Hui Su [Fri, 24 Jul 2015 17:40:37 +0000 (17:40 +0000)]
Merge "Code cleanup in vp9_encode_block_intra"
Aℓex Converse [Fri, 24 Jul 2015 17:38:45 +0000 (17:38 +0000)]
Merge "Allocate four |zcoeff_blk| for sub8x8 contexts."
Aℓex Converse [Fri, 24 Jul 2015 17:38:32 +0000 (17:38 +0000)]
Merge "Allocate eobs array per txblock and not per pixel."
Parag Salasakar [Fri, 24 Jul 2015 03:51:35 +0000 (09:21 +0530)]
mips msa vp8 bilinear filter optimization
average improvement ~3x-4x
Change-Id: I8c0b3d5c86c9eb4f802b87c971864d2cfceeb7cc
Parag Salasakar [Fri, 24 Jul 2015 03:43:37 +0000 (03:43 +0000)]
Merge "mips msa vp8 copy mem optimization"