]>
granicus.if.org Git - libvpx/log
Yaowu Xu [Thu, 13 Oct 2016 15:20:52 +0000 (15:20 +0000)]
Merge "Sync 2x2 intra predictors" into nextgenv2
Alex Converse [Fri, 9 Sep 2016 17:30:36 +0000 (10:30 -0700)]
AnsTest: Replace the dummy distribution
Use constrained token table row 65/256 instead.
Change-Id: I8b442d4c82af8fa9d36ac2de0d73179ed040478d
(cherry picked from commit
47eb9a2ca46821b468903514cd34eaaca2533d45 )
Alex Converse [Thu, 13 Oct 2016 14:03:15 +0000 (14:03 +0000)]
Merge changes Ic74d9d88,Ie93b474e,I544989ea,Ic273f7d9,Idfd2d2b3, ... into nextgenv2
* changes:
Remove custom rans types
Remove add_token_no_extra.
Remove unused aom_rans_build_cdf_from_pdf
Add the tool used to generate the constrained tokenset.
Remove the starting zero from ANS CDFs.
Import the aom_read/write_symbol abstractions from aom/master
Debargha Mukherjee [Thu, 13 Oct 2016 08:16:47 +0000 (08:16 +0000)]
Merge "Fix a bug in inverse halfright 32x32 transform" into nextgenv2
Alex Converse [Sun, 4 Sep 2016 11:30:43 +0000 (13:30 +0200)]
Remove custom rans types
(cherry picked from aom/master commit
11206c60d930be9d29100567aa67f2a65463852a )
Includes renames in a bunch of places not handled by the original
due to differing tree states.
Change-Id: Ic74d9d8850b8c80a51e55e425bbf472a67e2653f
Jingning Han [Thu, 13 Oct 2016 04:03:18 +0000 (21:03 -0700)]
Sync 2x2 intra predictors
Add 2x2 DC, V, H, TM intra predictors.
Change-Id: I2a614adde553f821c45bc5a9bf09800a9f0aaa26
Alex Converse [Sun, 4 Sep 2016 09:16:34 +0000 (11:16 +0200)]
Remove add_token_no_extra.
It was a fairly small production optimization for VP9.
Change-Id: Ie93b474ea5b7e63384a7c0b3a56b135462d1471b
(cherry picked from aom/master commit
df9bb76b1330de42fe13827df4c72010adb51429 )
Alex Converse [Sun, 4 Sep 2016 09:07:53 +0000 (11:07 +0200)]
Remove unused aom_rans_build_cdf_from_pdf
Change-Id: I544989eae45b7dda04250365c3de99f50110a76b
(cherry picked from aom/master commit
06cce842caa5212826d51c2a317de0bdfae74349 )
Alex Converse [Wed, 6 Jul 2016 17:47:27 +0000 (10:47 -0700)]
Add the tool used to generate the constrained tokenset.
The code that generates the raw distribution is based on a MATLAB
program by Debargha Mukherjee, and the algorithm used to quantize the
distribution comes from the ANS Toolkit by Jarek Duda.
Change-Id: Ic273f7d9e43e3ecd999e9e7e04cde57e8559375a
(cherry picked from aom/master commit
ef446026aeafa318f9bee182b8c80eb4f1ef5a0a )
Alex Converse [Mon, 22 Aug 2016 19:44:16 +0000 (12:44 -0700)]
Remove the starting zero from ANS CDFs.
This brings it in line with the Daala CDFs and will make it easier to
share code.
Change-Id: Idfd2d2b33c3b9b2c4e72ce72fb3d8039013448b9
(cherry picked from aom/master commit
af98507ca928afe33e9f88fdd2ca168379528d6a )
Alex Converse [Wed, 12 Oct 2016 22:59:58 +0000 (15:59 -0700)]
Import the aom_read/write_symbol abstractions from aom/master
Change-Id: I0b255c05108c3b97e74df1b59c34111c9e9a5770
Yi Luo [Thu, 13 Oct 2016 00:08:48 +0000 (00:08 +0000)]
Merge "Hybrid forward transform 32x32 AVX2 optimization" into nextgenv2
Alex Converse [Wed, 12 Oct 2016 22:25:27 +0000 (22:25 +0000)]
Merge changes I3ca2b674,I78afc587,I3ae62181,I5ed91556 into nextgenv2
* changes:
Unfork ANS decode_coefs
Remove ZERO_TOKEN from the ANS tokenset
Drop costing ANS tokens from derived probabilities
Unfork ANS pack_mb_tokens
Debargha Mukherjee [Wed, 12 Oct 2016 17:49:29 +0000 (10:49 -0700)]
Fix a bug in inverse halfright 32x32 transform
Fix a bug in the C implementation of the ihalfright32
transform, in the case that its input and output buffers are the same.
This occurs when it is called by av1_iht32x16_512_add_c.
Change-Id: I61c652e2662178520c0639a2879ae128a9c7ec3f
Yi Luo [Fri, 7 Oct 2016 16:46:05 +0000 (09:46 -0700)]
Hybrid forward transform 32x32 AVX2 optimization
- av1_fht32x32 AVX2 function level time reduction ~89% compared to C.
- av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
But function replacement must go with the corresponding inverse txfm.
- No obvious user level time reduction due to 32x32 TX_TYPE selection.
- Zero high 128b YMM to avoid AVX-SSE transition penalties
(fix 16x16 case).
- Added 32x32 AVX2 unit tests to verify bitexact.
- AVX2 optimization summary:
On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
C to AVX2: function level time reduction, ~86-89%.
SSE2 to AVX2: function level time reduction, ~51%.
Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
Hui Su [Wed, 12 Oct 2016 21:13:24 +0000 (21:13 +0000)]
Merge "Send allow_screen_content flag for both key and intra only frames" into nextgenv2
Debargha Mukherjee [Wed, 12 Oct 2016 21:06:41 +0000 (21:06 +0000)]
Merge "Refactor expand dry_run types to return coef rate" into nextgenv2
Alex Converse [Wed, 12 Oct 2016 20:23:33 +0000 (13:23 -0700)]
Unfork ANS decode_coefs
This is less code and more like what we have in aom/master.
Change-Id: I3ca2b674e4ad9e2e211d08bb51d78549e8b63a54
Alex Converse [Wed, 12 Oct 2016 19:53:40 +0000 (12:53 -0700)]
Remove ZERO_TOKEN from the ANS tokenset
This can be re-added after aligning AOM's ANS with nextgenv2's ANS.
This partially reverts commit
3829cd2f2f9904572019aa047d068baeee843767 .
Change-Id: I78afc587f1abfe33ffcd53b3262910cfae135534
Alex Converse [Wed, 12 Oct 2016 20:03:55 +0000 (13:03 -0700)]
Drop costing ANS tokens from derived probabilities
This mimics what's currently done in aom/master. This can be re-added
after aligning AOM's ANS with nextgenv2's ANS.
Change-Id: I3ae62181dd4803694204a234c717a86a15ca8a40
Alex Converse [Tue, 11 Oct 2016 23:50:56 +0000 (16:50 -0700)]
Unfork ANS pack_mb_tokens
This is less code and more like what we have in aom/master.
Change-Id: I5ed915563cbfbc6281113c1eb31455f50710ba9f
Jim Bankoski [Tue, 29 Mar 2016 21:21:56 +0000 (14:21 -0700)]
AUTHORS regenerated
script changed to remove extra entities and clang-format bot.
Change-Id: I102cd80fdf4b240e6e4d5172943e49146a601a72
Yaowu Xu [Wed, 12 Oct 2016 19:25:47 +0000 (19:25 +0000)]
Merge "minor updates" into nextgenv2
hui su [Wed, 12 Oct 2016 18:36:24 +0000 (11:36 -0700)]
Send allow_screen_content flag for both key and intra only frames
BUG=webm:1311
Change-Id: I03c1043d17ed4e4ea22002473779a9612884c6c6
Yaowu Xu [Wed, 12 Oct 2016 18:26:30 +0000 (18:26 +0000)]
Merge "Include fix: use aom_integer.h" into nextgenv2
Yaowu Xu [Wed, 12 Oct 2016 18:26:21 +0000 (18:26 +0000)]
Merge "Add compiler flag -Wsign-compare" into nextgenv2
Yaowu Xu [Wed, 12 Oct 2016 17:56:26 +0000 (17:56 +0000)]
Merge "LIBVPX_TEST_DATA_PATH -> LIBAOM_TEST_DATA_PATH" into nextgenv2
Yaowu Xu [Wed, 12 Oct 2016 17:50:08 +0000 (10:50 -0700)]
minor updates
1. vp8->aom
2. removed no-effect statements and spaces
Change-Id: I367d05ff9bf1b9f3c71c517c45d8049d9d4236ec
Sarah Parker [Wed, 12 Oct 2016 17:32:21 +0000 (17:32 +0000)]
Merge "Fix inconsistency in gm parameter write to bitstream" into nextgenv2
Urvang Joshi [Mon, 11 Jul 2016 22:51:21 +0000 (15:51 -0700)]
Include fix: use aom_integer.h
Change-Id: I98919a04bead417379e555461f67978501f922e7
Urvang Joshi [Fri, 8 Jul 2016 23:09:36 +0000 (16:09 -0700)]
Add compiler flag -Wsign-compare
Also, fix the warnings generated by this flag.
Conflicts:
examples/aom_cx_set_ref.c
Change-Id: I0451e119c52000aa7c1c55027d53f1da5a02a11f
Yaowu Xu [Wed, 12 Oct 2016 15:25:39 +0000 (08:25 -0700)]
LIBVPX_TEST_DATA_PATH -> LIBAOM_TEST_DATA_PATH
This commit renames LIBVPX_TEST_DATA_PATH to LIBAOM_TEST_DATA_PATH,
with a work around for working with jenkins environmnet variables.
Change-Id: If664ce57e25ad2af8121d1b578bf64043f0baa2a
Yaowu Xu [Wed, 12 Oct 2016 04:26:51 +0000 (04:26 +0000)]
Merge "y4m_test: fix segfault if test files are missing" into nextgenv2
Yaowu Xu [Wed, 12 Oct 2016 04:26:36 +0000 (04:26 +0000)]
Merge "Remove two files not in use" into nextgenv2
Sarah Parker [Tue, 11 Oct 2016 19:06:33 +0000 (12:06 -0700)]
Fix inconsistency in gm parameter write to bitstream
Before this change, gm parameters were being written to the
bitstream for all frames, but only read for inter only frames,
causing a bitstream error.
Change-Id: I63b8e2fdf6358e07cc00718de04cc399809bde37
Tristan Matthews [Fri, 22 Jan 2016 23:05:48 +0000 (18:05 -0500)]
y4m_test: fix segfault if test files are missing
Change-Id: I7a04beb83095e5c0821048909f81f45be8b5eee3
Alex Converse [Tue, 11 Oct 2016 23:24:39 +0000 (23:24 +0000)]
Merge "Remove -fno-strict-aliasing flag" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 22:04:47 +0000 (15:04 -0700)]
Remove two files not in use
test/cx_set_ref.sh: replaced by test/aomcx_set_ref.sh
test/vpxdec.sh: replaced by aomdec.sh
Change-Id: I74136d311eee7666e08ed8f573a17f810992fc52
Yaowu Xu [Tue, 11 Oct 2016 22:11:09 +0000 (22:11 +0000)]
Merge "change to use aomedia copyright notice" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 22:10:08 +0000 (22:10 +0000)]
Merge "Fix missing parentheses in v64_align()" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 22:09:53 +0000 (22:09 +0000)]
Merge "Improve v128 and v64 8 bit shifts for x86" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 22:09:30 +0000 (22:09 +0000)]
Merge "Clean up and speed up CLPF clipping" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 22:06:59 +0000 (22:06 +0000)]
Merge "Fix typos in CLPF unit test" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 22:06:43 +0000 (22:06 +0000)]
Merge "Make generic SIMD code compile if no native support" into nextgenv2
Debargha Mukherjee [Tue, 11 Oct 2016 12:26:50 +0000 (05:26 -0700)]
Refactor expand dry_run types to return coef rate
Adds the functionality to return the rate cost due to
coefficients without doing full search of all modes.
This will be subsequently used in various experiments,
including in new_quant experiment to search quantization
profiles at the superblock level without repeating the
full mode/partition search.
Change-Id: I4aad3f3f0c8b8dfdea38f8f4f094a98283f47f08
Yaowu Xu [Tue, 11 Oct 2016 21:54:12 +0000 (21:54 +0000)]
Merge "Bugfix in CLPF RDO. Prevented selection of enable_fb_flag=0." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 21:53:56 +0000 (21:53 +0000)]
Merge "Bugfix in the CLPF RDO." into nextgenv2
Sarah Parker [Tue, 11 Oct 2016 21:48:01 +0000 (21:48 +0000)]
Merge "Read mode to mi->bmi for sub 8x8 blocks" into nextgenv2
Yaowu Xu [Mon, 10 Oct 2016 23:21:45 +0000 (16:21 -0700)]
change to use aomedia copyright notice
Change-Id: Idb2cf2555bcbe04a6650c492a3a714d7d5836b67
Steinar Midtskogen [Wed, 28 Sep 2016 18:30:24 +0000 (20:30 +0200)]
Fix missing parentheses in v64_align()
Change-Id: I16469062853c101965f56002be30ebc5823975b1
Steinar Midtskogen [Wed, 28 Sep 2016 15:38:46 +0000 (17:38 +0200)]
Improve v128 and v64 8 bit shifts for x86
Change-Id: I25dc61bab46895d425ce49f89fceb164bee36906
Steinar Midtskogen [Mon, 26 Sep 2016 10:51:25 +0000 (12:51 +0200)]
Clean up and speed up CLPF clipping
* Move clipping tests from inside to outside loops
* Let sizex and sizey to clpf_block() be the clipped block size rather
than both just bs
* Make fallback tests to C more accurate
Change-Id: Icdc57540ce21b41a95403fdcc37988a4ebf546c7
Steinar Midtskogen [Mon, 26 Sep 2016 19:48:09 +0000 (21:48 +0200)]
Fix typos in CLPF unit test
Change-Id: Ia69bad44e47509208e3b9d306165d0872d4e92f3
Steinar Midtskogen [Mon, 26 Sep 2016 19:05:51 +0000 (21:05 +0200)]
Make generic SIMD code compile if no native support
Change-Id: I7f691a0ae27f06ef3d727764829a60a8ffc509eb
Steinar Midtskogen [Fri, 23 Sep 2016 10:30:50 +0000 (12:30 +0200)]
Bugfix in CLPF RDO. Prevented selection of enable_fb_flag=0.
PSNR YCbCr: -0.01% -0.06% -0.17%
PSNRHVS: 0.01%
SSIM: 0.03%
MSSSIM: 0.00%
CIEDE2000: -0.05%
Change-Id: I1205c021bfc5cee6f80344fec92aabb529af9bd1
Steinar Midtskogen [Wed, 21 Sep 2016 10:39:13 +0000 (12:39 +0200)]
Bugfix in the CLPF RDO.
When CLPF was extended to chroma, the chroma RDO accidentally
discarded the optimal block size found in the luma RDO.
PSNR YCbCr: -0.25% 0.05% 0.06%
PSNRHVS: -0.19%
SSIM: -0.36%
MSSSIM: -0.23%
Conflicts:
av1/common/clpf.c
Change-Id: Ie49cd30f9276a311ada88cb2f13d14757617f030
Yaowu Xu [Tue, 11 Oct 2016 19:16:25 +0000 (19:16 +0000)]
Merge "Move tree writing code into bitwriter.h." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 19:16:07 +0000 (19:16 +0000)]
Merge "Remove unused color_sensitivity member from MACROBLOCK." into nextgenv2
Sarah Parker [Tue, 11 Oct 2016 18:51:59 +0000 (11:51 -0700)]
Read mode to mi->bmi for sub 8x8 blocks
Previously, only the motion vectors were being stored. This caused
a mismatch in the global motion experiment, which needs this
mode information to decide whether or not to use the gm parameters
in reconstruction.
Change-Id: I58cde750ec06587dbfb8d65b07c15a67b7d6b1f6
Yaowu Xu [Tue, 11 Oct 2016 18:44:56 +0000 (18:44 +0000)]
Merge "CLPF: Remove redundant function argument." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:44:30 +0000 (18:44 +0000)]
Merge "Extend CLPF to chroma." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:43:26 +0000 (18:43 +0000)]
Merge "Remove some dead code in CLPF." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:42:52 +0000 (18:42 +0000)]
Merge "Print correct info if CLPF unit tests fail." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:42:34 +0000 (18:42 +0000)]
Merge "Reduce memory footprint for CLPF encoding." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:42:15 +0000 (18:42 +0000)]
Merge "Make generic SIMD work with clang." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:41:50 +0000 (18:41 +0000)]
Merge "Fix clang-format warnings in aom_dsp/simd/v64_intrinsics_arm.h" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:41:40 +0000 (18:41 +0000)]
Merge "Non-normative quality improvements to CLPF." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:41:14 +0000 (18:41 +0000)]
Merge "Added high bit-depth support in CLPF." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:41:02 +0000 (18:41 +0000)]
Merge "Fix a memleak in CLPF." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:40:47 +0000 (18:40 +0000)]
Merge "Reduce memory footprint for CLPF decoding." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 18:40:04 +0000 (18:40 +0000)]
Merge "Make CLPF handle frame widths and heights not divisible by 8." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 17:44:12 +0000 (17:44 +0000)]
Merge "CLPF: Don't assume sb size=64 and w&h multiple of 8 + valgrind fix." into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 17:43:23 +0000 (17:43 +0000)]
Merge "Silence some harmless compiler warnings in CLPF." into nextgenv2
Zoe Liu [Tue, 11 Oct 2016 16:58:16 +0000 (16:58 +0000)]
Merge "Add a small code clean for show_existing_frame" into nextgenv2
Nathan E. Egge [Sun, 19 Jun 2016 16:02:33 +0000 (12:02 -0400)]
Move tree writing code into bitwriter.h.
Rename av1_write_tree() to aom_write_tree() and move it into bitwriter.h
to match aom_read_tree() in bitreader.h.
Manually cherry-picked from aom/master:
33a143fa7ac42d62080bfc20468cb76ad26045db
Change-Id: I6c686cdd3e0f179d7e95c5bc6984558b62d46d67
Thomas Daede [Tue, 21 Jun 2016 00:56:24 +0000 (17:56 -0700)]
Remove unused color_sensitivity member from MACROBLOCK.
Conflicts:
av1/encoder/block.h
av1/encoder/encodeframe.c
Change-Id: I941e7b9e76380f262b173928d3c5132c5613b3ce
Yaowu Xu [Tue, 11 Oct 2016 16:15:43 +0000 (16:15 +0000)]
Merge "Use derived variable size for memcpy" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 16:05:18 +0000 (16:05 +0000)]
Merge "Added generic SIMD support for CLPF." into nextgenv2
Debargha Mukherjee [Tue, 11 Oct 2016 15:50:31 +0000 (15:50 +0000)]
Merge "Add sse2 forward / inverse 4x8 and 8x4 transforms" into nextgenv2
Yaowu Xu [Tue, 11 Oct 2016 00:39:29 +0000 (17:39 -0700)]
Use derived variable size for memcpy
Manually cherry-picked from aom/master:
bf2ad75a1723d223c376b93295aa06dd23226937
Change-Id: I99f05e79ec8ad35a49bc124e6dd829ccc7d9cc36
Zoe Liu [Tue, 11 Oct 2016 00:18:57 +0000 (17:18 -0700)]
Add a small code clean for show_existing_frame
Change-Id: I42dc9f0fdecd3cf3398ab82d6e01dde06bdf7b24
Steinar Midtskogen [Mon, 19 Sep 2016 11:33:52 +0000 (13:33 +0200)]
CLPF: Remove redundant function argument.
Change-Id: I31bea3b1f76493060edd7e1bd616a223841d5f77
Steinar Midtskogen [Tue, 13 Sep 2016 14:37:13 +0000 (16:37 +0200)]
Extend CLPF to chroma.
Objective quality impact (low latency):
PSNR YCbCr: 0.13% -1.37% -1.79%
PSNRHVS: 0.03%
SSIM: 0.24%
MSSSIM: 0.10%
CIEDE2000: -0.83%
Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
Steinar Midtskogen [Tue, 13 Sep 2016 06:55:56 +0000 (08:55 +0200)]
Remove some dead code in CLPF.
av1_clpf_frame() was always called with the same src and dst,
so we only need one argument and the code supporting different
src and dst was removed.
Change-Id: I70919f50e5cfb19c22eb4dff9ee7c0fa2697fad3
Steinar Midtskogen [Fri, 9 Sep 2016 15:30:21 +0000 (17:30 +0200)]
Print correct info if CLPF unit tests fail.
Change-Id: Ieac27194f342d8ef9ef98c96ebea9d0c444658cf
Steinar Midtskogen [Thu, 8 Sep 2016 07:48:31 +0000 (09:48 +0200)]
Reduce memory footprint for CLPF encoding.
Use in-place filtering, like in the decoder
(see
eb5794da1659f87597291d84c2fbdfd89280065d ).
Change-Id: If037ead45f5cb3461347a63e0e415954d5dcba8b
Steinar Midtskogen [Thu, 1 Sep 2016 17:45:29 +0000 (19:45 +0200)]
Make generic SIMD work with clang.
Change-Id: I2c504a078a7137bea6ba50c5768c1295878e9ea1
Jingning Han [Thu, 1 Sep 2016 19:36:25 +0000 (12:36 -0700)]
Fix clang-format warnings in aom_dsp/simd/v64_intrinsics_arm.h
Change-Id: I221bf4520d7030133e3b2fea883a995b3d6f6282
Steinar Midtskogen [Fri, 2 Sep 2016 08:56:54 +0000 (10:56 +0200)]
Non-normative quality improvements to CLPF.
BDR improvements:
PSNR PSNRHVS SSIM MSSSIM CIEDE2000 PSNR Cb PSNR Cr
LL: -0.17% -0.13% -0.11% -0.12% -0.18% -0.19% -0.21%
HL: -0.21% -0.14% -0.15% -0.11% -0.37% -0.39% -0.52%
Change-Id: I58c00a1cc0ddfc3376644f66345e99472482a613
Steinar Midtskogen [Fri, 9 Sep 2016 13:23:35 +0000 (15:23 +0200)]
Added high bit-depth support in CLPF.
Change-Id: Ic5eadb323227a820ad876c32d4dc296e05db6ece
Steinar Midtskogen [Fri, 9 Sep 2016 15:36:22 +0000 (17:36 +0200)]
Fix a memleak in CLPF.
The memleak appeared in
eb5794da1659f87597291d84c2fbdfd89280065d .
Change-Id: Ifdd6d64aafa0d0ce4dfaf1844f594d5f843bf2e0
Steinar Midtskogen [Wed, 24 Aug 2016 11:00:04 +0000 (13:00 +0200)]
Reduce memory footprint for CLPF decoding.
Instead of having CLPF write to an entire new frame and
copy the result back into the original frame, make the
filter able to work in-place by keeping a buffer of size
frame_width*filter_block_size and delay the write-back
by one filter_block_size row.
This reduces the cycles spent in the filter to ~75%.
Change-Id: I78ca74380c45492daa8935d08d766851edb5fbc1
Steinar Midtskogen [Wed, 7 Sep 2016 06:15:11 +0000 (08:15 +0200)]
Make CLPF handle frame widths and heights not divisible by 8.
Change-Id: If5eb33b6b090f43ba64c82468576b89eddd872c3
Steinar Midtskogen [Thu, 25 Aug 2016 10:22:24 +0000 (12:22 +0200)]
CLPF: Don't assume sb size=64 and w&h multiple of 8 + valgrind fix.
Change-Id: I518ad9c58973910eb0bdcb377f2d90138208c570
Steinar Midtskogen [Fri, 2 Sep 2016 08:02:30 +0000 (10:02 +0200)]
Silence some harmless compiler warnings in CLPF.
Change-Id: I4a6d84007bc17b89cfd8d8f2440bf2968505bd6a
Steinar Midtskogen [Fri, 5 Aug 2016 10:12:38 +0000 (12:12 +0200)]
Added generic SIMD support for CLPF.
Change-Id: Ie03f9a5b0a4c708a586532198d755a1e7509f149
Yaowu Xu [Mon, 10 Oct 2016 18:17:50 +0000 (18:17 +0000)]
Merge "Added generic SIMD library supporting x86 SSE2+ and ARM NEON." into nextgenv2
Yaowu Xu [Mon, 10 Oct 2016 18:17:41 +0000 (18:17 +0000)]
Merge "New CLPF: New kernel and RDO for strength and block size" into nextgenv2
David Barker [Mon, 3 Oct 2016 15:27:27 +0000 (16:27 +0100)]
Add sse2 forward / inverse 4x8 and 8x4 transforms
Change-Id: I89ed93fb20cf975c2b463cff58879521ceaa4163
Yi Luo [Fri, 7 Oct 2016 01:52:10 +0000 (01:52 +0000)]
Merge "Hybrid forward transforms 16x16 AVX2 optimization" into nextgenv2