]>
granicus.if.org Git - libvpx/log
Jingning Han [Mon, 15 Dec 2014 20:48:07 +0000 (12:48 -0800)]
Use right shift to replace division in vp9_pick_inter_mode
Make the variable reduction_fac log2 based and explicitly use
right shift when computing intra_cost_penalty.
Change-Id: I208f1fb879a02debb3b3fc64f9fd06260dcf1c86
Jingning Han [Fri, 12 Dec 2014 22:33:52 +0000 (14:33 -0800)]
Simplify rate-distortion modeling function
Use left shift to replace one multiplication. The computation
outcome remains identical.
Change-Id: I1e1737af0a245de0d2a2bde10f0c171477199fc1
Paul Wilkins [Mon, 15 Dec 2014 11:52:55 +0000 (11:52 +0000)]
Revert "Add support for setting byte alignment."
Fails to compile. Bad calls to vp9_alloc_frame_buffer
and vp9_realloc_frame_buffer in postproc.c
This reverts commit
399823b6f50fb7465f62822d1395e2192e7b07fc .
Change-Id: I29f0e173f8e185d3a303cfdb17813e1eccb51e3a
hkuang [Sat, 13 Dec 2014 17:56:55 +0000 (09:56 -0800)]
Merge "Fix a bug that break the vp8 fragment decoder."
James Zern [Sat, 13 Dec 2014 03:45:15 +0000 (19:45 -0800)]
iosbuild: add success/failure output
Change-Id: I84492f68752321f0266141666e2672ed2da5f509
hkuang [Thu, 20 Nov 2014 23:39:56 +0000 (15:39 -0800)]
Fix a bug that break the vp8 fragment decoder.
(issue #882).
Change-Id: I2ca7f96d390c4eaec0473c50cb01b903d0bd3ee6
James Zern [Sat, 13 Dec 2014 00:29:42 +0000 (16:29 -0800)]
Merge "Optimize bit_read_buffer."
Tom Finegan [Sat, 13 Dec 2014 00:14:57 +0000 (16:14 -0800)]
Merge "vpxdec: Rename the libyuv scale wrapper."
Tom Finegan [Sat, 13 Dec 2014 00:14:12 +0000 (16:14 -0800)]
Merge "iosbuild.sh: Add targets argument."
Frank Galligan [Fri, 12 Dec 2014 23:47:11 +0000 (15:47 -0800)]
Merge "Add support for setting byte alignment."
hkuang [Fri, 12 Dec 2014 21:54:40 +0000 (13:54 -0800)]
Optimize bit_read_buffer.
Change-Id: Iee43c34909deec9787b29c1c33672213b9f049df
James Zern [Fri, 12 Dec 2014 22:32:52 +0000 (14:32 -0800)]
Merge "Remove redundant loads on 1d16_v8 filter."
James Zern [Fri, 12 Dec 2014 22:32:26 +0000 (14:32 -0800)]
Merge "Remove redundant loads on 1d8_v8 filter."
James Zern [Fri, 12 Dec 2014 22:28:55 +0000 (14:28 -0800)]
Merge "vp9: move encoder-only member from common"
James Zern [Fri, 12 Dec 2014 22:28:20 +0000 (14:28 -0800)]
Merge "don't set INLINE to 'always_inline'"
James Zern [Fri, 12 Dec 2014 22:27:56 +0000 (14:27 -0800)]
Merge changes Id6421838,I37499329
* changes:
vp9: make postproc members depend on CONFIG_VP9_POSTPROC
vp9_postproc: remove redundant CONFIG_* checks
Marco [Fri, 12 Dec 2014 22:27:31 +0000 (14:27 -0800)]
Merge "Allow for 4x4 prediction blocks for key frame, speed 6."
James Zern [Fri, 12 Dec 2014 22:25:53 +0000 (14:25 -0800)]
Merge "vp9_loopfilter_mmx: remove some unused tables"
James Zern [Fri, 12 Dec 2014 22:25:30 +0000 (14:25 -0800)]
Merge "x86_abi_support: set LIBVPX_RAND w/vp9-postproc"
Tom Finegan [Fri, 12 Dec 2014 21:53:58 +0000 (13:53 -0800)]
iosbuild.sh: Add targets argument.
Allows override of default target list. Also added missing usage info
for --extra-configure-args, and removed last vestiges of armv6 support.
Change-Id: Ic0f14fffa0cbaea1bed371d38ff65e035bbe3273
Frank Galligan [Thu, 13 Nov 2014 20:28:34 +0000 (12:28 -0800)]
Add support for setting byte alignment.
Add support for setting byte alignment on the Y, U, and V plane of the
reference buffers. The byte alignment must be a power of 2, from 32 to
1024. A value of 0 sets legacy alignment.
Change-Id: I7c1399622f7aa68e123646369216b32047dda73d
James Zern [Fri, 12 Dec 2014 20:16:32 +0000 (12:16 -0800)]
Merge "Remove unnecessary dqcoeff memset."
Tom Finegan [Fri, 12 Dec 2014 19:57:04 +0000 (11:57 -0800)]
vpxdec: Rename the libyuv scale wrapper.
The other name was misleading: We do not export scaling support from
libvpx via vpx_im{g,age}*.
Change-Id: I8acb4ea0301f08c9bab557a4063ea35d147b4631
Frank Galligan [Fri, 12 Dec 2014 19:48:47 +0000 (11:48 -0800)]
Remove redundant loads on 1d16_v8 filter.
This CL showed about a 3% gain in performance on some systems.
Change-Id: Id27e7e0b8e69068aa364e67859436da852669250
Frank Galligan [Fri, 12 Dec 2014 19:34:24 +0000 (11:34 -0800)]
Remove redundant loads on 1d8_v8 filter.
This CL showed a modest gain in performance on some systems.
Change-Id: Iad636a89a1a9804ab7a0dea302bf2c6a4d1653a4
James Zern [Thu, 11 Dec 2014 02:42:07 +0000 (18:42 -0800)]
don't set INLINE to 'always_inline'
INLINE is used quite widely in vp9, this change improves performance
1-2% on most modern platforms.
Change-Id: I8a9974aab89fa588ea4923cc7eaf6199e344a528
James Zern [Thu, 11 Dec 2014 02:11:17 +0000 (18:11 -0800)]
vp9: move encoder-only member from common
allow_comp_inter_inter VP9_COMMON -> VP9_COMP
Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6
James Zern [Thu, 11 Dec 2014 02:12:29 +0000 (18:12 -0800)]
vp9: make postproc members depend on CONFIG_VP9_POSTPROC
Change-Id: Id64218386968cee3132269e4a0572650f20fd980
James Zern [Wed, 10 Dec 2014 01:27:52 +0000 (17:27 -0800)]
vp9_postproc: remove redundant CONFIG_* checks
the entire module is wrapped in CONFIG_VP9_POSTPROC which is forcibly
enabled with CONFIG_INTERNAL_STATS
+ a similar change in vp9_alloccommon.c
Change-Id: I374993297a9fba5bef2f0b71f984eba42f0995a3
James Zern [Wed, 10 Dec 2014 01:57:00 +0000 (17:57 -0800)]
vp9_loopfilter_mmx: remove some unused tables
Change-Id: I964d25cc91c8e4864d73b142d9c7a1b39cb6cfbb
Jim Bankoski [Fri, 12 Dec 2014 19:10:17 +0000 (11:10 -0800)]
Merge "vp9_dx_iface.c uses CONFIG_VP9_POSTPROC but config.h not included"
Jim Bankoski [Fri, 12 Dec 2014 19:10:08 +0000 (11:10 -0800)]
Merge "Adds a test to make sure encoder parms get to decoder."
James Zern [Wed, 10 Dec 2014 01:24:49 +0000 (17:24 -0800)]
x86_abi_support: set LIBVPX_RAND w/vp9-postproc
set LIBVPX_RAND with --enable-vp9-postproc, previously only the vp8
config was checked. this fixes the build with --disable-postproc.
Change-Id: Ia61baded6aa0e44d6443ae4a3c85915f1054f053
Jingning Han [Fri, 12 Dec 2014 17:16:01 +0000 (09:16 -0800)]
Merge "Fix PICK_MODE_CONTEXT index in non-RD coding mode"
Jim Bankoski [Fri, 12 Dec 2014 16:42:36 +0000 (08:42 -0800)]
vp9_dx_iface.c uses CONFIG_VP9_POSTPROC but config.h not included
Change-Id: Id316b3786214bf1028992968955da917e3f2d4a3
Jim Bankoski [Fri, 12 Dec 2014 14:18:56 +0000 (06:18 -0800)]
Fix test to call clear system state in convolve_test.
Assembly tests should clear system state, as we have no
expectation of proper system state in between test runs..
Change-Id: I0f591996c1f17ef2a5a8572a6b445f757223a144
Jim Bankoski [Fri, 12 Dec 2014 01:34:32 +0000 (17:34 -0800)]
Adds a test to make sure encoder parms get to decoder.
This is meant as a framework for testing that encode parms make it
through to the decoder.
Change-Id: Idb86ee3668b45b4e73c23c6e4daef94b0650b786
Jingning Han [Fri, 12 Dec 2014 01:17:53 +0000 (17:17 -0800)]
Fix PICK_MODE_CONTEXT index in non-RD coding mode
This commit fixes a bug in the PICK_MODE_CONTEXT index for
horizontal partition case. The compression performance change
is less than 0.01% level, since most blocks are selected to
use square block size in RTC coding mode.
Change-Id: I67effc18ae8795fccdd82a55f4efc609fa5cb3e1
JackyChen [Fri, 12 Dec 2014 00:24:08 +0000 (16:24 -0800)]
Merge "Multiframe Quality Enhancement(MFQE) in VP9."
Marco [Thu, 11 Dec 2014 23:21:17 +0000 (15:21 -0800)]
Allow for 4x4 prediction blocks for key frame, speed 6.
For key frame under variance source partition: 4x4 prediction blocks
may be selected when variance of 8x8 block is very high (threshold is set fairly high for now).
Testing on some RTC clips shows this helps to reduce some ringing artifacts on key frame.
Encoded key frame size increases about ~10%. Key frame PSNR increases about ~0.1-0.2dB.
Change-Id: I56e203fac32ea6ef69897fb3ea269c59cb50d174
Jingning Han [Thu, 11 Dec 2014 21:30:03 +0000 (13:30 -0800)]
Merge "Replace division with bit shift in choose_partitioning"
Deb Mukherjee [Thu, 11 Dec 2014 20:29:06 +0000 (12:29 -0800)]
Merge "Re-enable 8x8 fdct/fht tests by changing tolerance"
Debargha Mukherjee [Thu, 11 Dec 2014 20:28:45 +0000 (12:28 -0800)]
Merge "Corrected optimization of 8x8 DCT code"
hkuang [Thu, 11 Dec 2014 20:27:25 +0000 (12:27 -0800)]
Remove unnecessary dqcoeff memset.
dqcoeff is set to be 0 on initialization. And set back to 0 after being
used everytime.
Change-Id: I32b8e149bba40a8d707849f737a8e49a691f319c
Jingning Han [Thu, 11 Dec 2014 19:14:07 +0000 (11:14 -0800)]
Merge "Refactor choose_partitioning computing scheme"
Jingning Han [Thu, 11 Dec 2014 19:04:49 +0000 (11:04 -0800)]
Replace division with bit shift in choose_partitioning
This commit explicitly uses the bit shift operation instead of
division for computing block variance.
Change-Id: Id19c0ff27dd1d1ae4aceee6657e1aad0d406bd74
Peter de Rivaz [Thu, 11 Dec 2014 15:54:23 +0000 (15:54 +0000)]
Corrected optimization of 8x8 DCT code
The 8x8 DCT uses a fast version whenever possible.
There was a mistake in the checking code which
meant sometimes the fast version was used when it
was not safe to do so.
Change-Id: I154c84c9e2d836764768a11082947ca30f4b5ab7
(cherry picked from commit
fd05fb0c21e253b4d6f92d7e0b752850ff8ab188 )
Jingning Han [Thu, 11 Dec 2014 17:29:36 +0000 (09:29 -0800)]
Refactor choose_partitioning computing scheme
This commit refactors the choose_partitioning function. It removes
redundant memset calls and makes the encoder to calculate
variance value per block only when it is needed. It reduces the
average runtime cost of choose_partitioning by 60%. Overall it
reduces speed -6 runtime by 2-5%.
Change-Id: I951922c50d901d0fff77a3bafc45992179bacef9
JackyChen [Tue, 2 Dec 2014 20:14:47 +0000 (12:14 -0800)]
Multiframe Quality Enhancement(MFQE) in VP9.
It is the first version of MFQE in VP9. There are a few TODOs included
in this version.
Usage: Add flag --enable-vp9-postproc to config the project.
In decoder, use flag --mfqe in the command line to enable
MFQE in postproc.
Note: Need to have key frame with low quality to see the effect of this
new patch. In my experiment, I fixed the qindex to 200 in key frame.
Change-Id: I021f9ce4616ed3574c81e48d968662994b56a396
Johann [Wed, 10 Dec 2014 23:20:22 +0000 (15:20 -0800)]
Enable neon idct tests for intrinsics
Change-Id: I45d4a22f3ecb9af172e37c95f168805e492c5493
James Yu [Tue, 18 Feb 2014 13:56:46 +0000 (21:56 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 18
Add vp9_idct32x32_add_neon.c
- vp9_idct32x32_1024_add_neon
Change-Id: Ic598b772c28bd3487a8ead7a4598a66b25f9b00f
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Fri, 7 Feb 2014 17:52:15 +0000 (01:52 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 14
Add vp9_idct16x16_add_neon.c
- vp9_idct16x16_256_add_neon_pass1
- vp9_idct16x16_256_add_neon_pass2
- vp9_idct16x16_10_add_neon_pass1
- vp9_idct16x16_10_add_neon_pass2
Change-Id: I54d25b54a36f4371760f54e4036693aaea40a5de
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Sat, 1 Feb 2014 06:56:06 +0000 (14:56 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 13
Add vp9_idct8x8_add_neon.c
- vp9_idct8x8_64_add_neon
- vp9_idct8x8_10_add_neon
Change-Id: I6ee7b4496765aa36ed52990f2ef73e9f24459610
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Sat, 1 Feb 2014 06:01:05 +0000 (14:01 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 12
Add vp9_idct4x4_add_neon.c
- vp9_idct4x4_16_add_neon
Change-Id: I011a96b10f1992dbd52246019ce05bae7ca8ea4f
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Fri, 31 Jan 2014 05:18:15 +0000 (13:18 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 11
Add vp9_idct16x16_1_add_neon.c
- vp9_idct16x16_1_add_neon
Change-Id: I7c6524024ad4cb4e66aa38f1c887e733503c39df
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Thu, 30 Jan 2014 07:26:31 +0000 (15:26 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 10
Add vp9_idct32x32_1_add_neon.c
- vp9_idct32x32_1_add_neon
Change-Id: If9ffe9a857228f5c67f61dc2b428b40965816eda
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Thu, 30 Jan 2014 04:26:44 +0000 (12:26 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 09
Add vp9_idct8x8_1_add_neon.c
- vp9_idct8x8_1_add_neon
Change-Id: I9d23e01fa96013febbf64db6c76c6c955f14e3ff
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Thu, 30 Jan 2014 03:54:35 +0000 (11:54 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 08
Add vp9_idct4x4_1_add_neon.c
- vp9_idct4x4_1_add_neon
Change-Id: Ieab9af107dbd07a4f9503bc945890c90faccb8ac
Signed-off-by: James Yu <james.yu@linaro.org>
Johann [Wed, 10 Dec 2014 19:40:46 +0000 (11:40 -0800)]
Merge "VP9 common for ARMv8 by using NEON intrinsics 07"
Johann [Wed, 10 Dec 2014 19:40:29 +0000 (11:40 -0800)]
Merge "VP9 common for ARMv8 by using NEON intrinsics 04"
Paul Wilkins [Wed, 10 Dec 2014 18:44:27 +0000 (10:44 -0800)]
Merge "Substantial restructuring of AQ mode 2."
Jingning Han [Wed, 10 Dec 2014 17:25:12 +0000 (09:25 -0800)]
Merge "Use use_prev_frame_mvs flag for ref mv search branch"
Jingning Han [Wed, 10 Dec 2014 17:25:05 +0000 (09:25 -0800)]
Merge "Refactor update_state_rt"
Jingning Han [Wed, 10 Dec 2014 17:24:56 +0000 (09:24 -0800)]
Merge "Make RTC coding flow support sub8x8 in key frame coding"
Jingning Han [Wed, 10 Dec 2014 17:05:34 +0000 (09:05 -0800)]
Merge "Cosmetic naming change"
Jingning Han [Wed, 10 Dec 2014 17:05:26 +0000 (09:05 -0800)]
Merge "Take out redundant setting of mode_info from set_block_size"
Jingning Han [Wed, 10 Dec 2014 17:05:18 +0000 (09:05 -0800)]
Merge "Remove unused rd cost calculation from nonrd_use_partition"
Jim Bankoski [Wed, 10 Dec 2014 14:42:08 +0000 (06:42 -0800)]
Merge changes I92251a8b,I5d23a685
* changes:
Adds a decode perf test that builds a new file.
Make the decoder Cfg available to encoder tests..
James Yu [Wed, 29 Jan 2014 15:12:41 +0000 (23:12 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 07
Add vp9_convolve8_neon.c
- vp9_convolve8_horiz_neon
- vp9_convolve8_vert_neon
Change-Id: I0bdd99ff72d275223fe211ac7243c25a5a60cf87
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Sat, 25 Jan 2014 12:51:49 +0000 (20:51 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 04
Add vp9_convolve8_avg_neon.c
- vp9_convolve8_avg_horiz_neon
- vp9_convolve8_avg_vert_neon
Change-Id: I617971e37b02186fec5aca181f4f9622050ea2df
Signed-off-by: James Yu <james.yu@linaro.org>
James Yu [Tue, 21 Jan 2014 09:23:27 +0000 (17:23 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 03
Add vp9_copy_neon.c
- vp9_convolve_copy_neon
Change-Id: I291fc5423d06240876411bbceab03eae5ef585be
Signed-off-by: James Yu <james.yu@linaro.org>
Scott LaVarnway [Wed, 10 Sep 2014 16:49:34 +0000 (09:49 -0700)]
VP9 common for ARMv8 by using NEON intrinsics 02
Add vp9_avg_neon.c
- vp9_convolve_avg_neon
Change-Id: Id2c9d5bcfa37cff1a16417aba1656ff07bdf10fd
Signed-off-by: James Yu <james.yu@linaro.org>
James Zern [Wed, 10 Dec 2014 02:31:46 +0000 (18:31 -0800)]
Merge "Fix clang ioc warning due to NULL src_mi pointer."
Jingning Han [Wed, 10 Dec 2014 02:01:17 +0000 (18:01 -0800)]
Use use_prev_frame_mvs flag for ref mv search branch
Replace error_resilient flag with use_prev_frame_mvs in
vp9_pick_inter_mode reference motion vector search selection.
This effectively turns off the simplified ref mv search in the
settings of frame resizing, even if error-resilient mode is off.
Change-Id: I7fed814ee7bc0cb419a03b846e0fc2de46ba7686
Johann [Wed, 10 Dec 2014 00:51:35 +0000 (16:51 -0800)]
Merge "Add convolve_copy and convolve_avg to the test"
Jingning Han [Tue, 9 Dec 2014 20:09:36 +0000 (12:09 -0800)]
Refactor update_state_rt
Update the frame motion vector only if previous frame motion vector
is needed for next frame reference motion vector.
Change-Id: Ica50f9d7b46ad4f815bba0d9e30f5546df29546f
hkuang [Tue, 9 Dec 2014 22:32:48 +0000 (14:32 -0800)]
Fix clang ioc warning due to NULL src_mi pointer.
The warning only happens in VP9 encoder's first pass due to src_mi
is not set up yet. But it will not fail the encoder as left_mi and
above_mi are not used in the first_pass and they will be set up again
in the second pass.
Change-Id: I12dffcd5fb1002b2b2dabb083c8726650e4b5f08
Johann [Tue, 9 Dec 2014 21:41:49 +0000 (13:41 -0800)]
Merge "VP9 common for ARMv8 by using NEON intrinsics 01"
Johann [Tue, 9 Dec 2014 20:05:15 +0000 (12:05 -0800)]
Add convolve_copy and convolve_avg to the test
Change-Id: Ic9438031282e63e627550f7e4cdeda36e43e647b
Johann [Tue, 9 Dec 2014 20:47:12 +0000 (12:47 -0800)]
Merge "Disable neon assembly when neon is disabled"
Jim Bankoski [Tue, 9 Dec 2014 20:44:45 +0000 (12:44 -0800)]
Adds a decode perf test that builds a new file.
This allows us to track decode speed for new encodes so that we catch
problems like an encode change that makes decode really slow.
Change-Id: I92251a8b1f710b241f66e1042413df1b71b76038
James Yu [Tue, 21 Jan 2014 01:43:29 +0000 (09:43 +0800)]
VP9 common for ARMv8 by using NEON intrinsics 01
Add vp9_loopfilter_neon.c
- vp9_lpf_horizontal_4_neon
- vp9_lpf_vertical_4_neon
- vp9_lpf_horizontal_8_neon
- vp9_lpf_vertical_8_neon
Change-Id: I97a0d7b399a431c21ee77396be3d5f5a1f7ebccb
Signed-off-by: James Yu <james.yu@linaro.org>
Jingning Han [Tue, 9 Dec 2014 19:31:45 +0000 (11:31 -0800)]
Make RTC coding flow support sub8x8 in key frame coding
This commit enables the use of sub8x8 blocks in RTC key frame
encoding. It requires the block size to be preset and will decide
the coding mode and encode the bit-stream.
Change-Id: I35aaf8ee2d4d6085432410c7963f339f85a2c19b
Johann [Mon, 8 Dec 2014 23:13:37 +0000 (15:13 -0800)]
Disable neon assembly when neon is disabled
Change-Id: Idde266cd7287bb6bee016c90efeafa67550f94c6
Jingning Han [Tue, 9 Dec 2014 18:30:39 +0000 (10:30 -0800)]
Cosmetic naming change
Rename set_modeinfo_offsets as set_mode_info_offsets, to be more
consistent with naming convention.
Change-Id: I68ca1f36c4a78127d9439a50c1506a2afd07927d
Jingning Han [Tue, 9 Dec 2014 18:24:37 +0000 (10:24 -0800)]
Take out redundant setting of mode_info from set_block_size
The later encoding process will take the top-left block's
mode_info for pre-determined block size.
Change-Id: I76a90f9ce7f3b2dbc2975b52442114e461c465b5
hkuang [Tue, 9 Dec 2014 18:23:18 +0000 (10:23 -0800)]
Merge "Clean up the logic of handling corrupted frame."
Paul Wilkins [Thu, 27 Nov 2014 10:50:56 +0000 (10:50 +0000)]
Substantial restructuring of AQ mode 2.
The restructure moves the decision into the rd pick
modes loop and makes a decision based at the 16x16
block level instead of only the 64x64 level.
This gives finer granularity and better visual results
on the clips I have tested. Metrics results are worse
than the old AQ2 especially for PSNR and this mode
now falls between AQ0 and AQ1 in terms of visual
impact and metrics results.
Further tuning of this to follow.
It should be noted that if there are multiple iterations
of the recode loop the segment for a MB could change
in each loop if the previous loop causes a change in the
complexity / variance bin of the block. Also where a block
gets a delta Q this will alter the rd multiplier for this block
in subsequent recode iterations and frames where the
segmentation is applied.
Change-Id: I20256c125daa14734c16f7cc9aefab656ab808f7
Jingning Han [Tue, 9 Dec 2014 02:43:36 +0000 (18:43 -0800)]
Remove unused rd cost calculation from nonrd_use_partition
The per block rd cost calculation is not needed when partition
size is preset.
Change-Id: Ie5575248bbffb584e908aa13097f697ace6ec747
Johann [Mon, 8 Dec 2014 22:52:31 +0000 (14:52 -0800)]
Merge "Extend x32 check by also checking for __x86_64__."
Yunqing Wang [Mon, 8 Dec 2014 21:34:53 +0000 (13:34 -0800)]
Merge "SSSE3 Optimization for Atom processors using new instruction selection and ordering"
James Zern [Mon, 8 Dec 2014 20:55:06 +0000 (12:55 -0800)]
Merge "Changes to assembler for NASM on mac."
levytamar82 [Fri, 5 Dec 2014 18:14:33 +0000 (11:14 -0700)]
SSSE3 Optimization for Atom processors using new instruction selection and ordering
The function vp9_filter_block1d16_h8_ssse3 uses the PSHUFB instruction which has a 3 cycle latency and slows execution when done in blocks of 5 or more on Atom processors.
By replacing the PSHUFB instructions with other more efficient single cycle instructions (PUNPCKLBW + PUNPCHBW + PALIGNR) performance can be improved.
In the original code, the PSHUBF uses every byte and is consecutively copied.
This is done more efficiently by PUNPCKLBW and PUNPCHBW, using PALIGNR to concatenate the intermediate result and then shift right the next consecutive 16 bytes for the final result.
For example:
filter = 0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8
Reg = 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
REG1 = PUNPCKLBW Reg, Reg = 0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7
REG2 = PUNPCHBW Reg, Reg = 8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15
PALIGNR REG2, REG1, 1 = 0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8
This optimization improved the function performance by 23% and produced a 3% user level gain on 1080p content on Atom processors.
There was no observed performance impact on Core processors (expected).
Change-Id: I3cec701158993d95ed23ff04516942b5a4a461c0
hkuang [Mon, 8 Dec 2014 18:24:17 +0000 (10:24 -0800)]
Merge "Improve the performance by caching the left_mi and right_mi in macroblockd."
Paul Wilkins [Mon, 8 Dec 2014 17:01:39 +0000 (09:01 -0800)]
Merge "Use average mb energy from first pass in AQ2 test."
Frank Galligan [Mon, 8 Dec 2014 05:37:39 +0000 (21:37 -0800)]
Merge "Fix potential integer overflow."
Jim Bankoski [Sun, 7 Dec 2014 19:28:51 +0000 (11:28 -0800)]
Make the decoder Cfg available to encoder tests..
Adds decoder config as a changeable parameter to unit tests, and
changes end to end test to use commonly used parameters to enable
base test of tiles encoding and frame parallel decoding.
Change-Id: I5d23a6857303b4d68b92b15c3f2f04a1bcb4c2bb
James Zern [Sat, 6 Dec 2014 05:09:42 +0000 (21:09 -0800)]
Merge "vp9 asserts: fix compile warning"
James Zern [Sat, 6 Dec 2014 04:36:44 +0000 (20:36 -0800)]
Merge "fix building with --disable-spatial-resampling"
James Zern [Sat, 6 Dec 2014 00:02:42 +0000 (16:02 -0800)]
fix building with --disable-spatial-resampling
vpx_scale.c is only used by the vp8 encoder when spatial resampling is
enabled.
Change-Id: If3d3ad81e9ee6e0b59f8c040b9624ef52598fe03