James Zern [Thu, 6 Aug 2015 01:31:50 +0000 (18:31 -0700)]
endian_inl.h: fix mips32 android build
when configuring with mips32-android-gcc HAVE_MIPS32 would be set, but the
ndk does not set -mips32r2 for APP_ABI=mips which results in BSwap32 failing
to build; refine the check in endian_inl.h
Jingning Han [Thu, 6 Aug 2015 19:02:05 +0000 (12:02 -0700)]
Fix compiler error in vp8/9 decoder test
The test file compiler fails if one uses --disable-vp8-decoder
--enable-vp9-decoder. It effectively turns on CONFIG_VP8 and
CONFIG_DECODERS, but turns off CONFIG_VP8_DECODER, which causes
compiler error at test_vector_test.cc.
This commit fixes this issue by adding vp8/9 decoder flags to
the decoder behavior test, respectively.
Marco [Wed, 5 Aug 2015 22:09:16 +0000 (15:09 -0700)]
Rate control adjustment for temporal-svc 1pass.
-For ambient qp in active_worst setting: increase the initial
averaging time (from very first frame) to account for avg_qp of key_frame.
-In postencode on key frame: update the last_q/avg_q[key_frame] for
all temporal layers.
Marco [Wed, 5 Aug 2015 20:53:26 +0000 (13:53 -0700)]
Bugfix for svc.
Condition usage of rc.frames_since_golden to non-svc mode.
rc.frames_since_golden, which is used in non-svc mode to add second reference,
was causing, under certain condiiton, the turning off of golden reference
for svc case.
This change performs poorly on various x86_64 devices affecting
performance by 1-3% at 1080P. Performance on chromebook like devices was
mixed neutral to slightly negative, so there should be minimal change
there.
Jingning Han [Mon, 3 Aug 2015 21:51:10 +0000 (14:51 -0700)]
Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names
This commit clears the function naming convention in vpx_dsp. It
replaces vp9_ prefix of global functions with vpx_ prefix. It also
removes the vp9_ prefix from static functions.
Yunqing Wang [Tue, 4 Aug 2015 19:16:47 +0000 (12:16 -0700)]
Minor adjustment in diagonal sub-pixel point checking
Choose a different diagonal point to check when the two costs are
the same, making it consistent with the way we choose the best mv.
This slightly changes the encoding result, and the derflr set borg
test at speed 0 shows 0.027% Overall PSNR gain, 0.024% Avg PSNR
gain, and 0.043% SSIM gain.
Yunqing Wang [Tue, 4 Aug 2015 19:06:21 +0000 (12:06 -0700)]
Small improvement in sub-pixel motion search
If the current best mv(namely, the search center) is still the best mv
after the first level search, the second level checks is skipped. This
patch doesn't change the bitstream. At speed 0, it speeds up the encoder
by 1% - 2%.
James Zern [Tue, 4 Aug 2015 03:24:44 +0000 (20:24 -0700)]
third_party/libwebm: pull from upstream
Changes: b6de61a Adds support for simple tags 75a6d2d sample_muxer: Don't write huge files. cec1f85 mkvmuxer: remove unused timecode_scale variable 8a61b40 Merge "mkvparser: Tiny whitespace fix." 7affc5c clang-format re-run d6d04ac mkvmuxer: use generic Cluster::AddFrame 4928b0b Merge "mkvmuxer: Write Block key frames correctly." c2e4a46 Merge "sample_muxer: Use AddGenericFrame to add frames." e97f296 mkvparser: Tiny whitespace fix. d66ba44 Merge "Add support to parse DisplayUnit." deb41c2 Add support to parse DisplayUnit. 42e5660 Fix issues on EBML lacing block parsing fe1e9bb Fix block parsing to not allow frame_size = 0 2cb6a28 Change assertions to checks when parsing TrackPositions d04580f Fixes issues on Block Group parsing c3550fd mkvmuxer: Write Block key frames correctly. 5dd0e40 Merge "mkvmuxer: Set is_key to true for metadata blocks." 8e96863 mkvmuxer: Set is_key to true for metadata blocks. a9e4819 sample_muxer: Use AddGenericFrame to add frames. 5a3be73 Change assertions to checks when load CuePoints f99f3b2 mkvmuxerutil::EbmlDateElementSize: remove value param ff572b5 Frame::IsValid: fix track_number check b6311dc mkvmuxer: Refactor to remove a lot of duplicate code 256cd02 Merge "mkvmuxer: DiscardPadding should be signed integer." 16c8e78 mkvmuxer: s/frame/data in all AddFrame* functions. c5e511c mkvmuxer: DiscardPadding should be signed integer. 4baaa2c Add framework build script: iosbuild.sh 3d06eb1 PATENTS: fix a typo: constitutes -> constitute d3849c2 mkvparser: Dead code removal. f439e52 Change assertions to checks when preloading Cues d3a44cd Fix track transversal when listing Cues on sample c6255af Tweak .gitignore so git status is clean after checkout and
build: - added missing underscore to sample_muxer - added cmake and make
related files b5229c7 Makefile.unix: s/samplemuxer/sample_muxer/ e3616a6 Add support to parse stereo mode, display width and display
height in mkvparser a4b68f8 parser: Fix bug in Chapters::Atom::Parse() bab0a00 cmake: Set library and project name the proper way on Windows. feeb9b1 Set library name to match Windows expectations. b9a549b Fix CMakefile to generate libwebm.a b386aa5 Add CMakeLists.txt and msvc_runtime.cmake. b0f8a81 parser: Fix memory leak in Chapter parsing f06e152 mkvmuxer: Fix MoveCuesBeforeClustersHelper recursive call. 27bb747 allow subtitle tracks with ContentEncodings 623d182 DoLoadCluster: tolerate empty clusters 1156da8 Update PATENTS to reflect s/VP8/WebM/g 0d4cb40 mkvmuxerutil: Use rand() in MSVC builds. e12fff0 mkvmuxer: Overload WriteEbmlHeader for backward compatibility a321704 mkvmuxer: write correct DocTypeVersion 574045e mkvmuxer: fix DiscardPadding 8be6397 Include crop elements when calculating size of Video element 8f2d1b3 mkvparser: fix DiscardPadding extraction 1c36c24 mkvmuxer: fix style guide violations 568504e Merge "UUIDs can have their high bit set" acf788b Add support for CropLeft, CropRight, CropTop and CropBottom
elements. 418188b Merge "muxer: codec_id is a mandatory element" 07688c9 mkvmuxer: Reject frames if invalid track number is passed. 2a63e47 muxer: codec_id is a mandatory element d13c017 UUIDs can have their high bit set
Yaowu Xu [Mon, 3 Aug 2015 17:46:12 +0000 (10:46 -0700)]
Correct the allocation size for ssim_vars
Ssim_vars is used to accumulate stats based 4x4 pixel blocks, this
commit changes the allocations size to be based on mi_rows and mi_cols
to avoid out-of-bound memory access for larger size videos. The hard
coded 720x480 can only work for image size up to 2880x1920.
Jingning Han [Fri, 31 Jul 2015 01:53:18 +0000 (18:53 -0700)]
Factor inverse transform functions into vpx_dsp
This commit moves the module inverse transform functions from vp9
to vpx_dsp folder. The hybrid transform wrapper functions stay in
the vp9 folder, since it involves codec-specific data structures.
Scott LaVarnway [Thu, 30 Jul 2015 12:02:04 +0000 (05:02 -0700)]
VP9_COPY_CONVOLVE_SSE2 optimization
This function suffers from a couple problems in small core(tablets):
-The load of the next iteration is blocked by the store of previous iteration
-4k aliasing (between future store and older loads)
-current small core machine are in-order machine and because of it the store will spin the rehabQ until the load is finished
fixed by:
- prefetching 2 lines ahead
- unroll copy of 2 rows of block
- pre-load all xmm regiters before the loop, final stores after the loop
The function is optimized by:
copy_convolve_sse2 64x64 - 16%
copy_convolve_sse2 32x32 - 52%
copy_convolve_sse2 16x16 - 6%
copy_convolve_sse2 8x8 - 2.5%
copy_convolve_sse2 4x4 - 2.7%
credit goes to Tom Craver(tom.r.craver@intel.com) and Ilya Albrekht(ilya.albrekht@intel.com)
Zoe Liu [Wed, 22 Jul 2015 17:40:42 +0000 (10:40 -0700)]
Code refactor on InterpKernel
It in essence refactors the code for both the interpolation
filtering and the convolution. This change includes the moving
of all the files as well as the changing of the code from vp9_
prefix to vpx_ prefix accordingly, for underneath architectures:
(1) x86;
(2) arm/neon; and
(3) mips/msa.
The work on mips/drsp2 will be done in a separate change list.
Yunqing Wang [Wed, 29 Jul 2015 20:37:41 +0000 (13:37 -0700)]
Remove tx cache and speed up tx size selection
1. The RD scores obtained during the tx size selection were stored in the
tx cache, and used to help make the tx decision for the following frames.
This wasn't used anymore in VP9 encoder. Recovered the related decision
making code from 1.5+ years ago, and borg tests didn't show any quality
gain. This patch removed it to lower the complexity.
2. An optimization was done after the above refactoring. If the tx_mode
is not TX_MODE_SELECT, we only need to test the chosen tx size instead
of all posible tx sizes. This gave a 1.5% average speed gain at speed 2,
and a 1% average speed gain at speed 3.