]> granicus.if.org Git - libvpx/commit
Implemented DST 16x16 SSE2 intrinsics optimization
authorYi Luo <luoyi@google.com>
Tue, 8 Mar 2016 22:10:24 +0000 (14:10 -0800)
committerYi Luo <luoyi@google.com>
Tue, 8 Mar 2016 22:56:38 +0000 (14:56 -0800)
commit50a164a1f6eb7b32d34b0e9fc40f5f6067fdfb57
tree6dc4c7687d4fccc2b88a7af79bee5bf931fbb44e
parentcf9c95c32c5cda206a018bfad1321c7e9ac7f623
Implemented DST 16x16 SSE2 intrinsics optimization

- Implemented fdst16_sse2(), fdst16_8col() against C version: fdst16().
- Turned on 7 DST related hybrid txfm types in vp10_fht16x16_sse2().
- Replaced vp10_fht10x10_c() with vp10_fht16x16_sse2() in
  fwd_txfm_16x16().
- Added vp10_fht16x16_sse2() unit test against C version:
  vp10_fht16x16_c() (--gtest_filter=*VP10Trans16x16*).
- Unit test passed.
- Speed improvement: 2.4%, 3.2%, 3.2%, for city_cif.y4m, garden_sif.y4m,
  and mobile_cif.y4m.

Change-Id: Ib30a67ce5d5964bef143d588d0f8fa438be8901f
test/test.mk
test/vp10_fht16x16_test.cc [new file with mode: 0644]
vp10/common/vp10_rtcd_defs.pl
vp10/encoder/hybrid_fwd_txfm.c
vp10/encoder/x86/dct_sse2.c