From: Peter de Rivaz Date: Thu, 11 Dec 2014 15:54:23 +0000 (+0000) Subject: Corrected optimization of 8x8 DCT code X-Git-Tag: v1.4.0~382^2 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=5c22224e9e43befc414cce2cf163616c9d56b0d3;p=libvpx Corrected optimization of 8x8 DCT code The 8x8 DCT uses a fast version whenever possible. There was a mistake in the checking code which meant sometimes the fast version was used when it was not safe to do so. Change-Id: I154c84c9e2d836764768a11082947ca30f4b5ab7 (cherry picked from commit fd05fb0c21e253b4d6f92d7e0b752850ff8ab188) --- diff --git a/vp9/common/x86/vp9_idct_intrin_sse2.c b/vp9/common/x86/vp9_idct_intrin_sse2.c index 3610c7165..42e0baa05 100644 --- a/vp9/common/x86/vp9_idct_intrin_sse2.c +++ b/vp9/common/x86/vp9_idct_intrin_sse2.c @@ -4260,7 +4260,7 @@ void vp9_highbd_idct8x8_10_add_sse2(const tran_low_t *input, uint8_t *dest8, // N.B. Only first 4 cols contain non-zero coeffs max_input = _mm_max_epi16(inptr[0], inptr[1]); min_input = _mm_min_epi16(inptr[0], inptr[1]); - for (i = 2; i < 4; i++) { + for (i = 2; i < 8; i++) { max_input = _mm_max_epi16(max_input, inptr[i]); min_input = _mm_min_epi16(min_input, inptr[i]); }