]>
granicus.if.org Git - libvpx/log
Yaowu Xu [Fri, 19 Jul 2013 00:50:07 +0000 (17:50 -0700)]
Merge "Do in-place UV intra mode selection."
Yaowu Xu [Fri, 19 Jul 2013 00:49:58 +0000 (17:49 -0700)]
Merge "Change break statement in a 2d loop to a return statement."
Dmitry Kovalev [Fri, 19 Jul 2013 00:29:05 +0000 (17:29 -0700)]
Merge "Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT)."
Dmitry Kovalev [Thu, 18 Jul 2013 22:12:46 +0000 (15:12 -0700)]
Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT).
Change-Id: Ide58a74d31ff948319445a6337d2c05e98720e34
Ronald S. Bultje [Thu, 18 Jul 2013 22:10:41 +0000 (15:10 -0700)]
Merge "Remove motion vectors from PARTITION_INFO."
Dmitry Kovalev [Thu, 18 Jul 2013 21:41:52 +0000 (14:41 -0700)]
Merge "Removing unused mv_bias and check_mv_bounds functions."
Dmitry Kovalev [Thu, 18 Jul 2013 21:41:43 +0000 (14:41 -0700)]
Merge "Removing unused members of VP9Decompressor: mbc, prob_skip_false."
Ronald S. Bultje [Thu, 18 Jul 2013 20:09:38 +0000 (13:09 -0700)]
Do in-place UV intra mode selection.
This means we only do UV intra mode selection if we find any intra
mode to actually be useful at all; in addition, we only do UV intra
mode selection for the transform sizes that were selected, rather
than all sizes available in this partition.
First 50 frames of bus @ 1500kbps (speed 0) gains about 5% with this
change.
Change-Id: I7b461eb8b803247f57896c5a9505f745b55502b3
Ronald S. Bultje [Thu, 18 Jul 2013 00:24:33 +0000 (17:24 -0700)]
Change break statement in a 2d loop to a return statement.
The break statement only breaks out of the nested loop, not the
top-level loop, so it doesn't always work as intended. Changing it
to a return statement does what's intended.
Change-Id: I585419823b39a04ec8826b1c8a216099b1728ba7
Ronald S. Bultje [Wed, 17 Jul 2013 23:46:53 +0000 (16:46 -0700)]
Remove motion vectors from PARTITION_INFO.
The same information already exists in union b_mode_info.
Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d
Ronald S. Bultje [Thu, 18 Jul 2013 19:47:50 +0000 (12:47 -0700)]
Merge "Fix bug where we don't choose any mode in RD selection."
Ronald S. Bultje [Thu, 18 Jul 2013 19:12:48 +0000 (12:12 -0700)]
Fix bug where we don't choose any mode in RD selection.
This could happen during golden overlay frame coding from a previous
alt-ref frame if the special overlay code was triggered.
Change-Id: I3056d0c547cd26903b260ef93c94026e96bd9868
Dmitry Kovalev [Thu, 18 Jul 2013 18:29:34 +0000 (11:29 -0700)]
Removing unused members of VP9Decompressor: mbc, prob_skip_false.
Change-Id: Id5480a4fd56c184ad046c2192b30d190debb3de0
Dmitry Kovalev [Thu, 18 Jul 2013 18:20:48 +0000 (11:20 -0700)]
Removing unused mv_bias and check_mv_bounds functions.
Change-Id: I1558fd969d9ad112bf6480bdd16ef87edd396ab5
Johann [Thu, 18 Jul 2013 17:33:51 +0000 (10:33 -0700)]
Merge "libvpx: enable building for iOS devices (armv7)"
Ami Fischman [Thu, 18 Jul 2013 17:11:01 +0000 (10:11 -0700)]
libvpx: enable building for iOS devices (armv7)
Allow output of gas syntax assembly directly from obj_int_extract
Change-Id: I33a747e87ef1c590a8766dea17f8cb2497e54591
Frank Galligan [Thu, 18 Jul 2013 17:21:21 +0000 (10:21 -0700)]
Merge "Fix horz loopfilter loops"
Ronald S. Bultje [Thu, 18 Jul 2013 17:06:51 +0000 (10:06 -0700)]
Merge "Fix bug which skips zeromv even if near/nearest is not 0,0."
Frank Galligan [Thu, 18 Jul 2013 16:44:15 +0000 (09:44 -0700)]
Fix horz loopfilter loops
If count was greater than 1 the src pointer would be off on
the second loop.
Change-Id: I8e09037e68dc4ae92076a8067f7b6dacbbef8263
Ronald S. Bultje [Thu, 18 Jul 2013 16:34:59 +0000 (09:34 -0700)]
Fix bug which skips zeromv even if near/nearest is not 0,0.
Change-Id: Id4f454831f3f11099f39c30246adeaa52857d08d
Jingning Han [Thu, 18 Jul 2013 00:12:02 +0000 (17:12 -0700)]
Use mv_check_bounds in sub8x8 rd loop
Make the use of mv_check_bounds consistent for mvs of both ref_frame[0]
and ref_frame[1].
Change-Id: I1ca24865cc7232ca9cbe5db566c53abad1592211
Ronald S. Bultje [Wed, 17 Jul 2013 23:11:13 +0000 (16:11 -0700)]
Merge "Add a best_yrd shortcut in splitmv mode search."
Ronald S. Bultje [Wed, 17 Jul 2013 23:10:51 +0000 (16:10 -0700)]
Merge "Skip redundant nearest/near/zero encodes in splitmv."
Ronald S. Bultje [Wed, 17 Jul 2013 23:10:41 +0000 (16:10 -0700)]
Merge "Skip nearest/near/zero redundant encodes."
Ronald S. Bultje [Wed, 17 Jul 2013 23:10:22 +0000 (16:10 -0700)]
Merge "Best_rd breakout in rd partition search."
Yunqing Wang [Wed, 17 Jul 2013 22:47:17 +0000 (15:47 -0700)]
Merge "Remove unnecessary calling of vp9_init_quantizer()"
Yunqing Wang [Wed, 17 Jul 2013 21:59:00 +0000 (14:59 -0700)]
Remove unnecessary calling of vp9_init_quantizer()
vp9_init_quantizer() is called in vp9_create_compressor(), and
should not be called in vp9_set_speed_features().
Change-Id: Ic2f1f4b0531b9d46bb841d7e1d8da9812207dad6
hkuang [Wed, 17 Jul 2013 21:51:53 +0000 (14:51 -0700)]
Merge "Remove unnecessary buffer copy in idct4x4."
Yaowu Xu [Wed, 17 Jul 2013 21:44:40 +0000 (14:44 -0700)]
Merge "changed mode checking order"
Dmitry Kovalev [Wed, 17 Jul 2013 21:44:20 +0000 (14:44 -0700)]
Merge changes Ieffea49e,Idf610746
* changes:
Removing two unused arguments from vp9_inc_mv signature.
Changing signature of vp9_get_pred_probs_tx_size.
Dmitry Kovalev [Wed, 17 Jul 2013 21:43:45 +0000 (14:43 -0700)]
Merge "Removing experimental code from vp9_entropymv.c."
Dmitry Kovalev [Wed, 17 Jul 2013 21:26:34 +0000 (14:26 -0700)]
Merge "Adding read_comp_pred function."
Ronald S. Bultje [Wed, 17 Jul 2013 21:21:44 +0000 (14:21 -0700)]
Add a best_yrd shortcut in splitmv mode search.
Encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from
1min6.2 to 1min5.9, i.e. 0.5% faster overall.
Change-Id: I59d8a3b2f0a75010fa041d5e2646c8caac5bd683
hkuang [Wed, 17 Jul 2013 21:18:59 +0000 (14:18 -0700)]
Remove unnecessary buffer copy in idct4x4.
Change-Id: I386066b9bcfb4bffb582e6827af36ca0181f6a83
Ronald S. Bultje [Wed, 17 Jul 2013 20:53:35 +0000 (13:53 -0700)]
Skip redundant nearest/near/zero encodes in splitmv.
Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from
1min7.3 to 1min6.2, i.e. 1.7% faster overall.
Change-Id: I19d2deacfbffadd61d32551cee9586757ab4a987
Yaowu Xu [Wed, 17 Jul 2013 19:07:48 +0000 (12:07 -0700)]
changed mode checking order
Change-Id: Ic4c4b363ed840935e42f495f13ea5e601a56f1b2
Ronald S. Bultje [Wed, 17 Jul 2013 18:33:15 +0000 (11:33 -0700)]
Skip nearest/near/zero redundant encodes.
Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min12.8
to 1min7.3, i.e. 8% faster.
Change-Id: Ia22d1c7b687316c553cc60eacae988b24e175b62
Yunqing Wang [Wed, 17 Jul 2013 16:37:14 +0000 (09:37 -0700)]
Enable disable_splitmv feature for other speeds
Added disable_splitmv feature at other speed levels. For speed 3 or
above, always turn it on.
Change-Id: Ibb36f0a7ef12a34b4f8d0f9cb6193eab43b34360
Dmitry Kovalev [Wed, 17 Jul 2013 17:25:09 +0000 (10:25 -0700)]
Removing experimental code from vp9_entropymv.c.
Change-Id: I340d06e3bc32c78358654496503cccd4196cbe2e
Johann [Wed, 17 Jul 2013 17:09:00 +0000 (10:09 -0700)]
Merge "vp9_convolve8_neon placeholder"
Ronald S. Bultje [Wed, 17 Jul 2013 16:56:46 +0000 (09:56 -0700)]
Best_rd breakout in rd partition search.
About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which
goes from 1min36 to 1min24. Results become slightly better (+0.2% on
derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in
super_block_yrd(). Overall speed change (on derfraw300) is roughly
-13%. This can probably be improved further by caching best_yrd
between partition searches. Also, we might be able to get more
speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not
just at the sb8x8 level.
Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b
Ronald S. Bultje [Wed, 10 Jul 2013 22:18:52 +0000 (15:18 -0700)]
Do a skip-block check for sub8x8 partitions also.
+0.2% SSIM and glbPSNR on derfraw300.
Change-Id: I9cba0bca55e606a22f557c7732b064f738efe84d
Yunqing Wang [Wed, 3 Jul 2013 21:43:23 +0000 (14:43 -0700)]
Speed up motion estimation using small partitions' result(experiment)
Current partition checking starts from small sizes, and then goes up
to large sizes. This experiment uses the small partitions' motion
estimation result, which is already available, to speed up the
large partition's motion estimation. We can decide to skip some
patition checkings if they are unlikely choices. We could use the
motion vector(MV) result as current partition's prediction MV, limit
the search range and reference frame.
Current result at speed 1:
psnr loss: 1.19% for stdhd, 0.287% for derf.
speed gain: 14% for sunflower(hd), 11% for akiyo.
Further improvement will be done later.
Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab
Johann [Tue, 16 Jul 2013 17:13:06 +0000 (10:13 -0700)]
vp9_convolve8_neon placeholder
Call the individually optimized horizontal and vertical functions. This
implementation abuses the temp buffer.
This will be replaced with a custom optimized function.
Over 2x speedup.
Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd
Yaowu Xu [Wed, 17 Jul 2013 14:46:04 +0000 (07:46 -0700)]
Merge "added missed replacement"
Paul Wilkins [Wed, 17 Jul 2013 12:19:26 +0000 (05:19 -0700)]
Merge "Move uv intra mode selection in rd loop."
Paul Wilkins [Wed, 17 Jul 2013 10:40:11 +0000 (03:40 -0700)]
Merge "Limit transform sizes searched for uv intra."
Paul Wilkins [Tue, 16 Jul 2013 17:12:34 +0000 (18:12 +0100)]
Move uv intra mode selection in rd loop.
Use an estimate based on DC_PRED for intra uv cost
within the rd loop then only do a full uv mode analysis
if an intra mode is chosen.
Significant speed gains in some cases. Currently only
enabled for speed 2 pending speed/quality tests.
Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0
Paul Wilkins [Tue, 16 Jul 2013 14:56:42 +0000 (15:56 +0100)]
Limit transform sizes searched for uv intra.
Apply limit if search_method == USE_LARGESTALL
to the range of UV tx sizes searched.
Change-Id: I6db29f0dd237285ffc50d75a37e8b68151ad821c
Paul Wilkins [Wed, 17 Jul 2013 09:50:09 +0000 (02:50 -0700)]
Merge "Minor cleanup in code to fine uv tx_size."
Dmitry Kovalev [Wed, 17 Jul 2013 04:09:00 +0000 (21:09 -0700)]
Merge "Removing MV_GROUP_UPDATE define and corresponding code."
Jingning Han [Wed, 17 Jul 2013 03:54:25 +0000 (20:54 -0700)]
Merge "Skip redundant motion search in 4x4 level rd loop"
Dmitry Kovalev [Wed, 17 Jul 2013 03:20:25 +0000 (20:20 -0700)]
Adding read_comp_pred function.
Removing old debug code from vp9_decodemv.c.
Change-Id: I51a6d5fe6a2f6583a1555e692bb1ee5a5b315d6c
Jingning Han [Tue, 16 Jul 2013 19:04:07 +0000 (12:04 -0700)]
Skip redundant motion search in 4x4 level rd loop
This commit makes the encoder to perform motion search only once
per reference frame type for each 4x4/4x8/8x4 block. For bus_cif
at 2000 kbps, the runtime goes from 253812ms -> 217817ms
(14% speed-up) for speed 0.
Change-Id: I5f17599ccc8cfaf93ccb4f98fcb6008af6d79e92
Yaowu Xu [Wed, 17 Jul 2013 00:12:45 +0000 (17:12 -0700)]
added missed replacement
Change-Id: I2bce6f381fef0729b4dd5eb09ccb609f2eddd7ef
Dmitry Kovalev [Wed, 17 Jul 2013 00:01:08 +0000 (17:01 -0700)]
Removing two unused arguments from vp9_inc_mv signature.
Change-Id: Ieffea49eb7a5e5092f21f8694c546aff69b07c6d
Dmitry Kovalev [Tue, 16 Jul 2013 23:34:54 +0000 (16:34 -0700)]
Changing signature of vp9_get_pred_probs_tx_size.
Removing VP9_COMMON* argument and adding struct tx_probs* instead of
MACROBLOCKD*.
Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c
Dmitry Kovalev [Tue, 16 Jul 2013 22:55:17 +0000 (15:55 -0700)]
Merge "Loop filter code cleanup."
Dmitry Kovalev [Tue, 16 Jul 2013 22:03:00 +0000 (15:03 -0700)]
Removing MV_GROUP_UPDATE define and corresponding code.
Change-Id: I4884cdc2557d25d50c7c4f7e19b1ad8bdb93cd63
Dmitry Kovalev [Tue, 16 Jul 2013 21:47:15 +0000 (14:47 -0700)]
Cleaning up tile code.
Removing tile_rows and tile_columns from VP9Common, removing redundant
constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of
vp9_get_tile_n_bits.
Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267
Dmitry Kovalev [Tue, 16 Jul 2013 21:39:31 +0000 (14:39 -0700)]
Loop filter code cleanup.
Cosmetic code changes, renaming 'flat' local var to 'mask', removing
unused field 'blim' from loopfilter_info_n and loop_filter_info structs.
Change-Id: I51e6ccf727fe361ad9a08e29e1201aa7abd4987f
James Zern [Tue, 16 Jul 2013 21:25:32 +0000 (14:25 -0700)]
Merge changes I40454d26,I892e76d5,I865ab3f9,I4a4bec17,I61c4351e,I37eb3559,I1031c556,I8c8f1f42
* changes:
delete vp9_loopfilter_sse2.asm
vp9_loopfilter_intrin_sse2: cosmetics: fix indent
delete x86/vp9_loopfilter_x86.h
vp9_loopfilter_intrin_sse2: make some funcs static
vp9_loopfilter_intrin_sse2: remove unused uv funcs
vp9_loopfilter: remove uv function typedef
filter_block_plane: reuse some constants
vp9_loopfilter.c: make some functions static
James Zern [Tue, 16 Jul 2013 21:22:52 +0000 (14:22 -0700)]
Merge "use consistent framerate naming"
James Zern [Sat, 13 Jul 2013 00:12:46 +0000 (17:12 -0700)]
use consistent framerate naming
s/frame_rate/framerate/g
Change-Id: I6fc3e088e419c5f46e3a9390dd8a2cad2677a2fc
Jingning Han [Tue, 16 Jul 2013 21:04:04 +0000 (14:04 -0700)]
Merge "SSE2 16x16 inverse ADST/DCT hybrid transform"
Dmitry Kovalev [Tue, 16 Jul 2013 20:34:42 +0000 (13:34 -0700)]
Merge "Rewriting vp9_set_pred_flag_{seg_id, mbskip}."
Dmitry Kovalev [Tue, 16 Jul 2013 20:26:53 +0000 (13:26 -0700)]
Merge "Moving vp9_kf_default_bmode_probs to vp9_entropymode.c."
James Zern [Sun, 14 Jul 2013 02:08:13 +0000 (19:08 -0700)]
delete vp9_loopfilter_sse2.asm
sse2 functions are provided by vp9_loopfilter_intrin_sse2.c
Change-Id: I40454d26034e3ef915eeaf889937fe7d1b519b9b
James Zern [Sun, 14 Jul 2013 02:07:20 +0000 (19:07 -0700)]
vp9_loopfilter_intrin_sse2: cosmetics: fix indent
Change-Id: I892e76d5ad1443b2ea0d1a7839fe26afe9c68ffb
James Zern [Sun, 14 Jul 2013 01:50:55 +0000 (18:50 -0700)]
delete x86/vp9_loopfilter_x86.h
also remove prototype_loopfilter{,_block} defines from vp9_loopfilter.h
Change-Id: I865ab3f9436c7b1ca166f76630328abf01389405
James Zern [Tue, 16 Jul 2013 20:00:14 +0000 (13:00 -0700)]
Merge "vp9: remove frames_{since,till}.. from MACROBLOCKD"
James Zern [Tue, 16 Jul 2013 19:55:42 +0000 (12:55 -0700)]
Merge "Cosmetic changes in 4x4 and 8x8 fdct unit tests"
Jingning Han [Mon, 15 Jul 2013 18:05:31 +0000 (11:05 -0700)]
SSE2 16x16 inverse ADST/DCT hybrid transform
This commit enables SSE2 implementation of 16x16 inverse ADST/DCT
hybrid transform. The runtime goes from 5742 cycles -> 1821 cycles.
This provides about 1% encoding speed-up at speed 0.
Change-Id: I1678d0988bf30b9efd524877705bbb3645edb17b
James Zern [Tue, 16 Jul 2013 19:17:04 +0000 (12:17 -0700)]
Merge "VP[89]_COMMON: remove unused near_boffset"
James Zern [Tue, 16 Jul 2013 19:16:37 +0000 (12:16 -0700)]
Merge "VP9_COMMON: remove unused framerate/bitrate"
James Zern [Tue, 16 Jul 2013 19:16:04 +0000 (12:16 -0700)]
Merge "yv12config: remove YUV_TYPE"
Ronald S. Bultje [Tue, 16 Jul 2013 19:07:17 +0000 (12:07 -0700)]
Merge "Replace generated quant tables with static lookup tables."
Ronald S. Bultje [Tue, 16 Jul 2013 18:01:18 +0000 (11:01 -0700)]
Replace generated quant tables with static lookup tables.
This prevents possible float rounding issues between architectures.
Change-Id: I6ed260aebd49feb4cfb5596a5370c44be5f72167
John Koleszar [Tue, 16 Jul 2013 18:23:38 +0000 (11:23 -0700)]
Merge "Fix above context pointers"
Jingning Han [Tue, 16 Jul 2013 18:00:11 +0000 (11:00 -0700)]
Merge "SSE2 8x8 inverse ADST/DCT transform"
Dmitry Kovalev [Tue, 16 Jul 2013 17:54:34 +0000 (10:54 -0700)]
Moving vp9_kf_default_bmode_probs to vp9_entropymode.c.
Removing vp9_modelcontext.c.
Change-Id: If2316c58dead2708d9f95b52d9494ba4c1dd7427
Dmitry Kovalev [Tue, 16 Jul 2013 17:44:48 +0000 (10:44 -0700)]
Rewriting vp9_set_pred_flag_{seg_id, mbskip}.
Making implementation of vp9_set_pred_flag_{seg_id, mbskip} consistent
with vp9_get_segment_id without using confusing sub(a, b) macro. Passing
mi_row and mi_col to functions explicitly instead of replying on
mb_to_right_edge and mb_to_bottom_edge.
Change-Id: I54c1087dd2ba9036f8ba7eb165b073e807d00435
Paul Wilkins [Tue, 16 Jul 2013 15:58:37 +0000 (16:58 +0100)]
Minor cleanup in code to fine uv tx_size.
Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e
John Koleszar [Tue, 16 Jul 2013 17:20:56 +0000 (10:20 -0700)]
Fix above context pointers
In the prior code, the above context pointers used for entropy
decoding were initialized on the first frame, and not updated when
the frame size changed. The per-frame code which initializes the
contexts assumes that the contexts are contiguous, leading to an
incomplete initialization when the frame is smaller. This commit
updates the pointers so that the context is contigous whenever
the frame size changes.
Change-Id: I08b53e3a30c8289491212311682ff1b8028cff6c
Johann [Tue, 16 Jul 2013 16:42:52 +0000 (09:42 -0700)]
Merge "vp9_convolve8_[horiz|vert]_avg"
Jingning Han [Tue, 16 Jul 2013 16:03:38 +0000 (09:03 -0700)]
Merge "Skip inter-coded block reconstruction in rd loop"
Dmitry Kovalev [Tue, 16 Jul 2013 07:52:53 +0000 (00:52 -0700)]
Merge "Removing and moving around constant definitions."
Yaowu Xu [Tue, 16 Jul 2013 04:35:32 +0000 (21:35 -0700)]
Merge "Change to extend full border only when needed"
Yaowu Xu [Mon, 15 Jul 2013 21:59:59 +0000 (14:59 -0700)]
Change to extend full border only when needed
This is a short term optimization till we work out a decoder
implementation requiring no frame border extension.
Change-Id: I02d15bfde4d926b50a4e58b393d8c4062d1be70f
Dmitry Kovalev [Mon, 15 Jul 2013 19:26:58 +0000 (12:26 -0700)]
Removing and moving around constant definitions.
Removing unused and duplicated constants, moving them from *.h to *.c
if possible.
Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
Dmitry Kovalev [Tue, 16 Jul 2013 02:21:32 +0000 (19:21 -0700)]
Merge "Consistent naming for loop-filter filters."
Johann [Tue, 16 Jul 2013 01:43:41 +0000 (18:43 -0700)]
Merge "Remove print_nmvcounts"
Ronald S. Bultje [Fri, 12 Jul 2013 19:59:19 +0000 (12:59 -0700)]
Increase border size from 96 to 160.
This is required because upon downscaling, if a motion vector points
partially into the UMV (e.g. all minus 1 of 64+7 pixels, i.e. 70),
then we can point up to 140 pixels into the larger-resolution (2x)
reference buffer UMV, which means the UMV for reference buffers in
downscaling needs to be 140 rounded up to the nearest multiple of 32,
i.e. 160.
Longer-term, we should probably handle the UMV differently by detecting
edge coverage on-the-fly and using a temporary buffer for edge extensions
instead of adding 160 pixels on all sides of the image (which means a
CIF image uses 3x its own area size for borders).
Change-Id: I5184443e6731cd6721fc6a5d430a53e7d91b4f7e
Ronald S. Bultje [Thu, 11 Jul 2013 20:01:44 +0000 (13:01 -0700)]
Inline vp9_quantize() in xform_quant().
Cycle times:
4x4: 151 to 131 cycles (15% faster)
8x8: 334 to 306 cycles (9% faster)
16x16: 1401 to 1368 cycles (2.5% faster)
32x32: 7403 to 7367 cycles (0.5% faster)
Total encode time of first 50 frames of bus @ 1500kbps (speed 0)
goes from 1min39.2 to 1min38.6, i.e. a 0.67% overall speedup.
Change-Id: I799a49460e5e3fcab01725564dd49c629bfe935f
Ronald S. Bultje [Tue, 16 Jul 2013 00:29:39 +0000 (17:29 -0700)]
Merge "Inline xform_quant() in encode_block_intra()."
Frank Galligan [Tue, 16 Jul 2013 00:11:55 +0000 (17:11 -0700)]
Merge "Neon: Update mbfilter if all vectors follow one branch."
Dmitry Kovalev [Mon, 15 Jul 2013 23:01:31 +0000 (16:01 -0700)]
Consistent naming for loop-filter filters.
Renaming flatmask4 to flat_mask4, flatmask5 to flat_mask5, hevmask to
hev_mask, filter to filter4, mbfilter to filter8, wide_mbfilter to
filter16.
Change-Id: Ic61c73e59c2eee505257584867aafac99833cea1
Ronald S. Bultje [Thu, 11 Jul 2013 18:35:13 +0000 (11:35 -0700)]
Inline xform_quant() in encode_block_intra().
Also inline some of the block calculations to assist the compiler to
not do silly things like calculating the same offset (or converting
between raster/transform block offset or block, mi and pixel unit)
many, many, many times.
Cycle times:
4x4: 584 -> 505 cycles (16% faster)
8x8: 1651 -> 1560 cycles (6% faster)
16x16: 7897 -> 7704 cycles (2.5% faster)
32x32: 16096 -> 15852 cycles (1.5% faster)
Overall, this saves about 0.5 seconds (1min49.8 -> 1min49.3) on the
first 50 frames of bus (speed 0) @ 1500kbps, i.e. 0.5% overall.
Change-Id: If3dd62453f8e2ab9d4ee616bc4ea956fb8874b80
Dmitry Kovalev [Mon, 15 Jul 2013 21:47:25 +0000 (14:47 -0700)]
Code cleanup inside vp9_decodeframe.c.
Removing unused DEC_DEBUG define and dec_debug variable. Changing function
signatures to eliminate code duplication, renaming function
mb_init_dequantizer to init_dequantizer. Also removing redundant curly
braces, and comments.
Change-Id: Ia56ee1b0be5f24abb0e878581845be8a4773c298
Frank Galligan [Fri, 12 Jul 2013 00:13:03 +0000 (17:13 -0700)]
Neon: Update mbfilter if all vectors follow one branch.
Change the mbfilter Neon code from executing both branches if all
vectors follow only one branch.
The code is about 5% faster when executing only one branch and about
1% slower when executing both branches.
-PS5: Remove local stack space from mbfilter.
Change-Id: I6a23f9b318a9f4568a2718b4c9348db988fe2182