This commit re-designs the per transformed block rate-distortion
costs tracking buffers. It removes redundant buffer usage, makes
the needed context memory allocation per VP9_COMP instance and
reuses the same buffer sets inside the rate-distortion optimization
search loop, thereby avoiding repeatedly requiring memory space.
It reduces speed 0 runtime:
bus at 2000 kbps from 166763ms to 158967ms,
football at 600 kbps from 246614ms to 234257ms.
Both about 5% speed-up. Local tests suggest about 2% to 5% speed-up
for speed 1 and 2 settings. This does not change compression
performance.