Michael Bebenita [Thu, 25 Aug 2016 21:40:54 +0000 (14:40 -0700)]
Bit accounting.
This patch adds bit account infrastructure to the bit reader API.
When configured with --enable-accounting, every bit reader API
function records the number of bits necessary to decoding a symbol.
Accounting symbol entries are collected in global accounting data
structure, that can be used to understand exactly where bits are
spent (http://aomanalyzer.org). The data structure is cleared and
reused each frame to reduce memory usage. When configured without
--enable-accounting, bit accounting does not incur any runtime
overhead.
All aom_read_xxx functions now have an additional string parameter
that specifies the symbol name. By default, the ACCT_STR macro is
used (which expands to __func__). For more precise accounting,
these should be replaced with more descriptive names.
Sarah Parker [Tue, 11 Oct 2016 19:29:07 +0000 (12:29 -0700)]
Fix ransac random generator seeding
Ransac's get_rand_indices originally used rand_r seeded with the
same value every time, producing the same random sequence at every
iteration. This causes the global motion parameters to be slightly
less accurate because ransac cannot improve the model fit after
the first attempt.
This function is called after `super_block_yrd` and assumes that the dst
buffer is correct but that is no longer always the case after daf841b4a10ece1b6831300d79f271d00f9d027b since we don't call
`txfm_rd_in_plane` after the RDO loop in `choose_tx_size_from_rd`.
We could fix this by always saving and restoring the dst buffer but
removing `rd_variance_adjustment` is a better solution:
- Getting the dst buffer always right is tricky as demonstrated by the
fact that it is wrong now, even if we fix it now we could break it later
and not notice
- Perceptual weighting is a good idea but `rd_variance_adjustment` is the
wrong approach as it weights both the rate and the distortion:
to get meaningful units you should only weight the distortion,
weighting rate means that we pretend some bits cost less than other
bits, this is not the case. The distortion weighting approach is
implemented by Daala in `od_compute_dist` and we plan to experiment
with this in AV1 too.
- Removing `rd_variance_adjustment` improves coding efficiency on all
metrics, here are the results for objective-1-fast using the Low
Latency settings:
This array was allocated and used to save and restore segmentation map,
however the original segmentation map was never modified between the
calls to save and restore.
This commit ports the following from aom/master: 4c46278 Add aom_reader_tell() support. b9c9935 Remove an erroneous declaration. 56c9c3b Fix ANS build.
Peter de Rivaz [Tue, 18 Oct 2016 10:47:56 +0000 (11:47 +0100)]
Correction to costing rect_tx
When built with var_tx and ext_tx, select_tx_size_fix_type is used
to compute the cost for using a particular tx_type.
The code indexes the array inter_tx_type_costs at the wrong location
resulting in a zero cost for signalling tx_type for rect_tx blocks.
Nathan E. Egge [Thu, 18 Aug 2016 06:34:53 +0000 (02:34 -0400)]
Fix warning when discarding const qualifier.
Cherry-pick Daala 211c2a41: Clean up EC tell() and tell_frac() functions.
Add a const qualifier to the od_ec_enc and od_ec_dec parameters of
the od_ec_enc_tell(), od_ec_enc_tell_frac(), od_ec_dec_tell(), and
od_ec_dec_tell_frac() functions.
Add an OD_WARN_UNUSED_RESULT to od_ec_enc_tell_frac().
Nathan E. Egge [Tue, 21 Jun 2016 03:01:29 +0000 (23:01 -0400)]
Revert code formatting of OD_UNIFORM_CDFS_Q15.
The formatting of OD_UNIFORM_CDFS_Q15[] in entcode.c is helpful for
for understanding what is contained in the array (e.g., the uniform
probability distributions of small sizes 2 through 16).
This patch reverts the change made in f4b2926d and adds linter hints to
ignore the formatting.
Yushin Cho [Tue, 21 Jun 2016 21:51:23 +0000 (14:51 -0700)]
Bug fix in super_block_uvrd().
In super_block_uvrd(),if is_cost_valid == 0, all return parameters,
i.e. rate, distortion, skippable, and sse, are reset.
So, should not call txfm_rd_in_plane() if is_cost_valid == 0.
Also, the bug causes av1_xform_quant() to see invalid diff signal
since av1_subtract_plane() is not called in super_block_uvrd().
Yaowu Xu [Fri, 14 Oct 2016 23:42:22 +0000 (23:42 +0000)]
Merge changes I339d0389,I2fa1e87a,If79fa5ae,Icb1a8cb8,Ic76de4a4, ... into nextgenv2
* changes:
Add missing CONFIG_DAALA_EC declaration.
Add API for writing trees using a CDF.
Add macro to build a simple cdf table.
Use Daala entropy coder to code trees.
Silence clang-format code review warning.
Use Daala entropy coder to code bits.
Clear existing format issue in the codebase
Add Daala entropy coder.
Nathan E. Egge [Sun, 19 Jun 2016 18:38:04 +0000 (14:38 -0400)]
Add code to compute in-order mappings for tokens.
Add av1_indices_from_tree() function that computes a forward and inverse
mapping of the tree leaf-node symbols to their in-order traversal.
This is necessary because many of the aom_tree binary trees have their
leaf nodes out of order (e.g., an in-order traversal of a tree with n
nodes does not start at symbol 0 and go to symbol n - 1), but the CDFs
created by tree_to_cdf() are indexed in-order.
Nathan E. Egge [Mon, 20 Jun 2016 17:44:22 +0000 (13:44 -0400)]
Add API for writing trees using a CDF.
Added aom_write_tree_cdf() and aom_read_tree_cdf() function calls to
bitwriter.h and bitreader.h respectively.
These calls take a multisymbol CDF and an index and directly encode the
symbol using the enabled entropy coder.
Currently only the daala entropy encoder supports this (enabled with
--enable-daala_ec) and a compile error is thrown otherwise.
Nathan E. Egge [Thu, 19 May 2016 16:11:56 +0000 (12:11 -0400)]
Add macro to build a simple cdf table.
Add the av1_tree_to_cdf() macro which takes a aom_tree_index tree and
associated aom_prob probabilities and constructs a daala uint16_t cdf.
The av1_tree_to_cdf_1D() and av1_tree_to_cdf_2D() apply av1_tree_to_cdf()
across 1D and 2D arrays respectively.
Nathan E. Egge [Sun, 6 Mar 2016 18:41:53 +0000 (13:41 -0500)]
Use Daala entropy coder to code trees.
When building with --enable-daala_ec, calls to aom_write_tree() and
aom_read_tree() will convert a aom_tree_index structure with associated
aom_prob probabilities into a CDF on the fly for use with the
od_ec_encode_cdf_q15().
The number of symbols in the CDF is capped at 16, and trees that contain
more than 16 leaf nodes are handled by splitting the most likely, e.g.,
highest probability symbols, first and coding multiple symbols if
necessary.
ntt-short-1:
MEDIUM (%) HIGH (%)
PSNR 0.000227 0.000213
PSNRHVS 0.000215 0.000205
SSIM 0.000229 0.000209
FASTSSIM 0.000229 0.000214
Nathan E. Egge [Sun, 6 Mar 2016 17:42:47 +0000 (12:42 -0500)]
Use Daala entropy coder to code bits.
When building with --enable-daala_ec, calls to aom_write() and aom_read()
use the daala entropy coder to write and read bits.
When the probability is exactly 0.5 (128), then raw bits are used.
ntt-short-1:
MEDIUM (%) HIGH (%)
PSNR -0.027556 -0.020114
PSNRHVS -0.027401 -0.020169
SSIM -0.027587 -0.020151
FASTSSIM -0.027592 -0.020102
Yaowu Xu [Fri, 14 Oct 2016 15:47:03 +0000 (08:47 -0700)]
Revert "Revert "Move CLPF block signals from frame to SB level.""
This reverts commit 9b25f3067485b32442e13964df098903736c3fd8 to
reinstate the reverted commit with fixes that solved the build issues
when --enalbe-clpf is used in configure.
These signals were in the uncompressed frame header (as a temporary
hack), which caused two problems:
* We don't want that header to be duplicated in the slice header
* It was necessary to signal the number of bits to transmit up front
However, the filter size can be 128x128 which is greater than the SB
size, and a decoder wouldn't be able to know whether to read a bit or
not until the final SB of that 128x128 block has been decoded
(depending on whether the 128x128 is all skip or not). Therefore the
signalling was changed for 128x128 blocks so that every top left SB of
a 128x128 filter block contains a signal regardless of whether the
block is all skip or not. Also, all the MB's of 128x128 block are
filtered even if they are skip MB's. This gives the signal a purpose
even when the 128x128 block is all skip, and it also gives a slight
coding gain as it leaves a way to filter skip blocks, which was
previously forbidden.