1. ASCII text. This is the easiest one to generate. The file is divided into
sections, which correspond to each of the functions with profile
- information. The format is described below.
+ information. The format is described below. It can also be generated from
+ the binary or gcov formats using the ``llvm-profdata`` tool.
2. Binary encoding. This uses a more efficient encoding that yields smaller
- profile files, which may be useful when generating large profiles. It can be
- generated from the text format using the ``llvm-profdata`` tool.
+ profile files. This is the format generated by the ``create_llvm_prof`` tool
+ in http://github.com/google/autofdo.
3. GCC encoding. This is based on the gcov format, which is accepted by GCC. It
- is only interesting in environments where GCC and Clang co-exist. Similarly
- to the binary encoding, it can be generated using the ``llvm-profdata`` tool.
+ is only interesting in environments where GCC and Clang co-exist. This
+ encoding is only generated by the ``create_gcov`` tool in
+ http://github.com/google/autofdo. It can be read by LLVM and
+ ``llvm-profdata``, but it cannot be generated by either.
If you are using Linux Perf to generate sampling profiles, you can use the
conversion tool ``create_llvm_prof`` described in the previous section.
.. code-block:: console
function1:total_samples:total_head_samples
- offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
- offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
- ...
- offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
+ offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
+ offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
+ ...
+ offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
+ offsetA[.discriminator]: fnA:num_of_total_samples
+ offsetA1[.discriminator]: number_of_samples [fn7:num fn8:num ... ]
+ offsetA1[.discriminator]: number_of_samples [fn9:num fn10:num ... ]
+ offsetB[.discriminator]: fnB:num_of_total_samples
+ offsetB1[.discriminator]: number_of_samples [fn11:num fn12:num ... ]
+
+This is a nested tree in which the identation represents the nesting level
+of the inline stack. There are no blank lines in the file. And the spacing
+within a single line is fixed. Additional spaces will result in an error
+while reading the file.
+
+Any line starting with the '#' character is completely ignored.
-The file may contain blank lines between sections and within a
-section. However, the spacing within a single line is fixed. Additional
-spaces will result in an error while reading the file.
+Inlined calls are represented with indentation. The Inline stack is a
+stack of source locations in which the top of the stack represents the
+leaf function, and the bottom of the stack represents the actual
+symbol to which the instruction belongs.
Function names must be mangled in order for the profile loader to
match them in the current translation unit. The two numbers in the
in the prologue of the function (second number). This head sample
count provides an indicator of how frequently the function is invoked.
+There are two types of lines in the function body.
+
+- Sampled line represents the profile information of a source location.
+ ``offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]``
+
+- Callsite line represents the profile information of an inlined callsite.
+ ``offsetA[.discriminator]: fnA:num_of_total_samples``
+
Each sampled line may contain several items. Some are optional (marked
below):
instruction that calls one of ``foo()``, ``bar()`` and ``baz()``,
with ``baz()`` being the relatively more frequently called target.
+As an example, consider a program with the call chain ``main -> foo -> bar``.
+When built with optimizations enabled, the compiler may inline the
+calls to ``bar`` and ``foo`` inside ``main``. The generated profile
+could then be something like this:
+
+.. code-block:: console
+
+ main:35504:0
+ 1: _Z3foov:35504
+ 2: _Z32bari:31977
+ 1.1: 31977
+ 2: 0
+
+This profile indicates that there were a total of 35,504 samples
+collected in main. All of those were at line 1 (the call to ``foo``).
+Of those, 31,977 were spent inside the body of ``bar``. The last line
+of the profile (``2: 0``) corresponds to line 2 inside ``main``. No
+samples were collected there.
Profiling with Instrumentation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^