[llvm-dwarfdump] Print type names in DW_AT_type DIEs
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.
Oliver Stannard [Tue, 10 Oct 2017 11:00:40 +0000 (11:00 +0000)]
[AsmParser] Add DiagnosticString to register classes in tablegen
This allows a DiagnosticType and/or DiagnosticString to be associated
with a RegisterClass in tablegen, so that we can emit diagnostics in the
assembler when a register operand is incorrect.
DiagnosticType creates a predictable enum value, which gets returned as
the error code when an operand does not match, and can be used by the
assembly parser to map to a user-facing diagnostic. DiagnosticString
creates an anonymous enum value (currently based on the tablegen class
name), and a function to map from enum values to strings will be
generated. Both of these work the same was as they do for AsmOperand.
This isn't used by any targets yet, but has one (positive) side-effect.
It improves the diagnostic codes returned by validateOperandClass - we
always want to emit the diagnostic that relates to the expected operand
class, but this wasn't always being done when the expected and actual
classes were completely different (token/register/custom). This causes a
few AArch64 diagnostics to be improved, as Match_InvalidOperand was
being returned instead of a specific diagnostic type.
Florian Hahn [Tue, 10 Oct 2017 09:32:38 +0000 (09:32 +0000)]
[SCCP] Propagate integer range info for parameters in IPSCCP.
Summary:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.
For the following function, f() can be optimized to `ret i32 2` with
this change
Nemanja Ivanovic [Tue, 10 Oct 2017 08:46:10 +0000 (08:46 +0000)]
Fix for PR34888.
The issue is that we assume operand zero of the input to the add instruction
is a register. In this case, the input comes from inline assembly and
operand zero is not a register thereby causing a crash.
The code will bail anyway if the input instruction doesn't have the right
opcode. So do that check first and let short-circuiting prevent the crash.
Bjorn Steinbrink [Tue, 10 Oct 2017 07:46:17 +0000 (07:46 +0000)]
Ignore all duplicate frame index expression
Some passes might duplicate calls to llvm.dbg.declare creating
duplicate frame index expression which currently trigger an assertion
which is meant to catch erroneous, overlapping fragment declarations.
But identical frame index expressions are just redundant and don't
actually conflict with each other, so we can be more lenient and just
ignore the duplicates.
This version fixed a bug in finding the matching
phi -- the order of the incoming blocks may be
different (triggered in self build on Windows).
A new test case is added.
Lang Hames [Tue, 10 Oct 2017 01:15:10 +0000 (01:15 +0000)]
[MC] Plumb unique_ptr<MCWasmObjectTargetWriter> through createWasmObjectWriter
to WasmObjectWriter's constructor.
Fixes the same ownership issue for COFF that r315245 did for MachO:
WasmObjectWriter takes ownership of its MCWasmObjectTargetWriter, so we want to
pass this through to the constructor via a unique_ptr, rather than a raw ptr.
Reid Kleckner [Tue, 10 Oct 2017 00:57:36 +0000 (00:57 +0000)]
[MC] Suppress .Lcfi labels when emitting textual assembly
Summary:
This suppresses the generation of .Lcfi labels in our textual assembler.
It was annoying that this generated cascading .Lcfi labels:
llc foo.ll -o - | llvm-mc | llvm-mc
After three trips through MCAsmStreamer, we'd have three labels in the
output when none are necessary. We should only bother creating the
labels and frame data when making a real object file.
This supercedes D38605, which moved the entire .seh_ implementation into
MCObjectStreamer.
This has the advantage that we do more checking when emitting textual
assembly, as a minor efficiency cost. Outputting textual assembly is not
performance critical, so this shouldn't matter.
Lang Hames [Tue, 10 Oct 2017 00:50:29 +0000 (00:50 +0000)]
[MC] Plumb unique_ptr<MCWinCOFFObjectTargetWriter> through
createWinCOFFObjectWriter to WinCOFFObjectWriter's constructor.
Fixes the same ownership issue for COFF that r315245 did for MachO:
WinCOFFObjectWriter takes ownership of its MCWinCOFFObjectTargetWriter, so we
want to pass this through to the constructor via a unique_ptr, rather than a
raw ptr.
Lang Hames [Mon, 9 Oct 2017 23:53:15 +0000 (23:53 +0000)]
[MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter to
ELFObjectWriter's constructor.
Fixes the same ownership issue for ELF that r315245 did for MachO:
ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to
pass this through to the constructor via a unique_ptr, rather than a raw ptr.
Lang Hames [Mon, 9 Oct 2017 22:38:13 +0000 (22:38 +0000)]
[MC] Plumb unique_ptr<MCMachObjectTargetWriter> through createMachObjectWriter
to MCObjectWriter's constructor.
MCObjectWriter takes ownership of its MCMachObjectTargetWriter argument -- this
patch plumbs that ownership relationship through the constructor (which
previously took raw MCMachObjectTargetWriter*) and the createMachObjectWriter
function.
The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints).
umax(a,b) - b -> subus(a,b)
a - umin(a,b) -> subus(a,b)
There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987).
The example of special case code:
```
void foo(unsigned short *p, int max, int n) {
int i;
unsigned m;
for (i = 0; i < n; i++) {
m = *--p;
*p = (unsigned short)(m >= max ? m-max : 0);
}
}
```
Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero.
Here is the table of types, I try to support, special case items are bold:
Zachary Turner [Mon, 9 Oct 2017 18:50:29 +0000 (18:50 +0000)]
Fix some C++ value / reference semantics issues.
Some functions were taking Twine's not by const&, these are all
fixed to take by const&. We also had a case where some functions
were overloaded to accept by const& and &&. Now there is only
one version which accepts by value and move's the value.
Daniel Sanders [Mon, 9 Oct 2017 18:14:53 +0000 (18:14 +0000)]
[globalisel] Add support for ValueType operands in patterns.
It's rare but there are a small number of patterns like this:
(set i64:$dst, (add i64:$src1, i64:$src2))
These should be equivalent to register classes except they shouldn't check for
a specific register bank.
This doesn't occur in AArch64/ARM/X86 but does occasionally come up in other
in-tree targets such as BPF.
Adrian McCarthy [Mon, 9 Oct 2017 17:50:01 +0000 (17:50 +0000)]
Fix after r315079
Microsoft's debug implementation of std::copy checks if the destination is an
array and then does some bounds checking. This was causing an assertion
failure in fs::rename_internal which copies to a buffer of the appropriate
size but that's type-punned to an array of length 1 for API compatibility
reasons.
Fix is to make make the destination a pointer rather than an array.
Francis Ricci [Mon, 9 Oct 2017 17:27:47 +0000 (17:27 +0000)]
[dsymutil] Emit valid debug locations when no symbol flags are set
Summary:
swiftc emits symbols without flags set, which led dsymutil to ignore
them when searching for global symbols, causing dwarf location data
to be omitted. Xcode's dsymutil handles this case correctly, and emits
valid location data. Add this functionality to llvm-dsymutil by
allowing parsing of symbols with no flags set.
Zachary Turner [Mon, 9 Oct 2017 15:46:13 +0000 (15:46 +0000)]
[llvm-rc] Have the tokenizer discard single & block comments.
This allows rc files to have comments. Eventually we should
just use clang's c preprocessor, but that's a bit larger
effort for minimal gain, and this is straightforward.
Amara Emerson [Mon, 9 Oct 2017 15:15:09 +0000 (15:15 +0000)]
[AArch64] Improve codegen for inverted overflow checking intrinsics
E.g. if we have a (xor(overflow-bit), 1) where overflow-bit comes from an
intrinsic like llvm.sadd.with.overflow then we can kill the xor and use the
inverted condition code for the CSEL.
Craig Topper [Sun, 8 Oct 2017 16:57:23 +0000 (16:57 +0000)]
[X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI versions of scalar arithmetic patterns
Summary:
We currently disable some converting of shuffles to MOVSS/MOVSD during legalization if SSE41 is enabled. But later during shuffle combining we go back to prefering MOVSS/MOVSD.
Additionally we have patterns that look for BLENDIs to detect scalar arithmetic operations. I believe due to the combining using MOVSS/MOVSD these are unnecessary.
Interestingly, we still codegen blend instructions even though lowering/isel emit movss/movsd instructions. Turns out machine CSE commutes them to blend, and then commuting those blends back into blends that are equivalent to the original movss/movsd.
This patch fixes the inconsistency in legalization to prefer MOVSS/MOVSD. The one test change was caused by this change. The problem is that we have integer types and are mostly selecting integer instructions except for the shufps. This shufps forced the execution domain, but the vpblendw couldn't have its domain changed with a naive instruction swap. We could fix this by special casing VPBLENDW based on the immediate to widen the element type.
The rest of the patch is removing all the excess scalar patterns.
Long term we should probably add isel patterns to make MOVSS/MOVSD emit blends directly instead of relying on the double commute. We may also want to consider emitting movss/movsd for optsize. I also wonder if we should still use the VEX encoded blendi instructions even with AVX512. Blends have better throughput, and that may outweigh the register constraint.
Gadi Haber [Sun, 8 Oct 2017 12:52:54 +0000 (12:52 +0000)]
[X86][SKX] Adding the scheduling information for the SKX target.
Adding the scheduling information for the SkylakeServer (SKX) target.
This patch adds the instruction scheduling information for the SkylakeServer (SKX) architecture target by adding the file X86SchedSkylakeServer.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.
The patch continues the scheduling replacement and insertion effort started with the SNB target in r310792, the HSW target in r311879 and the SkylakeClient (SKL) target in rL313613.
Please expect some performance fluctuations due to code alignment effects.
Ayman Musa [Sun, 8 Oct 2017 09:20:32 +0000 (09:20 +0000)]
[X86][TableGen] Recommitting the X86 memory folding tables TableGen backend while disabling it by default.
After the original commit ([[ https://reviews.llvm.org/rL304088 | rL304088 ]]) was reverted, a discussion in llvm-dev was opened on 'how to accomplish this task'.
In the discussion we concluded that the best way to achieve our goal (which is to automate the folding tables and remove the manually maintained tables) is:
# Commit the tablegen backend disabled by default.
# Proceed with an incremental updating of the manual tables - while checking the validity of each added entry.
# Repeat previous step until we reach a state where the generated and the manual tables are identical. Then we can safely remove the manual tables and include the generated tables instead.
# Schedule periodical (1 week/2 weeks/1 month) runs of the pass:
- if changes appear (new entries):
- make sure the entries are legal
- If they are not, mark them as illegal to folding
- Commit the changes (if there are any).
CMake flag added for this purpose is "X86_GEN_FOLD_TABLES". Building with this flags will run the pass and emit the X86GenFoldTables.inc file under build/lib/Target/X86/ directory which is a good reference for any developer who wants to take part in the effort of completing the current folding tables.
Ayman Musa [Sun, 8 Oct 2017 08:32:56 +0000 (08:32 +0000)]
[X86] Add new attribute to X86 instructions to enable marking them as "not memory foldable"
This attribute will be used in a tablegen backend that generated the X86 memory folding tables which will be added in a future pass.
Instructions with this attribute unset will be excluded from the full set of X86 instructions available for the pass.
[MachineOutliner] Disable outlining from LinkOnceODRs by default
Say you have two identical linkonceodr functions, one in M1 and one in M2.
Say that the outliner outlines A,B,C from one function, and D,E,F from another
function (where letters are instructions). Now those functions are not
identical, and cannot be deduped. Locally to M1 and M2, these outlining
choices would be good-- to the whole program, however, this might not be true!
To mitigate this, this commit makes it so that the outliner sees linkonceodr
functions as unsafe to outline from. It also adds a flag,
-enable-linkonceodr-outlining, which allows the user to specify that they
want to outline from such functions when they know what they're doing.
Changing this handles most code size regressions in the test suite caused by
competing with linker dedupe. It also doesn't have a huge impact on the code
size improvements from the outliner. There are 6 tests that regress > 5% from
outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most
tests either improve or are not impacted.
Not outlined vs outlined without linkonceodrs:
https://hastebin.com/raw/qeguxavuda
Not outlined vs outlined with linkonceodrs:
https://hastebin.com/raw/edepoqoqic
Outlined with linkonceodrs vs outlined without linkonceodrs:
https://hastebin.com/raw/awiqifiheb
Numbers generated using compare.py with -m size.__text. Tests run for AArch64
with -Oz -mllvm -enable-machine-outliner -mno-red-zone.
Sanjay Patel [Fri, 6 Oct 2017 23:20:16 +0000 (23:20 +0000)]
[InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments, simplify code; NFCI
There's at least one bug here - this code can fail with vector types (PR34856).
It's also being called for FREM; I'm still trying to understand how that is valid.
[dwarfdump] Verify that unit type matches root DIE
This patch adds two new verifiers:
- It checks that the root DIE of a CU is actually a valid unit DIE.
(based on its tag)
- For DWARF5 which contains a unit type int he CU header, it checks that
this matches the type of the unit DIE.
Zachary Turner [Fri, 6 Oct 2017 22:05:15 +0000 (22:05 +0000)]
[llvm-rc] Implement escape sequences in .rc files.
This allows the escape sequences (\a, \n, \r, \t, \\, \x[0-9a-f]*,
\[0-7]*, "") to appear in .rc scripts. These are parsed and output in
the same way as it's done in original MS implementation.
The way these sequences are processed depends on the type of the
resource it resides in, and on whether the user declared the string to
be "wide" or "narrow".
I tried to maintain the maximum compatibility with the original tool
(and fail in some erroneous situations that are accepted by .rc).
However, there are some (extremely rare) cases where Microsoft tool
outputs nonsense. I found it infeasible to detect such casses.
Zachary Turner [Fri, 6 Oct 2017 21:30:55 +0000 (21:30 +0000)]
[llvm-rc] Serialize STRINGTABLE statements to .res file.
This allows llvm-rc to serialize STRINGTABLE resources.
These are output in an unusual way: we locate them at the end of the
file, and strings are merged into bundles of max 16 strings, depending
on their IDs, language, and characteristics.
Zachary Turner [Fri, 6 Oct 2017 21:25:44 +0000 (21:25 +0000)]
[llvm-rc] Serialize CURSOR and ICON resources to .res
This is part 6 of llvm-rc serialization.
This adds ability to output cursors and icons as resources.
Unfortunately, we can't just copy .cur or .ico files to output - as each
file might contain multiple images, each of them needs to be unpacked
and stored as a separate resource. This forces us to parse cursor and
icon contents. (Fortunately, these formats are pretty similar and can be
processed by mostly common code).
As test files are binary, here is a short explanation of .cur and .ico
files stored:
cursor.cur, cursor-8.cur, cursor-32.cur are sample correct cursor files,
differing in their bit depth.
icon-old.ico, icon-new.ico are sample correct icon files;
icon-png.ico is a sample correct icon file in PNG format (instead of
usual BMP);
cursor-eof.cur is an incorrect cursor file - this is cursor.cur with
some of its final bytes removed.
cursor-bad-offset.cur is an incorrect cursor file - image header states
that image data begins at offset 0xFFFFFFFF.
Sample correct cursors and icons were created by Nico Weber.
Patch by Marek Sokolowski
Differential Revision: https://reviews.llvm.org/D37878
Reid Kleckner [Fri, 6 Oct 2017 21:17:51 +0000 (21:17 +0000)]
Revert "Roll forward r314928"
This appears to be miscompiling Clang, as shown on two Windows bootstrap
bots:
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870
Nothing else is in the blame list. Both emit errors on this valid code
in the Windows ucrt headers:
C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char *' and 'int')
_Ptr = (char*)_Ptr + _ALLOCA_S_MARKER_SIZE;
~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~
Zachary Turner [Fri, 6 Oct 2017 20:51:20 +0000 (20:51 +0000)]
[llvm-rc] Add optional serialization support for DIALOG(EX) resources.
This is part 5 of llvm-rc serialization support.
This allows DIALOG and DIALOGEX to serialize if dialog-specific optional
statements are provided. These are (as of now): CAPTION, FONT, and
STYLE.
Notably, FONT statement can take more than two arguments when describing
DIALOGEX resources (as in
msdn.microsoft.com/en-us/library/windows/desktop/aa381013.aspx). I made
some changes to the parser to reflect this fact.
Patch by Marek Sokolowski
Differential Revision: https://reviews.llvm.org/D37864
Adrian Prantl [Fri, 6 Oct 2017 20:24:34 +0000 (20:24 +0000)]
llvm-dwarfdump: Add an option to collect debug info quality metrics.
At the last LLVM dev meeting we had a debug info for optimized code
BoF session. In that session I presented some graphs that showed how
the quality of the debug info produced by LLVM changed over the last
couple of years. This is a cleaned up version of the patch I used to
collect the this data. It is implemented as an extension of
llvm-dwarfdump, adding a new --statistics option. The intended
use-case is to automatically run this on the debug info produced by,
e.g., our bots, to identify eyebrow-raising changes or regressions
introduced by new transformations that we could act on.
In the current form, two kinds of data are being collected:
- The number of variables that have a debug location versus the number
of variables in total (this takes into account inlined instances of
the same function, so if a variable is completely missing form only
one instance it will be found).
- The PC range covered by variable location descriptions versus the PC
range of all variables' containing lexical scopes.
The output format is versioned and extensible, so I'm looking forward
to both bug fixes and ideas for other data that would be interesting
to track.
Amara Emerson [Fri, 6 Oct 2017 19:24:15 +0000 (19:24 +0000)]
[GlobalISel] Fix legalizer trying to process a deleted instruction.
In some cases an instruction is deleted from the block during combining,
however it can still exist in the legalizer worklist.
This change modifies the combiner routines to add the given MI to the
dead instruction list rather than trying to remove it from the block
themselves. The responsibility is then on the caller to delete the instruction
from the block and any worklists.
Reid Kleckner [Fri, 6 Oct 2017 18:21:19 +0000 (18:21 +0000)]
[PEI] Remove required properties and use 'if' instead of std::function
Summary:
After r303360, we initialize UsesCalleeSaves in runOnMachineFunction,
which runs after getRequiredProperties. UsesCalleeSaves was initialized
to 'false', so getRequiredProperties would always return an empty set.
We don't have a TargetMachine available early anymore after r303360.
Just removing the requirement of NoVRegs seems to make things work, so
let's do that.
The bitcode reader looks specifically for `__DATA, __objc_catlist` as a
section name. However, SVN r304661 removed the spaces (the two names
are functionally equivalent but do not compare equally
lexicographically). This causes compatibility issues. Add an
auto-upgrade path for removing the spaces as well as use the new name in
the LTO plugin.
Zachary Turner [Fri, 6 Oct 2017 17:54:46 +0000 (17:54 +0000)]
[lit] Improve tool substitution in lit.
This addresses two sources of inconsistency in test configuration
files.
1. Substitution boundaries. Previously you would specify a
substitution, such as 'lli', and then additionally a set
of characters that should fail to match before and after
the tool. This was used, for example, so that matches that
are parts of full paths would not be replaced. But not all
tools did this, and those that did would often re-invent
the set of characters themselves, leading to inconsistency.
Now, every tool substitution defaults to using a sane set
of reasonable defaults and you have to explicitly opt out
of it. This actually fixed a few latent bugs that were
never being surfaced, but only on accident.
2. There was no standard way for the system to decide how to
locate a tool. Sometimes you have an explicit path, sometimes
we would search for it and build up a path ourselves, and
sometimes we would build up a full command line. Furthermore,
there was no standardized way to handle missing tools. Do we
warn, fail, ignore, etc? All of this is now encapsulated in
the ToolSubst class. You either specify an exact command to
run, or an instance of FindTool('<tool-name>') and everything
else just works. Furthermore, you can specify an action to
take if the tool cannot be resolved.