DebugInfo: Reorder definitions of MDLocation and MDFile, NFC
Move definition of `MDLocation` after `MDLocalScope` so that the latter
is available for casts in the former. Similarly, move the definition of
`MDFile` as early as possible so that other classes can cast to it in
their definitions. (Follow-up commits will take advantage of this.)
DebugInfo: Add MDLocalScope, a legal scope for locals
Add a subclass of `MDScope` to explicitly categorize the legal scopes
for locals -- in particular, scopes that are legal for `MDLocation`,
`MDLexicalBlockBase`, and `MDLocalVariable`. This provides a convenient
`isa<>` target for the verifier, and eventually I'll be changing the
above classes' `getScope()` to specifically return it. Currently, its
subclasses are `MDSubprogram`, `MDLexicalBlock`, and
`MDLexicalBlockFile`.
I've gone with `MDLocalScope` for now -- a little ambiguous since it's a
scope *for* locals, not a scope that's local -- but I'm open to more
descriptive names if someone can think of something better. Regardless,
the code docs should make it clear enough.
It still causes buildbot failures (gcc running out of memory on several platforms, and a self-host failure on arm), although less than the previous time.
Daniel Sanders [Tue, 24 Mar 2015 11:26:34 +0000 (11:26 +0000)]
[mips] Distinguish 'R', 'ZC', and 'm' inline assembly memory constraint.
Summary:
Previous behaviour of 'R' and 'm' has been preserved for now. They will be
improved in subsequent commits.
The offset permitted by ZC varies according to the subtarget since it is
intended to match the restrictions of the pref, ll, and sc instructions.
The restrictions on these instructions are:
* For microMIPS: 12-bit signed offset.
* For Mips32r6/Mips64r6: 9-bit signed offset.
* Otherwise: 16-bit signed offset.
James Molloy [Tue, 24 Mar 2015 11:15:23 +0000 (11:15 +0000)]
"float2int": Add a new pass to demote from float to int where possible.
It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used.
This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers.
This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point.
To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around.
Previously, subtarget features were a bitfield with the underlying type being uint64_t.
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.
The first time this was committed (r229831), it caused several buildbot failures.
At least some of the ARM ones were due to gcc/binutils issues, and should now be fixed.
Lang Hames [Tue, 24 Mar 2015 04:07:01 +0000 (04:07 +0000)]
[Orc] Use std::string to capture name by value.
This just updates the code to reflect the comment, but this bug actually hit the
out-of-tree lazy demo. I'm working on a patch to add the lazy-demo's
functionality to lli so that we can test this in-tree soon.
DebugInfo: Overload get() in DIDescriptor subclasses
Continue to simplify the `DIDescriptor` subclasses, so that they behave
more like raw pointers. Remove `getRaw()`, replace it with an
overloaded `get()`, and overload the arrow and cast operators. Two
testcases started to crash on the arrow operators with this change
because of `scope:` references that weren't real scopes. I fixed them.
Soon I'll add verifier checks for them too.
This also adds explicit dereference operators. Previously, the builtin
dereference against `operator MDNode *()` would have worked, but now the
builtins are ambiguous.
David Blaikie [Mon, 23 Mar 2015 21:17:43 +0000 (21:17 +0000)]
Cleanup else-after-return and add an early-return to llvm-nm
The loop and error handling in checkMachOAndArchFlags didn't make sense
to me (a loop that only ever executes once? An error path that uses the
element the loop stopped at (which must always be a buffer overrun if
I'm reading that right?)... I'm confused) but I've made a guess at what
was intended.
Based on a patch by Richard Thomson to simplify boolean expressions.
Ahmed Bougacha [Mon, 23 Mar 2015 21:17:36 +0000 (21:17 +0000)]
[AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1.
The pass used to be enabled by default with CodeGenOpt::Less (-O1).
This is too aggressive, considering the pass indiscriminately merges
all globals together.
Currently, performance doesn't always improve, and, on code that uses
few globals (e.g., the odd file- or function- static), more often than
not is degraded by the optimization. Lengthy discussion can be found
on llvmdev (AArch64-focused; ARM has similar problems):
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html
Also, it makes tooling and debuggers less useful when dealing with
globals and data sections.
GlobalMerge needs to better identify those cases that benefit, and this
will be done separately. In the meantime, move the pass to run with
-O3 rather than -O1, on both ARM and AArch64.
Chris Bieneman [Mon, 23 Mar 2015 20:04:00 +0000 (20:04 +0000)]
Re-land: Generate targets for each lit suite.
Summary:
This change makes CMake scan for lit suites and generate a target for each lit test suite. The targets follow the format check-<project>-<suite path>.
For example:
check-llvm-unit - Runs the LLVM unit tests
check-llvm-codegen-arm - Runs the ARM codeine tests
Note: These targets are not generated during multi-configuration generators (i.e. Xcode and Visual Studio) because target clutter impacts UI usability.
* Also fixed a minor issue that Duncan pointed out to me I was passing the suite to lit twice
David Blaikie [Mon, 23 Mar 2015 19:45:40 +0000 (19:45 +0000)]
Refactor: Simplify boolean expressions in llvm Support
Simplify boolean expressions using `true` and `false` with `clang-tidy`
Patch by Richard Thomson - I dropped the parens and != 0 test, for
consistency with other patches/tests like this, but I'm open to the
notion that we should add the explicit non-zero test in all these sort
of cases (non-bool assigned to a bool).
Matt Arsenault [Mon, 23 Mar 2015 18:45:30 +0000 (18:45 +0000)]
R600/SI: Allow commuting compares
This enables very common cases to switch to the
smaller encoding.
All of the standard LLVM canonicalizations of comparisons
are the opposite of what we want. Compares with constants
are moved to the RHS, but the first operand can be an inline
immediate, literal constant, or SGPR using the 32-bit VOPC
encoding.
There are additional bad canonicalizations that should
also be fixed, such as canonicalizing ge x, k to gt x, (k + 1)
if this makes k no longer an inline immediate value.
David Blaikie [Mon, 23 Mar 2015 18:39:02 +0000 (18:39 +0000)]
Refactor: simplify boolean expressions in llvm-objdump
Simplify boolean expressions involving `true` and `false` with `clang-tidy`.
Actually upon inspection a bunch of these boolean variables could be
factored away entirely anyway - using find_if and then testing the
result before using it. This also helps reduce indentation in the code
anyway - and a bunch of other related simplification fell out nearby so
I just committed all of that.
Bradley Smith [Mon, 23 Mar 2015 16:52:52 +0000 (16:52 +0000)]
Revert "[ARM] Add more pattern matching for f16 <-> f64 conversions"
This change is incorrect since it converts double rounding into single rounding,
which can produce different results. Instead this optimization will be done by
modifying Clang's codegen to not produce double rounding in the first place.
James Molloy [Mon, 23 Mar 2015 16:15:16 +0000 (16:15 +0000)]
[ARM] Remove target-specific ITOFP/FPTOI nodes
Anton tried this 5 years ago but it was reverted due to extra VMOVs
being emitted. This can be easily fixed with a liberal application
of patterns - matching loads/stores and extractelts.
Petar Jovanovic [Mon, 23 Mar 2015 12:28:13 +0000 (12:28 +0000)]
Fix sign extension for MIPS64 in makeLibCall function
Fixing sign extension in makeLibCall for MIPS64. In MIPS64 architecture all
32 bit arguments (int, unsigned int, float 32 (soft float)) must be sign
extended. This fixes test "MultiSource/Applications/oggenc/".
Daniel Sanders [Mon, 23 Mar 2015 11:33:15 +0000 (11:33 +0000)]
[aarch64] Distinguish the 'Q' and 'm' inline assembly memory constraints.
Summary:
But still handle them the same way since I don't know how they differ on
this target.
Clang also has code for 'Ump', 'Utf', 'Usa', and 'Ush' but calls
llvm_unreachable() on this code path so they are not converted to a
constraint id at the moment.
Hal Finkel [Mon, 23 Mar 2015 08:22:43 +0000 (08:22 +0000)]
[SDAG] Don't widen VSETCC during type legalization for split operands
Because the operands of a vector SETCC node can be of a different type from the
result (and often are), it can happen that even if we'd prefer to widen the
result type of the SETCC, the operands have been split instead. In this case,
the SETCC result also must be split. This mirrors what is done in
WidenVecRes_SELECT, and should be NFC elsewhere because if the operands are not
widened the following calls to GetWidenedVector will assert (which is what was
happening in the test case).
Benjamin Kramer [Sat, 21 Mar 2015 21:09:33 +0000 (21:09 +0000)]
[SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.
int foo(char C) { return strchr("123!", C) != nullptr; } now becomes
cmpl $64, %edi ## range check
sbbb %al, %al
movabsq $0xE000200000001, %rcx
btq %rdi, %rcx ## bit test
sbbb %cl, %cl
andb %al, %cl ## and the two conditions
andb $1, %cl
movzbl %cl, %eax ## returning an int
ret
(imho the backend should expand this into a series of branches, but
that's a different story)
The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.
Benjamin Kramer [Sat, 21 Mar 2015 16:42:35 +0000 (16:42 +0000)]
StringRef: Just forward StringRef::find to libc's memchr.
Modern libc's have an SSE version of memchr which is a lot faster than our
hand-rolled version. In the past I was reluctant to use it because Darwin's
memchr used a naive ridiculously slow implementation, but that has been fixed
some versions ago.
Benjamin Kramer [Sat, 21 Mar 2015 15:36:06 +0000 (15:36 +0000)]
ValueTracking: Forward getConstantStringInfo's TrimAtNul param into recursive invocation
Currently this is only used to tweak the backend's memcpy inlining
heuristics, testing that isn't very helpful. A real test case will
follow in the next commit, where this behavior would cause a real
miscompilation.
r216771 introduced a change to MemoryDependenceAnalysis that allowed it
to reason about acquire/release operations. However, this change does
not ensure that the acquire/release operations pair. Unfortunately,
this leads to miscompiles as we won't see an acquire load as properly
memory effecting. This largely reverts r216771.
Eric Christopher [Sat, 21 Mar 2015 04:22:23 +0000 (04:22 +0000)]
Remove the target independent TargetMachine::getSubtarget and
TargetMachine::getSubtargetImpl routines.
This keeps the target independent code free of bare subtarget
calls while the remainder of the backends are migrated, or not
if they don't wish to support per-function subtargets as would
be needed for function multiversioning or LTO of disparate
cpu subarchitecture types, e.g.
Eric Christopher [Sat, 21 Mar 2015 04:04:50 +0000 (04:04 +0000)]
Remove the bare getSubtargetImpl call from the AArch64 port. As part
of this add a test that shows we can generate code for functions
that specifically enable a subtarget feature.
Eric Christopher [Sat, 21 Mar 2015 03:36:02 +0000 (03:36 +0000)]
Remove the bare getSubtargetImpl call from the PPC port. As part
of this add a test that shows we can generate code with
for functions that differ by subtarget feature.
Eric Christopher [Sat, 21 Mar 2015 03:17:25 +0000 (03:17 +0000)]
Grab a subtarget off of an AMDGPUTargetMachine rather than a
bare target machine in preparation for the TargetMachine bare
getSubtarget/getSubtargetImpl calls going away.
Eric Christopher [Sat, 21 Mar 2015 03:13:10 +0000 (03:13 +0000)]
Cache the Function dependent subtarget on the MachineFunction.
As preparation for removing the getSubtargetImpl() call from
TargetMachine go ahead and flip the switch on caching the function
dependent subtarget and remove the bare getSubtargetImpl call
from the X86 port. As part of this add a few tests that show we
can generate code and assemble on X86 based on features/cpu on
the Function.
Eric Christopher [Sat, 21 Mar 2015 03:13:05 +0000 (03:13 +0000)]
Grab a subtarget off of a MipsTargetMachine rather than a
bare target machine in preparation for the TargetMachine bare
getSubtarget/getSubtargetImpl calls going away.
Eric Christopher [Sat, 21 Mar 2015 03:13:01 +0000 (03:13 +0000)]
Change getISAEncoding to use the target triple to determine
thumb-ness similar to the rest of the Module level asm printing
infrastructure as debug info finalization happens after the function
may be missing.
Ahmed Bougacha [Sat, 21 Mar 2015 01:23:15 +0000 (01:23 +0000)]
[CodeGen][IfCvt] Don't re-ifcvt blocks with unanalyzable terminators.
If we couldn't analyze its terminator (i.e., it's an indirectbr, or some
other weirdness), we can't safely re-if-convert a predicated block,
because we can't tell whether the predicated terminator can
fallthrough (it does).
Currently, we would completely ignore the fallthrough successor. In
the added testcase, this means we used to generate: