[dwarfdump] Add verbose output for .debug-line section
This patch adds dumping of line table instructions as well as the final
state at each specified pc value in verbose mode. This is essentially
the same as the default in Darwin's dwarfdump. Dumping the actual line
table opcodes can be particularly useful for something like debugging a
bad `.debug_line` section.
[DAGCombiner] Slightly simplify some code by using APInt::isMask() and countTrailingOnes instead of getting active bits and checking if all the bits below that make a mask.
At least for the 64-bit and less case, we should be able to determine if we even have a mask without counting any bits. This also removes the need to explicitly check for 0 active bits, isMask will return false for 0.
Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
The fix is to avoid invalidating our insertion point in
replaceDbgDeclare:
Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore);
+ if (DII == InsertBefore)
+ InsertBefore = &*std::next(InsertBefore->getIterator());
DII->eraseFromParent();
I had to write a unit tests for this instead of a lit test because the
use list order matters in order to trigger the bug.
The reduced C test case for this was:
void useit(int*);
static inline void inlineme() {
int x[2];
useit(x);
}
void f() {
inlineme();
inlineme();
}
[SelectionDAG] Pick correct frame index in LowerArguments
Summary:
SelectionDAGISel::LowerArguments is associating arguments
with frame indices (FuncInfo->setArgumentFrameIndex). That
information is later on used by EmitFuncArgumentDbgValue to
create DBG_VALUE instructions that denotes that a variable
can be found on the stack.
I discovered that for our (big endian) out-of-tree target
the association created by SelectionDAGISel::LowerArguments
sometimes is wrong. I've seen this happen when a 64-bit value
is passed on the stack. The argument will occupy two stack
slots (frame index X, and frame index X+1). The fault is
that a call to setArgumentFrameIndex is associating the
64-bit argument with frame index X+1. The effect is that the
debug information (DBG_VALUE) will point at the least significant
part of the arguement on the stack. When printing the
argument in a debugger I will get the wrong value.
I managed to create a test case for PowerPC that seems to
show the same kind of problem.
The bugfix will look at the datalayout, taking endianness into
account when examining a BUILD_PAIR node, assuming that the
least significant part is in the first operand of the BUILD_PAIR.
For big endian targets we should use the frame index from
the second operand, as the most significant part will be stored
at the lower address (using the highest frame index).
[lit] Don't norm case when inserting into the config map.
This makes all paths lowercase on Windows, which seemed like a
good idea at the time, but it means that tests can't properly
use FileCheck to match expected path names.
Config map is not exposed through the command line, so testing this
is somewhat tricky. But basically we need a test that if a custom
driver builds a config map and passes it to main, it gets respected.
A config map allows config files in the source tree to be mapped
to alternate config files in the build tree. This particular test
works by having two config files in separate directories, and
setting up a config map to have that redirects A/lit.site.cfg
to B/altconfig. Then, we print a message in A/lit.site.cfg
and B/altconfig and check that we do see the output from B
but don't see the output from A. Additionally we test that
the test suite specified by A's config map is properly discovered.
[Power9] Spill gprs to vector registers rather than stack
This patch updates register allocation to enable spilling gprs to
volatile vector registers rather than the stack. It can be enabled
for Power9 with option -ppc-enable-gpr-to-vsr-spills.
Simon Atanasyan [Thu, 21 Sep 2017 14:04:53 +0000 (14:04 +0000)]
[mips] Implement generation of relocations "chains" used by N32 ABI
In case of using a "nested" relocation expressions like this
`%hi(%neg(%gp_rel()))`, N32 ABI requires generation of three consecutive
relocations. That differs from the N64 ABI case where all relocations
are packed into the single relocation record.
Simon Atanasyan [Thu, 21 Sep 2017 14:04:47 +0000 (14:04 +0000)]
[mips] Do not pass redundant IsN64 flag to MCELFObjectTargetWriter. NFC
Now we pass the 'Is64_' flag to the MCELFObjectTargetWriter ctor iif
when we make deal with N64 ABI. So it is redundant to pass additional
'IsN64' flag.
Mikael Holmen [Thu, 21 Sep 2017 11:14:27 +0000 (11:14 +0000)]
[SROA] Really remove associated dbg.declare when removing dead alloca
Summary:
There already was code that tried to remove the dbg.declare, but that code
was placed after we had called
I->replaceAllUsesWith(UndefValue::get(I->getType()));
on the alloca, so when we searched for the relevant dbg.declare, we
couldn't find it.
Now we do the search before we call RAUW so there is a chance to find it.
An existing testcase needed update due to this. Two dbg.declare with undef
were removed and then suddenly one of the two CHECKS failed.
However, the CHECKs in the testcase checked things in a silly order, so
they only passed since they found things in the first dbg.declare. Now
we changed the order of the checks and the test passes.
Simon Atanasyan [Thu, 21 Sep 2017 10:44:26 +0000 (10:44 +0000)]
[mips] Fix relocation record format and ELF header for N32 ABI
The N32 ABI uses RELA relocation format, do not use 3-in-1 relocation's
encoding, and uses ELFCLASS32. This change passes the `IsN32` flag
to the `MCAsmBackend` to distinguish usage of N32 ABI.
We still do not handle some cases like providing the `-target-abi=o32`
command line option with the `mips64` target triple. That's why
elf_header.s contains some "FIXME" strings. This case will be fixed in
a separate patch.
[dsymutil] Don't resolve DIE reference to NULL DIE.
This patch prevents dsymutil from resolving a reference to a NULL DIE
when a bogus reference happens to be coincidentally referencing a NULL
DIE. Now this is detected as an invalid reference and a warning is
printed.
Revert "Re-enable "[IRCE] Identify loops with latch comparison against current IV value""
Revert the patch causing the functional failures.
The patch owner is notified with test cases which fail.
Test case has been provided to Maxim offline.
[llvm-cov] Improve error messaging for function mismatches
Passing "-dump" to llvm-cov will now print more detailed information
about function hash and counter mismatches. This should make it easier
to debug *.profdata files which contain incorrect records, and to debug
other scenarios where coverage goes missing due to mismatch issues.
[lit] Make lit support config files with .py extension.
Many editors and Python-related diagnostics tools such as
debuggers break or fail in mysterious ways when python files
don't end in .py. This is especially true on Windows, but
still exists on other platforms. I don't want to be too heavy
handed in changing everything across the board, but I do want
to at least *allow* lit configs to have .py extensions. This
patch makes the discovery process first look for a config file
with a .py extension, and if one is not found, then looks for
a config file using the old method. So for existing users, there
should be no functional change.
Dave Lee [Wed, 20 Sep 2017 22:41:34 +0000 (22:41 +0000)]
Remove references to response file argument in CommandLine.rst
Summary:
The documentation refers to a boolean that controls whether response files are
handled, but this is incorrect. Since r165535, response files are always
enabled.
I noticed this inefficiency while investigating PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603
This fix will likely push another bug (we don't maintain state of 'LateSimplifyCFG')
into hiding, but I'll try to clean that up with a follow-up patch anyway.
[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare
Summary:
This implements the design discussed on llvm-dev for better tracking of
variables that live in memory through optimizations:
http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html
This is tracked as PR34136
llvm.dbg.addr is intended to be produced and used in almost precisely
the same way as llvm.dbg.declare is today, with the exception that it is
control-dependent. That means that dbg.addr should always have a
position in the instruction stream, and it will allow passes that
optimize memory operations on local variables to insert llvm.dbg.value
calls to reflect deleted stores. See SourceLevelDebugging.rst for more
details.
The main drawback to generating DBG_VALUE machine instrs is that they
usually cause LLVM to emit a location list for DW_AT_location. The next
step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE
ranges as not needing a location list, and possibly start setting
DW_AT_start_offset for variables whose lifetimes begin mid-scope.
This reverts commit SVN r313668. The original test case attempted to
write a pointer value into 16-bits, although the value may exceed the
range representable in 16-bits. Ensure that the symbol is located in
the address space such that its absolute address is representable in
16-bits. This should fix the assertion failure that was seen on the
Windows hosts.
We already did (X & C2) > C1 --> (X & C2) != 0, if any bit set in (X & C2) will produce a result greater than C1. But there is an equivalent inverse condition with <= C1 (which will be canonicalized to < C1+1)
The Swift CC is identical to Win64 CC with the exception of swift error
being passed in r12 which is a CSR. However, since this calling
convention is only used in swift -> swift code, it does not impact
interoperability and can be treated entirely as Win64 CC. We would
previously incorrectly lower the frame setup as we did not treat the
frame as conforming to Win64 specifications.
Matt Arsenault [Wed, 20 Sep 2017 20:53:49 +0000 (20:53 +0000)]
AMDGPU: Add tied operands to v_mad_mix{lo|hi}_f16
These write to the low and high half of the destination
register and leave the other 16-bits unchanged. This is true
for most 16-bit instructions on gfx9, but we don't use that
now.
Introduce the llvm-cfi-verify tool (resubmission of D37937).
Summary: Resubmission of D37937. Fixed i386 target building (conversion from std::size_t& to uint64_t& failed). Fixed documentation warning failure about docs/CFIVerify.rst not being in the tree.
Matt Arsenault [Wed, 20 Sep 2017 20:28:39 +0000 (20:28 +0000)]
AMDGPU: Start selecting v_mad_mixlo_f16
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.
Introduce the llvm-cfi-verify tool (resubmission of D37937).
Summary: Resubmission of D37937. Fixed i386 target building (conversion from std::size_t& to uint64_t& failed). Fixed documentation warning failure about docs/CFIVerify.rst not being in the tree.
CodeGen: support SwiftError SwiftCC on Windows x64
Add support for passing SwiftError through a register on the Windows x64
calling convention. This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter. This partially enables building the swift standard library
for Windows x86_64.
Marek Sokolowski [Wed, 20 Sep 2017 18:33:35 +0000 (18:33 +0000)]
[llvm-readobj] Teach readobj to dump .res files (WindowsResource).
This enables readobj to output Windows resource files (.res). This way,
we'll be able to test .res outputs without comparing them byte-by-byte
with "magic binary files" generated by MS toolchain.
Re-land "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
After r313775, it's easier to maintain a parallel BitVector of spilled
locations indexed by location number.
I wasn't able to build a good reduced test case for this iteration of
the bug, but I added a more direct assertion that spilled values must
use frame index locations. If this bug reappears, it won't only fire on
the NEON vector code that we detected it on, but on medium-sized
integer-only programs as well.
This broke the buildbots, e.g.
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391
> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask'
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Subscribers: mzolotukhin
>
> Reviewed By: ayal
>
> Differential Revision: https://reviews.llvm.org/D36130
>
> Review comments updated accordingly
>
> Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0
>
> Added a TODO for sortLoadAccesses API
>
> Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58
>
> Modified the TODO for sortLoadAccesses API
>
> Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565
>
> Review comment update for using OpdNum to insert the mask in respective location
>
> Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce
>
> Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase
>
> Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b
[DebugInfo] Use a MapVector to coalesce MachineOperand locations
Summary:
The new code should be linear in the number of DBG_VALUEs, while the old
code was quadratic. NFC intended.
This is also hopefully a more direct expression of the problem, which is
to:
1. Rewrite all virtual register operands to stack slots or physical
registers
2. Uniquely number those machine operands, assigning them location
numbers
3. Rewrite all uses of the old location numbers in the interval map to
use the new location numbers
In r313400, I attempted to track which locations were spilled in a
parallel bitvector indexed by location number. My code was broken
because these location numbers are not stable during rewriting.
In these cases, two selects have constant selectable operands for
both the true and false components and have the same conditional
expression.
We then create two arithmetic operations of the same type and feed a
final select operation using the result of the true arithmetic for the true
operand and the result of the false arithmetic for the false operand and reuse
the original conditionl expression.
The arithmetic operations are naturally folded as a consequence, leaving
only the newly formed select to replace the old arithmetic operation.
Patch by: Michael Berg <michael_c_berg@apple.com>
Differential Revision: https://reviews.llvm.org/D37019
Mohammad Shahid [Wed, 20 Sep 2017 17:19:57 +0000 (17:19 +0000)]
[SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask'
jumbled accesses.
This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
Teresa Johnson [Wed, 20 Sep 2017 17:09:47 +0000 (17:09 +0000)]
[ThinLTO] Fix dead stripping analysis for SamplePGO
Summary:
The fix for dead stripping analysis in the case of SamplePGO indirect
calls to local functions (r313151) introduced the possibility of an
infinite loop.
Make sure we check for the value being already live after we update it
for SamplePGO indirect call handling.
Reviewers: danielcdh
Subscribers: mehdi_amini, inglorion, llvm-commits, eraman
[lit] Reverse path list when updating environment vars.
Bug pointed out by EricWF. This would construct a path where
items would be added in the wrong order, potentially leading
to using the wrong tools for testing.
Make libcxx tests work when llvm sources are not present.
Despite a strong CMake warning that this is an unsupported
libcxx build configuration, some bots still rely on being
able to check out lit and libcxx independently with no
LLVM sources, and then run lit against libcxx.
A previous patch broke that workflow, so this is making it work
again. Unfortunately, it breaks generation of the llvm-lit
script for libcxx, but we will just have to live with that until
a solution is found that allows libcxx to make more use of
llvm build pieces. libcxx can still run tests by using the
ninja check target, or by running lit.py directly against the
build tree or source tree.