Craig Topper [Wed, 11 Jan 2017 04:02:23 +0000 (04:02 +0000)]
[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent.
This was reverted because it would miscompile code where the cmp had
multiple uses. That was due to a deficiency in the existing code, which
was fixed in r291630 (see the PR for details).
This re-commit includes an extra test for the kind of code that got
miscompiled: @test_sub_1_setcc_jcc.
Zachary Turner [Wed, 11 Jan 2017 00:35:43 +0000 (00:35 +0000)]
[CodeView/PDB] Rename a bunch of files.
We were starting to get some name clashes between llvm-pdbdump
and the common CodeView framework, so I took this opportunity
to rename a bunch of files to more accurately describe their
usage. This also helps in llvm-pdbdump to distinguish
between different files and whether they are used for pretty
dump mode or raw dump mode.
Zachary Turner [Wed, 11 Jan 2017 00:35:08 +0000 (00:35 +0000)]
[CodeView] Add TypeDatabase class.
This creates a centralized class in which to store type records.
It stores types as an array of entries, which matches the
notion of a type stream being a topologically sorted DAG.
Logic to build up such a database was already being used in
CVTypeDumper, so CVTypeDumper is now updated to to read from
a TypeDatabase which is filled out by an earlier visitor in
the pipeline.
Chandler Carruth [Wed, 11 Jan 2017 00:16:03 +0000 (00:16 +0000)]
[gmock] Teach gmock ElementsAre and BeginEndDistanceIs matchers to
handle generic ranges by using std::begin and std::end rather than
requiring things to look exactly like an STL container.
Much of the credit for this goes to Dave Blaikie who helped me figure
out the right incantations.
This will probably be re-designed when I send this to the maintainers of
gmock, so I've instead structured it to change is little as possible
while it is a local patch. That makes it somewhat ugly, but I think a focused
change is better for getting this to work for LLVM today and letting the
upstream maintainers figure out the correct long-term pattern.
Florian Hahn [Tue, 10 Jan 2017 23:43:35 +0000 (23:43 +0000)]
[loop-unroll] Properly populate LoopInfo for loops cloned in LoopUnrollRuntime.
Summary:
This fixes Transforms/LoopUnroll/runtime-loop3.ll which failed with
EXTENSIVE_DEBUG, because the cloned basic blocks were not added to the
correct sub-loops in LoopUnrollRuntime.cpp.
Justin Lebar [Tue, 10 Jan 2017 23:43:04 +0000 (23:43 +0000)]
[TM] Restore default TargetOptions in TargetMachine::resetTargetOptions.
Summary:
Previously if you had
* a function with the fast-math-enabled attr, followed by
* a function without the fast-math attr,
the second function would inherit the first function's fast-math-ness.
This means that mixing fast-math and non-fast-math functions in a module
was completely broken unless you explicitly annotated every
non-fast-math function with "unsafe-fp-math"="false". This appears to
have been broken since r176986 (March 2013), when the resetTargetOptions
function was introduced.
This patch tests the correct behavior as best we can. I don't think I
can test FPDenormalMode and NoTrappingFPMath, because they aren't used
in any backends during function lowering. Surprisingly, I also can't
find any uses at all of LessPreciseFPMAD affecting generated code.
The NVPTX/fast-math.ll test changes are an expected result of fixing
this bug. When FMA is disabled, we emit add as "add.rn.f32", which
prevents fma combining. Before this patch, fast-math was enabled in all
functions following the one which explicitly enabled it on itself, so we
were emitting plain "add.f32" where we should have generated
"add.rn.f32".
Reid Kleckner [Tue, 10 Jan 2017 23:23:58 +0000 (23:23 +0000)]
Move the section name from GlobalObject to the LLVMContext
Summary:
Convention wisdom says that bytes in Function are precious, and the
vast, vast majority of globals do not live in special sections. Even
when they do, they tend to live in the same section. Store the section
name on the LLVMContext in a StringSet, and maintain a map from
GlobalObject* to section name like we do for metadata, prefix data, etc.
The fact that we've survived this long wasting at least three pointers
of space in Function suggests that Function bytes are perhaps not as
precious as we once thought. Given that most functions have metadata
attachments when debug info is enabled, we might consider adding a
pointer here to make that access more efficient.
Kyle Butt [Tue, 10 Jan 2017 23:04:30 +0000 (23:04 +0000)]
CodeGen: Allow small copyable blocks to "break" the CFG.
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.
Chandler Carruth [Tue, 10 Jan 2017 22:32:26 +0000 (22:32 +0000)]
Add the 'googlemock' component of Google Test to LLVM's unittest libraries.
I have two immediate motivations for adding this:
1) It makes writing expectations in tests *dramatically* easier. A
quick example that is a taste of what is possible:
std::vector<int> v = ...;
EXPECT_THAT(v, UnorderedElementsAre(1, 2, 3));
This checks that v contains '1', '2', and '3' in some order. There
are a wealth of other helpful matchers like this. They tend to be
highly generic and STL-friendly so they will in almost all cases work
out of the box even on custom LLVM data structures.
I actually find the matcher syntax substantially easier to read even
for simple assertions:
EXPECT_THAT(a, Eq(b));
EXPECT_THAT(b, Ne(c));
Both of these make it clear what is being *tested* and what is being
*expected*. With `EXPECT_EQ` this is implicit (the LHS is expected,
the RHS is tested) and often confusing. With `EXPECT_NE` it is just
not clear. Even the failure error messages are superior with the
matcher based expectations.
2) When testing any kind of generic code, you are continually defining
dummy types with interfaces and then trying to check that the
interfaces are manipulated in a particular way. This is actually what
mocks are *good* for -- testing *interface interactions*. With
generic code, there is often no "fake" or other object that can be
used.
For a concrete example of where this is currently causing significant
pain, look at the pass manager unittests which are riddled with
counters incremented when methods are called. All of these could be
replaced with mocks. The result would be more effective at testing
the code by having tighter constraints. It would be substantially
more readable and maintainable when updating the code. And the error
messages on failure would have substantially more information as
mocks automatically record stack traces and other information *when
the API is misused* instead of trying to diagnose it after the fact.
I expect that #1 will be the overwhelming majority of the uses of gmock,
but I think that is sufficient to justify having it. I would actually
like to update the coding standards to encourage the use of matchers
rather than any other form of `EXPECT_...` macros as they are IMO
a strict superset in terms of functionality and readability.
I think that #2 is relatively rarely useful, but there *are* cases where
it is useful. Historically, I think misuse of actual mocking as
described in #2 has led to resistance towards this framework. I am
actually sympathetic to this -- mocking can easily be overused. However
I think this is not a significant concern in LLVM. First and foremost,
LLVM has very careful and rare exposure of abstract interfaces or
dependency injection, which are the most prone to abuse with mocks. So
there are few opportunities to abuse them. Second, a large fraction of
LLVM's unittests are testing *generic code* where mocks actually make
tremendous sense. And gmock is well suited to building interfaces that
exercise generic libraries. Finally, I still think we should be willing
to have testing utilities in tree even if they should be used rarely. We
can use code review to help guide the usage here.
For a longer and more complete discussion of this, see the llvm-dev
thread here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108672.html
The general consensus seems that this is a reasonable direction to start
down, but that doesn't mean we should race ahead and use this
everywhere. I have one test that is blocked on this to land and that was
specifically used as an example. Before widespread adoption, I'm going
to work up some (brief) guidelines as some of these facilities should be
used sparingly and carefully.
Matt Arsenault [Tue, 10 Jan 2017 22:02:30 +0000 (22:02 +0000)]
DAG: Avoid OOB when legalizing vector indexing
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.
Derek Schuff [Tue, 10 Jan 2017 21:59:53 +0000 (21:59 +0000)]
[WebAssembly] Only RAUW a constant once in FixFunctionBitcasts
When we collect 2 uses of a function in FindUses and then RAUW when we
visit the first, we end up visiting the wrapper (because the second was
RAUW'd). We still want to use RAUW instead of just Use->set() because
it has special handling for Constants, so this patch just ensures that
only one use of each constant is added to the work list.
Victor Leschuk [Tue, 10 Jan 2017 21:18:26 +0000 (21:18 +0000)]
DebugInfo: support for DW_FORM_implicit_const
Support for DW_FORM_implicit_const DWARFv5 feature.
When this form is used attribute value goes to .debug_abbrev section (as SLEB).
As this form would break any debug tool which doesn't support DWARFv5
it is guarded by dwarf version check. Attempt to use this form with
dwarf version <= 4 is considered a fatal error.
Michal Gorny [Tue, 10 Jan 2017 19:55:51 +0000 (19:55 +0000)]
[llvm-config] Canonicalize CMake booleans to 0/1
Following the similar change to lit configuration, ensure that all CMake
booleans are canonicalized to 0/1 when being passed to llvm-config. This
fixes the incorrect interpretation of values when user passes another
value than the ON/OFF, and simplifies the code by removing unnecessary
string matching.
Furthermore, the code for --has-rtti and --has-global-isel has been
modified to print consistent values indepdently of the boolean used by
passed by the user to CMake. Sadly, the code already implicitly used
different values for the two (YES/NO for --has-rtti, ON/OFF for
--has-global-isel).
Include tests for all booleans and multi-value options in llvm-config.
Petr Hosek [Tue, 10 Jan 2017 19:47:05 +0000 (19:47 +0000)]
[CMake] Handle common options for runtimes build
All the existing runtimes relies on flags which are set by AddLLVM
and HandleLLVMOptions. In the standalone case, they would include
these themselves, but when being built using LLVM runtimes we should
include these in the top-level runtimes CMake files.
[LV] Don't panic when encountering the IV of an outer loop.
Bail out instead of asserting when we encounter this situation,
which can actually happen.
The reason the test uses the new PM is that the "bad" phi, incidentally, gets
cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve
loop simplify form, so the cleanup has no chance to run.
This fixes PR31190.
We may want to solve this in a less conservative manner, since this phi is
actually uniform within the inner loop (or we may want LICM to output a cleaner
promotion to begin with).
Rong Xu [Tue, 10 Jan 2017 19:30:20 +0000 (19:30 +0000)]
[PGO] Turn off comdat renaming in IR PGO by default
Summary:
In IR PGO we append the function hash to comdat functions to avoid the
potential hash mismatch. This turns out not legal in some cases: if the comdat
function is address-taken and used in comparison. Renaming changes the semantic.
This patch turns off comdat renaming by default.
To alleviate the hash mismatch issue, we now rename the profile variable
for comdat functions. Profile allows co-existing multiple versions of profiles
with different hash value. The inlined copy will always has the correct profile
counter. The out-of-line copy might not have the correct count. But we will
not have the bogus mismatch warning.
[X86][AVX512]Improving shuffle lowering by using AVX-512 EXPAND* instructions
This patch fix PR31351: https://llvm.org/bugs/show_bug.cgi?id=31351
1. This patch adds new type of shuffle lowering
2. We can use the expand instruction, When the shuffle pattern is as following:
{ 0*a[0]0*a[1]...0*a[n] , n >=0 where a[] elements in a ascending order}.
Simon Dardis [Tue, 10 Jan 2017 16:40:57 +0000 (16:40 +0000)]
[mips] Fix Mips MSA instrinsics
The usage of some MIPS MSA instrinsics that took immediates could crash LLVM
during lowering. This patch addresses that behaviour. Crucially this patch
also makes the use of intrinsics with out of range immediates as producing an
internal error.
The ld,st instrinsics would trigger an assertion failure for MIPS64 as their
lowering would attempt to add an i32 offset to a i64 pointer.
Simon Dardis [Tue, 10 Jan 2017 15:53:10 +0000 (15:53 +0000)]
[mips] Honour -mno-odd-spreg for vector splat (again)
Previous the lowering of FILL_FW would use the MSA128W register class when
performing a vector splat. Instead it should be honouring -mno-odd-spreg and
only use the even registers when performing a splat from word to vector
register.
Logical follow-on from r230235.
This fixes PR/31369.
A previous commit was missing the test case and had another differential
in it.
Simon Dardis [Tue, 10 Jan 2017 10:28:37 +0000 (10:28 +0000)]
[mips] Honour -mno-odd-spreg for vector splat
Previous the lowering of FILL_FW would use the MSA128W register class when
performing a vector splat. Instead it should be honouring -mno-odd-spreg and
only use the even registers when performing a splat from word to vector
register.
Craig Topper [Tue, 10 Jan 2017 07:42:57 +0000 (07:42 +0000)]
[DAGCombiner] Merge together duplicate checks for folding fold (select C, 1, X) -> (or C, X) and folding (select C, X, 0) -> (and C, X). Also be consistent about checking that both the condition and the result type are i1. NFC
I guess previously we just assumed if the result type was i1, then the condition type must also be i1?
Chris Bieneman [Tue, 10 Jan 2017 06:22:49 +0000 (06:22 +0000)]
[ObjectYAML] Support for DWARF line tables
One more try... relanding r291541 with a fix to properly gate MaxOpsPerInst on DWARF version.
Description from r291541:
This patch re-lands r291470, which failed on Linux bots. The issue (I believe) was undefined behavior because the size of llvm::dwarf::LineNumberOps was not explcitly specified or consistently respected. The updated patch adds an explcit underlying type to the enum and preserves the size more correctly.
Original description:
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.
Craig Topper [Tue, 10 Jan 2017 06:01:16 +0000 (06:01 +0000)]
AMD family 17h (znver1) enablement
Summary:
This patch enables the following
1. AMD family 17h architecture using "znver1" tune flag (-march, -mcpu).
2. ISAs that are enabled for "znver1" architecture.
3. Checks ADX isa from cpuid to identify "znver1" flag when -march=native is used.
4. ISAs FMA4, XOP are disabled as they are dropped from amdfam17.
5. For the time being, it uses the btver2 scheduler model.
6. Test file is updated to check this flag.
This item is linked to clang review item https://reviews.llvm.org/D28018
Chris Bieneman [Tue, 10 Jan 2017 05:25:24 +0000 (05:25 +0000)]
[ObjectYAML] Support for DWARF line tables
This patch re-lands r291470, which failed on Linux bots. The issue (I believe) was undefined behavior because the size of llvm::dwarf::LineNumberOps was not explcitly specified or consistently respected. The updated patch adds an explcit underlying type to the enum and preserves the size more correctly.
Original description:
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.
Craig Topper [Tue, 10 Jan 2017 04:12:24 +0000 (04:12 +0000)]
[X86] When lowering uniform shifts, use X86ISD::VZEXT instead of using a ZERO_EXTEND_VECTOR_INREG. If we emit the ZERO_EXTEND_VECTOR_INREG too late it doesn't get lowered properly and makes it through to isel and fails.
Serge Pavlov [Tue, 10 Jan 2017 02:50:47 +0000 (02:50 +0000)]
[StructurizeCfg] Update dominator info.
In some cases StructurizeCfg updates root node, but dominator info
remains unchanges, it causes crash when expensive checks are enabled.
To cope with this problem a new method was added to DominatorTreeBase
that allows adding new root nodes, it is called in StructurizeCfg to
put dominator tree in sync.
This is the second part of a multi-part change to define additional
subcommands to the `llvm-xray` tool.
This change defines a conversion subcommand to take XRay log files, and
turns them from one format to another (binary or YAML). This currently
only supports the first version of the log file format, defined in the
compiler-rt runtime.
Evandro Menezes [Tue, 10 Jan 2017 01:08:01 +0000 (01:08 +0000)]
[CodeGen] Implement the SUnit::print() method
This method seems to have had a troubled life. This patch proposes that it
replaces the recently added helper function dumpSUIdentifier. This way, the
method can be used in other files using the SUnit class.
Reid Kleckner [Tue, 10 Jan 2017 01:05:33 +0000 (01:05 +0000)]
Try once again to fix the MSVC build of AlignedCharArrayUnion
It was complaining about ambiguity between llvm::detail and
llvm::support::detail:
error C2872: 'detail': ambiguous symbol
note: could be 'llvm::detail'
note: or 'llvm::support::detail'
Standardize on llvm::support::detail to hide these symbols further.
Reid Kleckner [Tue, 10 Jan 2017 00:26:56 +0000 (00:26 +0000)]
Fix MSVC build of AlignedCharArrayUnion
Use constexpr recursion for alignof like we do for sizeof. Seems to work
with Clang and MSVC. Also, don't recurse twice to avoid slowdowns in
compilers that don't memoize constexpr results (Clang).
James Y Knight [Mon, 9 Jan 2017 23:11:25 +0000 (23:11 +0000)]
Commit a test for match-full-lines.
I unfortunately neglected to add it in r260540, but it has been
sitting in my working dir ever since. D'oh.
Modified to work with r290069, which made the CHECK patterns
themselves whitespace-sensitive as well, and remove the test added
then, as this tests both strict and non-strict modes.
Rui Ueyama [Mon, 9 Jan 2017 22:55:00 +0000 (22:55 +0000)]
TarWriter: Fix a bug in Ustar header.
If we split a filename into `Name` and `Prefix`, `Prefix` is at most
145 bytes. We had a bug that didn't split a path correctly. This bug
was pointed out by Rafael in the post commit review.
This patch adds a unit test for TarWriter to verify the fix.
Easwaran Raman [Mon, 9 Jan 2017 21:56:26 +0000 (21:56 +0000)]
Refactor inline threshold update code.
Functional change: Previously, if a callee is cold, we used ColdThreshold if it minimizes the existing threshold. This was irrespective of whether we were optimizing for minsize (-Oz) or not. But -Oz uses very low threshold to begin with and the inlining with -Oz is expected to be tuned for lowering code size, so there is no good reason to set an even lower threshold for cold callees. We now lower the threshold for cold callees only when -Oz is not used. For default values of -inlinethreshold and -inlinecold-threshold, this change has no effect and this simplifies the code.
NFC changes: Group all threshold updates that are guarded by !Caller->optForMinSize() and within that group threshold updates that require profile summary info.
When writing to a non regular file we cannot rename to it. Since we
have to write, we may as well create a temporary file to avoid trying
to create an unique file in /dev when trying to write to /dev/null.
Matthias Braun [Mon, 9 Jan 2017 21:38:17 +0000 (21:38 +0000)]
PeepholeOptimizer: Do not replace SubregToReg(bitcast like)
While we can usually replace bitcast like instructions
(MachineInstr::isBitcast()) with a COPY this is not legal if any of the
users uses SUBREG_TO_REG to assert the upper bits of the result are
zero.
Rui Ueyama [Mon, 9 Jan 2017 21:20:42 +0000 (21:20 +0000)]
TarWriter: Set "00" to Ustar version field.
Most (maybe all?) tar commands can handle tar archives with blank
version fields, but POSIX requires "00" to be set to the field, so
doing it is good for compliance.
Chris Bieneman [Mon, 9 Jan 2017 20:01:37 +0000 (20:01 +0000)]
[ObjectYAML] Support for DWARF line tables
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.
Matthew Simpson [Mon, 9 Jan 2017 19:05:29 +0000 (19:05 +0000)]
[LV] Fix-up external IV users after updating dominator tree
This patch delays the fix-up step for external induction variable users until
after the dominator tree has been properly updated. This should fix PR30742.
The SCEVExpander in InductionDescriptor::transform can generate code in the
wrong location if the dominator tree is not up-to-date. We should work towards
keeping the dominator tree up-to-date throughout the transformation.
Matt Arsenault [Mon, 9 Jan 2017 18:52:39 +0000 (18:52 +0000)]
AMDGPU: Add Assert[SZ]Ext during argument load creation
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.
Fixes code quality regressions vs. SI in min.ll test.