Silviu Baranga [Fri, 30 Oct 2015 15:02:28 +0000 (15:02 +0000)]
[SCEV] Generalize the SCEV algorithm for creating expressions for PHI nodes
Summary:
When forming expressions for phi nodes having an incoming value from
outside the loop A and a value coming from the previous iteration B
we were forming an AddRec if:
- B was an AddRec
- the value A was equal to the value for B at iteration -1 (or equal
to the value of B shifted by one iteration, at iteration 0)
In this case, we were computing the expression to be the expression of
B, shifted by one iteration.
This changes generalizes the logic above by removing the restriction that
B needs to be an AddRec. For this we introduce two expression rewriters
that allow us to
- shift an expression by one iteration
- get the value of an expression at iteration 0
This allows us to get SCEV expressions for PHI nodes when these expressions
are not AddRecExprs.
Dehao Chen [Fri, 30 Oct 2015 05:07:15 +0000 (05:07 +0000)]
Recommit r251680 (also need to update clang test)
Update the discriminator assignment algorithm
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.
original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }
Dehao Chen [Fri, 30 Oct 2015 04:29:05 +0000 (04:29 +0000)]
Revert r251680:
Update the discriminator assignment algorithm
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.
original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }
Dehao Chen [Fri, 30 Oct 2015 02:38:29 +0000 (02:38 +0000)]
Update the discriminator assignment algorithm
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.
original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }
Alexey Samsonov [Fri, 30 Oct 2015 00:40:20 +0000 (00:40 +0000)]
Let the users of LLVMSymbolizer decide whether they want to symbolize inlined frames.
Introduce LLVMSymbolizer::symbolizeInlinedCode() instead of switching
on PrintInlining option passed to the constructor. This will be needed
once we retrun structured data (instead of std::string) from
LLVMSymbolizer and move printing logic out.
Alexey Samsonov [Thu, 29 Oct 2015 22:21:37 +0000 (22:21 +0000)]
[LLVMSymbolize] Move ModuleInfo into a separate class (SymbolizableModule).
Summary:
This is mostly NFC. It is a first step in cleaning up LLVMSymbolize
library. It removes "ModuleInfo" class which bundles together ObjectFile
and its debug info context in favor of:
* abstract SymbolizableModule in public headers;
* SymbolizableObjectFile subclass in implementation.
Additionally, SymbolizableObjectFile is now created via factory, so we
can properly detect object parsing error at this stage instead of keeping
the broken half-parsed object. As a next step, we would be able to
propagate the error all the way back to the library user.
Further improvements might include:
* factoring out the logic of finding appropriate file with debug info
for a given object file, and caching all parsed object files into a
separate class [A].
* factoring out DILineInfo rendering [B].
This would make what is now a heavyweight "LLVMSymbolizer" a relatively
straightforward class, that calls into [A] to turn filepath into a
SymbolizableModule, delegates actual symbolization to concrete SymbolizableModule
implementation, and lets [C] render the result.
Simon Pilgrim [Thu, 29 Oct 2015 22:11:28 +0000 (22:11 +0000)]
[X86][SSE] Shuffle blends with zero
This patch generalizes the zeroing of vector elements with the BLEND instructions. Currently a zero vector will only blend if the shuffled elements are correctly inline, this patch recognises when a vector input is zero (or zeroable) and modifies a local copy of the shuffle mask to support a blend. As a zeroable vector input may not be all zeroes, the zeroable vector is regenerated if necessary.
Lang Hames [Thu, 29 Oct 2015 22:04:22 +0000 (22:04 +0000)]
[Orc] Teach IndirectStubsManager to manage an expandable pool of stubs, rather
than a pre-allocated slab of stubs. Also add a convenience method for creating a
single stub, rather than a whole block a time.
Teresa Johnson [Thu, 29 Oct 2015 21:24:38 +0000 (21:24 +0000)]
Fix test check label.
Summary:
I noticed when manually modifying this test that it was passing when I
expected it to fail. Looks like the combination of LABEL and NOT on the
check does not work. This can also be seen when running FileCheck with
only that one -check-prefix (removing the additional -check-prefix=B):
/usr/local/google/home/tejohnson/llvm/llvm_11_build/./bin/llvm-link -S -internalize -only-needed /usr/local/google/home/tejohnson/llvm/llvm_11_build/test/Linker/Output/link-flags.ll.tmp.b.bc /usr/local/google/home/tejohnson/llvm/llvm_11_build/test/Linker/Output/link-flags.ll.tmp.c.bc | /usr/local/google/home/tejohnson/llvm/llvm_11_build/./bin/FileCheck /usr/local/google/home/tejohnson/llvm/llvm_11/test/Linker/link-flags.ll -check-prefix=CN
error: no check strings found with prefix 'CN:'
The CN prefix checks don't in fact need "LABEL" so remove that.
Chandler Carruth [Thu, 29 Oct 2015 18:29:15 +0000 (18:29 +0000)]
[FunctionAttrs] Provide a single SCC node set to all of the
transformations in FunctionAttrs rather than building a new one each
time.
This isn't trivial because there are different heuristics from different
passes for exactly what set they want. The primary difference is whether
an *overridable* function completely disables the synthesis of
attributes. I've modeled this by directly testing for overridable, and
using the common set that excludes external and opt-none functions.
This does cause some changes by disabling more optimizations in the face
of opt-none. Specifically, we were still optimizing *calls* to opt-none
functions based on their attributes, just not the bodies. It seems
better to be conservative on both fronts given the intended semanticas
here (best effort to not assume or disturb anything). I've not tried to
test this change as it seems complex, brittle, and not important to the
implicit contract of opt-none. Instead, it seems more like a choice that
should be dictated by the simplified implementation and the change to be
acceptable differences within the space of opt-none.
A big benefit here is that these transformations no longer rely on the
legacy pass manager's SCC types, they just work on generic sets of
function pointers. This will make it easy to re-use their logic in the
new pass manager.
I've also made the transforms static functions instead of members where
trivial while I was touching the signatures.
Jonas Paulsson [Thu, 29 Oct 2015 16:13:55 +0000 (16:13 +0000)]
[SystemZ] Make the CCRegs regclass non-allocatable.
This was discovered to be necessary while running memchr-01.ll with
-verify-machinstrs, because it is not allowed to have a phys reg live
accross block boundaries while on SSA form, if the register is
allocatable (expect in entry block and landing pads).
In this test case, stringRRE pseudos are expanded after isel by adding
a loop block which produces a live out CC register. To make the test
pass, it was also necessary to not say that StringRRELoop pseudo uses
R0L, this is only true for the StringRRE opcode.
-verify-machineinstrs added to memchr-01.ll test.
New test case int-cmp-51.ll to test that MachineCSE can eliminate
an identical compare (which it couldn't do before).
Zoran Jovanovic [Thu, 29 Oct 2015 14:40:19 +0000 (14:40 +0000)]
[mips] wrong opcode for ll/sc instructions on mipsr6 when -integrated-as is used
Summary:
This commit resolves wrong opcodes for ll and sc instructions for r6 architecutres, which were generated in method MipsTargetLowering::emitAtomicBinary.
This patch unify the 39-bit and 42-bit mapping for aarch64 to use only
one instrumentation algorithm. This removes compiler flag
SANITIZER_AARCH64_VMA requirement for MSAN on aarch64.
[mips] Check the register class before replacing materializations of zero with $zero in microMIPS.
Summary:
The microMIPS register class GPRMM16 does not contain the $zero register.
However, MipsSEDAGToDAGISel::replaceUsesWithZeroReg() would replace uses
of the $dst register:
[d]addiu, $dst, $zero, 0
with the $zero register, without checking for membership in the register
class of the target machine operand.
Jonas Paulsson [Thu, 29 Oct 2015 08:28:35 +0000 (08:28 +0000)]
[MachineVerifier] Analyze MachineMemOperands for mem-to-mem moves.
Since the verifier will give false reports if it incorrectly thinks MI is
loading or storing using an FI, it is necessary to scan memoperands and
find out how the FI is used in the instruction. This should be relatively
rare.
Needed to make CodeGen/SystemZ/spill-01.ll pass, which now runs with this flag.
Matthias Braun [Thu, 29 Oct 2015 03:57:24 +0000 (03:57 +0000)]
ScheduleDAGInstrs: Remove IsPostRA flag
This was a layering violation in ScheduleDAGInstrs (and
MachineSchedulerBase) they both shouldn't know directly whether they are
used by the PostMachineScheduler or the MachineScheduler.
Philip Reames [Thu, 29 Oct 2015 03:57:17 +0000 (03:57 +0000)]
[LVI/CVP] Teach LVI about range metadata
Somewhat shockingly for an analysis pass which is computing constant ranges, LVI did not understand the ranges provided by range metadata.
As part of this change, I included a change to CVP primarily because doing so made it much easier to write small self contained test cases. CVP was previously only handling the non-local operand case, but given that LVI can sometimes figure out information about instructions standalone, I don't see any reason to restrict this. There could possibly be a compile time impact from this, but I suspect it should be minimal. If anyone has an example which substaintially regresses, please let me know. I could restrict the block local handling to ICmps feeding Terminator instructions if needed.
Note that this patch continues a somewhat bad practice in LVI. In many cases, we know facts about values, and separate context sensitive facts about values. LVI makes no effort to distinguish and will frequently cache the same value fact repeatedly for different contexts. I would like to change this, but that's a large enough change that I want it to go in separately with clear documentation of what's changing. Other examples of this include the non-null handling, and arguments.
As a meta comment: the entire motivation of this change was being able to write smaller (aka reasonable sized) test cases for a future patch teaching LVI about select instructions.
Cong Hou [Thu, 29 Oct 2015 01:28:44 +0000 (01:28 +0000)]
Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger vectorization factor.
To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers.
Hal Finkel [Wed, 28 Oct 2015 23:43:00 +0000 (23:43 +0000)]
[PowerPC] Recurse through constants when looking for TLS globals
We cannot form ctr-based loops around function calls, including calls to
__tls_get_addr used for PIC TLS variables. References to such TLS variables,
however, might be buried within constant expressions, and so we need to search
the entire constant expression to be sure that no references to such TLS
variables exist.
Fixes PR25256, reported by Eric Schweitz. This is a slightly-modified version
of the patch suggested by Eric in the bug report, and a test case I created.
Hal Finkel [Wed, 28 Oct 2015 23:03:45 +0000 (23:03 +0000)]
[PowerPC] Don't return unsupported register classes for asm constraints
As a follow-up to r251566, do the same for the other optionally-supported
register classes (mostly for vector registers). Don't return an unavailable
register class (which would cause an assert later), but fail cleanly when
provided an unsupported inline asm constraint.
Tim Northover [Wed, 28 Oct 2015 22:36:05 +0000 (22:36 +0000)]
ARM: support .watchos_version_min and .tvos_version_min.
These MachO file directives are used by linkers and other tools to provide
compatibility information, much like the existing .ios_version_min and
.macosx_version_min.
Diego Novillo [Wed, 28 Oct 2015 22:30:25 +0000 (22:30 +0000)]
SamplePGO - Add flag to check sampling coverage.
This adds the flag -mllvm -sample-profile-check-coverage=N to the
SampleProfile pass. N is the percent of input sample records that the
user expects to apply. If the pass does not use N% (or more) of the
sample records in the input, it emits a warning.
This is useful to detect some forms of stale profiles. If the code has
drifted enough from the original profile, there will be records that do
not match the IR anymore.
This will not detect cases where a sample profile record for line L is
referring to some other instructions that also used to be at line L.
Hal Finkel [Wed, 28 Oct 2015 22:13:41 +0000 (22:13 +0000)]
Revert "r251451 - [AliasSetTracker] Use mod/ref information for UnknownInstr"
It looks like this broke the stage 2 builder:
http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/6989/
Original commit message:
AliasSetTracker does not need to convert the access mode to ModRefAccess if the
new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the
sets.
Sanjoy Das [Wed, 28 Oct 2015 21:27:14 +0000 (21:27 +0000)]
[SCEV] Compute max backedge count for loops with "shift ivs"
This teaches SCEV to compute //max// backedge taken counts for loops
like
for (int i = k; i != 0; i >>>= 1)
whatever();
SCEV yet cannot represent the exact backedge count for these loops, and
this patch does not change that. This is really geared towards teaching
SCEV that loops like the above are *not* infinite.
Sanjoy Das [Wed, 28 Oct 2015 21:27:08 +0000 (21:27 +0000)]
[JumpThreading] Use dominating conditions to prove implications
Summary:
If P branches to Q conditional on C and Q branches to R conditional on
C' and C => C' then the branch conditional on C' can be folded to an
unconditional branch.
Add a couple of helper methods to make the primary
raw profile reader interface's implementation more
readable. It also hides more format details. This
patch has no functional change.
Chris Bieneman [Wed, 28 Oct 2015 18:36:56 +0000 (18:36 +0000)]
[CMake] Disable adding the test suite as a projects subdirectory
This will never work as an add_subdirectory call, so we should just make sure it doesn't happen. To do this properly we'll need to add it under clang similar to the external compiler-rt.
Diego Novillo [Wed, 28 Oct 2015 17:40:22 +0000 (17:40 +0000)]
SamplePGO - Clear per-function data after applying a profile.
The pass was keeping around a lot of per-function data (visited blocks,
edges, dominance, etc) that is just taking up memory for no reason. In
fact, from function to function it could potentially confuse the
propagator since some maps are indexed by line offsets which can be
common between functions.
Igor Laevsky [Wed, 28 Oct 2015 16:42:00 +0000 (16:42 +0000)]
[AliasAnalysis] Take into account readonly attribute for the function arguments
In getArgModRefInfo we consider all arguments as having MRI_ModRef.
However for arguments marked with readonly attribute we can return
more precise answer - MRI_Ref.
James Molloy [Wed, 28 Oct 2015 14:30:53 +0000 (14:30 +0000)]
[GlobalOpt] Add newlines to DEBUG messages
I think these were affected by a change way back when to stop printing newlines in Value::dump() by default. This change simply allows the debug output to be readable.
Artyom Skrobov [Wed, 28 Oct 2015 13:58:36 +0000 (13:58 +0000)]
[ARM] Allow SP in rGPR, starting from ARMv8
Summary:
This patch handles assembly and disassembly, but not codegen, as of yet.
Additionally, it fixes a bug whereby SP and PC as shifted-reg operands
were treated as predictable in ARMv7 Thumb; and it enables the tests
for invalid and unpredictable instructions to run on both ARMv7 and ARMv8.
James Molloy [Wed, 28 Oct 2015 10:41:29 +0000 (10:41 +0000)]
[GlobalsAA] An indirect global that is initialized is not fair game
When checking if an indirect global (a global with pointer type) is only assigned by allocation functions, first check if the global is itself initialized. If it is, it's not only assigned by allocation functions.
This fixes PR25309. Thanks to David Majnemer for reducing the test case!
Chen Li [Wed, 28 Oct 2015 04:45:47 +0000 (04:45 +0000)]
[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.
Change InstrProfReaderIndex from typedef into a wrapper
class with helper methods. This makes the index profile
reader code more readable. It also hides the implementation
detail of the index format and make it more flexible to allow
support different (or more than one) format in the future.
Hal Finkel [Wed, 28 Oct 2015 03:26:45 +0000 (03:26 +0000)]
[PowerPC] Replace cntlz[.] with cntlzw[.]
cntlz is the old POWER mnemonic. cntlzw is the PowerPC mnemonic.
This change fixes an issue when -no-integrated-as: The opcode cntlz is
unrecognized by gas
Alias the POWER mnemonic cntlz[.] to the PowerPC mnemonic cntlzw[.]
This is done for because the POWER cntlz mnemonic has be used by LLVM for
a very long time. We need to make sure that assembly programs
that are using the cntlz[.] do not break with this change.
Change PowerPC tests to reflect the insn change from cntlz to cntlzw.
Add assembly test to verify cntlz[.] is encoded correctly.