Marcos Pividori [Sun, 22 Jan 2017 01:58:36 +0000 (01:58 +0000)]
[libFuzzer] Fix ListFilesInDirRecursive() to do the same for Posix and Windows.
Update `ListFilesInDirRecursive` implementation on Windows to have the same
behavior than for Posix, when the directory doesn't exists and when it is empty.
Marcos Pividori [Sun, 22 Jan 2017 01:58:26 +0000 (01:58 +0000)]
[libFuzzer] Portably disassemble and find calls to sanitizer_cov_trace_pc_guard.
Instead of directly using objdump, which is not present on Windows, we consider
different tools depending on the platform.
For Windows, we consider dumpbin and llvm-objdump.
Marcos Pividori [Sun, 22 Jan 2017 01:27:42 +0000 (01:27 +0000)]
[libFuzzer] Remove optimization flags for tests.
We need to build all the tests with -O0, otherwise optimizations may merge some
basic blocks and the tests will fail.
In this diff, I simplify the cmake implementation and I remove the flags for
Windows too (/O[123s]).
Marcos Pividori [Sun, 22 Jan 2017 01:27:34 +0000 (01:27 +0000)]
[libFuzzer] Remove dependencies for tests on Windows.
Remove dependency on FileCheck, sancov and not for tests on Windows.
If LLVM_USE_SANITIZER=Address and LLVM_USE_SANITIZE_COVERAGE=YES, this will
trigger the building of dependencies with sanitizer instrumentation.
This will fail in Windows, since cmake will use link.exe for linking and won't
include compiler-rt libraries.
Sanjay Patel [Sat, 21 Jan 2017 17:51:25 +0000 (17:51 +0000)]
[ValueTracking] tighten up matchMinMax(); NFCI
This is similar to what the caller (matchSelectPattern()) does. In all
cases where we succeed in matching a min/max pattern, the values in
that pattern will be the values of the 'select', so hoist that and
remove a bunch of duplicated code.
Chandler Carruth [Sat, 21 Jan 2017 04:16:53 +0000 (04:16 +0000)]
[PM] Sink an LCSSA preservation assert from the LoopSimplify pass into
the library routine shared with the new PM and other code.
This assert checks that when LCSSA preservation is requested we start in
LCSSA form. Without this early assert, given *very* complex test cases
we can hit an assert or crash much later on when trying to preserve
LCSSA.
The new PM's loop simplify doesn't need to (and indeed can't) preserve
LCSSA as the new PM doesn't deal in transforms in the dependency graph.
But we asked the library to and shockingly, this didn't work very well!
Stop doing that. Now the assert will tell us immediately with existing
test cases. Before this, it took a pretty convoluted input to trigger
this.
However, sinking the assert also found a bug in LoopUnroll where we
asked simplifyLoop to preserve LCSSA *right before we reform it*. That's
kinda silly and unsurprising that it wasn't available. =D Stop doing
that too.
We also would assert that the unrolled loop was in LCSSA even if
preserving LCSSA was never requested! I don't have a test case or
anything here. I spotted it by inspection and it seems quite obvious. No
logic change anyways, that's just avoiding a spurrious assert.
Chandler Carruth [Sat, 21 Jan 2017 03:48:51 +0000 (03:48 +0000)]
[PM] Teach the loop PM to run LoopSimplify prior to the loop pipeline.
This adds the last remaining core feature of the loop pass pipeline in
the new PM and removes the last of the really egregious hacks in the
LICM tests.
Sadly, this requires really substantial changes in the unittests in
order to provide and maintain simplified loops. This is particularly
hard because for example LoopSimplify will try to fold undef branches to
an ideal direction and simplify the loop accordingly.
This is a stub implementation of the `-s` or `--format` option that
allows the user to specify the demangling style. Since we only support
the Itanium (GNU) style demangling, auto is synonymous with `gnu`.
Simply swallow the option to permit some level of commandline
compatibility.
MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info
Summary:
Under option -mergefunc-preserve-debug-info we:
- Do not create a new function for a thunk.
- Retain the debug info for a thunk's parameters (and associated
instructions for the debug info) from the entry block.
Note: -debug will display the algorithm at work.
- Create debug-info for the call (to the shared implementation) made by
a thunk and its return value.
- Erase the rest of the function, retaining the (minimally sized) entry
block to create a thunk.
- Preserve a thunk's call site to point to the thunk even when both occur
within the same translation unit, to aid debugability. Note that this
behaviour differs from the underlying -mergefunc implementation which
modifies the thunk's call site to point to the shared implementation
when both occur within the same translation unit.
Justin Lebar [Sat, 21 Jan 2017 00:59:57 +0000 (00:59 +0000)]
[ConstantFolding] Constant-fold llvm.sqrt(x) like other intrinsics.
Summary:
Currently we return undef, but we're in the process of changing the
LangRef so that llvm.sqrt behaves like the other math intrinsics,
matching the return value of the standard libcall but not setting errno.
This change is legal even without the LangRef change because currently
calling llvm.sqrt(x) where x is negative is spec'ed to be UB. But in
practice it's also safe because we're simply constant-folding fewer
inputs: Inputs >= -0 get constant-folded as before, but inputs < -0 now
aren't constant-folded, because ConstantFoldFP aborts if the host math
function raises an fp exception.
Guozhi Wei [Fri, 20 Jan 2017 23:35:27 +0000 (23:35 +0000)]
[PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.
Davide Italiano [Fri, 20 Jan 2017 23:29:28 +0000 (23:29 +0000)]
[NewGVN] Optimize processing for instructions found trivially dead.
Don't call `isTriviallyDeadInstructions()` once we discover that
an instruction is dead. Instead, set DFS number zero (as suggested
by Danny) and forget about it (this also speeds up things as we
won't try to reprocess that block).
Tim Northover [Fri, 20 Jan 2017 23:25:17 +0000 (23:25 +0000)]
GlobalISel: prevent heap use-after-free when looking up VReg.
Translating the constant can create more VRegs, which can invalidate the
reference into the DenseMap. So we have to look up the value again after all
that's happened.
Marcos Pividori [Fri, 20 Jan 2017 22:49:08 +0000 (22:49 +0000)]
[libFuzzer] Use clang as linker on Windows, to properly include sanitizer libraries.
In order to use sanitizers on Windows, we need to link against many runtime
libraries which will depend on the target being created (executable or dll) and
the c runtime library used (MT/MD).
By default, cmake uses link.exe for linking, which fails because we don't
specify the appropiate dependencies. As we don't want to consider all of that
possible situations which depends on the implementation of the compiler-rt, the
simplest option is to change the rules for linking executables and shared
libraries, using the compiler instead of link.exe.
Clang driver will consider the sanitizer flags, and automatically provide the
required libraries to the linker.
Easwaran Raman [Fri, 20 Jan 2017 22:44:04 +0000 (22:44 +0000)]
Improve PGO support for the new inliner
This adds the following to the new PM based inliner in PGO mode:
* Use block frequency analysis to derive callsite's profile count and use
that to adjust thresholds of hot and cold callsites.
* Incrementally update the BFI of the caller after a callee gets inlined
into it. This incremental update is only within an invocation of the run
method - BFI is not preserved across calls to run.
Update the function entry count of the callee after inlining it into a
caller.
* I've tuned the thresholds for the hot and cold callsites using a hacked
up version of the old inliner that explicitly computes BFI on a set of
internal benchmarks and spec. Once the new PM based pipeline stabilizes
(IIRC Chandler mentioned there are known issues) I'll benchmark this
again and adjust the thresholds if required.
Inliner PGO support.
Zachary Turner [Fri, 20 Jan 2017 22:41:40 +0000 (22:41 +0000)]
[pdb] Merge NamedStreamMapBuilder and NamedStreamMap.
While the builder pattern has proven useful for certain other
larger types, in this case it was hampering the ability to use
the data structure, as for runtime access we need a map that
we can efficiently read from and write to. So the two are merged
into a single data structure that can efficiently be read to,
written from, deserialized from bytes, and serialized to bytes.
Sanjay Patel [Fri, 20 Jan 2017 22:18:47 +0000 (22:18 +0000)]
[ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693)
By enhancing value tracking, we allow an existing min/max canonicalization to
kick in and improve codegen for several targets that have min/max instructions.
Unfortunately, recognizing min/max in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html
...but I'm hoping we can remove that soon.
Teresa Johnson [Fri, 20 Jan 2017 21:54:58 +0000 (21:54 +0000)]
[ThinLTO] Drop non-prevailing non-ODR weak to declarations
Summary:
Allow non-ODR weak/linkonce non-prevailing copies to be marked
as available_externally in the index. Add support for dropping these to
declarations in the backend.
Sanjay Patel [Fri, 20 Jan 2017 21:49:41 +0000 (21:49 +0000)]
[InstCombine] add tests to show missed canonicalization of min/max; NFC
Unfortunately, recognizing these in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html
...but besides being the obviously Right Thing To Do, there's a clear
codegen win from identifying these patterns for several targets.
Daniel Berlin [Fri, 20 Jan 2017 21:04:30 +0000 (21:04 +0000)]
NewGVN: Fix PR 31686 and PR 31698 by rewriting store leader handling.
Summary:
This rewrites store expression/leader handling. We no longer use the
value operand as the leader, instead, we store it separately. We also
now store the stored value as part of the expression, and compare it
when comparing stores for equality. This enables us to get rid of a
bunch of our previous hacks and machinations, as the existing
machinery takes care of everything *except* updating the stored value
on classes. The only time we have to update it is if the storecount
goes to 0, and when we do, we destroy it.
Since we no longer use the value operand as the leader, during elimination, we have to use the value operand. Doing this also fixes a bunch of store forwarding cases we were missing.
Any value operand we use is guaranteed to either be updated by previous eliminations, or minimized by future ones.
(IE the fact that we don't use the most dominating value operand when it's not a constant does not affect anything).
Sadly, this change also exposes that we didn't pay attention to the
output of the pr31594.ll test, as it also very clearly exposes the
same store leader bug we are fixing here.
(I added pr31682.ll anyway, but maybe we think that's too large to be useful)
On the plus side, propagate-ir-flags.ll now passes due to the
corrected store forwarding.
This change was 3 stage'd on darwin and linux, with the full test-suite.
Dan Gohman [Fri, 20 Jan 2017 20:50:29 +0000 (20:50 +0000)]
[WebAssembly] Don't create bitcast-wrappers for varargs.
WebAssembly varargs functions use a significantly different ABI than
non-varargs functions, and the current code in
WebAssemblyFixFunctionBitcasts doesn't handle that difference. For now,
just avoid creating wrapper functions in the presence of varargs.
Chris Bieneman [Fri, 20 Jan 2017 19:03:14 +0000 (19:03 +0000)]
[DWARF] [ObjectYAML] Adding APIs for unittesting
Summary: This patch adds some new APIs to enable using the YAML DWARF representation in unit tests. The most basic new API is DWARFYAML::EmitDebugSections which converts a YAML string into a series of owned MemoryBuffer objects stored in a StringMap. The string map can then be used to construct a DWARFContext for parsing in place of an ObjectFile.
Haicheng Wu [Fri, 20 Jan 2017 18:51:22 +0000 (18:51 +0000)]
Recommit "[InlineCost] Use TTI to check if GEP is free." #3
This is the third attemp to recommit r292526.
The original summary:
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
Matthias Braun [Fri, 20 Jan 2017 18:04:27 +0000 (18:04 +0000)]
AArch64LoadStoreOptimizer: Update kill flags when merging stores
Kill flags need to be updated correctly when moving stores up/down to
form store pair instructions.
Those invalid flags have been ignored before but as of r290014 they are
recognized when using -mllvm -verify-machineinstrs.
Also simplifies test/CodeGen/AArch64/ldst-opt-dbg-limit.mir, renames it
to ldst-opt.mir test and adds a new tests for this change.
Wei Mi [Fri, 20 Jan 2017 17:38:54 +0000 (17:38 +0000)]
[RegisterCoalescing] Recommit the patch "Remove partial redundent copy".
The recommit fixes a bug related with live interval update after the partial
redundent copy is moved.
The original patch is to solve the performance problem described in PR27827.
Register coalescing sometimes cannot remove a copy because of interference.
But if we can find a reverse copy in one of the predecessor block of the copy,
the copy is partially redundent and we may remove the copy partially by moving
it to the predecessor block without the reverse copy.
Haicheng Wu [Fri, 20 Jan 2017 16:36:34 +0000 (16:36 +0000)]
Recommit "[InlineCost] Use TTI to check if GEP is free." #2
This is the second attemp to recommit r292526.
The original summary:
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
Sjoerd Meijer [Fri, 20 Jan 2017 13:10:12 +0000 (13:10 +0000)]
[Thumb] Add support for tMUL in the compare instruction peephole optimizer.
We also want to optimise tests like this: return a*b == 0. The MULS
instruction is flag setting, so we don't need the CMP instruction but can
instead branch on the result of the MULS. The generated instructions sequence
for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the
boolean values resulting from the select instruction, but these MOVS
instructions are flag setting and were thus preventing this optimisation. Now
we first reorder and move the MULS to before the CMP and generate sequence
MOVS, MOVS, MULS, CMP so that the optimisation could trigger. Reordering of the
MULS and MOVS is safe to do because the subsequent MOVS instructions just set
the CPSR register and don't use it, i.e. the CPSR is dead.
Chandler Carruth [Fri, 20 Jan 2017 08:42:19 +0000 (08:42 +0000)]
[PM] Port LoopSink to the new pass manager.
Like several other loop passes (the vectorizer, etc) this pass doesn't
really fit the model of a loop pass. The critical distinction is that it
isn't intended to be pipelined together with other loop passes. I plan
to add some documentation to the loop pass manager to make this more
clear on that side.
LoopSink is also different because it doesn't really need a lot of the
infrastructure of our loop passes. For example, if there aren't loop
invariant instructions causing a preheader to exist, there is no need to
form a preheader. It also doesn't need LCSSA because this pass is
only involved in sinking invariant instructions from a preheader into
the loop, not reasoning about live-outs.
This allows some nice simplifications to the pass in the new PM where we
can directly walk the loops once without restructuring them.
Diana Picus [Fri, 20 Jan 2017 08:15:24 +0000 (08:15 +0000)]
[ARM] Use helpers for adding pred / CC operands. NFC
Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0)
and replace with add(condCodeOp()) and add(predOps()). This should make it
easier to understand what those operands represent (without having to look at
the definition of the instruction that we're adding to).