Add the `--strip-underscore` option to llvm-cxxfilt to strip the leading
underscore. This is useful for when dealing with targets which add a
leading underscore.
Chandler Carruth [Sun, 22 Jan 2017 10:34:01 +0000 (10:34 +0000)]
[PM] Fix a really nasty bug introduced when adding PGO support to the
new PM's inliner.
The bug happens when we refine an SCC after having computed a proxy for
the FunctionAnalysisManager, and then proceed to compute fresh analyses
for functions in the *new* SCC using the manager provided by the old
SCC's proxy. *And* when we manage to mutate a function in this new SCC
in a way that invalidates those analyses. This can be... challenging to
reproduce.
I've managed to contrive a set of functions that trigger this and added
a test case, but it is a bit brittle. I've directly checked that the
passes run in the expected ways to help avoid the test just becoming
silently irrelevant.
This gets the new PM back to passing the LLVM test suite after the PGO
improvements landed.
Craig Topper [Sun, 22 Jan 2017 06:53:07 +0000 (06:53 +0000)]
[IR] Add LLVM_READONLY to BasicBlock::getTerminator.
I noticed that this function got called twice in compiled code to create succ_begin and succ_end iterators. Adding this directive helps the compiler share the call.
Ideally we'd just make this method available for inlining since its quite simple, but the current header file arrangements don't allow that.
Marcos Pividori [Sun, 22 Jan 2017 02:28:08 +0000 (02:28 +0000)]
[libFuzzer] Fix test with shared libraries on Windows.
We need to set BINARY_DIR to: ${CMAKE_BINARY_DIR}/lib/Fuzzer/test , so the dll
is placed in the same directory than the test LLVMFuzzer-DSOTest, and is found
when executing that test.
As we are using CMAKE_CXX_CREATE_SHARED_LIBRARY to link the dll, we can't modify
the output directory for the import library. It will be created in the same
directory than the dll (in BINARY_DIR), no matter which value we set to
LIBRARY_DIR. So, if we set LIBRARY_DIR to a different directory than BINARY_DIR,
when linking LLVMFuzzer-DSOTest, cmake will look for the import library
LLVMFuzzer-DSO1.lib in LIBRARY_DIR, and won't find it, since it was created in
BINARY_DIR. So, for Windows, we need that LIBRARY_DIR and BINARY_DIR are the
same directory.
Marcos Pividori [Sun, 22 Jan 2017 01:58:59 +0000 (01:58 +0000)]
[libFuzzer] AlrmHandler is executed in a different thread for Windows.
Don't check for InFuzzingThread() on Windows, since the AlarmHandler() is
always executed by a different thread from a thread pool.
If we don't add these changes, the alarm handler will never execute.
Note that we decided to ignore possible problem in the synchronization.
Marcos Pividori [Sun, 22 Jan 2017 01:58:50 +0000 (01:58 +0000)]
[libFuzzer] Fix OutOfMemory tests to work on 32 bits.
I add 2 changes to make the tests work on 32 bits and on 64 bits.
I change the size allocated to 0x20000000 and add the flag: -rss_limit_mb=300.
Otherwise the output for 32 bits and 64 bits is different.
For 64 bits the value 0xff000000 doesn't exceed kMaxAllowedMallocSize.
For 32 bits, kMaxAllowedMallocSize is set to 0xc0000000, so the call to
Allocate() will fail earlier printing "WARNING: AddressSanitizer failed to
allocate ..." , and wont't call malloc hooks.
So, we need to consider a size smaller than 2GB (so malloc doesn't fail on
32bits) and greater that the value provided by -rss_limit_mb.
Because of that I use: 0x20000000.
Marcos Pividori [Sun, 22 Jan 2017 01:58:45 +0000 (01:58 +0000)]
[libFuzzer] Avoid undefined behavior, properly discard output to stdout/stderr.
Fix libFuzzer when setting -close_fd_mask to a non-zero value.
In previous implementation, libFuzzer closes the file descriptors for
stdout/stderr. This has some disavantages:
For `fuzzer-fdmask.test`, we write directly to stdout and stderr using the
file streams stdout and stderr, after the file descriptors are closed, which is
undefined behavior. In Windows, in particular, this was making the test fail.
Also, if we close stdout and we open a new file in libFuzzer, we get the file
descriptor 1, which could generate problem if some code assumes file descriptors
refers to stdout and works directly writing to the file descriptor 1, but it
will be writing to the opened file (for example using std::cout).
Instead of closing the file descriptors, I redirect the output to /dev/null on
linux and nul on Windows.
Marcos Pividori [Sun, 22 Jan 2017 01:58:36 +0000 (01:58 +0000)]
[libFuzzer] Fix ListFilesInDirRecursive() to do the same for Posix and Windows.
Update `ListFilesInDirRecursive` implementation on Windows to have the same
behavior than for Posix, when the directory doesn't exists and when it is empty.
Marcos Pividori [Sun, 22 Jan 2017 01:58:26 +0000 (01:58 +0000)]
[libFuzzer] Portably disassemble and find calls to sanitizer_cov_trace_pc_guard.
Instead of directly using objdump, which is not present on Windows, we consider
different tools depending on the platform.
For Windows, we consider dumpbin and llvm-objdump.
Marcos Pividori [Sun, 22 Jan 2017 01:27:42 +0000 (01:27 +0000)]
[libFuzzer] Remove optimization flags for tests.
We need to build all the tests with -O0, otherwise optimizations may merge some
basic blocks and the tests will fail.
In this diff, I simplify the cmake implementation and I remove the flags for
Windows too (/O[123s]).
Marcos Pividori [Sun, 22 Jan 2017 01:27:34 +0000 (01:27 +0000)]
[libFuzzer] Remove dependencies for tests on Windows.
Remove dependency on FileCheck, sancov and not for tests on Windows.
If LLVM_USE_SANITIZER=Address and LLVM_USE_SANITIZE_COVERAGE=YES, this will
trigger the building of dependencies with sanitizer instrumentation.
This will fail in Windows, since cmake will use link.exe for linking and won't
include compiler-rt libraries.
Sanjay Patel [Sat, 21 Jan 2017 17:51:25 +0000 (17:51 +0000)]
[ValueTracking] tighten up matchMinMax(); NFCI
This is similar to what the caller (matchSelectPattern()) does. In all
cases where we succeed in matching a min/max pattern, the values in
that pattern will be the values of the 'select', so hoist that and
remove a bunch of duplicated code.
Chandler Carruth [Sat, 21 Jan 2017 04:16:53 +0000 (04:16 +0000)]
[PM] Sink an LCSSA preservation assert from the LoopSimplify pass into
the library routine shared with the new PM and other code.
This assert checks that when LCSSA preservation is requested we start in
LCSSA form. Without this early assert, given *very* complex test cases
we can hit an assert or crash much later on when trying to preserve
LCSSA.
The new PM's loop simplify doesn't need to (and indeed can't) preserve
LCSSA as the new PM doesn't deal in transforms in the dependency graph.
But we asked the library to and shockingly, this didn't work very well!
Stop doing that. Now the assert will tell us immediately with existing
test cases. Before this, it took a pretty convoluted input to trigger
this.
However, sinking the assert also found a bug in LoopUnroll where we
asked simplifyLoop to preserve LCSSA *right before we reform it*. That's
kinda silly and unsurprising that it wasn't available. =D Stop doing
that too.
We also would assert that the unrolled loop was in LCSSA even if
preserving LCSSA was never requested! I don't have a test case or
anything here. I spotted it by inspection and it seems quite obvious. No
logic change anyways, that's just avoiding a spurrious assert.
Chandler Carruth [Sat, 21 Jan 2017 03:48:51 +0000 (03:48 +0000)]
[PM] Teach the loop PM to run LoopSimplify prior to the loop pipeline.
This adds the last remaining core feature of the loop pass pipeline in
the new PM and removes the last of the really egregious hacks in the
LICM tests.
Sadly, this requires really substantial changes in the unittests in
order to provide and maintain simplified loops. This is particularly
hard because for example LoopSimplify will try to fold undef branches to
an ideal direction and simplify the loop accordingly.
This is a stub implementation of the `-s` or `--format` option that
allows the user to specify the demangling style. Since we only support
the Itanium (GNU) style demangling, auto is synonymous with `gnu`.
Simply swallow the option to permit some level of commandline
compatibility.
MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info
Summary:
Under option -mergefunc-preserve-debug-info we:
- Do not create a new function for a thunk.
- Retain the debug info for a thunk's parameters (and associated
instructions for the debug info) from the entry block.
Note: -debug will display the algorithm at work.
- Create debug-info for the call (to the shared implementation) made by
a thunk and its return value.
- Erase the rest of the function, retaining the (minimally sized) entry
block to create a thunk.
- Preserve a thunk's call site to point to the thunk even when both occur
within the same translation unit, to aid debugability. Note that this
behaviour differs from the underlying -mergefunc implementation which
modifies the thunk's call site to point to the shared implementation
when both occur within the same translation unit.
Justin Lebar [Sat, 21 Jan 2017 00:59:57 +0000 (00:59 +0000)]
[ConstantFolding] Constant-fold llvm.sqrt(x) like other intrinsics.
Summary:
Currently we return undef, but we're in the process of changing the
LangRef so that llvm.sqrt behaves like the other math intrinsics,
matching the return value of the standard libcall but not setting errno.
This change is legal even without the LangRef change because currently
calling llvm.sqrt(x) where x is negative is spec'ed to be UB. But in
practice it's also safe because we're simply constant-folding fewer
inputs: Inputs >= -0 get constant-folded as before, but inputs < -0 now
aren't constant-folded, because ConstantFoldFP aborts if the host math
function raises an fp exception.
Guozhi Wei [Fri, 20 Jan 2017 23:35:27 +0000 (23:35 +0000)]
[PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.
Davide Italiano [Fri, 20 Jan 2017 23:29:28 +0000 (23:29 +0000)]
[NewGVN] Optimize processing for instructions found trivially dead.
Don't call `isTriviallyDeadInstructions()` once we discover that
an instruction is dead. Instead, set DFS number zero (as suggested
by Danny) and forget about it (this also speeds up things as we
won't try to reprocess that block).
Tim Northover [Fri, 20 Jan 2017 23:25:17 +0000 (23:25 +0000)]
GlobalISel: prevent heap use-after-free when looking up VReg.
Translating the constant can create more VRegs, which can invalidate the
reference into the DenseMap. So we have to look up the value again after all
that's happened.
Marcos Pividori [Fri, 20 Jan 2017 22:49:08 +0000 (22:49 +0000)]
[libFuzzer] Use clang as linker on Windows, to properly include sanitizer libraries.
In order to use sanitizers on Windows, we need to link against many runtime
libraries which will depend on the target being created (executable or dll) and
the c runtime library used (MT/MD).
By default, cmake uses link.exe for linking, which fails because we don't
specify the appropiate dependencies. As we don't want to consider all of that
possible situations which depends on the implementation of the compiler-rt, the
simplest option is to change the rules for linking executables and shared
libraries, using the compiler instead of link.exe.
Clang driver will consider the sanitizer flags, and automatically provide the
required libraries to the linker.
Easwaran Raman [Fri, 20 Jan 2017 22:44:04 +0000 (22:44 +0000)]
Improve PGO support for the new inliner
This adds the following to the new PM based inliner in PGO mode:
* Use block frequency analysis to derive callsite's profile count and use
that to adjust thresholds of hot and cold callsites.
* Incrementally update the BFI of the caller after a callee gets inlined
into it. This incremental update is only within an invocation of the run
method - BFI is not preserved across calls to run.
Update the function entry count of the callee after inlining it into a
caller.
* I've tuned the thresholds for the hot and cold callsites using a hacked
up version of the old inliner that explicitly computes BFI on a set of
internal benchmarks and spec. Once the new PM based pipeline stabilizes
(IIRC Chandler mentioned there are known issues) I'll benchmark this
again and adjust the thresholds if required.
Inliner PGO support.
Zachary Turner [Fri, 20 Jan 2017 22:41:40 +0000 (22:41 +0000)]
[pdb] Merge NamedStreamMapBuilder and NamedStreamMap.
While the builder pattern has proven useful for certain other
larger types, in this case it was hampering the ability to use
the data structure, as for runtime access we need a map that
we can efficiently read from and write to. So the two are merged
into a single data structure that can efficiently be read to,
written from, deserialized from bytes, and serialized to bytes.
Sanjay Patel [Fri, 20 Jan 2017 22:18:47 +0000 (22:18 +0000)]
[ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693)
By enhancing value tracking, we allow an existing min/max canonicalization to
kick in and improve codegen for several targets that have min/max instructions.
Unfortunately, recognizing min/max in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html
...but I'm hoping we can remove that soon.
Teresa Johnson [Fri, 20 Jan 2017 21:54:58 +0000 (21:54 +0000)]
[ThinLTO] Drop non-prevailing non-ODR weak to declarations
Summary:
Allow non-ODR weak/linkonce non-prevailing copies to be marked
as available_externally in the index. Add support for dropping these to
declarations in the backend.
Sanjay Patel [Fri, 20 Jan 2017 21:49:41 +0000 (21:49 +0000)]
[InstCombine] add tests to show missed canonicalization of min/max; NFC
Unfortunately, recognizing these in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html
...but besides being the obviously Right Thing To Do, there's a clear
codegen win from identifying these patterns for several targets.
Daniel Berlin [Fri, 20 Jan 2017 21:04:30 +0000 (21:04 +0000)]
NewGVN: Fix PR 31686 and PR 31698 by rewriting store leader handling.
Summary:
This rewrites store expression/leader handling. We no longer use the
value operand as the leader, instead, we store it separately. We also
now store the stored value as part of the expression, and compare it
when comparing stores for equality. This enables us to get rid of a
bunch of our previous hacks and machinations, as the existing
machinery takes care of everything *except* updating the stored value
on classes. The only time we have to update it is if the storecount
goes to 0, and when we do, we destroy it.
Since we no longer use the value operand as the leader, during elimination, we have to use the value operand. Doing this also fixes a bunch of store forwarding cases we were missing.
Any value operand we use is guaranteed to either be updated by previous eliminations, or minimized by future ones.
(IE the fact that we don't use the most dominating value operand when it's not a constant does not affect anything).
Sadly, this change also exposes that we didn't pay attention to the
output of the pr31594.ll test, as it also very clearly exposes the
same store leader bug we are fixing here.
(I added pr31682.ll anyway, but maybe we think that's too large to be useful)
On the plus side, propagate-ir-flags.ll now passes due to the
corrected store forwarding.
This change was 3 stage'd on darwin and linux, with the full test-suite.
Dan Gohman [Fri, 20 Jan 2017 20:50:29 +0000 (20:50 +0000)]
[WebAssembly] Don't create bitcast-wrappers for varargs.
WebAssembly varargs functions use a significantly different ABI than
non-varargs functions, and the current code in
WebAssemblyFixFunctionBitcasts doesn't handle that difference. For now,
just avoid creating wrapper functions in the presence of varargs.
Chris Bieneman [Fri, 20 Jan 2017 19:03:14 +0000 (19:03 +0000)]
[DWARF] [ObjectYAML] Adding APIs for unittesting
Summary: This patch adds some new APIs to enable using the YAML DWARF representation in unit tests. The most basic new API is DWARFYAML::EmitDebugSections which converts a YAML string into a series of owned MemoryBuffer objects stored in a StringMap. The string map can then be used to construct a DWARFContext for parsing in place of an ObjectFile.
Haicheng Wu [Fri, 20 Jan 2017 18:51:22 +0000 (18:51 +0000)]
Recommit "[InlineCost] Use TTI to check if GEP is free." #3
This is the third attemp to recommit r292526.
The original summary:
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
Matthias Braun [Fri, 20 Jan 2017 18:04:27 +0000 (18:04 +0000)]
AArch64LoadStoreOptimizer: Update kill flags when merging stores
Kill flags need to be updated correctly when moving stores up/down to
form store pair instructions.
Those invalid flags have been ignored before but as of r290014 they are
recognized when using -mllvm -verify-machineinstrs.
Also simplifies test/CodeGen/AArch64/ldst-opt-dbg-limit.mir, renames it
to ldst-opt.mir test and adds a new tests for this change.
Wei Mi [Fri, 20 Jan 2017 17:38:54 +0000 (17:38 +0000)]
[RegisterCoalescing] Recommit the patch "Remove partial redundent copy".
The recommit fixes a bug related with live interval update after the partial
redundent copy is moved.
The original patch is to solve the performance problem described in PR27827.
Register coalescing sometimes cannot remove a copy because of interference.
But if we can find a reverse copy in one of the predecessor block of the copy,
the copy is partially redundent and we may remove the copy partially by moving
it to the predecessor block without the reverse copy.