Matthias Braun [Mon, 9 Jan 2017 21:38:17 +0000 (21:38 +0000)]
PeepholeOptimizer: Do not replace SubregToReg(bitcast like)
While we can usually replace bitcast like instructions
(MachineInstr::isBitcast()) with a COPY this is not legal if any of the
users uses SUBREG_TO_REG to assert the upper bits of the result are
zero.
Rui Ueyama [Mon, 9 Jan 2017 21:20:42 +0000 (21:20 +0000)]
TarWriter: Set "00" to Ustar version field.
Most (maybe all?) tar commands can handle tar archives with blank
version fields, but POSIX requires "00" to be set to the field, so
doing it is good for compliance.
Chris Bieneman [Mon, 9 Jan 2017 20:01:37 +0000 (20:01 +0000)]
[ObjectYAML] Support for DWARF line tables
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.
Matthew Simpson [Mon, 9 Jan 2017 19:05:29 +0000 (19:05 +0000)]
[LV] Fix-up external IV users after updating dominator tree
This patch delays the fix-up step for external induction variable users until
after the dominator tree has been properly updated. This should fix PR30742.
The SCEVExpander in InductionDescriptor::transform can generate code in the
wrong location if the dominator tree is not up-to-date. We should work towards
keeping the dominator tree up-to-date throughout the transformation.
Matt Arsenault [Mon, 9 Jan 2017 18:52:39 +0000 (18:52 +0000)]
AMDGPU: Add Assert[SZ]Ext during argument load creation
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.
Fixes code quality regressions vs. SI in min.ll test.
Simon Pilgrim [Mon, 9 Jan 2017 17:20:03 +0000 (17:20 +0000)]
[X86][AVX512] Enable v16i8/v32i8 vector shifts to use an extend+shift+truncate pattern.
Use the existing AVX2 v8i16 vector shift lowering for v16i8 (extending to v16i32) on AVX512 targets and v32i8 (extending to v32i16) on AVX512BW targets.
Eugene Leviant [Mon, 9 Jan 2017 09:56:31 +0000 (09:56 +0000)]
RuntimeDyldELF: don't create thunk if not needed
This patch doesn't create thunk for branch operation when following conditions are met:
- Architecture is AArch64
- Relocation target is in the same object file
- Relocation target is close enough to be encoded in immediate offset
In such case we branch directly to the target instead of branching to thunk
[PM] Teach SCEV to invalidate itself when its dependencies become
invalid.
This fixes use-after-free bugs that will arise with any interesting use
of SCEV.
I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.
Craig Topper [Mon, 9 Jan 2017 04:19:34 +0000 (04:19 +0000)]
[AVX-512] Change another pattern that was using BLENDM to use masked moves. A future patch will conver it back to BLENDM if its beneficial to register allocation.
Lang Hames [Sun, 8 Jan 2017 20:09:35 +0000 (20:09 +0000)]
[Orc][RPC] Lock the pending results data structure when installing new result
handlers, make abandonPendingResults public API.
This should make installing asynchronous result handlers thread safe.
The abandonPendingResults method is made public so that clients can disconnect
from a remote even if they have asynchronous handlers awaing results from that
remote. The asynchronous handlers will all receive "abandoned result" errors as
their argument.
Lang Hames [Sun, 8 Jan 2017 01:13:47 +0000 (01:13 +0000)]
[Orc][RPC] Add an APICalls utility for grouping RPC funtions for registration.
APICalls allows groups of functions to be composed into an API that can be
registered as a unit with an RPC endpoint. Doing registration on a-whole API
basis (rather than per-function) allows missing API functions to be detected
early.
APICalls also allows Function membership to be tested at compile-time. This
allows clients to write static assertions that functions to be called are
members of registered APIs.
Mehdi Amini [Sun, 8 Jan 2017 00:44:45 +0000 (00:44 +0000)]
[ThinLTO] Fix lazy-loading of Metadata attachment, which left some Fwd ref behind
The change in r291362 was too agressive. We still need to flush at the
end of the block because function local metadata can introduce fwd
ref as well.
(Bootstrap with ThinLTO was broken)
Mehdi Amini [Sat, 7 Jan 2017 20:24:23 +0000 (20:24 +0000)]
[ThinLTO] Fix assertions on lazy-loading of Metadata TBAA attachments
Summary:
The issue happens with:
%0 = ....., !tbaa !0
%1 = ....., !tbaa !1
With !0 that references !1.
In this case when loading !0 we generates a temporary for the
operand !1. We now flush it immediately and trigger the load of
!1 before moving on. If we don't we get the temporary when
attaching to %1. This is usually not an issue except that we
eagerly try to update TBAA MDNodes, which is obviously not possible
if we only have a temporary.
Matt Arsenault [Sat, 7 Jan 2017 19:55:12 +0000 (19:55 +0000)]
SimplifyLibCalls: Remove incorrect optimization of fabs
fabs(x * x) is not generally safe to assume x is positive if x is a NaN.
This is also less general than it could be, so this will be replaced
with a transformation on the intrinsic.
Daniel Berlin [Sat, 7 Jan 2017 19:04:59 +0000 (19:04 +0000)]
Update update_test_checks to work properly with phi nodes and other fun things.
Summary:
Prior to this change, phi nodes were never considered defs, and so we ended up with undefined variables for any loop. Now, instead of trying to find just defs, we iterate over each actual IR value in the line, and replace them one by one with either a definition or a use.
We also don't try to match anything in the comment portions of the line.
I've tested it even on things like function pointer calls, etc, and against existing test cases uses update_test_checks
With this change, we are able to use update_tests on the cyclic cases in newgvn.
The only case i'm aware of that will misfire is if you have a string with which contains a valid token.
However, this is the same as it is now, with a slightly larger set of strings that may misfire.
Prior to this change, a test with the string " %a =" would be replaced.
Rui Ueyama [Sat, 7 Jan 2017 08:28:56 +0000 (08:28 +0000)]
TarWriter: Use Ustar header's "prefix" field to store long filenames.
Tar's Ustar header has the "prefix" field to store a directory
part of a filename. It is not as flexible as the PAX-extended
filename because there's still a limitation on the maximum filename
size, but it mitigates the situation.
This patch should unbreak some Windows buildbots that uses very
old tar command.
Craig Topper [Sat, 7 Jan 2017 06:56:54 +0000 (06:56 +0000)]
[X86] Disable load unfolding for 128-bit MOVDDUP instructions since the load size is smaller than the register size so unfolding would increase the load size.
Dan Gohman [Sat, 7 Jan 2017 01:50:01 +0000 (01:50 +0000)]
[WebAssembly] Don't abort on code with UB.
Gracefully leave code that performs function-pointer bitcasts implying
non-trivial pointer conversions alone, rather than aborting, since it's
just undefined behavior.
[MachineBasicBlock] Add a non-assert live-in accessor for debug mode.
With r291169, it is now not possible to access the live-in information
when the liveness is not properly tracked. Although this is want we want
in general, for debugging purpose we may want to still be able to
traverse this information even if it may not be accurate.
Dan Gohman [Sat, 7 Jan 2017 00:34:54 +0000 (00:34 +0000)]
[WebAssembly] Add a pass to create wrappers for function bitcasts.
WebAssembly requires caller and callee signatures to match exactly. In LLVM,
there are a variety of circumstances where signatures may be mismatched in
practice, and one can bitcast a function address to another type to call it
as that type. This patch adds a pass which replaces bitcasted function
addresses with wrappers to replace the bitcasts.
This doesn't catch everything, but it does match many common cases.
Daniel Berlin [Sat, 7 Jan 2017 00:01:42 +0000 (00:01 +0000)]
NewGVN: Fix PR 31501.
Summary: LLVM's non-standard notion of phi nodes means we can't both try to substitute for undef in phi nodes *and* use phi nodes as leaders all the time. This changes NewGVN to use the same semantics as SimplifyPHINode to decide which phi nodes are equivalent.
Teresa Johnson [Fri, 6 Jan 2017 23:38:41 +0000 (23:38 +0000)]
[ThinLTO] Handle conflicting local names gracefully
Summary:
r285871 introduced an assert that was overly aggressive in the case
of a same-named local in different same-named files (in different
directories), where the source name and therefore the GUID ended up
the same because the files were compiled in their own directory without
any leading path. Change the handling in the promotion logic to get
the summary for the version in that module.
This also exposed an issue where we are not always importing the
right copy, which is a performance not correctness issue (because
the renaming is based on the module hash which must be different,
see the bug report for details). I will fix that as a follow-on.
Teresa Johnson [Fri, 6 Jan 2017 23:37:17 +0000 (23:37 +0000)]
[ThinLTO] Optionally ignore empty index file
Summary:
In order to simplify distributed build system integration, where actions
may be scheduled before the Thin Link which determines the list of
objects selected by the linker. The gold plugin currently will emit
0-sized index files for objects not selected by the link, to enable
checking for expected output files by the build system. If the build
system then schedules a backend action for these bitcode files, we want
to be able to fall back to normal compilation instead of failing.
This is the LLVM side support for optionally enabling fallback
instead of issuing an error. Return a null CombinedIndex from
llvm::getModuleSummaryIndexForFile under the option when the file
is empty. Clang can then ignore the index when it is null.
[gtest] Detect warning flags using the positive spelling.
Some GCC versions will accept any warning flag name after a '-Wno-',
which would cause us to try to disable warnings with names GCC didn't
understand. This will silently succeed unless there is some other output
from GCC in which case we get weird cc1plus warnings about the warning
name being bogus.
There is still the issue that gtest sets warning flags for building
gtest-all.cc using weird 'add_definitions' and the fact that there is
a GCC version which warns on the variadic macro usage in gtest under
-pedantic, but has no flag analogous to Clang's
-Wgnu-zero-variadic-macro-argumnets to suppress this warning. I haven't
been able to come up with any good solution here. The closest is to turn
off -pedantic for those versions of GCC, but that seems really nasty.
For now, those versinos of GCC aren't warning clean. If anyone is broken
by this, I'll work on CMake logic to detect and disable -pedantic in
these cases.