Dylan McKay [Fri, 16 Oct 2015 03:10:30 +0000 (03:10 +0000)]
Initial migration of AVR backend
This patch adds the underlying infrastructure for an AVR backend to be included into LLVM. It is the first of a series of patches aimed at moving the out-of-tree AVR backend into the tree.
Sanjoy Das [Fri, 16 Oct 2015 02:41:23 +0000 (02:41 +0000)]
[RS4GC] Dont' propagate call attrs related to patchable statepoints
The `"statepoint-id"` and `"statepoint-num-patch-bytes"` attributes are
used solely to determine properties of the `gc.statepoint` being
created. Once the `gc.statepoint` is in place, these should be removed.
Sanjoy Das [Fri, 16 Oct 2015 02:41:00 +0000 (02:41 +0000)]
[RS4GC] Use "deopt" operand bundles
Summary:
This is a step towards using operand bundles to carry deopt state till
RewriteStatepointsForGC. The change adds a flag to
RewriteStatepointsForGC that teaches it to pick up deopt state from a
`"deopt"` operand bundle attached to the `call` or `invoke` it is
wrapping.
The command line flag added, `-rs4gc-use-deopt-bundles`, will only exist
for a short while. Once we are able to pipe deopt bundle state through
the full optimization pipeline without problems, we will "constant fold"
`-rs4gc-use-deopt-bundles` to `true`.
Sanjoy Das [Fri, 16 Oct 2015 01:00:47 +0000 (01:00 +0000)]
[IndVars] Have `cloneArithmeticIVUser` guess better
Summary:
`cloneArithmeticIVUser` currently trips over expression like `add %iv,
-1` when `%iv` is being zero extended -- it tries to construct the
widened use as `add %iv.zext, zext(-1)` and (correctly) fails to prove
equivalence to `zext(add %iv, -1)` (here the SCEV for `%iv` is
`{1,+,1}`).
This change teaches `IndVars` to try sign extending the non-IV operand
if that makes the newly constructed IV use equivalent to the widened
narrow IV use.
Sanjoy Das [Fri, 16 Oct 2015 01:00:39 +0000 (01:00 +0000)]
[IndVars] Split `WidenIV::cloneIVUser`; NFC
Summary:
This NFC splitting is intended to make a later diff easier to follow.
It just tail duplicates `cloneIVUser` into `cloneArithmeticIVUser` and
`cloneBitwiseIVUser`.
Adrian Prantl [Thu, 15 Oct 2015 20:58:55 +0000 (20:58 +0000)]
Replace a forward declaration with an #include.
When building with modules the forward-declared inner class
DebugLocStream::ListBuilder causes clang to fall over.
Evgeniy Stepanov [Thu, 15 Oct 2015 20:50:16 +0000 (20:50 +0000)]
[safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.
This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.
This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.
Chris Bieneman [Thu, 15 Oct 2015 20:09:01 +0000 (20:09 +0000)]
[CMake] [Darwin] Add support for generating Xcode-compatible toolchains that xcodebuild and xcrun can search
Summary:
Sometimes you want to install a custom compiler and use it like the system compiler without overriding the system compiler. This patch lets you create xctoolchains that the darwin command line tools can use.
To use this patch set LLVM_CREATE_XCODE_TOOLCHAIN=On in your CMake invocation and build the `install-code-toolchain` target.
After installation you can set the envar EXTERNAL_TOOLCHAINS_DIR to your installed Toolchains directory, and the TOOLCHAINS envar to the toolchain identifier (ex org.llvm.3.8.0svn). This will then cause /usr/bin/clang to call your newly installed clang.
JF Bastien [Thu, 15 Oct 2015 18:24:52 +0000 (18:24 +0000)]
x86: preserve flags when folding atomic operations
D4796 taught LLVM to fold some atomic integer operations into a single
instruction. The pattern was unaware that the instructions clobbered
flags. I fixed some of this issue in D13680 but had missed INC/DEC.
Justin Bogner [Thu, 15 Oct 2015 18:17:44 +0000 (18:17 +0000)]
docs: Stop using DEBUG() without DEBUG_TYPE in the ProgrammersManual
The DEBUG() macro has required that a DEBUG_TYPE be set since r206822.
Update the programmers manual to reflect that, and also update the
wording to point out that DEBUG_TYPE should be defined after #includes.
Philip Reames [Thu, 15 Oct 2015 16:51:00 +0000 (16:51 +0000)]
Revert 250343 and 250344
Turns out this approach is buggy. In discussion about follow on work, Sanjoy pointed out that we could be subject to circular logic problems.
Consider:
if (i u< L) leave()
if ((i + 1) u< L) leave()
print(a[i] + a[i+1])
If we know that L is less than UINT_MAX, we could possible prove (in a control dependent way) that i + 1 does not overflow. This gives us:
if (i u< L) leave()
if ((i +nuw 1) u< L) leave()
print(a[i] + a[i+1])
If we now do the transform this patch proposed, we end up with:
if ((i +nuw 1) u< L) leave_appropriately()
print(a[i] + a[i+1])
That would be a miscompile when i==-1. The problem here is that the control dependent nuw bits got used to prove something about the first condition. That's obviously invalid.
This won't happen today, but since I plan to enhance LVI/CVP with exactly that transform at some point in the not too distant future...
JF Bastien [Thu, 15 Oct 2015 16:46:29 +0000 (16:46 +0000)]
x86 FP atomic codegen: don't drop globals, stack
Summary:
x86 codegen is clever about generating good code for relaxed
floating-point operations, but it was being silly when globals and
immediates were involved, forgetting where the global was and
loading/storing from/to the wrong place. The same applied to hard-coded
address immediates.
Don't let it forget about the displacement.
This fixes https://llvm.org/bugs/show_bug.cgi?id=25171
A very similar bug when doing floating-points atomics to the stack is
also fixed by this patch.
This fixes https://llvm.org/bugs/show_bug.cgi?id=25144
This adjusts all integers in the reader/writer to reflect the types
stored on profile files. They should all be unsigned 32-bit or 64-bit
values. Changed all associated internal types to be uint32_t or
uint64_t.
The only place that needed some adjustments is in the sample profile
transformation. Altough the weight read from the profile are 64-bit
values, the internal API for branch weights only accepts 32-bit values.
The pass now saturates weights that overflow uint32_t.
Daniel Sanders [Thu, 15 Oct 2015 14:34:23 +0000 (14:34 +0000)]
[mips][mips16] MIPS16 is not a CPU/Architecture but is an ASE.
Summary:
The -mcpu=mips16 option caused the Integrated Assembler to crash because
it couldn't figure out the architecture revision number to write to the
.MIPS.abiflags section. This CPU definition has been removed because, like
microMIPS, MIPS16 is an ASE to a base architecture.
Lang Hames [Thu, 15 Oct 2015 07:16:40 +0000 (07:16 +0000)]
[RuntimeDyld] Drop the '.s' suffix off the COFF test case - the MIPS bot started
failing when the suffix was added.
I assume the lack of a '.s' suffix means that the test case just wasn't running
before, and it has never worked on MIPS. I'll investigate that tomorrow.
Lang Hames [Thu, 15 Oct 2015 06:41:45 +0000 (06:41 +0000)]
[RuntimeDyld] Don't try to get the contents of sections that don't have any
(e.g. bss sections).
MachO and ELF have been silently letting this pass, but COFFObjectFile contains
an assertion to catch this kind of (ab)use of the getSectionContents, and this
was causing the JIT to crash on COFF objects with BSS sections. This patch
should fix that.
Akira Hatanaka [Thu, 15 Oct 2015 05:28:38 +0000 (05:28 +0000)]
[MachO] Stop generating *coal* sections.
Recommit r250342: move coal-sections-powerpc.s to subdirectory for powerpc.
Some background on why we don't have to use *coal* sections anymore:
Long ago when C++ was new and "weak" had not been standardized, an attempt was
made in cctools to support C++ inlines that can be coalesced by putting them
into their own section (TEXT/textcoal_nt instead of TEXT/text).
The current macho linker supports the weak-def bit on any symbol to allow it to
be coalesced, but the compiler still puts weak-def functions/data into alternate
section names, which the linker must map back to the base section name.
This patch makes changes that are necessary to prevent the compiler from using
the "coal" sections and have it use the non-coal sections instead when the
target architecture is not powerpc:
TEXT/textcoal_nt instead use TEXT/text
TEXT/const_coal instead use TEXT/const
DATA/datacoal_nt instead use DATA/data
If the target is powerpc, we continue to use the *coal* sections since anyone
targeting powerpc is probably using an old linker that doesn't have support for
the weak-def bits.
Also, have the assembler issue a warning if it encounters a *coal* section in
the assembly file and inform the users to use the non-coal sections instead.
Quentin Colombet [Thu, 15 Oct 2015 00:41:26 +0000 (00:41 +0000)]
[ARM] Make sure we do not dereference the end iterator when accessing debug
information.
Although the problem was always here, it would only be exposed when
shrink-wrapping is enable.
Davide Italiano [Thu, 15 Oct 2015 00:05:32 +0000 (00:05 +0000)]
[JIT] TrivialMemoryManager: Fail if we can't allocate memory.
TrivialMemoryManager currently doesn't check the return type of AllocateRWX --
and returns a 'null' MemoryBlock to its caller. As pointed out by Lang,
this exposes some serious issues with the MemoryManager interface. There's,
in fact, no way to report back an error to clients rather than aborting in
case memory can't be allocated. Eventually the interface will grow to support
this, but for now, fail sooner rather than later.
Akira Hatanaka [Wed, 14 Oct 2015 23:48:10 +0000 (23:48 +0000)]
[MachO] Stop generating *coal* sections.
Recommit r250342: add -arch=ppc32 to the RUN lines of powerpc tests.
Some background on why we don't have to use *coal* sections anymore:
Long ago when C++ was new and "weak" had not been standardized, an attempt was
made in cctools to support C++ inlines that can be coalesced by putting them
into their own section (TEXT/textcoal_nt instead of TEXT/text).
The current macho linker supports the weak-def bit on any symbol to allow it to
be coalesced, but the compiler still puts weak-def functions/data into alternate
section names, which the linker must map back to the base section name.
This patch makes changes that are necessary to prevent the compiler from using
the "coal" sections and have it use the non-coal sections instead when the
target architecture is not powerpc:
TEXT/textcoal_nt instead use TEXT/text
TEXT/const_coal instead use TEXT/const
DATA/datacoal_nt instead use DATA/data
If the target is powerpc, we continue to use the *coal* sections since anyone
targeting powerpc is probably using an old linker that doesn't have support for
the weak-def bits.
Also, have the assembler issue a warning if it encounters a *coal* section in
the assembly file and inform the users to use the non-coal sections instead.
Cong Hou [Wed, 14 Oct 2015 23:14:17 +0000 (23:14 +0000)]
Update the branch weight metadata in JumpThreading pass.
Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB).
This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added.
Philip Reames [Wed, 14 Oct 2015 22:46:19 +0000 (22:46 +0000)]
[SimplifyCFG] Speculatively flatten CFG based on profiling metadata
If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path.
The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences.
Akira Hatanaka [Wed, 14 Oct 2015 22:45:36 +0000 (22:45 +0000)]
[MachO] Stop generating *coal* sections.
Some background on why we don't have to use *coal* sections anymore:
Long ago when C++ was new and "weak" had not been standardized, an attempt was
made in cctools to support C++ inlines that can be coalesced by putting them
into their own section (TEXT/textcoal_nt instead of TEXT/text).
The current macho linker supports the weak-def bit on any symbol to allow it to
be coalesced, but the compiler still puts weak-def functions/data into alternate
section names, which the linker must map back to the base section name.
This patch makes changes that are necessary to prevent the compiler from using
the "coal" sections and have it use the non-coal sections instead when the
target architecture is not powerpc:
TEXT/textcoal_nt instead use TEXT/text
TEXT/const_coal instead use TEXT/const
DATA/datacoal_nt instead use DATA/data
If the target is powerpc, we continue to use the *coal* sections since anyone
targeting powerpc is probably using an old linker that doesn't have support for
the weak-def bits.
Also, have the assembler issue a warning if it encounters a *coal* section in
the assembly file and inform the users to use the non-coal sections instead.
Philip Reames [Wed, 14 Oct 2015 22:42:12 +0000 (22:42 +0000)]
Tighten known bits for ctpop based on zero input bits
This is a cleaned up patch from the one written by John Regehr based on the findings of the Souper superoptimizer.
The basic idea here is that input bits that are known zero reduce the maximum count that the intrinsic could return. We know that the number of bits required to represent a particular count is at most log2(N)+1.
Chris Bieneman [Wed, 14 Oct 2015 21:50:09 +0000 (21:50 +0000)]
[CMake] Make LLVM_VERSION_* variables user definable
CMake's set command overwrites existing values. Package maintainers may want or need to set the version variables manually, so we need to only set them if they are not already defined. Note I use the "if(NOT DEFINED ...)" syntax deliberately in the last case because empty string is a valid value for the suffx, but not the other variables.
PR25157 identifies a bug where a load plus a vector shuffle is
incorrectly converted into an LXVDSX instruction. That optimization
is only valid if the load is of a doubleword, and in the noted case,
it was not. This corrects that problem.
Joint patch with Eric Schweitz, who provided the bugpoint-reduced test
case.
Daniel Berlin [Wed, 14 Oct 2015 19:54:24 +0000 (19:54 +0000)]
[IDFCalculator] Use DominatorTreeBase instead of DominatorTree
Summary:
IDFCalculator used a DominatorTree instance for its calculations. Since the PostDominatorTree struct is not a subclass of DominatorTree, it wasn't possible to use PDT in IDFCalculator to compute post-dominance frontiers.
This patch makes IDFCalculator work with a DominatorTreeBase<BasicBlock> instead, which enables PDTs to be utilized.
Andrea Di Biagio [Wed, 14 Oct 2015 10:03:13 +0000 (10:03 +0000)]
[x86][FastISel] Teach how to select nontemporal stores.
This patch teaches x86 fast-isel how to select nontemporal stores.
On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords.
Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal
vector stores.
Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti'
for doubleword/quadword nontemporal stores. In the case of nontemporal stores
of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of
movntps/movntpd/movntdq.
With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now
always get the expected (V)MOVNT instructions.
The lack of fast-isel support for nontemporal stores was spotted when analyzing
the -O0 codegen for nontemporal stores.
Chris Bieneman [Wed, 14 Oct 2015 07:37:00 +0000 (07:37 +0000)]
[CMake] Set Policy CMP0048 to NEW
CMake 3.0 introduced the VERSION option for the project() command. If you don't specify the VERSION in the function it will clear out variables matching ${PROJECT_NAME}_VERSION_${MAJOR|MINOR|PATCH|TWEAK}.
This makes overriding LLVM_VERSION_* not work properly with newer versions of CMake. To make this work properly we need to:
(1) Optionally set the policy to NEW
(2) Move default versions and setting PACKAGE_VERSION to before the call to project()
(3) If the policy is set, pass the VERSION and LANGUAGES options in the new format
This change should have no behavioral change for CMake versions before 3.0, and it makes the behavior of later versions match the earlier versions.
Craig Topper [Wed, 14 Oct 2015 05:37:42 +0000 (05:37 +0000)]
[X86] Update CPU detection to only enable XSAVE features if the OS has enabled them and the saving of YMM state. This seems to be consistent with gcc behavior.
Richard Smith [Wed, 14 Oct 2015 00:04:19 +0000 (00:04 +0000)]
Rename one of our two llvm::GCOVOptions classes to llvm::GCOV::Options. We used
to get away with this because llvm/Support/GCOV.h was an implementation detail
of the llvm-gcov tool, but it's now being used by FDO.