Ulrich Weigand [Fri, 20 Jun 2014 16:34:05 +0000 (16:34 +0000)]
[PowerPC] Fix small argument stack slot offset for LE
When small arguments (structures < 8 bytes or "float") are passed in a
stack slot in the ppc64 SVR4 ABI, they must reside in the least
significant part of that slot. On BE, this means that an offset needs
to be added to the stack address of the parameter, but on LE, the least
significant part of the slot has the same address as the slot itself.
This changes the PowerPC back-end ABI code to only add the small
argument stack slot offset for BE. It also adds test cases to verify
the correct behavior on both BE and LE.
Rafael Espindola [Fri, 20 Jun 2014 13:11:28 +0000 (13:11 +0000)]
Allow a target to create a null streamer.
Targets can assume that a target streamer is present, so they have to be able
to construct a null streamer in order to set the target streamer in it to.
Fixes a crash when using the null streamer with arm.
Yaron Keren [Fri, 20 Jun 2014 12:20:56 +0000 (12:20 +0000)]
Reverting size_type for the containers from size_type to unsigned.
Various places in LLVM assume that container size and count are unsigned
and do not use the container size_type. Therefore they break compilation
(or possibly executation) for LP64 systems where size_t is 64 bit while
unsigned is still 32 bit.
If we'll ever that many items in the container size_type could be made
size_t for a specific containers after reviweing its other uses.
Yaron Keren [Fri, 20 Jun 2014 10:26:56 +0000 (10:26 +0000)]
The count() function for STL datatypes returns unsigned, even where it's
only 1/0 result like std::set. Some of the LLVM ADT already return unsigned
count(), while others still return bool count().
In continuation to r197879, this patch modifies DenseMap, DenseSet,
ScopedHashTable, ValueMap:: count() to return size_type instead of bool,
1 instead of true and 0 instead of false.
size_type is typedef-ed locally within each class to size_t.
Karthik Bhat [Fri, 20 Jun 2014 04:32:48 +0000 (04:32 +0000)]
Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015
Start extracting helper functions out of -block-freq's `UnsignedFloat`
into `Support/ScaledNumber.h` with the eventual goal of moving and
renaming the class to `ScaledNumber`.
The bike shed about names is still being painted, but I'm going with
this for now.
Chandler Carruth [Fri, 20 Jun 2014 01:05:28 +0000 (01:05 +0000)]
[x86] Make the x86 PACKSSWB, PACKSSDW, PACKUSWB, and PACKUSDW
instructions available as synthetic SDNodes PACKSS and PACKUS that will
select to the correct instruction variants based on the return type.
This allows us to use these rather important instructions when lowering
vector shuffles.
Also moves the relevant instruction definitions to be split out from
the fully generic multiclasses to allow them to match these new SDNodes
in the same way that the UNPCK instructions do.
Hans Wennborg [Fri, 20 Jun 2014 00:38:12 +0000 (00:38 +0000)]
Don't build switch lookup tables for dllimport or TLS variables
We would previously put dllimport variables in switch lookup tables, which
doesn't work because the address cannot be used in a constant initializer.
This is basically the same problem that we have in PR19955.
Putting TLS variables in switch tables also desn't work, because the
address of such a variable is not constant.
Rafael Espindola [Thu, 19 Jun 2014 23:06:53 +0000 (23:06 +0000)]
The gold plugin doesn't need disassemblers.
Back in r128440 tools/LTO started exporting the disassembler interface. It
was never clear why, but whatever the reason I am pretty sure it doesn't hold
for tools/gold.
Kevin Enderby [Thu, 19 Jun 2014 22:49:21 +0000 (22:49 +0000)]
Fix the output of llvm-nm for Mach-O files to use the characters ‘d’ and ‘b’ for
data and bss symbols instead of the generic ’s’ for a symbol in a section.
Kevin Enderby [Thu, 19 Jun 2014 22:03:18 +0000 (22:03 +0000)]
Change the output of llvm-nm and llvm-size for Mach-O universal files (aka
fat files) to print “ (for architecture XYZ)” for fat files with more than
one architecture to be like what the darwin tools do for fat files.
Also clean up the Mach-O printing of archive membernames in llvm-nm to use
the darwin form of "libx.a(foo.o)".
Rafael Espindola [Thu, 19 Jun 2014 21:14:13 +0000 (21:14 +0000)]
Use lib/LTO directly in the gold plugin.
The tools/lto API is not the best choice for implementing a gold plugin. Among
other issues:
* It is an stable ABI. Old errors stay and we have to be really careful
before adding new features.
* It has to support two fairly different linkers: gold and ld64.
* We end up with a plugin that depends on a shared lib, something quiet
unusual in LLVM land.
* It hides LLVM. For some features in the gold plugin it would be really
nice to be able to just get a Module or a GlobalValue.
This change is intended to be a very direct translation from the C API. It
will just enable other fixes and cleanups.
Eric Christopher [Thu, 19 Jun 2014 21:03:04 +0000 (21:03 +0000)]
Add a new subtarget hook for whether or not we'd like to enable
the atomic load linked expander pass to run for a particular
subtarget. This requires a check of the subtarget and so save
the TargetMachine rather than only TargetLoweringInfo and update
all callers.
Zachary Turner [Thu, 19 Jun 2014 20:20:03 +0000 (20:20 +0000)]
Include Threading.h instead of forward declaring a function.
Previously this led to a circular header dependency, but a recent
change has since removed this dependency, so the correct fix is
to simply include the header rather than forward declare.
Eric Christopher [Thu, 19 Jun 2014 20:00:13 +0000 (20:00 +0000)]
Since we're using DW_AT_string rather than DW_AT_strp for debug_info
for assembly files we can't depend on the offset within the section
after a string since it could be different between producers etc.
Relax these tests accordingly.
Rafael Espindola [Thu, 19 Jun 2014 19:45:25 +0000 (19:45 +0000)]
Remove an incorrect fixme.
dynamic-no-pic is just another output type. If gnu ld gets support for MachO,
it should also add something like LDPO_DYN_NO_PIC to the plugin interface.
David Greene [Thu, 19 Jun 2014 19:31:09 +0000 (19:31 +0000)]
Add option to keep flavor out of the install directory
Sometimes we want to install things in "standard" locations and the
flavor directories interfere with that. Add an option to keep them
out of the install path.
Zachary Turner [Thu, 19 Jun 2014 18:18:23 +0000 (18:18 +0000)]
Remove support for LLVM runtime multi-threading.
After a number of previous small iterations, the functions
llvm_start_multithreaded() and llvm_stop_multithreaded() have
been reduced essentially to no-ops. This change removes them
entirely.
Zachary Turner [Thu, 19 Jun 2014 16:17:42 +0000 (16:17 +0000)]
Kill the LLVM global lock.
This patch removes the LLVM global lock, and updates all existing
users of the global lock to use their own mutex. None of the
existing users of the global lock were protecting code that was
mutually exclusive with any of the other users of the global
lock, so its purpose was not being met.
Oliver Stannard [Thu, 19 Jun 2014 15:52:37 +0000 (15:52 +0000)]
Emit DWARF info for all code section in an assembly file
Currently, when using llvm as an assembler, DWARF debug information is only
generated for the .text section. This patch modifies this so that DWARF info
is emitted for all executable sections.
Oliver Stannard [Thu, 19 Jun 2014 15:39:33 +0000 (15:39 +0000)]
Emit DWARF3 call frame information when DWARF3+ debug info is requested
Currently, llvm always emits a DWARF CIE with a version of 1, even when emitting
DWARF 3 or 4, which both support CIE version 3. This patch makes it emit the
newer CIE version when we are emitting DWARF 3 or 4. This will not reduce
compatibility, as we already emit other DWARF3/4 features, and is worth doing as
the DWARF3 spec removed some ambiguities in the interpretation of call frame
information.
It also fixes a minor bug where the "return address" field of the CIE was
encoded as a ULEB128, which is only valid when the CIE version is 3. There are
no test changes for this, because (as far as I can tell) none of the platforms
that we test have a return address register with a DWARF register number >127.
Matheus Almeida [Thu, 19 Jun 2014 15:08:04 +0000 (15:08 +0000)]
[mips] Implementation of dli.
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
Some small modifications to the original patch: we now error if
it's not possible to expand an instruction (mips-expansions-bad.s has some
examples). Added some comments to the expansions.
Matheus Almeida [Thu, 19 Jun 2014 14:39:14 +0000 (14:39 +0000)]
[mips] Small update to the logic behind the expansion of assembly pseudo instructions.
Summary:
The functions that do the expansion now return false on success and true otherwise. This is so
we can catch some errors during the expansion (e.g.: immediate too large). The next patch adds some test cases.
Dinesh Dwivedi [Thu, 19 Jun 2014 10:36:52 +0000 (10:36 +0000)]
Added instruction combine to transform few more negative values addition to subtraction (Part 1)
This patch enables transforms for following patterns.
(x + (~(y & c) + 1) --> x - (y & c)
(x + (~((y >> z) & c) + 1) --> x - ((y>>z) & c)
Andrea Di Biagio [Thu, 19 Jun 2014 10:29:41 +0000 (10:29 +0000)]
[X86] Teach how to combine horizontal binop even in the presence of undefs.
Before this change, the backend was unable to fold a build_vector dag
node with UNDEF operands into a single horizontal add/sub.
This patch teaches how to combine a build_vector with UNDEF operands into a
horizontal add/sub when possible. The algorithm conservatively avoids to combine
a build_vector with only a single non-UNDEF operand.
Added test haddsub-undef.ll to verify that we correctly fold horizontal binop
even in the presence of UNDEFs.
Dinesh Dwivedi [Thu, 19 Jun 2014 08:29:18 +0000 (08:29 +0000)]
Refactored and updated SimplifyUsingDistributiveLaws() to
* Find factorization opportunities using identity values.
* Find factorization opportunities by treating shl(X, C) as mul (X, shl(C))
* Keep NSW flag while simplifying instruction using factorization.
InstCombineAddSub has:
// W*X + Y*Z --> W * (X+Z) iff W == Y
These two transforms could fight with each other if C1*CI would not fold
away to something simpler than a ConstantExpr mul.
The InstCombineMulDivRem transform only acted on ConstantInts until
r199602 when it was changed to operate on all Constants in order to
let it fire on ConstantVectors.
To fix this, make this transform more careful by checking to see if we
actually folded away C1*CI.
Eric Christopher [Thu, 19 Jun 2014 06:22:08 +0000 (06:22 +0000)]
Move -dwarf-version to an MC level command line option so it's
used by all of the MC level tools and codegen. Fix up all uses
in the compiler to use this and set it on the context accordingly.
Nick Lewycky [Thu, 19 Jun 2014 03:51:46 +0000 (03:51 +0000)]
Move optimization of some cases of (A & C1)|(B & C2) from instcombine to instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs.
Nick Lewycky [Thu, 19 Jun 2014 03:28:28 +0000 (03:28 +0000)]
Remove redundant code in InstCombineShift, no functionality change because instsimplify already does this and instcombine calls instsimplify a few lines above. Patch by Suyog Sarda!
Kevin Enderby [Wed, 18 Jun 2014 22:04:40 +0000 (22:04 +0000)]
Teach llvm-size to know about Mach-O universal files (aka fat files) and
fat files containing archives.
Also fix a bug in MachOUniversalBinary::ObjectForArch::ObjectForArch()
where it needed a >= when comparing the Index with the number of
objects in a fat file. As the index starts at 0.
Matt Arsenault [Wed, 18 Jun 2014 22:03:45 +0000 (22:03 +0000)]
R600: Handle fnearbyint
The difference from rint isn't really relevant here,
so treat them as equivalent. OpenCL doesn't have nearbyint,
so this is sort of pointless other than for completeness.
Marek Olsak [Wed, 18 Jun 2014 22:00:29 +0000 (22:00 +0000)]
R600/SI: add gather4 and getlod intrinsics (v3)
This contains all the previous patches + getlod support on top of it.
It doesn't use SDNodes anymore, so it's quite small.
It also adds v16i8 to SReg_128, which is used for the sampler descriptor.
Reviewed-by: Tom Stellard
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211228 91177308-0d34-0410-b5e6-96231b3b80d8
Zachary Turner [Wed, 18 Jun 2014 20:17:35 +0000 (20:17 +0000)]
Replace Execution Engine's mutex with std::recursive_mutex.
This change has a bit of a trickle down effect due to the fact that
there are a number of derived implementations of ExecutionEngine,
and that the mutex is not tightly encapsulated so is used by other
classes directly.
Diego Novillo [Wed, 18 Jun 2014 18:46:58 +0000 (18:46 +0000)]
Simply test for available locations in optimization remarks.
When emitting optimization remarks, we test for the presence of
instruction locations by testing for a valid llvm.dbg.cu annotation.
This is slightly inefficient because we can simply ask whether the
debug location we have is known or not.
Additionally, if my current plan works, I will need to remove the
llvm.dbg.cu annotation from the IL (or prevent it from being generated)
when -Rpass is used without -g. In those cases, we'll want to generate
line tables but we will want to prevent code generation from emitting
DWARF code for them.
Ulrich Weigand [Wed, 18 Jun 2014 18:33:36 +0000 (18:33 +0000)]
[PowerPC] Remove unnecessary load of r12 in indirect call
When looking at the 64-bit SVR4 indirect call sequence, I noticed
an unnecessary load of r12. And indeed the code says:
// R12 must contain the address of an indirect callee.
But this is not correct; in the 64-bit SVR4 (ELFv1) ABI, there is
no need to load r12 at this point. It seems this code and comment
is a remnant of code originally shared with the Darwin ABI ...