because the simplifycfg transformation into selects would happen to happen
before the simplifycfg transformation that removes unreachable control flow
(We have 'unreachable control flow' due to the store to null which is undefined
behavior).
The existing transformation that removes unreachable control flow in simplifycfg
is:
/// If BB has an incoming value that will always trigger undefined behavior
/// (eg. null pointer dereference), remove the branch leading here.
static bool removeUndefIntroducingPredecessor(BasicBlock *BB)
Now we generate:
define void @test6(i1 %cond, i8* %ptr) {
store i8 2, i8* %ptr.2, align 8
ret void
}
I did not see any impact on the test-suite + externals.
Sanjay Patel [Thu, 9 Oct 2014 21:26:35 +0000 (21:26 +0000)]
Improve sqrt estimate algorithm (fast-math)
This patch changes the fast-math implementation for calculating sqrt(x) from:
y = 1 / (1 / sqrt(x))
to:
y = x * (1 / sqrt(x))
This has 2 benefits: less code / faster code and one less estimate instruction
that may lose precision.
The only target that will be affected (until http://reviews.llvm.org/D5658 is approved)
is PPC. The difference in codegen for PPC is 2 less flops for a single-precision sqrtf
or vector sqrtf and 4 less flops for a double-precision sqrt.
We also eliminate a constant load and extra register usage.
Samuel Antao [Thu, 9 Oct 2014 20:42:56 +0000 (20:42 +0000)]
Fix bug in GPR to FPR moves in PPC64LE.
The current implementation of GPR->FPR register moves uses a stack slot. This mechanism writes a double word and reads a word. In big-endian the load address must be displaced by 4-bytes in order to get the right value. In little endian this is no longer required. This patch fixes the issue and adds LE regression tests to fast-isel-conversion which currently expose this problem.
Lang Hames [Thu, 9 Oct 2014 18:20:51 +0000 (18:20 +0000)]
[PBQP] Replace PBQPBuilder with composable constraints (PBQPRAConstraint).
This patch removes the PBQPBuilder class and its subclasses and replaces them
with a composable constraints class: PBQPRAConstraint. This allows constraints
that are only required for optimisation (e.g. coalescing, soft pairing) to be
mixed and matched.
This patch also introduces support for target writers to supply custom
constraints for their targets by overriding a TargetSubtargetInfo method:
Tom Stellard [Thu, 9 Oct 2014 18:09:15 +0000 (18:09 +0000)]
R600/SI: Legalize INSERT_SUBREG instructions during PostISelFolding
LLVM assumes INSERT_SUBREG will always have register operands, so
we need to legalize non-register operands, like FrameIndexes, to
avoid random assertion failures.
Bill Schmidt [Thu, 9 Oct 2014 17:51:35 +0000 (17:51 +0000)]
[PPC64] VSX indexed-form loads use wrong instruction format
The VSX instruction definitions for lxsdx, lxvd2x, lxvdsx, and lxvw4x
incorrectly use the XForm_1 instruction format, rather than the
XX1Form instruction format. This is likely a pasto when creating
these instructions, which were based on lvx and so forth. This patch
uses the correct format.
The existing reformatting test (test/MC/PowerPC/vsx.s) missed this
because the two formats differ only in that XX1Form has an extension
to the target register field in bit 31. The tests for these
instructions used a target register of 7, so the default of 0 in bit
31 for XForm_1 didn't expose a problem. For register numbers 32-63
this would be noticeable. I've changed the test to use higher
register numbers to verify my change is effective.
David Blaikie [Thu, 9 Oct 2014 16:50:53 +0000 (16:50 +0000)]
Sink DwarfDebug::constructInlinedScopeDIE into DwarfCompileUnit
This introduces access to the AbstractSPDies map from DwarfDebug so
DwarfCompileUnit can access it. Eventually this'll sink down to
DwarfFile, but it'll still be generically accessible - not much
encapsulation to provide it. (constructInlinedScopeDIE could stay
further up, in DwarfFile to avoid exposing this - but I don't think
that's particularly better)
[InstCombine] Fix wrong folding of constant comparisons involving ashr and negative values.
This patch fixes a bug in method InstCombiner::FoldCmpCstShrCst where we
wrongly computed the distance between the highest bits set of two negative
values.
David Majnemer [Thu, 9 Oct 2014 08:42:31 +0000 (08:42 +0000)]
Object, COFF: Move the VirtualSize/SizeOfRawData logic to getSectionSize
While getSectionContents was updated to do the right thing,
getSectionSize wasn't. Move the logic to getSectionSize and leverage it
from getSectionContents.
David Majnemer [Thu, 9 Oct 2014 07:49:28 +0000 (07:49 +0000)]
Object, COFF: Cap the section contents to min(VirtualSize, SizeOfRawData)
It is not useful to return the data beyond VirtualSize it's less than
SizeOfRawData.
An implementation detail of COFF requires the section size to be rounded
up to a multiple of FileAlignment; this means that SizeOfRawData is not
representative of how large the section is. Instead, we should cap it
to VirtualSize when this occurs as it represents the true size of the
section.
Note that this is only relevant in executable files because this
rounding doesn't occur in object files (and VirtualSize is always zero).
David Blaikie [Thu, 9 Oct 2014 00:11:39 +0000 (00:11 +0000)]
Sink DwarfDebug::addScopeRangeList down into DwarfCompileUnit
(& add a few accessors/make a couple of things public for this - it's a
bit of a toss-up, but I think I prefer it this way, keeping some more of
the meaty code down in DwarfCompileUnit - if only to make for smaller
implementation files, etc)
I think we could simplify range handling a bit if we removed the range
lists from each unit and just put a single range list on DwarfDebug,
similar to address pooling.
Adam Nemet [Wed, 8 Oct 2014 23:25:37 +0000 (23:25 +0000)]
[AVX512] Intrinsics for vextract*x4
This adds the Pat<>'s for the intrinsics. These are necessary because we
don't lower these intrinsics to SDNodes but match them directly. See the
rational in the previous commit.
Adam Nemet [Wed, 8 Oct 2014 23:25:33 +0000 (23:25 +0000)]
[AVX512] Add asm-only support for vextract*x4 masking variants
These derive from the new asm-only masking definitions.
Unfortunately I wasn't able to find a ISel pattern that we could legally
generate for the masking variants. The problem is that since the destination
is v4* we would need VK4 register classes and v4i1 value types to express the
masking. These are however not legal types/classes in AVX512f but only in VL,
so things get complicated pretty quickly. We can revisit this question later
if we have a more pressing need to express something like this.
So the ISel patterns are empty for the masking instructions and the next patch
will add Pat<>s instead to match the intrinsics calls with instructions.
Alexey Samsonov [Wed, 8 Oct 2014 22:57:47 +0000 (22:57 +0000)]
Use llvm-symbolizer to symbolize LLVM/Clang crash dumps.
This change modifies fatal signal handler used in LLVM tools.
Now it attempts to find llvm-symbolizer binary and communicates
with it in order to turn instruction addresses into
function/file/line info entries. This should significantly improve
stack traces readability in Debug builds.
This feature only works on selected platforms (including Darwin
and Linux). If the symbolization fails for some reason, signal
handler will fallback to the original behavior.
David Blaikie [Wed, 8 Oct 2014 22:20:02 +0000 (22:20 +0000)]
Push DwarfDebug::constructScopeDIE down into DwarfCompileUnit
One of many steps to generalize subprogram emission to both the DWO and
non-DWO sections (to emit -gmlt-like data under fission). Once the
functions are pushed down into DwarfCompileUnit some of the data
structures will be pushed at least into DwarfFile so that they can be
unique per-file, allowing emission to both files independently.
Robin Morisset [Wed, 8 Oct 2014 19:38:18 +0000 (19:38 +0000)]
[X86] Avoid generating inc/dec when slow for x.atomic_store(1 + x.atomic_load())
Summary:
I had forgotten to check for NotSlowIncDec in the patterns that can generate
inc/dec for the above pattern (added in D4796).
This currently applies to Atom Silvermont, KNL and SKX.
David Majnemer [Wed, 8 Oct 2014 19:32:32 +0000 (19:32 +0000)]
Inliner: Non-local functions in COMDATs shouldn't be dropped
A function with discardable linkage cannot be discarded if its a member
of a COMDAT group without considering all the other COMDAT members as
well. This sort of thing is already handled by GlobalOpt/GlobalDCE.
Remove bogus std::error_code returns form SectionRef.
There are two methods in SectionRef that can fail:
* getName: The index into the string table can be invalid.
* getContents: The section might point to invalid contents.
Every other method will always succeed and returning and std::error_code just
complicates the code. For example, a section can have an invalid alignment,
but if we are able to get to the section structure at all and create a
SectionRef, we will always be able to read that invalid alignment.
Frederic Riss [Wed, 8 Oct 2014 14:59:44 +0000 (14:59 +0000)]
Update dwarf::ApplePropertyAttributes enum to meaningful values.
Summary:
We currently emit an DW_AT_APPLE_property_attribute with a value that is a
bitfield describing the various attributes applied to an ObjectiveC property.
While trying to add testing to one of my dwarfdump patches that would pretty
print that, I realized this information looks totally broken and has maybe
never been correct.
As with every DWARF info, we have some enum in Dwarf.h that describes this
attribute (enum ApplePropertyAttributes). It seems however that the attribute
value is set from another definition of these flags in Sema/DeclSpec.h (enum
ObjCPropertyAttributeKind). And these 2 enums aren't in sync.
This patch updates the Dwarf.h values to the ones we are (and have been for
a very long time) emitting. We change some publicly (and even documented
in SourceLevelDebugging.rst) values, but I doubt this could be an issue as
the information has been wrong for so long...
Robert Khasanov [Wed, 8 Oct 2014 14:37:45 +0000 (14:37 +0000)]
[AVX512] Refactoring of avx512_binop_rm multiclass through AVX512_masking.
Added new argrument for AVX512_masking: InstrItinClass and bit isCommutable.
No functional change.
Renato Golin [Wed, 8 Oct 2014 09:32:47 +0000 (09:32 +0000)]
Update git-svnrevert to accept git and svn revisions
Interchangeable commit ids can now be used on this git-svnrevert, which
will figure out what kind of commit that is (if you use format rNNNN for SVN
commits) and make sure the right ids are used in the right places.
[InstCombine] re-commit r218721 with fix for pr21199
The icmp-select-icmp optimization targets select-icmp.eq
only. This is now ensured by testing the branch predicate
explictly. This commit also includes the test case for pr21199.
David Majnemer [Wed, 8 Oct 2014 06:38:53 +0000 (06:38 +0000)]
COFF: Don't oversize COMMON symbols when targeting BFD ld
COFF normally doesn't allow us to describe the alignment of COMMON
symbols.
It turns out that most linkers use the symbol size as a hint as to how
aligned the symbol should be.
However the BFD folks have added a .drectve command, which we
now support as of r219229, that allows us to specify the alignment
precisely. With this in mind, stop rounding sizes up.
Cache SelectionDAGISel TargetInstrInfo lookups on the class and
propagate. Also use the TargetSubtargetInfo and the MachineFunction
and move TargetRegisterInfo query closer to uses.
Reset the target options and optimization level as the first
thing we do inside selection dag. This code needs to be
migrated to queries on the function rather than global
data, but this organizes things before we start grabbing
the subtarget.