granicus.if.org Git

[unroll] Merge the simplification and DCE estimation methods on the
UnrollAnalyzer.

Now they share a single worklist and have less implicit state between
them. There was no real benefit to separating these two things out.

I'm going to subsequently refactor things to share even more code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229062 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Remove pointless dyn_cast<>s to Instruction - the users of an
instruction must by definition be instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229061 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Don't check the loop set for whether an instruction is
contained in it each time we try to add it to the worklist, just check
this when pulling it off the worklist. That way we do it at most once
per instruction with the cost of the worklist set we would need to pay
anyways.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229060 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Change the other worklist in the unroll analyzer to be a set
vector.

In addition to dramatically reducing the work required for contrived
example loops, this also has to correct some serious latent bugs in the
cost computation. Previously, we might add an instruction onto the
worklist once for every load which it used and was simplified. Then we
would visit it many times and accumulate "savings" each time.

I mean, fortunately this couldn't matter for things like calls with 100s
of operands, but even for binary operators this code seems like it must
be double counting the savings.

I just noticed this by inspection and due to the runtime problems it can
introduce, I don't have any test cases for cases where the cost produced
by this routine is unacceptable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229059 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Replace a boolean, for loop, condition, and break with
std::all_of and a lambda. Much cleaner, no functionality
changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229058 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Directly query for dead instructions.

In the unroll analyzer, it is checking each user to see if that user
will become dead. However, it first checked if that user was missing
from the simplified values map, and then if was also missing from the
dead instructions set. We add everything from the simplified values map
to the dead instructions set, so the first step is completely subsumed
by the second. Moreover, the first step requires *inserting* something
into the simplified value map which isn't what we want at all.

This also replaces a dyn_cast with a cast as an instruction cannot be
used by a non-instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229057 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Replace a linear time check for no uses with a constant time
check.

Also hoist this into the enqueue process as it is faster even than
testing the worklist set, we should just directly filter these out much
like we filter out constants and such.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229056 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Rather than an operand set, use a setvector for the worklist.

We don't just want to handle duplicate operands within an instruction,
but also duplicates across operands of different instructions. I should
have gone straight to this, but I had convinced myself that it wasn't
going to be necessary briefly. I've come to my senses after chatting
more with Nick, and am now happier here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229054 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Extract the code to enqueue operansd for the worklist in the
unroll analysis into a lambda and call it. That's much simpler than
duplicating all the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229053 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Use a small set to de-duplicate operands prior to putting them
into the worklist. This avoids allocating lots of worklist memory for
them when there are large numbers of repeated operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229052 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Make the unroll cost analysis terminate deterministically and
reasonably quickly.

I don't have a reduced test case, but for a version of FFMPEG, this
makes the loop unroller start finishing at all (after over 15 minutes of
running, it hadn't terminated for me, no idea if it was a true infloop
or just exponential work).

The key thing here is to check the DeadInstructions set when pulling
things off the worklist. Without this, we would re-walk the user list of
already dead instructions again and again and again. Consider phi nodes
with many, many operands and other patterns.

The other important aspect of this is that because we would keep
re-visiting instructions that were already known dead, we kept adding
their cost savings to this! This would cause our cost savings to be
*insanely* inflated from this.

While I was here, I also rotated the operand walk out of the worklist
loop to make the code easier to read. There is still work to be done to
minimize worklist traffic because we don't de-duplicate operands. This
means we may add the same instruction onto the worklist 1000s of times
if it shows up in 1000s of operansd to a PHI node for example.

Still, with this patch, the ffmpeg testcase I have finishes quickly and
I can't measure the runtime impact of the unroll analysis any more. I'll
probably try to do a few more cleanups to this code, but not sure how
much cleanup I can justify right now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229038 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Drop never-used defaults for DIBuilder::createTemplate*(), NFC

No caller specifies anything different; these parameters are dead code
and probably always have been. The new hierarchy doesn't bother with
the fields at all (see r228607 and r228652).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229037 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Remove unnecessary check for fpimm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229034 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Make range based for loops a bit more explicit and more
readable.

The biggest thing that was causing me problems is recognizing the
references vs. poniters here. I also found that for maps naming the loop
variable as KeyValue helps make it obvious why you don't actually use it
directly. Finally, using 'auto' instead of 'User *' doesn't seem like
a good tradeoff. Much like with the other cases, I like to know its
a pointer, and 'User' is just as long and tells the reader a lot more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229033 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Remove confusing '?' from r229004, NFC

The name is always part of the record, it just might be empty. Remove
the `?` for clarity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229032 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Add trailing comma to MetadataCodes, NFC

Suggested in the review of r229004, this should simplify diffs
in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229031 91177308-0d34-0410-b5e6-96231b3b80d8

[IC] Fix a bug with the instcombine canonicalizing of loads and
propagating of metadata.

We were propagating !nonnull metadata even when the newly formed load is
no longer of a pointer type. This is clearly broken and results in LLVM
failing the verifier and aborting. This patch just restricts the
propagation of !nonnull metadata to when we actually have a pointer
type.

This bug report and the initial version of this patch was provided by
Charles Davis! Many thanks for finding this!

We still need to add logic to round-trip the metadata correctly if we
combine from pointer types to integer types and then back by using range
metadata for the integer type loads. But this is the minimal and safe
version of the patch, which is important so we can backport it into 3.6.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229029 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Avoid the "Insn" abbreviation of Instruction. This is quite
hard to type and read for me, and is inconsistent with the other
abbreviation in the base class "Inst". For most of these (where they are
used widely) I prefer just spelling it out as Instruction. I've changed
two of the short-lived variables to use "Inst" to match the base class.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229028 91177308-0d34-0410-b5e6-96231b3b80d8

Check interleaving without relying on debug output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229027 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Tidy up the integer we use to accumululate the number of
instructions optimized. NFC, just separating this out from the
functionality changing commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229026 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDImportedEntity

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229025 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDObjCProperty

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229024 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDExpression

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229023 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDLocalVariable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229022 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the build, I forgot to check that UnitTests still built.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229021 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDGlobalVariable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229020 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDTemplate{Type,Value}Parameter

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229019 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDNamespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229018 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDLexicalBlockFile

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229017 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDLexicalBlock

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229016 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter: MDSubprogram: Recognize DW_VIRTUALITY in 'virtuality'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229015 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDSubprogram

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229014 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDCompileUnit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229013 91177308-0d34-0410-b5e6-96231b3b80d8

Improve llvm-pdbdump output display.

This patch adds a number of improvements to llvm-pdbdump.

1) Dumping of the entire global scope, and not only those
symbols that live in individual compilands.
2) Prepend class name to member functions and data
3) Improved display of bitfields.
4) Support for dumping more kinds of data symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229012 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDSubroutineType

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229011 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter: MDCompositeType: Recognize DW_LANG in 'runtimeLang'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229010 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDDerivedType and MDCompositeType

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229009 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDFile

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229007 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter: MDBasicType: Recognize DW_ATE in 'encoding'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229006 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDBasicType

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229005 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDEnumerator

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229004 91177308-0d34-0410-b5e6-96231b3b80d8

AsmWriter/Bitcode: MDSubrange

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229003 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Add MDExpression::ExprOperand

Port `DIExpression::Operand` over to `MDExpression::ExprOperand`. The
logic is needed directly in `MDExpression` to support printing in
assembly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229002 91177308-0d34-0410-b5e6-96231b3b80d8

Support: Add dwarf::getOperationEncoding()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229001 91177308-0d34-0410-b5e6-96231b3b80d8

Support: Rewrite LocationAtom and OperationEncodingString(), NFC

Use `Dwarf.def` more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229000 91177308-0d34-0410-b5e6-96231b3b80d8

[LinkModules] Change the way ModuleLinker merges triples.

This commit makes the following changes:

- Stop issuing a warning when the triples' string representations do not match
exactly if the Triple objects generated from the strings compare equal.

- On Apple platforms, choose the triple that has the larger minimum version
number.

rdar://problem/16743513

Differential Revision: http://reviews.llvm.org/D7591

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228999 91177308-0d34-0410-b5e6-96231b3b80d8

PPCFrameLowering's FramePointerOffset can be computed at initialization
time. Do so.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228998 91177308-0d34-0410-b5e6-96231b3b80d8

The TOC save offset can be computed at compile time, do so and
propagate changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228997 91177308-0d34-0410-b5e6-96231b3b80d8

The return save offset can be computed at initialization time - do
so and save the value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228996 91177308-0d34-0410-b5e6-96231b3b80d8

Testcase for r228988.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228995 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Don't use a map from pointer to bool. Use a set.

This is much more efficient. In particular, the query with the user
instruction has to insert a false for every missing instruction into the
set. This is just a cleanup a long the way to fixing the underlying
algorithm problems here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228994 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/Transforms/LoopVectorize/PowerPC/small-loop-rdx.ll REQUIRES +Asserts due to -debug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228989 91177308-0d34-0410-b5e6-96231b3b80d8

Prevent division by 0.

When we try to estimate number of potentially removed instructions in
loop unroller, we analyze first N iterations and then scale the
computed number by TripCount/N. We should bail out early if N is 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228988 91177308-0d34-0410-b5e6-96231b3b80d8

[unroll] Update the new analysis logic from r228265 to use modern coding
conventions for function names consistently. Some were already using
this but not all.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228987 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for having multiple sections with the same name and comdat.

Using this in combination with -ffunction-sections allows LLVM to output a .o
file with mulitple sections named .text. This saves space by avoiding long
unique names of the form .text.<C++ mangled name>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228980 91177308-0d34-0410-b5e6-96231b3b80d8

X86: Don't crash if we can't decode the pshufb mask

Constant pool entries are uniqued by their contents regardless of their
type. This means that a pshufb can have a shuffle mask which isn't a
simple array of bytes.

The code path which attempts to decode the mask didn't check for
failure, causing PR22559.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228979 91177308-0d34-0410-b5e6-96231b3b80d8

Learn that __DATA,__objc_classrefs is not atomized via symbols.

This should hopefully fix objc on AArch64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228976 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing override.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228974 91177308-0d34-0410-b5e6-96231b3b80d8

Change max interleave factor to 12 for POWER7 and POWER8.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228973 91177308-0d34-0410-b5e6-96231b3b80d8

Ensure integer domain on general shuffle stack folding tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228972 91177308-0d34-0410-b5e6-96231b3b80d8

Remove typedef of a pointer type used in a gep to simplify migration of geps to a typeless-pointer future.

I'd modify my migration tool to account for this, but this is the only
instance of a typedef'd pointer type to a gep I found in the whole test
suite, so it didn't seem worthwhile.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228970 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG] Don't try to use FP_EXTEND/FP_ROUND for int<->fp promotions

The PowerPC backend has long promoted some floating-point vector operations
(such as select) to integer vector operations. Unfortunately, this behavior was
broken by r216555. When using FP_EXTEND/FP_ROUND for promotions, we must check
that both the old and new types are floating-point types. Otherwise, we must
use BITCAST as we did prior to r216555 for everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228969 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Stop abusing DW_TAG_base_type for compile unit arrays

The sub-arrays for compile units have for a long time been initialized
to distinct temporary nodes with the `DW_TAG_base_type` tag, with no
other operands.  These invalid `DIBasicType`s are later replaced with
appropriate arrays.

This seems like a poor man's assertion that the arrays do eventually get
replaced.  These days, temporaries in the graph will cause assertions
when writing bitcode or assembly, so this isn't necessary.  Use
temporary empty tuples instead.

Note that the whole idea of using temporaries and then replacing them
later is wasteful here.  We never actually want to merge compile units
by uniquing based on content.  Compile units should use `getDistinct()`
instead of `get()`, and then their operands can be freely replaced later
on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228967 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix the build again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228964 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix Linux builds after r228960.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228962 91177308-0d34-0410-b5e6-96231b3b80d8

Remove mostly unused setters.

Most of the code was setting the TargetOptions directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228961 91177308-0d34-0410-b5e6-96231b3b80d8

Add concrete type overloads to PDBSymbol::findChildren().

Frequently you only want to iterate over children of a specific
type (e.g. functions). Previously you would get back a generic
interface that allowed iteration over the base symbol type,
which you would have to dyn_cast<> each one of. With this patch,
we allow the user to specify the concrete type as a template
parameter, and it will return an iterator which returns instances
of the concrete type directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228960 91177308-0d34-0410-b5e6-96231b3b80d8

Add bulk of returning of values to Mips fast-isel

Summary:
Implement the bulk of returning values in Mips fast-isel

Test Plan:
reatabi.ll

Passes test-suite at -O0,-O2 and with mips32r2 and mips32r1.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D5920

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228958 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a crash in the assumption cache when inlining indirect function calls

Summary:
Instances of the AssumptionCache are per function, so we can't re-use
the same AssumptionCache instance when recursing in the CallAnalyzer to
analyze a different function. Instead we have to pass the
AssumptionCacheTracker to the CallAnalyzer so it can get the right
AssumptionCache on demand.

Reviewers: hfinkel

Subscribers: llvm-commits, hans

Differential Revision: http://reviews.llvm.org/D7533

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228957 91177308-0d34-0410-b5e6-96231b3b80d8

Update test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228956 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombine: Allow folding of xor into icmp by changing the predicate for vectors

The loop vectorizer can create this pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228954 91177308-0d34-0410-b5e6-96231b3b80d8

Relaxed over-zealous alignment requirement for VEX-encoded AES instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228953 91177308-0d34-0410-b5e6-96231b3b80d8

Add a testcase for r228432.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228951 91177308-0d34-0410-b5e6-96231b3b80d8

Try to fix the MSVC build.

0xFFFFFFFFFFFFFFFFLL doesn't fit in a long long so it should have
type 'unsigned long long'. MSVC thinks it's a (signed) __int64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228950 91177308-0d34-0410-b5e6-96231b3b80d8

gold-plugin: delete the output file for OT_DISABLE

bfd creates the output file early, so calling exit(0) is not enough, the file needs to be explicitly deleted.

Patch by: H.J. Lu <hjl.tools@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228946 91177308-0d34-0410-b5e6-96231b3b80d8

On ELF, put PIC jump tables in a non executable section.

Fixes PR22558.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228939 91177308-0d34-0410-b5e6-96231b3b80d8

Put each jump table in an independent section if the function is too.

This allows the linker to GC both, fixing pr22557.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228937 91177308-0d34-0410-b5e6-96231b3b80d8

Fix accidental bit flip.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228936 91177308-0d34-0410-b5e6-96231b3b80d8

CoverageMapping: Bitvectorize code. No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228934 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopRerolling] Be more forgiving with instruction order.

We can't solve the full subgraph isomorphism problem. But we can
allow obvious cases, where for example two instructions of different
types are out of order. Due to them having different types/opcodes,
there is no ambiguity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228931 91177308-0d34-0410-b5e6-96231b3b80d8

MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line with countTrailingZeros

Update all callers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228930 91177308-0d34-0410-b5e6-96231b3b80d8

Triple: refactor redundant code.

Should be no functional change, since most of the logic removed was
completely pointless (after some previous refactoring) and the rest
duplicated elsewhere.

Patch by Kamil Rytarowski.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228926 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Call frame optimization - allow stack-relative movs to be folded into a push

Since we track esp precisely, there's no reason not to allow this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228924 91177308-0d34-0410-b5e6-96231b3b80d8

[TTI] Teach the cost heuristic how to query TLI to check if a zext/trunc is 'free' for the target.

Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl
how to query TLI in order to get a more accurate cost for truncates and
zero-extends.

Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase
would have conservatively returned a 'default' TCC_Basic for all zero-extends,
and TCC_Free for truncates on native types.

This patch improves the heuristic so that we query TLI (if available) to get
more accurate answers. If TLI is available, then methods 'isZExtFree' and
'isTruncateFree' can be used to check if a zext/trunc is free for the target.

Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll.
With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz
immediately followed by a free zext/trunc.

Differential Revision: http://reviews.llvm.org/D7585

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228923 91177308-0d34-0410-b5e6-96231b3b80d8

BitVector: Remove manual bit width dispatch, this is handled by templates

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228922 91177308-0d34-0410-b5e6-96231b3b80d8

MathExtras: Parametrize count(Trailing|Leading)Zeros on the type size.

Otherwise we will always select the generic version for e.g. unsigned
long if uint64_t is typedef'd to 'unsigned long long'. Also remove
enable_if hacks in favor of static_assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228921 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: Fix another regression introduced in r223113

The changes in r223113 (ARM modified-immediate syntax) have broken
instructions like:
mov r0, #~0xffffff00
The problem is that I've added a spurious range check on the immediate
operand to ensure that it lies between INT32_MIN and UINT32_MAX. While
this range check is correct in theory, it causes problems because the
operand is stored in an int64_t (by MC). So valid 32-bit constants like
\#~0xffffff00 become out of range. The solution is to simply remove this
range check. It is not possible to validate the range of the immediate
operand with the current setup because: 1) The operand is stored in an
int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm
or even #((~imm)) etc. So we just chop the value to 32 bits and use it.

Also noted that the original range check was note tested by any of the
unit tests. I've added a new test to cover #~imm kind of operands.

Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228920 91177308-0d34-0410-b5e6-96231b3b80d8

tsan: do not instrument not captured values

I've built some tests in WebRTC with and without this change. With this change number of __tsan_read/write calls is reduced by 20-40%, binary size decreases by 5-10% and execution time drops by ~5%. For example:

$ ls -l old/modules_unittests new/modules_unittests
-rwxr-x--- 1 dvyukov 41708976 Jan 20 18:35 old/modules_unittests
-rwxr-x--- 1 dvyukov 38294008 Jan 20 18:29 new/modules_unittests
$ objdump -d old/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
239871
$ objdump -d new/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
148365

http://reviews.llvm.org/D7069

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228917 91177308-0d34-0410-b5e6-96231b3b80d8

AVX-512: Fixed the "test" operation for i1 type

Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits.
KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits.
I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB.

There are some cases where i1 is in the mask register and the upper bits are already zeroed.
Then KORTESTW is the better solution, but it is subject for optimization.
Meanwhile, I'm fixing the correctness issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228916 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] A heuristic to estimate the size impact for converting stack-relative parameter movs to pushes

This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size.
We go over all calls in the MachineFunction and compute:
a) For each callsite that can not use pushes, the penalty of not having a reserved call frame.
b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack).

Differential Revision: http://reviews.llvm.org/D7561

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228915 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Don't blindly combine (fp_round (fp_round x)) to (fp_round x).

We used to do this DAG combine, but it's not always correct:
If the first fp_round isn't a value preserving truncation, it might
introduce a tie in the second fp_round, that wouldn't occur in the
single-step fp_round we want to fold to.
In other words, double rounding isn't the same as rounding.

Differential Revision: http://reviews.llvm.org/D7571

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228911 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed a bug where CFLAA would crash the compiler.

We would crash if we couldn't locate a Function that either Location's
Value belonged to. Now we just print out a debug message and return
conservatively.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228901 91177308-0d34-0410-b5e6-96231b3b80d8

[slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.

The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.

Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.

I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.

I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228899 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Mark jumps as expensive (using using CR bits)

On PowerPC, which has a full set of logical operations on (its multiple sets
of) condition-register bits, it is not profitable to break of complex
conditions feeding a jump into multiple jumps. We can turn off this feature of
CGP/SDAGBuilder by marking jumps as "expensive".

P7 test-suite speedups (no regressions):
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
-0.626647% +/- 0.323583%
MultiSource/Benchmarks/Olden/power/power
-18.2821% +/- 8.06481%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228895 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Change Path::filename_pos() to skip the drive letter."

This reverts commit 228874. For some reason users reported
seeing Clang taking up 25+GB of memory and bringing down
machines with this change. Reverting until we figure it out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228890 91177308-0d34-0410-b5e6-96231b3b80d8

Invert the section relocation map.

It now points from rel section to section. Use it to set sh_info, avoiding
a brittle name lookup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228889 91177308-0d34-0410-b5e6-96231b3b80d8

Use the existing SymbolTableIndex instead of doing a lookup. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228888 91177308-0d34-0410-b5e6-96231b3b80d8

Create the Seciton -> Rel Section map when it is first needed. NFC.

Saves a walk over every section.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228886 91177308-0d34-0410-b5e6-96231b3b80d8

DeadArgElim: aggregate Return assessment properly.

I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228885 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused argument. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228884 91177308-0d34-0410-b5e6-96231b3b80d8