granicus.if.org Git

Merging r242869:
------------------------------------------------------------------------
r242869 | chandlerc | 2015-07-21 20:32:42 -0700 (Tue, 21 Jul 2015) | 60 lines

[SROA] Fix a nasty pile of bugs to do with big-endian, different alloca
types and loads, loads or stores widened past the size of an alloca,
etc.

This started off with a bug report about big-endian behavior with
bitfields and loads and stores to a { i32, i24 } struct. An initial
attempt to fix this was sent for review in D10357, but that didn't
really get to the root of the problem.

The core issue was that canConvertValue and convertValue in SROA were
handling different bitwidth integers by doing a zext of the integer. It
wouldn't do a trunc though, only a zext! This would in turn lead SROA to
form an i24 load from an i24 alloca, zext it to i32, and then use it.
This would at least produce the wrong value for big-endian systems.

One of my many false starts here was to correct the computation for
big-endian systems by shifting. But this doesn't actually work because
the original code has a 64-bit store to the entire 8 bytes, and a 32-bit
load of the last 4 bytes, and because the alloc size is 8 bytes, we
can't lose that last (least significant if bigendian) byte! The real
problem here is that we're forming an i24 load in SROA which is actually
not sufficiently wide to load all of the necessary bits here. The source
has an i32 load, and SROA needs to form that as well.

The straightforward way to do this is to disable the zext logic in
canConvertValue and convertValue, forcing us to actually load all
32-bits. This seems like a really good change, but it in turn breaks
several other parts of SROA.

First in the chain of knock-on failures, we had places where we were
doing integer-widening promotion even though some of the integer loads
or stores extended *past the end* of the alloca's memory! There was even
a comment about preventing this, but it only prevented the case where
the type had a different bit size from its store size. So I added checks
to handle the cases where we actually have a widened load or store and
to avoid trying to special integer widening promotion in those cases.

Second, we actually rely on the ability to promote in the face of loads
past the end of an alloca! This is important so that we can (for
example) speculate loads around PHI nodes to do more promotion. The bits
loaded are garbage, but as long as they aren't used and the alignment is
suitable high (which it wasn't in the test case!) this is "fine". And we
can't stop promoting here, lots of things stop working well if we do. So
we need to add specific logic to handle the extension (and truncation)
case, but *only* where that extension or truncation are over bytes that
*are outside the alloca's allocated storage* and thus totally bogus to
load or store.

And of course, once we add back this correct handling of extension or
truncation, we need to correctly handle bigendian systems to avoid
re-introducing the exact bug that started us off on this chain of misery
in the first place, but this time even more subtle as it only happens
along speculated loads atop a PHI node.

I've ported an existing test for PHI speculation to the big-endian test
file and checked that we get that part correct, and I've added several
more interesting big-endian test cases that should help check that we're
getting this correct.

Fun times.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242924 91177308-0d34-0410-b5e6-96231b3b80d8

Merge r242733, r242734, r242735 and r242742

(r242742 is the interesting patch here, but I picked the others too to get a clean
merge since there's been some back-and-forth on this file.)

------------------------------------------------------------------------
r242733 | matze | 2015-07-20 16:17:14 -0700 (Mon, 20 Jul 2015) | 3 lines

Revert "ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920"

This reverts commit r241951. It caused http://llvm.org/PR24190
------------------------------------------------------------------------

------------------------------------------------------------------------
r242734 | matze | 2015-07-20 16:17:16 -0700 (Mon, 20 Jul 2015) | 3 lines

Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code"

This reverts commit r241928. This caused http://llvm.org/PR24190
------------------------------------------------------------------------

------------------------------------------------------------------------
r242735 | matze | 2015-07-20 16:17:20 -0700 (Mon, 20 Jul 2015) | 3 lines

Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2"

This reverts commit r241926. This caused http://llvm.org/PR24190
------------------------------------------------------------------------

------------------------------------------------------------------------
r242742 | matze | 2015-07-20 17:18:59 -0700 (Mon, 20 Jul 2015) | 7 lines

ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2

Re-apply r241926 with an additional check that r13 and r15 are not used
for LDRD/STRD. See http://llvm.org/PR24190. This also already includes
the fix from r241951.

Differential Revision: http://reviews.llvm.org/D10623
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242907 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242722:
------------------------------------------------------------------------
r242722 | dim | 2015-07-20 15:24:40 -0700 (Mon, 20 Jul 2015) | 6 lines

Avoid early pipefail exits due to grep failures in stage comparisons.

If objects or executables did not contain any RPATH, grep would return
nonzero, and the whole stage comparison loop would unexpectedly exit.
Fix this by checking the grep result explicitly.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242861 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242721:
------------------------------------------------------------------------
r242721 | dim | 2015-07-20 15:07:27 -0700 (Mon, 20 Jul 2015) | 4 lines

Since BSD cmp(1) does not support the --ignore-initial option, use the
more portable 3rd and 4th arguments to skip the first 16 bytes during
the comparison of Phase2 and Phase3 objects.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242860 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242706:
------------------------------------------------------------------------
r242706 | hans | 2015-07-20 13:36:21 -0700 (Mon, 20 Jul 2015) | 1 line

test-release.sh: don't include /usr/local prefix in the tarball
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242859 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242680:
------------------------------------------------------------------------
r242680 | wschmidt | 2015-07-20 08:43:21 -0700 (Mon, 20 Jul 2015) | 1 line

Add missing test for r242296 (vec_sld)
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242686 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242673:
------------------------------------------------------------------------
r242673 | tstellar | 2015-07-20 07:28:41 -0700 (Mon, 20 Jul 2015) | 11 lines

AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops

Summary:
The MUBUF addr64 bit has been removed on VI, so we must use FLAT
instructions when the pointer is stored in VGPRs.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11067
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242685 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242434:
------------------------------------------------------------------------
r242434 | tstellar | 2015-07-16 12:40:09 -0700 (Thu, 16 Jul 2015) | 7 lines

AMDPGU/SI: Negative offsets aren't allowed in MUBUF's vaddr operand

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11226
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242684 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242433:
------------------------------------------------------------------------
r242433 | tstellar | 2015-07-16 12:40:07 -0700 (Thu, 16 Jul 2015) | 11 lines

AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsets

Summary:
We can safely assume that the high bit of scratch offsets will never
be set, because this would require at least 128 GB of GPU memory.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11225
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242683 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242281:
------------------------------------------------------------------------
r242281 | chandlerc | 2015-07-15 01:53:29 -0700 (Wed, 15 Jul 2015) | 21 lines

[PM/AA] Fix *numerous* serious bugs in GlobalsModRef found by
inspection.

While we want to handle calls specially in this code because they should
have been modeled by the call graph analysis that precedes it, we should
*not* be re-implementing the predicates for whether an instruction reads
or writes memory. Those are well defined already. Notably, at least the
following issues seem to be clearly missed before:
- Ordered atomic loads can "write" to memory by causing writes from other
threads to become visible. Similarly for ordered atomic stores.
- AtomicRMW instructions quite obviously both read and write to memory.
- AtomicCmpXchg instructions also read and write to memory.
- Fences read and write to memory.
- Invokes of intrinsics or memory allocation functions.

I don't have any test cases, and I suspect this has never really come up
in the real world. But there is no reason why it wouldn't, and it makes
the code simpler to do this the right way.

While here, I've tried to make the loops significantly simpler as well
and added helpful comments as to what is going on.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242570 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242543:
------------------------------------------------------------------------
r242543 | hans | 2015-07-17 09:49:59 -0700 (Fri, 17 Jul 2015) | 1 line

Add libunwind to the release scripts
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242544 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242296:
------------------------------------------------------------------------
r242296 | wschmidt | 2015-07-15 08:45:30 -0700 (Wed, 15 Jul 2015) | 37 lines

[PPC64LE] Fix vec_sld semantics for little endian

The vec_sld interface provides access to the vsldoi instruction.
Unlike most of the vec_* interfaces, we do not attempt to change the
generated code for vec_sld based on the endian mode.  It is too
difficult to correctly infer the desired semantics because of
different element types, and the corrected instruction sequence is
expensive, involving loading a permute control vector and performing a
generalized permute.

For GCC, this was implemented as "Don't touch the vec_sld"
implementation.  When it came time for the LLVM implementation, I did
the same thing.  However, this was hasty and incorrect.  In LLVM's
version of altivec.h, vec_sld was previously defined in terms of the
vec_perm interface.  Because vec_perm semantics are adjusted for
little endian, this means that leaving vec_sld untouched causes it to
generate something different for LE than for BE.  Not good.

This back-end patch accompanies the changes to altivec.h that change
vec_sld's behavior for little endian.  Those changes mean that we see
slightly different code in the back end when trying to recognize a
VSLDOI instruction in isVSLDOIShuffleMask.  In particular, a
ShuffleKind of 1 (where the two inputs are identical) must now be
treated the same way as a ShuffleKind of 2 (little endian with
different inputs) when little endian mode is in force.  This is
because ShuffleKind of 1 is defined using big-endian numbering.

This has a ripple effect on LowerBUILD_VECTOR, where we create our own
internal VSLDOI instructions.  Because these are a ShuffleKind of 1,
they will now have their shift amounts subtracted from 16 when
recognizing the shuffle mask.  To avoid problems we have to subtract
them from 16 again before creating the VSLDOI instructions.

There are a couple of other uses of BuildVSLDOI, but these do not need
to be modified because the shift amount is 8, which is unchanged when
subtracted from 16.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242530 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242236:
------------------------------------------------------------------------
r242236 | rafael | 2015-07-14 15:42:21 -0700 (Tue, 14 Jul 2015) | 1 line

Accept lower case to handle windows error messages.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242527 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242412:
------------------------------------------------------------------------
r242412 | tstellar | 2015-07-16 09:13:34 -0700 (Thu, 16 Jul 2015) | 3 lines

AMDGPU/R600: Remove unused variable

This fixes a warning introduced by r242410.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242455 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242410:
------------------------------------------------------------------------
r242410 | tstellar | 2015-07-16 08:38:29 -0700 (Thu, 16 Jul 2015) | 13 lines

AMDPGU/R600: Replace llvm_unreachable() call with LLVMContext::emitError()

Summary:
This fixes an issue on MIPS where the infinite-loop-evergreen.ll test
was failing to terminate.

Fixes PR24147.

Reviewers: arsenm, dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11260
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242453 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242442:
------------------------------------------------------------------------
r242442 | wschmidt | 2015-07-16 14:14:07 -0700 (Thu, 16 Jul 2015) | 14 lines

[PowerPC] v4i32 is a VSRCRegClass

I was looking at some vector code generation and kept seeing
unnecessary vector copies into the Altivec half of the VSX registers.
I discovered that we overlooked v4i32 when adding the register classes
for VSX; we only added v4f32 and v2f64. This means that anything that
canonicalizes into v4i32 (which is a LOT of stuff) ends up being
forced into VRRC on its way to VSRC.

The fix is one line. The rest of the patch is fixing up some test
cases whose code generation has changed as a result.

This seems like it would be a good candidate for backport to 3.7.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242447 91177308-0d34-0410-b5e6-96231b3b80d8

Add some more PowerPC release notes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242408 91177308-0d34-0410-b5e6-96231b3b80d8

Add PowerPC release notes for 3.7

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242407 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242341:
------------------------------------------------------------------------
r242341 | hans | 2015-07-15 15:18:25 -0700 (Wed, 15 Jul 2015) | 7 lines

test-release.sh: Run both .o files through sed before comparing them

On some systems (e.g. Mac OS X), sed will add a newline to the end of
the output if there wasn't one already. This would cause false
cmp errors since the .o file from Phase 2 was passed through sed and
the one from Phase 3 wasn't. Work around this by passing both through
sed.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242361 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242331:
------------------------------------------------------------------------
r242331 | hans | 2015-07-15 14:06:16 -0700 (Wed, 15 Jul 2015) | 17 lines

Switch the release script to build with CMake by default (PR21561)

It retains the possibility to use the autoconf build with a
command-line option ('-use-autoconf'), and uses that by default on Darwin since
compiler-rt requires it on that platform.

This commit also removes the "Release-64" flavour and related logic. The script
would previously do two builds unless the '-no-64bit' flag was passed, but on
my machine and from those I asked this always ended up producing two 64-bit builds,
causing much confusion.

It also removes the -build-triple option, which caused the --build= flag to
get passed to ./configure. This was presumably intended for cross-compiling,
but none of the release testers use it. If someone does want to pass it,
they can use '-configure-flags --build=foo' instead.

Differential Revision: http://reviews.llvm.org/D10715
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242360 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242288:
------------------------------------------------------------------------
r242288 | d0k | 2015-07-15 05:56:19 -0700 (Wed, 15 Jul 2015) | 3 lines

[PPC] Disassemble little endian ppc instructions in the right byte order

PR24122. The test is simply a byte swapped version of ppc64-encoding.txt.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242327 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r242239:
------------------------------------------------------------------------
r242239 | hfinkel | 2015-07-14 15:53:11 -0700 (Tue, 14 Jul 2015) | 4 lines

[PowerPC] Support symbolic targets in patchpoints

Follow-up r235483, with the corresponding support in PPC. We use a regular call
for symbolic targets (because they're much cheaper than indirect calls).
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242325 91177308-0d34-0410-b5e6-96231b3b80d8

Change version from '3.7.0svn' to '3.7.0'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242251 91177308-0d34-0410-b5e6-96231b3b80d8

Creating release_37 branch off revision 242221

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242224 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Turn RuntimePointerChecking into a class, start hiding things, NFC

The goal is to start hiding internal APIs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242220 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Introduce RuntimePointerChecking::PointerInfo, NFC

Turn this structure-of-arrays (i.e. the various pointer attributes) into
array-of-structures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242219 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC

I am planning to add more nested classes inside RuntimePointerCheck so
all these triple-nesting would be hard to follow.

Also rename it to RuntimePointerChecking (i.e. append 'ing').

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242218 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Use the ABI indirect-call protocol for patchpoints

We used to take the address specified as the direct target of the patchpoint
and did no TOC-pointer handling. This, however, as not all that useful,
because MCJIT tends to create a lot of modules, and they have their own TOC
sections. Thus, to call from the generated code to other generated code, you
really need to switch TOC pointers. Make this work as expected, and under
ELFv1, tread the address as the function descriptor address so that the correct
TOC pointer can be loaded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242217 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for reading members out of thin archives.

For now the Archive owns the buffers of the thin archive members.
This makes for a simple API, but all the buffers are destructed
only when the archive is destructed. This should be fine since we
close the files after mmap so we should not hit an open file
limit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242215 91177308-0d34-0410-b5e6-96231b3b80d8

[ExecutionEngine] Re-apply r241962 with fixes for ARM.

Patch by Pierre-Andre Saulais. Thanks Pierre-Andre!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242213 91177308-0d34-0410-b5e6-96231b3b80d8

Add allnodes() iterator range to SelectionDAG. NFC.

SelectionDAG already had begin/end methods for iterating over all
the nodes, but didn't define an iterator_range for us in foreach
loops.

This adds such a method and uses it in some of the eligible places
throughout the backends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242212 91177308-0d34-0410-b5e6-96231b3b80d8

Move SDNode::IROrder in to padding to save space. NFC.

There was a 32-bit padding gap between 'unsigned short NumOperands, NumValues;' and 'DebugLoc debugLoc. Move 'unsigned IROrder' in to that gap.

This trims the size of SDNode's from 76 bytes (really 80 due to alignment) to 72 bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242211 91177308-0d34-0410-b5e6-96231b3b80d8

Constify parameters in SelectionDAG methods. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242210 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unnecessary .getNode() in SelectionDAG. NFC.

The simplify_type specialisation allows us to cast directly from
SDValue to an SDNode* subclass so we don't need to pass a SDNode*
to cast<>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242209 91177308-0d34-0410-b5e6-96231b3b80d8

Use more foreach loops in SelectionDAG. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242208 91177308-0d34-0410-b5e6-96231b3b80d8

MIR Serialization: Serialize the machine basic block live in registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242204 91177308-0d34-0410-b5e6-96231b3b80d8

MIR Printer: move the function 'printReg'. NFC.

This commit moves the function 'printReg' towards the start of the file so that
it can be used by the conversion methods in MIRPrinter and not just the printing
methods in MIPrinter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242203 91177308-0d34-0410-b5e6-96231b3b80d8

GVN: use a static array instead of regenerating it each time. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242202 91177308-0d34-0410-b5e6-96231b3b80d8

WebAssembly: add basic int/fp instruction codegen.

Summary: This patch has the most basic instruction codegen for 32 and 64 bit int/fp.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11193

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242201 91177308-0d34-0410-b5e6-96231b3b80d8

Fix NDEBUG build warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242200 91177308-0d34-0410-b5e6-96231b3b80d8

GVN: tolerate an instruction being replaced without existing in the leaderboard

Sometimes an incidentally created instruction can duplicate a Value used
elsewhere. It then often doesn't end up in the leader table. If it's later
removed, we attempt to remove it from the leader table and segfault.

Instead we should just ignore the removal request, which won't cause any
problems. The reverse situation, where the original instruction is replaced by
the new one (which you might think could leave the leader table empty) cannot
occur, because the incidental instruction will never be found in the first
place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242199 91177308-0d34-0410-b5e6-96231b3b80d8

test-release.sh: Remove the InstallDir parameter from configure_llvmCore

After r242187, it's never set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242194 91177308-0d34-0410-b5e6-96231b3b80d8

Fix Windows build: replace __func__ with LLVM_FUNCTION_NAME

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242192 91177308-0d34-0410-b5e6-96231b3b80d8

[MMX] Use the appropriate instructions for GR64 <-> VR64 copies.

MOVSDto64rr and MOV64toSDrr are defined to convert between FR64 (%xmm)
<-> GR64 registers, not VR64 (%mm) <-> GR64. This is wrong.

I found this by inspection and could not find a suitable testcase for it
since (1) we don't handle MMX bitcasts in Peephole optimizer as to
generate COPYs that (2) could be expanded back to the appropriate x86
instruction in ExpandPostRA.

Switch to use the appropriate instructions: MMX_MOVD64from64rr and
MMX_MOVD64to64rr here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242191 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix the PPCInstrInfo::getInstrLatency implementation

PowerPC uses itineraries to describe processor pipelines (and dispatch-group
restrictions for P7/P8 cores). Unfortunately, the target-independent
implementation of TII.getInstrLatency calls ItinData->getStageLatency, and that
looks for the largest cycle count in the pipeline for any given instruction.
This, however, yields the wrong answer for the PPC itineraries, because we
don't encode the full pipeline. Because the functional units are fully
pipelined, we only model the initial stages (there are no relevant hazards in
the later stages to model), and so the technique employed by getStageLatency
does not really work. Instead, we should take the maximum output operand
latency, and that's what PPCInstrInfo::getInstrLatency now does.

This caused some test-case churn, including two unfortunate side effects.
First, the new arrangement of copies we get from function parameters now
sometimes blocks VSX FMA mutation (a FIXME has been added to the code and the
test cases), and we have one significant test-suite regression:

SingleSource/Benchmarks/BenchmarkGame/spectral-norm
56.4185% +/- 18.9398%

In this benchmark we have a loop with a vectorized FP divide, and it with the
new scheduling both divides end up in the same dispatch group (which in this
case seems to cause a problem, although why is not exactly clear). The grouping
structure is hard to predict from the bottom of the loop, and there may not be
much we can do to fix this.

Very few other test-suite performance effects were really significant, but
almost all weakly favor this change. However, in light of the issues
highlighted above, I've left the old behavior available via a
command-line flag.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242188 91177308-0d34-0410-b5e6-96231b3b80d8

Fix several issues with the test-release.sh script

* Use the default install prefix (/usr/local) and use DESTDIR instead to
  set a temporary install location for tarballing. This is the correct
  way to package binary releases (otherwise the temporary install path
  ends up in files in the binary release).
* Remove ``-disable-clang`` option. It did not work correctly
  (tarballing assumed phase 3 was run) and when doing a release
  we should always be doing a three-phased build and test.

Note: Technically we should only be using DESTDIR for the third phase
and use --prefix for the first and second phase because we run the built
clang from phase 1 and 2 (and in general an application's behaviour
may depend on the install prefix). However in the case of clang it
seems to not care what the install prefix was so to simplify the script
we use DESTDIR for all three stages.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242187 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Generate instructions for operations on predicate registers

Convert logical operations on general-purpose registers to the correspon-
ding operations on predicate registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242186 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Force emission of personality directive if explicitly specified

Summary:
Before this change, personality directives were not emitted
if there was no invoke left in the function (of course until
recently this also meant that we couldn't know what
the personality actually was). This patch forces personality directives
to still be emitted, unless it is known to be a noop in the absence of
invokes, or the user explicitly specified `nounwind` (and not
`uwtable`) on the function.

Reviewers: majnemer, rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D10884

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242185 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for on-disk hash table lookup with a known hash, for situations where the same key will be looked up in multiple tables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242179 91177308-0d34-0410-b5e6-96231b3b80d8

Teach config.guess that MSYS exists.

We might not want to upgrade config.guess to the current
version due to the license change from GPL2 to GPL3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242178 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32)

This can be done only with moves which theoretically
will optimize better later.

Although this transform increases the instruction count,
it should be code size / cycle count neutral in the worst
VALU case. It also seems to slightly improve a couple
of testcases due to other DAG combines this exposes.

This is probably slightly worse for the SALU case, so
it might be better to handle this during moveToVALU,
although then you lose some simplifications like
the load width reducing in the simple testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242177 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Fix read2 merging into a super register.

If the read2 produced was supposed to be writing into a
super register, it would use the wrong subregister indices.
Fix this by inserting copies, so we only ever write to a vreg_64.
Run the register coalescer again to clean this up, although this
isn't ideal and often does result in an extra move.

Also remove the assert that offset1 > offset0.

There isn't a real reason to not allow this other than a minor
convenience in the compiler, and it doesn't seem worth the effort
of avoiding it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242174 91177308-0d34-0410-b5e6-96231b3b80d8

MachineRegisterInfo: Remove UsedPhysReg infrastructure

We have a detailed def/use lists for every physical register in
MachineRegisterInfo anyway, so there is little use in maintaining an
additional bitset of which ones are used.

Removing it frees us from extra book keeping. This simplifies
VirtRegMap.

Differential Revision: http://reviews.llvm.org/D10911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242173 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid MSVC-incompatible use of init list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242170 91177308-0d34-0410-b5e6-96231b3b80d8

RAGreedy: Keep track of allocated PhysRegs internally

Do not use MachineRegisterInfo::setPhysRegUsed()/isPhysRegUsed()
anymore. This bitset changes function-global state and is set by the
VirtRegRewriter anyway.
Simply use a bitvector private to RAGreedy.

Differential Revision: http://reviews.llvm.org/D10910

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242169 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing builtins to the PPC back end for ABI compliance (vol. 4)

This patch corresponds to review:
http://reviews.llvm.org/D11183

Back end portion of the fourth round of additions to altivec.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242167 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: add at least one real test for r242123.

The ones committed were orthogonal to the change and would have passed before
that revision. What it *did* do was prevent an assertion failure when
generating object files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242166 91177308-0d34-0410-b5e6-96231b3b80d8

PrologEpilogInserter: Rewrite API to determine callee save regsiters.

This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():

- Rename the function to determineCalleeSaves()
- Pass a bitset of callee saved registers by reference, thus avoiding
the function-global PhysRegUsed bitset in MachineRegisterInfo.
- Without PhysRegUsed the implementation is fine tuned to not save
physcial registers which are only read but never modified.

Related to rdar://21539507

Differential Revision: http://reviews.llvm.org/D10909

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242165 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: add rev64 alias for 64-bit rev instruction.

It could be useful to assembly programmers and makes the permitted variants a
little more uniform.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242164 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Generate "extract" instructions more aggressively

Generate extract instructions (via intrinsics) before the DAG combiner
folds shifts into unrecognizable forms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242163 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-ar: Don't try to extract from thin archives.

This matches the gnu ar behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242162 91177308-0d34-0410-b5e6-96231b3b80d8

ARMAsmParser: Take MCInst param by const-ref

(Broken out from http://reviews.llvm.org/D11167)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242160 91177308-0d34-0410-b5e6-96231b3b80d8

Add default value for Args parameter of IRBuilder::CreateCall

Convenient for calls to zero-argument functions.

Patch by servuswiegehtz at yahoo.de

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242159 91177308-0d34-0410-b5e6-96231b3b80d8

Sleep for 2.1 seconds to see if that makes the test stable on windows.

Might fix pr24106.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242158 91177308-0d34-0410-b5e6-96231b3b80d8

Allocate the IntervalMap in ELF.h on the heap to work around MSVC alignment bug (PR24113)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242157 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-ar: print an error when the requested member is not found.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242156 91177308-0d34-0410-b5e6-96231b3b80d8

Use a range loop. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242153 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Fix `llvm-config` to emit the linker flag for the combined shared object built by autoconfig/make instead of the individual components."

This reverts commit 01446706b4c0a86bb64768f307079cab5c514aa3.

Causes breakage, seems to be related to 'svn' in the file's name:

  CC=gcc CXX=g++ \
  ../llvm/configure \
    --prefix=/usr \
    --sysconfdir=/etc \
    --enable-shared \
    --enable-libffi \
    --enable-targets=all \
    --disable-assertions \
    --with-python=/usr/bin/python2 \
    --enable-optimized

  make REQUIRES_RTTI=1 ENABLE_PIC=1

results:
llvm[2]: Linking Release unit test Support (without symbols)
llvm[2]: ======= Finished Linking Release Unit test Support (without symbols)
make[3]: Entering directory '/build/llvm-svn/src/build/bindings/ocaml/llvm'
make[3]: *** No rule to make target '/build/llvm-
svn/src/build/Release/lib/ocaml/libLLVM-3.7.0svn.so', needed by 'build-
deplibs'.  Stop.
make[3]: *** Waiting for unfinished jobs....
llvm[3]: Compiling llvm_ocaml.c for Release build
make[3]: Leaving directory '/build/llvm-svn/src/build/bindings/ocaml/llvm'
/build/llvm-svn/src/llvm/Makefile.rules:880: recipe for target 'all' failed
/build/llvm-svn/src/llvm/Makefile.rules:965: recipe for target 'all' failed

Differential Revision: http://reviews.llvm.org/D10716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242152 91177308-0d34-0410-b5e6-96231b3b80d8

Rename a test. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242151 91177308-0d34-0410-b5e6-96231b3b80d8

Caused regressions: compile Release+Asserts failed on clang-native-arm-cortex-a9
Revert "-Added API for retrieving the default FPU of a CPU from TargetParser."

This reverts commit 01199ab0c6ff2d5c4f6b2c05a95ec011e41c4669.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242147 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Add support for shrinking v_cndmask_b32_e32 instructions

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11061

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242146 91177308-0d34-0410-b5e6-96231b3b80d8

Silencing two MSVC warnings; 'argument' : truncation from 'unsigned int' to 'int16_t' and truncation of constant value. NFC intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242145 91177308-0d34-0410-b5e6-96231b3b80d8

-Added API for retrieving the default FPU of a CPU from TargetParser.
-Implemented as a table lookup.

Change-Id: Ibf7217f6bd2769e9c06835a5aede3d072dee6757
Phabricator: http://reviews.llvm.org/D11100

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242141 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix li/la differences between IAS and GAS.

Summary:
- Signed 16-bit should have priority over unsigned.
- For la, unsigned 16-bit must use ori+addu rather than directly use ori.
- Correct tests on 32-bit immediates with 64-bit predicates by
sign-extending the immediate beforehand. For example, isInt<16>(0xffff8000)
should be true and use addiu.

Also split li/la testing into separate files due to their size.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242139 91177308-0d34-0410-b5e6-96231b3b80d8

[PM/AA] Reformat GlobalsModRef so that subsequent patches I make here
don't continually introduce formatting deltas. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242129 91177308-0d34-0410-b5e6-96231b3b80d8

Fix comment typo

Test commit access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242128 91177308-0d34-0410-b5e6-96231b3b80d8

[SROA] Don't de-atomic volatile loads and stores

Volatile loads and stores are made visible in global state regardless of
what memory is involved. It is not correct to disregard the ordering
and synchronization scope because it is possible to synchronize with
memory operations performed by hardware.

This partially addresses PR23737.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242126 91177308-0d34-0410-b5e6-96231b3b80d8

Generate correct asm info for mingw and cygwin ARM targets.

http://reviews.llvm.org/D11075

Patch by Martell Malone
Reviewed by Reid Kleckner

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242123 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Unbreak add_llvm_external_project when external projects are specified.

LLVM_EXTERNAL_*_SOURCE_DIR is reset as PATH with set(CACHE PATH).
Then the CACHE PATH variable, LLVM_EXTERNAL_*_SOURCE_DIR, is normalized as
${CMAKE_SOURCE_DIR}/${path_var} if ${path_var} is relative.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242120 91177308-0d34-0410-b5e6-96231b3b80d8

Prune trailing whitespaces and CRs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242117 91177308-0d34-0410-b5e6-96231b3b80d8

Give an explicit triple to llvm/test/CodeGen/X86/pr13577.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242111 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization"

Accidental commit, needs review first.

This reverts commit r242107.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242108 91177308-0d34-0410-b5e6-96231b3b80d8

LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization

- Factor out code to query and modify the sign bit of a floatingpoint
  value as an integer. This also works if none of the targets integer
  types is big enough to hold all bits of the floatingpoint value.

- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
  otherwise perform bit manipulation on the sign bit. The previous code
  used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
  takes 34 instructions on ARM Cortex-M4. With this patch we only
  require 5:
    vldr d0, LCPI0_0
    vmov r2, r3, d0
    lsrs r2, r3, #31
    bfi r1, r2, #31, #1
    bx lr
  (This could be further improved if the compiler would recognize that
   r2, r3 is zero).

- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
  available otherwise perform bit manipulation on the sign bit.

- Perform the sign(x) test by masking out the sign bit and comparing
  with 0 rather than shifting the sign bit to the highest position and
  testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
    testl $32768, %eax
  rather than:
    shlq $48, %rax
    sets %al
    testb %al, %al

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242107 91177308-0d34-0410-b5e6-96231b3b80d8

X86: Check output of x86 copysignl testcase.

This makes the changes in an upcoming patch visible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242106 91177308-0d34-0410-b5e6-96231b3b80d8

Add capability to get and set the personalitty function from the C API

Summary:
The capability was lost with D10429 where the personality function was set at function level rather than landing pad level. Now there is no way to get/set the personality function from the C API. That is a problem.

Note that the whole thing could be avoided by improving the C API testing, as started by D10725

Reviewers: chandlerc, bogner, majnemer, andrew.w.kaylor, rafael, rnk, axw

Subscribers: rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D10946

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242104 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Forgot to quote the first part of STREQUAL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242103 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] We shouldn't be storing values in the cache unless they actually need CMake cache behavior.

add_llvm_external_project puts LLVM_EXTERNAL_${nameUPPER}_SOURCE_DIR into the cache even if it is just the in-tree default path. This causes all sorts of oddness, and makes it so that I can't change the behavior of this variable.

This patch never puts LLVM_EXTERNAL_${nameUPPER}_SOURCE_DIR into the cache. It will only end up in the cache if it is specified on the command line, which is the correct behavior.

There is also a temporary change to remove non-default values from the cache if they are already present. This should have the impact of cleaning out unncecissary values from the caches on the buildbots and people's local build directories. This part of the change is marked with a TODO and can be removed in a few days.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242102 91177308-0d34-0410-b5e6-96231b3b80d8

Add a herper function. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242100 91177308-0d34-0410-b5e6-96231b3b80d8

MIR Serialization: Serialize the variable sized stack objects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242095 91177308-0d34-0410-b5e6-96231b3b80d8

Update enforceKnownAlignment after the isWeakForLinker semantic change

Previously we would refrain from attempting to increase the linkage of
available_externally globals because they were considered weak for the
linker. Now they are treated more like a declaration instead of a weak
definition.

This was causing SSE alignment faults in Chromuim, when some code
assumed it could increase the alignment of a dllimported global that it
didn't control. http://crbug.com/509256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242091 91177308-0d34-0410-b5e6-96231b3b80d8

MIR Serialization: Serialize the sub register indices.

This commit serializes the sub register indices from the register machine
operands.

Reviewers: Duncan P. N. Exon Smith

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242084 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing file.

Sorry about that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242083 91177308-0d34-0410-b5e6-96231b3b80d8

Fix reading archive members with / in the name.

This is important for thin archives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242082 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC64LE] More improvements to VSX swap optimization

This patch allows VSX swap optimization to succeed more frequently.
Specifically, it is concerned with common code sequences that occur
when copying a scalar floating-point value to a vector register.  This
patch currently handles cases where the floating-point value is
already in a register, but does not yet handle loads (such as via an
LXSDX scalar floating-point VSX load).  That will be dealt with later.

A typical case is when a scalar value comes in as a floating-point
parameter.  The value is copied into a virtual VSFRC register, and
then a sequence of SUBREG_TO_REG and/or COPY operations will convert
it to a full vector register of the class required by the context.  If
this vector register is then used as part of a lane-permuted
computation, the original scalar value will be in the wrong lane.  We
can fix this by adding a swap operation following any widening
SUBREG_TO_REG operation.  Additional COPY operations may be needed
around the swap operation in order to keep register assignment happy,
but these are pro forma operations that will be removed by coalescing.

If a scalar value is otherwise directly referenced in a computation
(such as by one of the many XS* vector-scalar operations), we
currently disable swap optimization.  These operations are
lane-sensitive by definition.  A MentionsPartialVR flag is added for
use in each swap table entry that mentions a scalar floating-point
register without having special handling defined.

A common idiom for PPC64LE is to convert a double-precision scalar to
a vector by performing a splat operation.  This ensures that the value
can be referenced as V[0], as it would be for big endian, whereas just
converting the scalar to a vector with a SUBREG_TO_REG operation
leaves this value only in V[1].  A doubleword splat operation is one
form of an XXPERMDI instruction, which takes one doubleword from a
first operand and another doubleword from a second operand, with a
two-bit selector operand indicating which doublewords are chosen.  In
the general case, an XXPERMDI can be permitted in a lane-swapped
region provided that it is properly transformed to select the
corresponding swapped values.  This transformation is to reverse the
order of the two input operands, and to reverse and complement the
bits of the selector operand (derivation left as an exercise to the
reader ;).

A new test case that exercises the scalar-to-vector and generalized
XXPERMDI transformations is added as CodeGen/PowerPC/swaps-le-5.ll.
The patch also requires a change to CodeGen/PowerPC/swaps-le-3.ll to
use CHECK-DAG instead of CHECK for two independent instructions that
now appear in reverse order.

There are two small unrelated changes that are added with this patch.
First, the XXSLDWI instruction was incorrectly omitted from the list
of lane-sensitive instructions; this is now fixed.  Second, I observed
that the same webs were being rejected over and over again for
different reasons.  Since it's sufficient to reject a web only once, I
added a check for this to speed up the compilation time slightly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242081 91177308-0d34-0410-b5e6-96231b3b80d8

Use std::make_tuple to reduce code duplication.

Thanks to David Blaikie for the suggestion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242074 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unnecessary lines from the test in r242068.

This test case was breaking the hexagon elf bot. The failing lines
were actually unnecessary as checking that the store still reads the
correct value demonstrates that everything is working fine now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242073 91177308-0d34-0410-b5e6-96231b3b80d8

Reduce memory usage of ComputeEditDistance() by (almost) 50%

ComputeEditDistance() currently keeps two rows of the edit distance matrix in
memory. That's unnecessary, one row plus one additional element are sufficient.
With this change, strings up to 64 chars can be processed without going to the
heap, compared to 32 chars previously. (But the main motivation is that the
code gets a bit simpler.)

No intended behavior change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242069 91177308-0d34-0410-b5e6-96231b3b80d8

Loop idiom recognizer was replacing too many uses of popcount.

When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop.

The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed

%tobool.5 = icmp ne i32 %num, 0
store i1 %tobool.5, i1* %ptr
br i1 %tobool.5, label %for.body.lr.ph, label %for.end

to

store i1 %1, i1* %ptr
%0 = call i32 @llvm.ctpop.i32(i32 %num)
%1 = icmp ne i32 %0, 0
br i1 %1, label %for.body.lr.ph, label %for.end

Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect. The store doesn’t want the result of ctpop.

The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242068 91177308-0d34-0410-b5e6-96231b3b80d8

[WinEH] Emit the LSDA even if no lpads remain but outlining occurred

The outlined funclets call intrinsics which reference labels from the
LSDA. This situation can easily arise in small functions with a single
cleanup at -O0, where Clang marks a definition as nounwind, and then
WinEHPrepare "discovers" that the landingpad is dead by accident and
deletes it.

We now need to ask the LLVM IR Function for it's personality directly,
rather than going through MachineModuleInfo.

Fixes PR23892.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242063 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Move BitTracker into the llvm namespace and remove redundant qualifications

No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242062 91177308-0d34-0410-b5e6-96231b3b80d8