granicus.if.org Git

Add a using declaration so that the overrides don't hide some of the
base class methods.

This was caught by GCC's -Woverloaded-virtual, not sure why it wasn't
caught by Clang's. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272460 91177308-0d34-0410-b5e6-96231b3b80d8

Compare to an unsigned literal to avoid a -Wsign-compare warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272459 91177308-0d34-0410-b5e6-96231b3b80d8

Use const_cast to cast away constness. This silences a warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272458 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfoPDBTests:MappedBlockStreamTest.TestWriteThenRead: Avoid assigning temporary object to ArrayRef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272457 91177308-0d34-0410-b5e6-96231b3b80d8

[MCJIT] Update MCJIT and get the fibonacci example working again.

MCJIT will now set the DataLayout on a module when it is added to the JIT,
rather than waiting until it is codegen'd, and the runFunction method will
finalize the module containing the function to be run before running it.

The fibonacci example has been updated to include and link against MCJIT.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272455 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Add support for lowering v32i16 shuffles with repeated lanes. This allows us to create 512-bit PSHUFLW/PSHUFHW.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272450 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] No need to check for BWI being enabled before lowering v32i16 and v64i8 shuffles. If we get this far the types are already legal which means BWI must be enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272449 91177308-0d34-0410-b5e6-96231b3b80d8

LiveIntervalAnalysis: findLastUseBefore() must ignore undef uses.

undef uses are no real uses of a register and must be ignored by
findLastUseBefore() so that handleMove() does not produce invalid live
intervals in some cases.

This fixed http://llvm.org/PR28083

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272446 91177308-0d34-0410-b5e6-96231b3b80d8

[esan|cfrag] Handle complex GEP instr in the cfrag tool

Summary:
Iterates all (except the first and the last) operands within each GEP
instruction for instrumentation.

Adds test struct_field_gep.ll.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits

Differential Revision: http://reviews.llvm.org/D21242

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272442 91177308-0d34-0410-b5e6-96231b3b80d8

Try again to fix this endianness issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272440 91177308-0d34-0410-b5e6-96231b3b80d8

Don't try to rotate a loop more than once - we never do this anyway.

Summary:
I can't find a case where we can rotate a loop more than once, and it looks
like we never do this. To rotate a loop following conditions should be met:
1) its header should be exiting
2) its latch shouldn't be exiting

But after the first rotation the header becomes the new latch, so this
condition can never be true any longer.

Tested on with an assert on LNT testsuite and make check.

Reviewers: hfinkel, sanjoy

Subscribers: sebpop, sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20181

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272439 91177308-0d34-0410-b5e6-96231b3b80d8

[pdb] Fix issues with pdb writing.

This fixes an alignment issue by forcing all cached allocations
to be 8 byte aligned, and also fixes an issue arising on big
endian systems by writing ulittle32_t's instead of uint32_t's
in the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272437 91177308-0d34-0410-b5e6-96231b3b80d8

MemorySSA: fix memory access local dominance function for live on entry

A memory access defined on function entry cannot be locally dominated by another memory access.
The patch was split from http://reviews.llvm.org/D19338 which exposes the problem.

Differential Revision: http://reviews.llvm.org/D21039

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272436 91177308-0d34-0410-b5e6-96231b3b80d8

[STLExtras] Introduce and use llvm::count_if; NFC

(This is split out from was D21115)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272435 91177308-0d34-0410-b5e6-96231b3b80d8

[IRTranslator] Support the translation of or.

Now or instructions get translated into G_OR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272433 91177308-0d34-0410-b5e6-96231b3b80d8

[IRTranslator] Rework the comments for the methods to translate.

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272432 91177308-0d34-0410-b5e6-96231b3b80d8

[IRTranslator] Refactor to expose a translateBinaryOp method.

This method will be used for every binary operation.

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272431 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Move comments closer to relevant check. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272430 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Refactor a check earlier. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272429 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] enable bitcasted fabs/fneg transforms

The vector cases don't change because we already have folds in X86ISelLowering
to look through and remove bitcasts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272427 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Fix PrologEpilogInserter to avoid duplicate allocation of SEH structs

Summary:
When stack-protection is activated and WinEH exceptions is used,
the EHRegNode (exception handling registration) is allocated twice on the stack.

This was not breaking anything except loosing space on the stack.

```
D:\src\llvm\examples>llc exc2.ll  -debug-only=pei
alloc FI(0) at SP[-24]
alloc FI(1) at SP[-48]   <<-- Allocated
alloc FI(1) at SP[-72]   <<-- Allocated twice!?
alloc FI(2) at SP[-76]
alloc FI(4) at SP[-80]
alloc FI(3) at SP[-84]
```

Reviewers: rnk, majnemer

Subscribers: chrisha, llvm-commits

Differential Revision: http://reviews.llvm.org/D21188

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272426 91177308-0d34-0410-b5e6-96231b3b80d8

Remove a few gendered pronouns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272422 91177308-0d34-0410-b5e6-96231b3b80d8

Disable MSan-hostile loop unswitching.

Loop unswitching may cause MSan false positive when the unswitch
condition is not guaranteed to execute.

This is very similar to ASan and TSan special case in
llvm::isSafeToSpeculativelyExecute (they don't like speculative loads
and stores), but for branch instructions.

This is a workaround for PR28054.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272421 91177308-0d34-0410-b5e6-96231b3b80d8

Move isGuaranteedToExecute out of LICM.

Also rename LICMSafetyInfo to LoopSafetyInfo.
Both will be used in LoopUnswitch in a separate change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272420 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Support Compare and Traps

Support and generate Compare and Traps like CRT, CIT, etc.

Support Trap as legal DAG opcodes and generate "j .+2" for them by default.
Add support for Conditional Traps and use the If Converter to convert them into
the corresponding compare and trap opcodes.

Differential Revision: http://reviews.llvm.org/D21155

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272419 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations

Summary:
We need to set the fixup type to FK_Data_4 for the
SCRATCH_RSRC_DWORD[01] symbols, since these require absolute
relocations, and fixup_si_rodata is for relative relocations.

Reviewers: arsenm, kzhuravl

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21153

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272417 91177308-0d34-0410-b5e6-96231b3b80d8

Move CodeGen test from Generic to X86 specific directory

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272416 91177308-0d34-0410-b5e6-96231b3b80d8

Interprocedural Register Allocation (IPRA): add a Transformation Pass

Adds a MachineFunctionPass that scans the body to find calls, and
update the register mask with the one saved by the
RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo.

Patch by Vivek Pandya <vivekvpandya@gmail.com>

Differential Revision: http://reviews.llvm.org/D21180

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272414 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test for PR28044

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272411 91177308-0d34-0410-b5e6-96231b3b80d8

Add a period. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272410 91177308-0d34-0410-b5e6-96231b3b80d8

Fix whitespace. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272409 91177308-0d34-0410-b5e6-96231b3b80d8

test: split test into two files

Split up the test cases into two inputs as per post-commit review comments from
Renato. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272408 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add costs for SSE zext/sext to v4i64 to TTI

The costs are somewhat hand-wavy, but should be much closer to the truth
than what we get from BasicTTI.

Differential Revision: http://reviews.llvm.org/D21156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272406 91177308-0d34-0410-b5e6-96231b3b80d8

Interprocedural Register Allocation (IPRA) Analysis

Add an option to enable the analysis of MachineFunction register
usage to extract the list of clobbered registers.

When enabled, the CodeGen order is changed to be bottom up on the Call
Graph.

The analysis is split in two parts, RegUsageInfoCollector is the
MachineFunction Pass that runs post-RA and collect the list of
clobbered registers to produce a register mask.

An immutable pass, RegisterUsageInfo, stores the RegMask produced by
RegUsageInfoCollector, and keep them available. A future tranformation
pass will use this information to update every call-sites after
instruction selection.

Patch by Vivek Pandya <vivekvpandya@gmail.com>

Differential Revision: http://reviews.llvm.org/D20769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272403 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add preferred alignments for Exynos M1

Differential Revision: http://reviews.llvm.org/D21203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272400 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Remove incorrect offset scaling

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272399 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] fix test attributes and autogenerate checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272398 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add missing tests for fcmp ueq/one

Somehow, the codegen logic for these sequences has gone completely untested
until now (note the 2 compare instructions generated per test).

There's also an *Intel* AVX optimization opportunity exposed in these cases
and the existing tests. Intel's (but not AMD's) AVX spec shows that extra FP
predicates were added, so a single comparison should always be sufficient,
and operand commutation should never be necessary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272397 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] regenerate checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272396 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "[TTI] Refine default cost for interleaved load groups with gaps"

This reapplies commit r272385 with a fix. The build was failing when compiled
with gcc, but not with clang. With the fix, we now get the data layout from the
current TTI implementation, which will hopefully solve the issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272395 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272393 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Added target shuffle combine tests for byte shift/rotates (PSLLDQ/PSRLDQ/PALIGNR)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272392 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[TTI] Refine default cost for interleaved load groups with gaps"

This reverts commit r272385. This commit broke the build. I'm temporarily
reverting to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272391 91177308-0d34-0410-b5e6-96231b3b80d8

[TTI] Refine default cost for interleaved load groups with gaps

This patch refines the default cost for interleaved load groups having gaps. If
a load group has gaps, the legalized instructions corresponding to the unused
elements will be dead. Thus, we don't need to account for them in the cost
model. Instead, we only need to account for the fraction of legalized loads
that will actually be used.

Differential Revision: http://reviews.llvm.org/D20873

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272385 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning in AMDGPUOperand.

Summary:
sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported.
Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier.
Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers.
Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...).

Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D20968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272384 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Added VPSLLDQ/VPSRLDQ memory fold tests

Memory operand is new for AVX512 (SSE/AVX2 didn't support it).

Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations).

Regenerated VPALIGNR test now that the shuffle comments work

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272383 91177308-0d34-0410-b5e6-96231b3b80d8

Fix stale name in comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272382 91177308-0d34-0410-b5e6-96231b3b80d8

test commit: remove trailing whitespaces in README.txt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272380 91177308-0d34-0410-b5e6-96231b3b80d8

Bug fix remove another illegal char from prof symbol name

End-end test with no integrated assembly should be added
at some point (not done now because some bots are not properly configured to
support -no-integrated-as)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272376 91177308-0d34-0410-b5e6-96231b3b80d8

[LibFuzzer] Fix some unit test crashes on OSX.

This fixes the following unit tests:

FuzzerDictionary.ParseOneDictionaryEntry
FuzzerDictionary.ParseDictionaryFile

The issue appears to be mixing non-ASan-ified code (LibFuzzer) and
ASan-ified code (the unittest) as the tests would pass fine if
everything was built with ASan enabled.

I believe the issue is that different implementations of std::vector<>
are being used in LibFuzzer and outside LibFuzzer (in the unittests).
For Libcxx (I've not seen the issue manifest for libstdc++) we can disable
the ASanified std::vector<> by definining the ``_LIBCPP_HAS_NO_ASAN`` macro.
Doing this fixes the tests on OSX.

Differential Revision: http://reviews.llvm.org/D21049

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272374 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing include for r272369

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272373 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272371 91177308-0d34-0410-b5e6-96231b3b80d8

Make PDBFile take a StreamInterface instead of a MemBuffer.

This is the next step towards being able to write PDBs.
MemoryBuffer is immutable, and StreamInterface is our replacement
which can be any combination of read-only, read-write, or write-only
depending on the particular implementation.

The one place where we were creating a PDBFile (in RawSession) is
updated to subclass ByteStream with a simple adapter that holds
a MemoryBuffer, and initializes the superclass with the buffer's
array, so that all the functionality of ByteStream works
transparently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272370 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for writing through StreamInterface.

This adds method and tests for writing to a PDB stream. With
this, even a PDB stream which is discontiguous can be treated
as a sequential stream of bytes for the purposes of writing.

Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21157

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272369 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Fix shuffle comment printing to handle the masked versions of some shuffles. Previously we were printing the mask operands as the register names.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272367 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Only gather redirected files for command failures.

- The intended use of this was just in diagnostics, so we shouldn't pay the
   cost of reading these all the time.

- This will avoid including the full output of each command in tests which
   fail, but the most important use case for this was to gather the output of
   the specific command which failed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272365 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix trailing whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272364 91177308-0d34-0410-b5e6-96231b3b80d8

[esan|cfrag] Add the struct field offset array in StructInfo

Summary:
Adds the struct field offset array in struct StructInfo.

Updates test struct_field_count_basic.ll.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21192

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272362 91177308-0d34-0410-b5e6-96231b3b80d8

[LiveRangeEdit] Add a test case for r272314.

The test case is not great espicially because it is still cumbersome to
run the regalloc pass with run-pass. (We miss a bunch of initiliazier to
be properly implemented.)

Related to llvm.org/PR27983

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272360 91177308-0d34-0410-b5e6-96231b3b80d8

Add null checks before using a pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272359 91177308-0d34-0410-b5e6-96231b3b80d8

[llc] Do not create the pass config several times for run-pass.

Thanks to Matthias Braun for spotting this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272358 91177308-0d34-0410-b5e6-96231b3b80d8

[llc] Add support for several run-pass options.

Previously we could run only one machine pass with the run-pass option.
With that patch, we can now specify several passes with several run-pass
options (or just one option with a list of comma separated passes) and
llc will build the related pipeline.
This is great to test the interaction of two passes that are not
necessarily next to each other in the pipeline, or play with pass
ordering.
Now, we should be at parity with opt for the flexibility of running
passes.

Note: I also moved the run pass option from CommandFlags.h to llc.cpp
because, really, this is needed only there!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272356 91177308-0d34-0410-b5e6-96231b3b80d8

[esan|cfrag] Disable load/store instrumentation for cfrag

Summary:
Adds ClInstrumentFastpath option to control fastpath instrumentation.

Avoids the load/store instrumentation for the cache fragmentation tool.

Renames cache_frag_basic.ll to working_set_slow.ll for slowpath
instrumentation test.

Adds the __esan_init check in struct_field_count_basic.ll.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21079

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272355 91177308-0d34-0410-b5e6-96231b3b80d8

Update call site attribute documentation

convergent is also accepted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272353 91177308-0d34-0410-b5e6-96231b3b80d8

docs: Add AMDGPU relocation information

Summary:
This documents the various relocation types that are supported by the
Radeon Open Compute (ROC) runtime (which is essentially the dynamic
linker for AMDGPU).

Only R_AMDGPU_32 is not currently supported by the ROC runtime, but
it will usually be resolved at link time by lld.

Patch by: Konstantin Zhuravlyov

Reviewers: kzhuravl, rafael

Subscribers: rafael, arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D20952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272352 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: v_cndmask_b32 does not def vcc

Fixes verifier errors after SIShrinkInstructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272351 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permute

Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.

Reviewers: scchan, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272349 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Removing fallback code for CMake versions before 3.1

This code is dead code now. Out with the old, in with the new!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272347 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Use common topological sort algorithm in SIScheduleDAGMI

Reviewers: arsenm, axeldavy

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272346 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix flat atomics

The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272345 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix i64 global cmpxchg

This was using extract_subreg sub0 to extract the low register
of the result instead of sub0_sub1, producing an invalid copy.

There doesn't seem to be a way to use the compound subreg indices
in tablegen since those are generated, so manually select it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272344 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix missing and broken check lines in atomic tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272343 91177308-0d34-0410-b5e6-96231b3b80d8

Make sure that not interesting allocas are not instrumented.

Summary:
We failed to unpoison uninteresting allocas on return as unpoisoning is part of
main instrumentation which skips such allocas.

Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of
dynamic allocas is disabled it will not will not be unpoisoned.

PR27453

Reviewers: kcc, eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272341 91177308-0d34-0410-b5e6-96231b3b80d8

CodeGen: Allow verifier to run after MachineBlockPlacement

No tests break with this enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272340 91177308-0d34-0410-b5e6-96231b3b80d8

Add aliases for mfvrsave/mtvrsave.

Update a test as we're now going to emit it for easier reading of
generated assembly as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272339 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Run verifer after insert waits pass

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272338 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove incorrect assertion

I'm still not sure under what circumstances the offset here is non-0,
but private memory is not limited to 27-bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272337 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Properly initialize SIShrinkInstructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272336 91177308-0d34-0410-b5e6-96231b3b80d8

[CFLAA] Handle global/arg attrs more sanely.

Prior to this patch, we used argument/global stratified attributes in
order to note that a value could have come from either dereferencing a
global/arg, or from the assignment from a global/arg.

Now, AttrUnknown is placed on sets when we see a dereference, instead of
the global/arg attributes. This allows us to be more aggressive in the
future when we see global/arg attributes without AttrUnknown.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21110

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272335 91177308-0d34-0410-b5e6-96231b3b80d8

Unpoison stack memory in use-after-return + use-after-scope mode

Summary:
We still want to unpoison full stack even in use-after-return as it can be disabled at runtime.

PR27453

Reviewers: eugenis, kcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21202

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272334 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply 272328 and 272329 as a single patch.

[cpu-detection] [amdfam10] Return barcelona, and amdfam10 for all other
subtypes. Address Bug 28067.

Along with the refactoring of Host.cpp, getHostCPUName() was modified to
return more precise types for CPUs in amdfam10.
However, callers of getHostCPUName() do string matching on type, so this
cannot be modified.
Currently there is support in the x86 backend for barcelona.
For all other subtypes the assumed return value is amdfam10.

Fix: getHostCPUName() returns barcelona subtype and amdfam10 for all
others. This can be extended further when support for the other subtypes
is added.

Differential revision: http://reviews.llvm.org/D21193

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272333 91177308-0d34-0410-b5e6-96231b3b80d8

Revert 272328 and 272329 to recommit as a single patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272332 91177308-0d34-0410-b5e6-96231b3b80d8

Keep barcelona subtype for amdfam10

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272329 91177308-0d34-0410-b5e6-96231b3b80d8

[cpu-detection] Return amdfam10 for all subtypes. Address Bug 28067.

Summary: Remove architecture subtype from the string returned by getHostCPUName(). String matching done on type.

Reviewers: llvm-commits, echristo

Subscribers: mehdi_amini

Differential Revision: http://reviews.llvm.org/D21193

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272328 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Cleanup ExternalProject usage of CMake 3.x features

All the ExternalProject features in use here are supported by CMake 3.4.3, so we don't need these version checks anymore.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272327 91177308-0d34-0410-b5e6-96231b3b80d8

Use ProfileSummaryInfo in inline cost analysis.

Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo.

Differential revision: http://reviews.llvm.org/D21045

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272321 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272319 91177308-0d34-0410-b5e6-96231b3b80d8

[LiveRangeEdit] Fix a crash in eliminateDeadDef.

When we delete a live-range, we check if that live-range is the origin of others
to keep it around for rematerialization. For that we check that the instruction
we are about to remove is the same as the definition of the VNI of the original
live-range.
If this is the case, we just shrink the live-range to an empty one.

Now, when we try to delete one of the children of such live-range (product of
splitting), we do the same check.
However, now the original live-range is empty and there is no way we can
access the VNI to check its definition, and we crash.

When we cannot get the VNI for the original live-range, that means we are not in
the presence of the original definition. Thus, this check does not need to happen
in that case and the crash is sloved!

This bug was introduced in r266162 | wmi | 2016-04-12 20:08:27. It affects every
target that uses the greedy register allocator.
To happen, we need to delete both a the original instruction and its split
products, in that order. This is likely to happen when rematerialization comes
into play.

Trying to produce a more robust test case. Will follow in a coming commit.

This fixes llvm.org/PR27983.

rdar://problem/26651519

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272314 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Fix indentation for a tool option

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272309 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsics

Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272308 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272307 91177308-0d34-0410-b5e6-96231b3b80d8

BitcodeReader: Use std:::piecewise_construct when upgrading type refs

r267296 used std::piecewise_construct without using
std::forward_as_tuple, and r267298 hacked it out (using an emplace_back
followed by a couple of reset() calls) because of a problem on a bot.
I'm finally circling back to call forward_as_tuple as I should have to
begin with (thanks to David Blaikie for pointing out the missing piece).

Note that this code uses emplace_back() instead of
push_back(make_pair()) because the move constructor for TrackingMDRef is
expensive (cheaper than a copy, but still expensive).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272306 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte shifts

512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272300 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Add intrinsics for shfl instructions.

Summary:
Currently clang emits these instructions via inline (volatile) asm in
the CUDA headers. Switching to intrinsics will let the optimizer reason
across calls to these intrinsics.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D21160

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272298 91177308-0d34-0410-b5e6-96231b3b80d8

NFC cleanup of InitializePasses.h

- Alphabetically sort the initializeXXX calls (this was brought up in
D21115)
- Remove repeated function names from doxygen comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272297 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Mark bar.sync intrinsic as convergent.

Summary:
__syncthreads, which corresponds to bar.sync 0, is already convergent.
This makes the more general bar.sync n likewise convergent.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D21161

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272295 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Port LCSSA to the new PM.

Differential Revision: http://reviews.llvm.org/D21090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272294 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[lit] Use os.devnull instead of named temp files"

This reverts commit r272290. It breaks a test that depends on being able
to seek the /dev/null equivalent on Windows:

http://bb.pgr.jp/builders/ninja-clang-x64-mingw64-RA/builds/11360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272293 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Fix 32-bit fdiv lowering

We were using the fast fdiv lowering for all division, implementation of
IEEE754 fdiv is added.

http://reviews.llvm.org/D20557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272292 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Use os.devnull instead of named temp files

Use os.devnull instead of tempfiles when substituting '/dev/null' on
Windows machines. This should make the bots just a bit speedier.

Thanks to Yunzhong Gao for testing this patch on Windows!

Differential Revision: http://reviews.llvm.org/D20549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272290 91177308-0d34-0410-b5e6-96231b3b80d8