granicus.if.org Git

GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals

Summary:
In preparation for changing GlobalsAA to stop assuming that intrinsics
can't read arbitrary globals, we need to make sure GlobalsAA is querying
function attributes rather than relying on this assumption.

This patch was inspired by: http://reviews.llvm.org/D20206

Reviewers: jmolloy, hfinkel

Subscribers: eli.friedman, llvm-commits

Differential Revision: https://reviews.llvm.org/D21318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275433 91177308-0d34-0410-b5e6-96231b3b80d8

Don't optimize movs to pushes in -O0 builds.

https://reviews.llvm.org/D22362

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275431 91177308-0d34-0410-b5e6-96231b3b80d8

Delete some trailing whitespace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275429 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Decode MPX BND registers.

We were able to assemble, but not disassemble.

Note that fixupRMValue was truncating EA_REG_BND0-3 because we hit
the uint8_t max. The control registers were already squarely above
it, but I don't think they ever go in .r/m, only in .reg.

I also did notice an extra REX.W in our encoding, but I think that's
fine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275427 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't mark addressing mode operands as "outs". NFC-ish.

Nothing in-tree can tell the difference, but it's incorrect: the
addressing mode registers aren't what's defined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275426 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Autobrief-ize Record. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275425 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Cleanup Record comments. NFC.

LLVM doesn't use exceptions anymore.
Also remove the implementation comments. Some of them diverged.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275424 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Fix #include ordering/spacing. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275423 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Assembler: fix row_bcast parsing

Summary: This change fix bug 28538

Reviewers: tstellarAMD, vpykhtin

Subscribers: arsenm, kzhuravl

Differential Revision: https://reviews.llvm.org/D22355

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275422 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r275411, it cause PR28552.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275421 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r275401, it caused PR28551.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275420 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversions

This patch prevents increases in the number of instructions, pre-instcombine,
due to induction variable scalarization. An increase in instructions can lead
to an increase in the compile-time required to simplify the induction
variables. We now maintain a new map for scalarized induction variables to
prevent us from converting between the scalar and vector forms.

This patch should resolve compile-time regressions seen after r274627.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275419 91177308-0d34-0410-b5e6-96231b3b80d8

Teach fast isel calls and rets about stdcall.

stdcall is callee-pop like thiscall, so the thiscall changes already did most
of the work for this. This change only opts stdcall in and adds tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275414 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Added an additional vperm2f128 memory folding test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275413 91177308-0d34-0410-b5e6-96231b3b80d8

Remove trailing whitespace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275412 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle

This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275411 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] SelectionDAGISel subclasses now follow the optimization level.

Summary:
It was recently discovered that, for Mips's SelectionDAGISel subclasses,
all optimization levels caused SelectionDAGISel to behave like -O2.

This change adds the necessary plumbing to initialize the optimization level.

Reviewers: andrew.w.kaylor

Subscribers: andrew.w.kaylor, sdardis, dean, llvm-commits, vradosavljevic, petarj, qcolombet, probinson, dsanders

Differential Revision: https://reviews.llvm.org/D14900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275410 91177308-0d34-0410-b5e6-96231b3b80d8

Upgrade all the .arcconfigs to https.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275409 91177308-0d34-0410-b5e6-96231b3b80d8

Speculatively fix the sphinx build, which does not think the original code was valid nasm (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11854/steps/docs-llvm-html/logs/stdio).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275408 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Add support for narrowing 128-bit+ shuffle mask elements to 64-bits to allow combining

Primarily this is to allow blend with zero instead of having to use vperm2f128, but we can use this in the future to deal with AVX512 cases where we need to keep the original element size to correctly fold masked operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275406 91177308-0d34-0410-b5e6-96231b3b80d8

This converts a signed remainder instruction to unsigned remainder, which
enables the code size optimisation to fold a rem and div into a single
aeabi_uidivmod call. This was not happening before because sdiv was converted
but srem not, and instructions with different signedness are not combined.

Differential Revision: http://reviews.llvm.org/D22214

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275403 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Add 128-bit wide shuffle tests that should combine to blend-with-zero

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275402 91177308-0d34-0410-b5e6-96231b3b80d8

code hoisting pass based on GVN

This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275401 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Add VBROADCASTF128/VBROADCASTI128 shuffle comments support

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275400 91177308-0d34-0410-b5e6-96231b3b80d8

Remove extra ';' to appease -Wpedantic

Summary:

Reviewers: dok

Subscribers: llvm-commits

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275399 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Regenerate broadcast upgrade tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275398 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX2] VBROADCASTSSrr/VBROADCASTSSYrr require AVX2 not AVX

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275391 91177308-0d34-0410-b5e6-96231b3b80d8

This implements a more optimal algorithm for selecting a base constant in
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.

Differential Revision: http://reviews.llvm.org/D21183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275382 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Masked loads with undef masks can fold to normal loads

We were able to fold masked loads with an all-ones mask to a normal
load. However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275380 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify llvm.masked.load w/ undef masks

We can always pick the passthru value if the mask is undef: we are
permitted to treat the mask as-if it were filled with zeros.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275379 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Implement EXTLOAD lowering with patterns to select existing VPMOVZX instructions instead of creating CodeGenOnly instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275378 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix stupid typo in isel lowering.

Apparently someone miscounted the number of zeros in the immediate.
Fixes https://llvm.org/bugs/show_bug.cgi?id=28544 .

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275376 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/R600: Delete/rename intrinsics no longer used by mesa

Use the replacement pass to update the tests, and delete old names.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275375 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/R600: Remove intrinsics with no tests and no users

Mesa removed this path, so nothing is using these anymore.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275372 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove unused intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275371 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix test not actually testing anything

It wasn't actually running the pass, and since it is
missing the llvm prefix, the eh intrinsic was not
really an IntrinsicInst.

Also add missing test for lifetime markers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275370 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove dead code

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275369 91177308-0d34-0410-b5e6-96231b3b80d8

XRay: Add entry and exit sleds

Summary:
In this patch we implement the following parts of XRay:

- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.

There are some caveats here:

1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.

2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.

Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk

Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D19904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275367 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Pass a Value * instead of templating this function. NFC.

Thanks to Eli for the suggestion!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275366 91177308-0d34-0410-b5e6-96231b3b80d8

clarify a bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275364 91177308-0d34-0410-b5e6-96231b3b80d8

[IPSCCP] Constant fold struct argument/instructions when all the lattice values are constant.

This now should also work with the interprocedural variant of the pass.
Slightly easier now that the yak is shaved.

Differential Revision: http://reviews.llvm.org/D22329

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275363 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Re-apply r275316 now that I have the corresponding LLD patch ready.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275361 91177308-0d34-0410-b5e6-96231b3b80d8

Teach fast isel about thiscall (and callee-pop) calls.

http://reviews.llvm.org/D22315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275360 91177308-0d34-0410-b5e6-96231b3b80d8

[Scalarizer] PR28108: Skip over nullptr rather than crashing on it.

Summary:
In Scalarizer::gather we see if we already have a scattered form of Op,
and in that case use the new form.

In the particular case of PR28108, the found ValueVector SV has size 2,
where the first Value is nullptr, and the second is indeed a proper Value.
The nullptr then caused an assert to blow when we tried to do
cast<Instruction>(SV[I]).

With this patch we check SV[I] before doing the cast, and if it's nullptr
we just skip over it.

I don't know the Scalarizer well enough to know if this is the best fix
or if something should be done else where to prevent the nullptr from
being in the ValueVector at all, but at least this avoids the crash
and looking at the test case output it looks reasonable.

Reviewers: hfinkel, frasercrmck, wala, mehdi_amini

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275359 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing test for r275347 "[IPRA] Set callee saved registers to none for local function when IPRA is enabled."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275358 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Generalize tryToReplaceInstWithConstant to work also with arguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275357 91177308-0d34-0410-b5e6-96231b3b80d8

MIRParser: Fix MIRParser not reporting nullptr on error.

While some code paths in MIRParserImpl::parse() already returned nullptr
in case of error one of the important ones did not.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275355 91177308-0d34-0410-b5e6-96231b3b80d8

Synchronize LLVM and clang's ObjCDeclSpec::ObjCPropertyAttributeKind.

This adds Clang-specific DWARF constants for nullability and ObjC
class properties that are already generated by clang. This patch adds
dwarfdump support and a more comprehensive testcase.

<rdar://problem/27335745>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275354 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Revert r275316, Archive::child_iterator changes, while I update lld.

Should fix the bots broken by r275316.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275353 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantFolding] Fold masked loads

We can constant fold a masked load if the operands are appropriately
constant.

Differential Revision: http://reviews.llvm.org/D22324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275352 91177308-0d34-0410-b5e6-96231b3b80d8

Force a semicolon at the end of the LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE() macro.

This silences a warning about an extra semicolon on gcc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275349 91177308-0d34-0410-b5e6-96231b3b80d8

Add EnableIPRA to TargetOptions, and move the cl::opt -enable-ipra to TargetMachine.cpp

Avoid exposing a cl::opt in a public header and instead promote this
option in the API.
Alternatively, we could land the cl::opt in CommandFlags.h so that
it is available to every tool, but we would still have to find an
option for clang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275348 91177308-0d34-0410-b5e6-96231b3b80d8

[IPRA] Set callee saved registers to none for local function when IPRA is enabled.

IPRA try to optimize caller saved register by propagating register
usage information from callee to caller so it is beneficial to have
caller saved registers compare to callee saved registers when IPRA
is enabled. Please find more detailed explanation here
https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/tjAJqb0eEgAJ.

This change makes local function do not have any callee preserved
register when IPRA is enabled. A simple test case is also added to
verify this change.

Patch by Vivek Pandya <vivekvpandya@gmail.com>

Differential Revision: http://reviews.llvm.org/D21561

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275347 91177308-0d34-0410-b5e6-96231b3b80d8

[JumpThreading] Delete commented out debug code; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275346 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantFolding] Extend FoldReinterpretLoadFromConstPtr to handle negative offsets

Treat loads which clip before the start of a global initializer the same
way we treat clipping beyond the end of the initializer: use zeros.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275345 91177308-0d34-0410-b5e6-96231b3b80d8

Move a transform from InstCombine to InstSimplify.

This transform doesn't require any new instructions, it can safely live
in InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275344 91177308-0d34-0410-b5e6-96231b3b80d8

Fix copy/paste bug in r275340.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275343 91177308-0d34-0410-b5e6-96231b3b80d8

MIRParser: Move SlotMapping and SourceMgr refs to PFS; NFC

Code cleanup: Move references to SlotMapping and SourceMgr into the
PerFunctionMIParsingState to avoid unnecessary passing around in
parameters.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275342 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Correctly chain masked loads

If a masked loads is not added to the chain, it should not reset the chain's
root.

This fixes the remaining part of PR28515.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275340 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Have the logic for replacing insts with constant in a single place.

The code was pretty much copy-pasted between SCCP and IPSCCP. The situation
became clearly worse after I introduced the support for folding structs in
SCCP. This commit is NFC as we currently (still) skip the replacement
step in IPSCCP, but I'll change this soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275339 91177308-0d34-0410-b5e6-96231b3b80d8

[Coverage] Return an ArrayRef to avoid copies (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275338 91177308-0d34-0410-b5e6-96231b3b80d8

[Coverage] Mark a few methods const (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275337 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Don't hold on to DominatorTree in the analysis result

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275335 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Don't hold on to TargetLibraryInfo in the analysis result

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275334 91177308-0d34-0410-b5e6-96231b3b80d8

[MIR] Fix one GlobalISel test case that I missed in r275314.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275333 91177308-0d34-0410-b5e6-96231b3b80d8

[MI] Clean up some loops over MachineInstr::memoperands(). NFC

Use range-based for loops and llvm::any_of instead of explicit
iterators.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275332 91177308-0d34-0410-b5e6-96231b3b80d8

[MI] Fix MachineInstr::isInvariantLoad.

Summary:
Previously it would say we had an invariant load if any of the memory
operands were invariant.  But the load should be invariant only if *all*
the memory operands are invariant.

No testcase because this has proven to be very difficult to tickle in
practice.  As just one example, ARM's ldrd instruction, which loads 64
bits into two 32-bit regs, is theoretically affected by this.  But when
it's produced, it loses its memoperands' invariance bits!

Reviewers: jfb

Subscribers: llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D22318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275331 91177308-0d34-0410-b5e6-96231b3b80d8

MIRParser: Move MachineFunction reference into PFS; NFC

Code cleanup: The PerFunctionMIParsingState is per function, moving a
reference into PFS we can avoid passing around the MachineFunction in an
extra parameter most of the time.

Also change most signatures to consistently pass PFS reference first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275329 91177308-0d34-0410-b5e6-96231b3b80d8

MIRYamlMapping: Update stale comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275328 91177308-0d34-0410-b5e6-96231b3b80d8

Add a triple to fix test on bots after 275320.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275327 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Don't hold on to DataLayout in the analysis result

In fact, don't even pass this to the ctor since we can get it from the
module.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275326 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Don't hold on to LoopInfo in the analysis result

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275325 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Don't hold on to AliasAnalysis in the analysis result

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275322 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-cov] Use a thread pool to speed up report generation (NFC)

It's safe to print out source coverage views using multiple threads when
using the -output-dir mode of the `llvm-cov show` sub-command.

While testing this on my development machine, I observed that the speed
up is roughly linear with the number of available cores. Avg. time for
`llvm-cov show ./llvm-as -show-line-counts-or-regions`:

1 thread: 7.79s user 0.33s system 98% cpu 8.228 total
4 threads: 7.82s user 0.34s system 283% cpu 2.880 total

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275321 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a TODO in X86CallFrameOptimization to not rely on a codegen artifact.

This happens to make X86CallFrameOptimization in -O0 / FastISel builds as well,
but it's not clear if the pass should run in that setup.

http://reviews.llvm.org/D22314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275320 91177308-0d34-0410-b5e6-96231b3b80d8

Mark the textual headers in the module map for ProfileData

Follow on to r275312.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275319 91177308-0d34-0410-b5e6-96231b3b80d8

Extended LoadStoreVectorizer to vectorize subchains.

Summary:
LSV used to abort vectorizing a chain for interleaved load/store accesses that alias.
Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain.

Reviewers: llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275317 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Change Archive::child_iterator for better interop with Error/Expected.

See http://reviews.llvm.org/D22079

Changes the Archive::child_begin and Archive::children to require a reference
to an Error. If iterator increment fails (because the archive header is
damaged) the iterator will be set to 'end()', and the error stored in the
given Error&. The Error value should be checked by the user immediately after
the loop. E.g.:

Error Err;
for (auto &C : A->children(Err)) {
// Do something with archive child C.
}
// Check the error immediately after the loop.
if (Err)
return Err;

Failure to check the Error will result in an abort() when the Error goes out of
scope (as guaranteed by the Error class).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275316 91177308-0d34-0410-b5e6-96231b3b80d8

[MIR] Print on the given output instead of stderr.

Currently the MIR framework prints all its outputs (errors and actual
representation) on stderr.

This patch fixes that by printing the regular output in the output
specified with -o.

Differential Revision: http://reviews.llvm.org/D22251

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275314 91177308-0d34-0410-b5e6-96231b3b80d8

Define a module map entry for ProfileData.

As per Richard Smith, this should help avoid a modules bug exposed
by my r275216 commit:
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275312 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove last AMDIL intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275309 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Factor out common code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275308 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Use early return. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275307 91177308-0d34-0410-b5e6-96231b3b80d8

Reverting r275284 due to platform-specific test failures

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275304 91177308-0d34-0410-b5e6-96231b3b80d8

add more tests for zexty xor sandwiches

...mmm sandwiches

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275302 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Regenerate truncated shift test

Check SSE2 and AVX2 implementations

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275300 91177308-0d34-0410-b5e6-96231b3b80d8

Regenerate test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275299 91177308-0d34-0410-b5e6-96231b3b80d8

add test for zexty xor sandwich

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275297 91177308-0d34-0410-b5e6-96231b3b80d8

Fix header comment in unittests/CodeGen/DIEHashTest.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275296 91177308-0d34-0410-b5e6-96231b3b80d8

Move mempcpy_call.ll to X86 subdirectory

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275294 91177308-0d34-0410-b5e6-96231b3b80d8

Fix warning in ObjectTransformLayerTest.

Doing "I++" inside of an EXPECT_* triggers

warning: expression with side effects has no effect in an unevaluated context

because EXPECT_* partially expands to

EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(MockObjects[I++] + 1)) == 1)>

which is an unevaluated context.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275293 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add LLVM_MARK_AS_BITMASK_ENUM, used to enable bitwise operations on enums without static_cast.

Summary: Normally when you do a bitwise operation on an enum value, you
get back an instance of the underlying type (e.g. int).  But using this
macro, bitwise ops on your enum will return you back instances of the
enum.  This is particularly useful for enums which represent a
combination of flags.

Suppose you have a function which takes an int and a set of flags.  One
way to do this would be to take two numeric params:

  enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
  void Fn(int Num, int Flags);

  void foo() {
    Fn(42, F2 | F3);
  }

But now if you get the order of arguments wrong, you won't get an error.

You might try to fix this by changing the signature of Fn so it accepts
a SomeFlags arg:

  enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
  void Fn(int Num, SomeFlags Flags);

  void foo() {
    Fn(42, static_cast<SomeFlags>(F2 | F3));
  }

But now we need a static cast after doing "F2 | F3" because the result
of that computation is the enum's underlying type.

This patch adds a mechanism which gives us the safety of the second
approach with the brevity of the first.

  enum SomeFlags {
    F1 = 1, F2 = 2, F3 = 4, ..., F_MAX = 128,
    LLVM_MARK_AS_BITMASK_ENUM(F_MAX)
  };

  void Fn(int Num, SomeFlags Flags);

  void foo() {
    Fn(42, F2 | F3);  // No static_cast.
  }

The LLVM_MARK_AS_BITMASK_ENUM macro enables overloads for bitwise
operators on SomeFlags.  Critically, these operators return the enum
type, not its underlying type, so you don't need any static_casts.

An advantage of this solution over the previously-proposed BitMask class
[0, 1] is that we don't need any wrapper classes -- we can operate
directly on the enum itself.

The approach here is somewhat similar to OpenOffice's typed_flags_set
[2].  But we skirt the need for a wrapper class (and a good deal of
complexity) by judicious use of enable_if.  We SFINAE on the presence of
a particular enumerator (added by the LLVM_MARK_AS_BITMASK_ENUM macro)
instead of using a traits class so that it's impossible to use the enum
before the overloads are present.  The solution here also seamlessly
works across multiple namespaces.

[0] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150622/283369.html
[1] http://lists.llvm.org/pipermail/llvm-commits/attachments/20150623/073434b6/attachment.obj
[2] https://cgit.freedesktop.org/libreoffice/core/tree/include/o3tl/typed_flags_set.hxx

Reviewers: chandlerc, rsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22279

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275292 91177308-0d34-0410-b5e6-96231b3b80d8

Fix warnings in FunctionTest.cpp.

Because of the goop involved in the EXPECT_EQ macro, we were getting the
following warning

expression with side effects has no effect in an unevaluated context

because the "I++" was being used inside of a template type:

switch (0) case 0: default: if (const ::testing::AssertionResult gtest_ar = (::testing::internal:: EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(Args[I++])) == 1)>::Compare("Args[I++]", "&A", Args[I++], &A))) ; else ::testing::internal::AssertHelper(::testing::TestPartResult::kNonFatalFailure, "../src/unittests/IR/FunctionTest.cpp", 94, gtest_ar.failure_message()) = ::testing::Message();

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275291 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] extend vector select matching for non-splat constants

In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean
way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the
direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could
move that helper function to a higher level.

There is an open question as to which is of these forms should be considered the canonical IR:
%sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b
%shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3>

Differential Revision: http://reviews.llvm.org/D22114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275289 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: Emit the number of SGPR and VGPR spills

Summary:
v2: don't count SGPRs spilled to scratch twice

I think this is sufficient. It doesn't count private memory usage, which
happens often and uses scratch but isn't technically a spill. The private
memory usage can be computed by:
[scratch_per_thread - vgpr_spills - a random multiple of SGPR spills].

The fact SGPR spills add very high numbers to the scratch size make that
computation a guessing game, but I don't have a solution to that.

Reviewers: tstellarAMD

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D22197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275288 91177308-0d34-0410-b5e6-96231b3b80d8

Fix for Bug 26903, adds support to inline __builtin_mempcpy

Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275284 91177308-0d34-0410-b5e6-96231b3b80d8

PR28516: Fix LangRef description of call and invoke to match IR changes for typeless pointers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275283 91177308-0d34-0410-b5e6-96231b3b80d8

PatchableFunction: Skip pseudos that do not create code

This fixes http://llvm.org/PR28524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275278 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO/gold] Enable symbol resolution in distributed backend case

While testing a follow-on change to enable index-based symbol resolution
and internalization in the distributed backends, I realized that a test
case change I made in r275247 was only required because we were not
analyzing symbols in the claimed files in thinlto-index-only mode.

In the fixed test case there should be no internalization because we are
linking in -shared mode, so f() is in fact exported, which is detected
properly when we analyze symbols in thinlto-index-only mode. Note that
this is not (yet) a correctness issue (because we are not yet performing
the index-based linkage optimizations in the distributed backends -
that's coming in a follow-on patch).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275277 91177308-0d34-0410-b5e6-96231b3b80d8

[x86][SSE/AVX] optimize pcmp results better (PR28484)

We know that pcmp produces all-ones/all-zeros bitmasks, so we can use that behavior to avoid unnecessary constant loading.

One could argue that load+and is actually a better solution for some CPUs (Intel big cores) because shifts don't have the
same throughput potential as load+and on those cores, but that should be handled as a CPU-specific later transformation if
it ever comes up. Removing the load is the more general x86 optimization. Note that the uneven usage of vpbroadcast in the
test cases is filed as PR28505:
https://llvm.org/bugs/show_bug.cgi?id=28505

Differential Revision: http://reviews.llvm.org/D22225

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275276 91177308-0d34-0410-b5e6-96231b3b80d8