granicus.if.org Git - llvm/log

]> granicus.if.org Git - llvm/log

projects / llvm / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Pirama Arumuga Nainar [Mon, 21 Aug 2017 20:49:44 +0000 (20:49 +0000)]

[Support, Windows] Handle long paths with unix separators

Summary:
The function widenPath() for Windows also normalizes long path names by
iterating over the path's components and calling append(). The
assumption during the iteration that separators are not returned by the
iterator doesn't hold because the iterators do return a separator when
the path has a drive name. Handle this case by ignoring separators
during iteration.

Reviewers: rnk

Subscribers: danalbert, srhines

Differential Revision: https://reviews.llvm.org/D36752

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311382 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjoy Das [Mon, 21 Aug 2017 20:39:18 +0000 (20:39 +0000)]

Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators"

Summary: This partially reverts commit r311057 since it breaks ADCE. See PR34258.

Reviewers: kuhar

Subscribers: mcrosier, david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D36979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311381 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Elliott [Mon, 21 Aug 2017 20:30:44 +0000 (20:30 +0000)]

[ORE] Remove Old Optimization Remark API

Summary: https://bugs.llvm.org/show_bug.cgi?id=33789

Reviewers: anemet

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36972

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311380 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Zachary Turner [Mon, 21 Aug 2017 20:17:19 +0000 (20:17 +0000)]

[PDB] Serialize records into a stack-allocated buffer.

We were using a std::vector<> and resizing to MaxRecordLength,
which is ~64KB. We would then do this repeatedly often many
times in a tight loop, which was causing measurable performance
impact when linking PDBs.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311375 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

George Karpenkov [Mon, 21 Aug 2017 20:12:58 +0000 (20:12 +0000)]

Always compile libFuzzer with no coverage

Do not compile libFuzzer itself with coverage, regardless of LLVM variables

Differential Revision: https://reviews.llvm.org/D36887

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311374 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Zachary Turner [Mon, 21 Aug 2017 20:08:40 +0000 (20:08 +0000)]

[lld/pdb] Speed up construction of publics & globals addr map.

computeAddrMap function calls std::stable_sort with a comparison
function that computes deserialized symbols every time its called.
In the result deserializeAs<PublicSym32> is called 20-30 times per
symbol. It's much faster to calculate it beforehand and pass a
pointer to it to the comparison function.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36941

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311373 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Haicheng Wu [Mon, 21 Aug 2017 20:00:09 +0000 (20:00 +0000)]

[InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes.

Currently, the inline cost model will bail once the inline cost exceeds the
inline threshold in order to avoid unnecessary compile-time. However, when
debugging it is useful to compute the full cost, so this command line option
is added to override the default behavior.

I took over this work from Chad Rosier (mcrosier@codeaurora.org).

Differential Revision: https://reviews.llvm.org/D35850

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311371 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chad Rosier [Mon, 21 Aug 2017 19:56:46 +0000 (19:56 +0000)]

[InlineCost] Add more debug during inline cost computation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311370 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Zachary Turner [Mon, 21 Aug 2017 19:46:46 +0000 (19:46 +0000)]

[BinaryStream] Defaultify copy and move constructors.

The various BinaryStream classes had explicit copy constructors
which resulted in deleted move constructors. This was causing
the internal std::shared_ptr to get copied rather than moved
very frequently, since these classes are often used as return
values.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36942

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311368 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Mon, 21 Aug 2017 19:13:14 +0000 (19:13 +0000)]

[LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try)

The 1st try was reverted because it could inf-loop by creating a dead instruction.
Fixed that to not happen and added a test case to verify.

Original commit message:

Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311366 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Mon, 21 Aug 2017 19:02:06 +0000 (19:02 +0000)]

[InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type

This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely.

Differential Revision: https://reviews.llvm.org/D36498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311362 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Justin Bogner [Mon, 21 Aug 2017 17:57:12 +0000 (17:57 +0000)]

Revert "Introduce FuzzMutate library"

Looks like this fails to build with libstdc++.

This reverts r311356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311358 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Justin Bogner [Mon, 21 Aug 2017 17:44:36 +0000 (17:44 +0000)]

Introduce FuzzMutate library

This introduces the FuzzMutate library, which provides structured
fuzzing for LLVM IR, as described in my [EuroLLVM 2017 talk][1]. Most
of the basic mutators to inject and delete IR are provided, with
support for most basic operations.

I will follow up with the instruction selection fuzzer, which is
implemented in terms of this library.

[1]: http://llvm.org/devmtg/2017-03//2017/02/20/accepted-sessions.html#2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311356 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sean Fertile [Mon, 21 Aug 2017 17:35:32 +0000 (17:35 +0000)]

[PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.

For the medium and large code models we only need to check if a call crosses
dso-boundaries when considering tail-call elgibility.

Differential Revision: https://reviews.llvm.org/D34245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311353 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Elliott [Mon, 21 Aug 2017 16:57:21 +0000 (16:57 +0000)]

Migrate WholeProgramDevirt to new Optimization Remark API

Summary:
This is an attempt to move WholeProgramDevirt to the new remark API.

https://bugs.llvm.org/show_bug.cgi?id=33793

Reviewers: anemet

Reviewed By: anemet

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36943

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311352 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Davide Italiano [Mon, 21 Aug 2017 16:51:54 +0000 (16:51 +0000)]

[APFloat] Fix IsInteger() for DoubleAPFloat.

Previously, we would just assert instead.

Differential Revision: https://reviews.llvm.org/D36961

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311351 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Mon, 21 Aug 2017 16:47:12 +0000 (16:47 +0000)]

[InstCombine] add tests for memcmp with constant; NFC

This is the baseline (current) version of the tests that would
have been added with the transform in r311333 (reverted at
r311340 due to inf-looping).

Adding these now to aid in testing and minimize the patch if/when
it is reinstated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311350 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Elliott [Mon, 21 Aug 2017 16:45:47 +0000 (16:45 +0000)]

Emit only A Single Opt Remark When Inlining

Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311349 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Mon, 21 Aug 2017 16:04:11 +0000 (16:04 +0000)]

[InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions

Summary:
If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear.

Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits.

Reviewers: spatel, majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311343 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Mon, 21 Aug 2017 16:04:04 +0000 (16:04 +0000)]

[X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match

Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions.

Reviewers: RKSimon, spatel, zvi, igorb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36938

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311342 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Xinliang David Li [Mon, 21 Aug 2017 16:00:38 +0000 (16:00 +0000)]

Revert 311208, 311209

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311341 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Mon, 21 Aug 2017 15:16:25 +0000 (15:16 +0000)]

revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments

We're getting lots of compile-timeout bot failures like:
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311340 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Mon, 21 Aug 2017 15:11:39 +0000 (15:11 +0000)]

[InstCombine] add vector tests; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311339 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Zachary Turner [Mon, 21 Aug 2017 14:53:25 +0000 (14:53 +0000)]

[llvm-pdbutil] Add support for dumping detailed module stats.

This adds support for dumping a summary of module symbols
and CodeView debug chunks. This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size. Then
at the end it prints the totals for the entire file.

Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311338 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Mon, 21 Aug 2017 14:34:06 +0000 (14:34 +0000)]

[InstCombine] regenerate test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311337 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sanjay Patel [Mon, 21 Aug 2017 13:55:49 +0000 (13:55 +0000)]

[LibCallSimplifier] try harder to fold memcmp with constant arguments

Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311333 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Stefan Pintilie [Mon, 21 Aug 2017 13:36:18 +0000 (13:36 +0000)]

[PowerPC] Check if the pre-increment PHI Node already exists

Preparations to use the per-increment are sometimes done in the target
independent pass Loop Strength Reduction. We try to detect them in the PowerPC
specific pass so that they are not done twice and so that we do not add PHIs
that are not required.

Differential Revision: https://reviews.llvm.org/D36736

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311332 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Igor Breger [Mon, 21 Aug 2017 10:51:54 +0000 (10:51 +0000)]

[GlobalISel][X86] Support G_BRCOND operation.

Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311327 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Oliver Stannard [Mon, 21 Aug 2017 09:58:37 +0000 (09:58 +0000)]

[AsmParser] Recommit: Hash is not a comment on some targets

Re-committing after r311325 fixed an unintentional use of '#' comments in
clang.

The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.

Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.

Differential Revision: https://reviews.llvm.org/D36405

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311326 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Igor Breger [Mon, 21 Aug 2017 09:17:28 +0000 (09:17 +0000)]

[GlobalISel][X86] InstructionSelector, for now use fallback path for LOAD_STACK_GUARD and PHI nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311323 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Igor Breger [Mon, 21 Aug 2017 08:59:59 +0000 (08:59 +0000)]

[GlobalISel][X86] LowerCall, for now don't handel ByValue function arguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311321 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Michael Zuckerman [Mon, 21 Aug 2017 08:56:39 +0000 (08:56 +0000)]

[InterLeaved] Adding lit test for future work interleaved load strid 3

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311320 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Mon, 21 Aug 2017 08:45:22 +0000 (08:45 +0000)]

[x86] Teach the "generic" x86 CPU to avoid patterns that are slow on
widely used processors.

This occured to me when I saw that we were generating 'inc' and 'dec'
when for Haswell and newer we shouldn't. However, there were a few "X is
slow" things that we should probably just set.

I've avoided any of the "X is fast" features because most of those would
be pretty serious regressions on processors where X isn't actually fast.
The slow things are likely to be negligible costs on processors where
these aren't slow and a significant win when they are slow.

In retrospect this seems somewhat obvious. Not sure why we didn't do
this a long time ago.

Differential Revision: https://reviews.llvm.org/D36947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311318 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Mon, 21 Aug 2017 08:45:19 +0000 (08:45 +0000)]

[x86] Handle more cases where we can re-use an atomic operation's flags
rather than doing a separate comparison.

This both saves an explicit comparision and avoids the use of `xadd`
which introduces register constraints and other challenges to the
generated code.

The motivating case is from atomic reference counts where `1` is the
sentinel rather than `0` for whatever reason. This can and should be
lowered efficiently on x86 by just using a different flag, however the
x86 code only handled the `0` case.

There remains some further opportunities here that are currently hidden
due to canonicalization. I've included test cases that show these and
FIXMEs. However, I don't at the moment have any production use cases and
they seem substantially harder to address.

Differential Revision: https://reviews.llvm.org/D36945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311317 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Parker [Mon, 21 Aug 2017 08:43:06 +0000 (08:43 +0000)]

[ARM][AArch64] Cortex-A75 and Cortex-A55 support

This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36667

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311316 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

George Rimar [Mon, 21 Aug 2017 08:00:54 +0000 (08:00 +0000)]

[Support/Parallel] - Do not use a task group for a very small task.

parallel_for_each_n splits a given task into small pieces of tasks and then
passes them to background threads managed by a thread pool to process them
in parallel. TaskGroup then waits for all tasks to be done, which is done by
TaskGroup's destructor.

In the previous code, all tasks were passed to background threads, and the
main thread just waited for them to finish their jobs. This patch changes
the logic so that the main thread processes a task just like other
worker threads instead of just waiting for workers.

This patch improves the performance of parallel_for_each_n for a task which
is too small that we do not split it into multiple tasks. Previously, such task
was submitted to another thread and the main thread waited for its completion.
That involves multiple inter-thread synchronization which is not cheap for
small tasks. Now, such task is processed by the main thread, so no inter-thread
communication is necessary.

Differential revision: https://reviews.llvm.org/D36607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311312 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Coby Tayree [Mon, 21 Aug 2017 07:50:15 +0000 (07:50 +0000)]

[X86] Allow xacquire/xrelease prefixes

Allow those prefixes on assembly code
Differential Revision: https://reviews.llvm.org/D36845

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311309 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Mon, 21 Aug 2017 05:29:02 +0000 (05:29 +0000)]

[AVX-512] Don't change which instructions we use for unmasked subvector broadcasts when AVX512DQ is enabled.

There's no functional difference between the AVX512DQ instructions if we're not masking.

This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311308 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Mon, 21 Aug 2017 05:03:28 +0000 (05:03 +0000)]

[AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the EVEX->VEX table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311307 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Dean Michael Berris [Mon, 21 Aug 2017 00:14:06 +0000 (00:14 +0000)]

[XRay][tools] Support new kinds of instrumentation map entries

Summary:
When extracting the instrumentation map from a binary, we should be able
to recognize the new kinds of instrumentation sleds we've been emitting
with the compiler using -fxray-instrument. This change adds a test for
all the kinds of sleds we currently support (sans the tail-call sled,
which is a bit harder to force in a simple prebuilt input).

Reviewers: kpw, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311305 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Sun, 20 Aug 2017 23:17:11 +0000 (23:17 +0000)]

Revert r311077: [LV] Using VPlan ...

This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311304 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Sun, 20 Aug 2017 21:38:28 +0000 (21:38 +0000)]

[InstCombine] Add a test case for a weakness in canEvaluateZExtd. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311303 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Sun, 20 Aug 2017 19:47:00 +0000 (19:47 +0000)]

[AVX512] Add a test to check what happens when a load is referenced by two different masked scalar intrinsics with the same op inputs, but different masking node.

We're missing some single use checks in the sse_load_f32/f64 handling that cause us to replicate the load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311300 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Kuba Mracek [Sun, 20 Aug 2017 18:31:30 +0000 (18:31 +0000)]

Fix archive-update.test after r311296.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311299 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Sun, 20 Aug 2017 18:30:24 +0000 (18:30 +0000)]

[AVX-512] Use a scalar load pattern for FPCLASSSS/FPCLASSSD patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311297 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Kuba Mracek [Sun, 20 Aug 2017 18:18:44 +0000 (18:18 +0000)]

Remove uses of "%T" from test/Object/archive-* tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311296 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Sun, 20 Aug 2017 17:30:32 +0000 (17:30 +0000)]

[NVPTX] Reduce copypasta.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311295 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Kuba Mracek [Sun, 20 Aug 2017 17:05:22 +0000 (17:05 +0000)]

Get rid of even more "%T" expansions, see <https://reviews.llvm.org/D35396>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311294 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Kuba Mracek [Sun, 20 Aug 2017 17:00:08 +0000 (17:00 +0000)]

Get rid of some more "%T" expansions, see <https://reviews.llvm.org/D35396>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311293 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Sun, 20 Aug 2017 15:13:39 +0000 (15:13 +0000)]

[MachO] Use Twines more efficiently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311291 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Sun, 20 Aug 2017 14:34:44 +0000 (14:34 +0000)]

[Mem2Reg] Modernize code a bit.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311290 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Sun, 20 Aug 2017 13:03:48 +0000 (13:03 +0000)]

Move helper classes into anonymous namespaces.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311288 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Benjamin Kramer [Sun, 20 Aug 2017 13:03:32 +0000 (13:03 +0000)]

[dlltool] Make memory buffer ownership less weird.

There's no reason to destroy them in a global destructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311287 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Elena Demikhovsky [Sun, 20 Aug 2017 12:34:29 +0000 (12:34 +0000)]

Changed basic cost of store operation on X86

Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling.
This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks.

Differential Revision: https://reviews.llvm.org/D35888

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311285 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Aditya Kumar [Sun, 20 Aug 2017 10:32:41 +0000 (10:32 +0000)]

[Loop Vectorize] Added a separate metadata

Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.

Patch written by Divya Shanmughan and Aditya Kumar

Differential Revision: https://reviews.llvm.org/D36220

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311281 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Igor Breger [Sun, 20 Aug 2017 09:25:22 +0000 (09:25 +0000)]

[GlobalISel][X86] Support call ABI.

Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported.

Reviewers: zvi, guyblank, oren_ben_simhon

Reviewed By: oren_ben_simhon

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311279 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Igor Breger [Sun, 20 Aug 2017 07:14:40 +0000 (07:14 +0000)]

[GlobalISel][X86] Support asimetric copy from/to GPR physical register.

Usually this case generated by ABI lowering, it requare to performe trancate/anyext.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311278 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Alex Bradbury [Sun, 20 Aug 2017 06:58:43 +0000 (06:58 +0000)]

[RISCV] Trivial whitespace fix in RISCVInstPrinter

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311277 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Alex Bradbury [Sun, 20 Aug 2017 06:57:27 +0000 (06:57 +0000)]

[RISCV] Fix two abuses of llvm_unreachable

Replace with report_fatal_error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311276 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Alex Bradbury [Sun, 20 Aug 2017 06:55:14 +0000 (06:55 +0000)]

[RISCV] Set HasRelocationAddend for RISCVELFObjectWriter

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311275 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Elliott [Sun, 20 Aug 2017 06:55:10 +0000 (06:55 +0000)]

Revert "Emit only A Single Opt Remark When Inlining"

Reverting due to clang build failure

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311274 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Elliott [Sun, 20 Aug 2017 06:43:34 +0000 (06:43 +0000)]

Emit only A Single Opt Remark When Inlining

Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311273 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Igor Breger [Sun, 20 Aug 2017 06:26:22 +0000 (06:26 +0000)]

[GlobalIsel] Fix undefined behavior if Action not set (release), it aslo crashing in debug mode.

Differential Revision: https://reviews.llvm.org/D34978

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311272 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Sam Elliott [Sun, 20 Aug 2017 01:30:45 +0000 (01:30 +0000)]

Keep Optimization Remark Yaml in NewPM

Summary:
The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds.

So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33951

Reviewers: anemet, chandlerc

Reviewed By: anemet, chandlerc

Subscribers: javed.absar, chandlerc, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311271 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Sat, 19 Aug 2017 23:35:50 +0000 (23:35 +0000)]

[x86] Fix an even stranger corner case where we have multiple levels of
cmov self-refrencing.

Pointed out by Amjad Aboud in code review, test case minorly simplified
from the one he posted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311267 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Sat, 19 Aug 2017 23:21:22 +0000 (23:21 +0000)]

[X86] Merge all of the vecload and alignedload predicates into single predicates.

We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size.

This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311266 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Sat, 19 Aug 2017 23:21:21 +0000 (23:21 +0000)]

[X86] Converge alignedstore/alignedstore256/alignedstore512 to a single predicate.

We can read the memoryVT and get its store size directly from the SDNode to check its alignment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311265 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Sat, 19 Aug 2017 22:02:02 +0000 (22:02 +0000)]

[AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311263 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Victor Leschuk [Sat, 19 Aug 2017 21:05:08 +0000 (21:05 +0000)]

Set init value for ScalarEvolution::BackedgeTakenInfo::MaxOrZero

Otherwise it can be used uninitialized in move ctor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311262 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Martin Storsjo [Sat, 19 Aug 2017 20:26:51 +0000 (20:26 +0000)]

[ARM] Factorize the calculation of WhichResult in isV*Mask. NFC.

Differential Revision: https://reviews.llvm.org/D36930

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311260 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Martin Storsjo [Sat, 19 Aug 2017 19:47:48 +0000 (19:47 +0000)]

[ARM] Check the right order for halves of VZIP/VUZP if both parts are used

This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311258 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Teresa Johnson [Sat, 19 Aug 2017 19:15:04 +0000 (19:15 +0000)]

Fix bot failures by requiring x86 target

The tests added in r311254 require a target triple since they are
running through code generation. Fix bot failures by requiring
an x86 target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311257 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Konstantin Zhuravlyov [Sat, 19 Aug 2017 18:44:27 +0000 (18:44 +0000)]

AMDGPU/NFC: Reorder functions in SIMemoryLegalizer:

- Move *load* functions before *atomic* functions
- Move *store* functions before *atomic* functions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311256 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jatin Bhateja [Sat, 19 Aug 2017 18:08:59 +0000 (18:08 +0000)]

[DAGCombiner] Extending pattern detection for vector shuffle.

    Summary:
    If all the operands of a BUILD_VECTOR extract elements from same vector then split the
    vector efficiently based on the maximum vector access index.

    Reviewers: zvi, delena, RKSimon, thakis

    Reviewed By: RKSimon

    Subscribers: chandlerc, eladcohen, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311255 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Teresa Johnson [Sat, 19 Aug 2017 18:04:25 +0000 (18:04 +0000)]

[ThinLTO] Fix ThinLTO crash

Summary:
Follow up to fix in r311023, which fixed the case where the combined
index is written to disk. The same samplePGO logic exists for the
in-memory index when computing imports, so we need to filter out
GlobalVariable summaries there too.

Reviewers: davidxl

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311254 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Craig Topper [Sat, 19 Aug 2017 18:02:28 +0000 (18:02 +0000)]

[X86] Remove an unnecessary alignment restriction from MOVDDUP pattern.

The SSE MOVDDUP instruction only loads 64-bits with no alignment restriction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311253 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jatin Bhateja [Sat, 19 Aug 2017 17:59:58 +0000 (17:59 +0000)]

Revert rL311247 : To rectify commit message.

Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311252 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jatin Bhateja [Sat, 19 Aug 2017 17:00:04 +0000 (17:00 +0000)]

Merge branch 'arcpatch-D35788'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311247 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jatin Bhateja [Sat, 19 Aug 2017 16:40:06 +0000 (16:40 +0000)]

Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."

Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311246 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jatin Bhateja [Sat, 19 Aug 2017 15:58:36 +0000 (15:58 +0000)]

Extension of shuffle vector pattern detection, updating post rebase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311242 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Victor Leschuk [Sat, 19 Aug 2017 12:24:41 +0000 (12:24 +0000)]

revert failing test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311238 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Victor Leschuk [Sat, 19 Aug 2017 12:02:39 +0000 (12:02 +0000)]

Add temporary test to verify that win10 builder hangs on error

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311236 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Victor Leschuk [Sat, 19 Aug 2017 07:58:07 +0000 (07:58 +0000)]

Temporary mark lit :: shtest-format as unsupported on windows

When run manually it fails, but when run under buildbot it causes hang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311230 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Sat, 19 Aug 2017 06:56:11 +0000 (06:56 +0000)]

[Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.

We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.

This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.

The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.

Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311229 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Sat, 19 Aug 2017 06:06:44 +0000 (06:06 +0000)]

[Inliner] Clean up a test case a bit to make it more clear what is being
tested and why.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311228 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Sat, 19 Aug 2017 05:06:23 +0000 (05:06 +0000)]

[SLP] Fix an unused variable warning in non-asserts builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311227 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Sat, 19 Aug 2017 05:01:19 +0000 (05:01 +0000)]

[x86] Teach the cmov converter to aggressively convert cmovs with memory
operands into control flow.

We have seen periodically performance problems with cmov where one
operand comes from memory. On modern x86 processors with strong branch
predictors and speculative execution, this tends to be much better done
with a branch than cmov. We routinely see cmov stalling while the load
is completed rather than continuing, and if there are subsequent
branches, they cannot be speculated in turn.

Also, in many (even simple) cases, macro fusion causes the control flow
version to be fewer uops.

Consider the IACA output for the initial sequence of code in a very hot
function in one of our internal benchmarks that motivates this, and notice the
micro-op reduction provided.
Before, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.20 Cycles       Throughput Bottleneck: Port1

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    |           | 1.0 |           |           |     |     | CP | mov rcx, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.1       | 0.6 | 0.5   0.5 | 0.5   0.5 |     | 0.4 | CP | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.8       | 0.6 |           |           |     | 0.6 | CP | cmovbe rax, rdi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jb 0xf
Total Num Of Uops: 9
```
After, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: Port5

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.5       | 0.5 | 1.0   1.0 |           |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     | 1.0 | CP | jnbe 0x39
|   2^   |           |     |           | 1.0   1.0 |     | 1.0 | CP | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jnb 0x3c
Total Num Of Uops: 7
```
The difference even manifests in a throughput cycle rate difference on Haswell.
Before, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rcx, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.0       | 1.0 |           |           |     |     | 1.0 |     |    | cmovbe rax, rdi
|   2^   | 0.5       |     | 0.5   0.5 | 0.5   0.5 |     |     | 0.5 |     |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jb 0xf
Total Num Of Uops: 8
```
After, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 1.50 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 1.0   1.0 |           |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           | 1.0 |           |           |     |     |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     |     | 1.0 |     |    | jnbe 0x39
|   2^   | 1.0       |     |           | 1.0   1.0 |     |     |     |     |    | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jnb 0x3c
Total Num Of Uops: 6
```

Note that this cannot be usefully restricted to inner loops. Much of the
hot code we see hitting this is not in an inner loop or not in a loop at
all. The optimization still remains effective and indeed critical for
some of our code.

I have run a suite of internal benchmarks with this change. I saw a few
very significant improvements and a very few minor regressions,
but overall this change rarely has a significant effect. However, the
improvements were very significant, and in quite important routines
responsible for a great deal of our C++ CPU cycles. The gains pretty
clealy outweigh the regressions for us.

I also ran the test-suite and SPEC2006. Only 11 binaries changed at all
and none of them showed any regressions.

Amjad Aboud at Intel also ran this over their benchmarks and saw no
regressions.

Differential Revision: https://reviews.llvm.org/D36858

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311226 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Chandler Carruth [Sat, 19 Aug 2017 04:28:20 +0000 (04:28 +0000)]

[x86] Refactor the CMOV conversion pass to be more flexible.

The primary thing that this accomplishes is to allow future re-use of
these routines in more contexts and clarify the behavior w.r.t. loops.
For example, if handling outer loops is desirable, doing so in
a inside-out order becomes straight forward because it walks the loop
nest itself (rather than walking the function's basic blocks) and
de-couples the CMOV rewriting from the loop structure as there isn't
actually anything loop-specific about this transformation.

This patch should be essentially a no-op. It potentially changes the
order in which we visit the inner loops, but otherwise should merely set
the stage for subsequent changes.

Differential Revision: https://reviews.llvm.org/D36783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311225 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Dinar Temirbulatov [Sat, 19 Aug 2017 03:15:07 +0000 (03:15 +0000)]

[SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311223 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Dinar Temirbulatov [Sat, 19 Aug 2017 02:54:20 +0000 (02:54 +0000)]

[SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions.

Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311221 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Matthias Braun [Sat, 19 Aug 2017 01:21:11 +0000 (01:21 +0000)]

ARMRegsiterInfo: Define more ssub indexes; NFC

This doesn't really change anything as Tablegen would have inferred
those indices anyway; defining them gives us shorter names that are
easier to read while debugging (i.e. "ssub_4" rather than
"dsub2_then_ssub_0")

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311218 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Adrian Prantl [Sat, 19 Aug 2017 01:15:06 +0000 (01:15 +0000)]

Filter out non-constant DIGlobalVariableExpressions reachable via the CU

They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311217 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Eric Beckmann [Sat, 19 Aug 2017 00:37:41 +0000 (00:37 +0000)]

llvm-mt: Merge manifest namespaces.

mt.exe performs a tree merge where certain element nodes are combined
into one.  This introduces the possibility of xml namespaces conflicting
with each other.  The original mt.exe has a hierarchy whereby certain
namespace names can override others, and nodes that would then end up in
ambigious namespaces have their namespaces explicitly defined.  This
namespace handles this merging process.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311215 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Eugene Zelenko [Fri, 18 Aug 2017 23:51:26 +0000 (23:51 +0000)]

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311212 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Xinliang David Li [Fri, 18 Aug 2017 23:08:50 +0000 (23:08 +0000)]

Fix comment /NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311209 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Xinliang David Li [Fri, 18 Aug 2017 23:00:05 +0000 (23:00 +0000)]

[Profile] backward propagate profile info in JumpThreading

Differential Revsion: http://reviews.llvm.org/D36864

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311208 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Amjad Aboud [Fri, 18 Aug 2017 22:56:55 +0000 (22:56 +0000)]

[InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction.

Differential Revision: https://reviews.llvm.org/D36679

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311206 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Max Kazantsev [Fri, 18 Aug 2017 22:50:29 +0000 (22:50 +0000)]

[IRCE] Fix buggy behavior in Clamp

Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations.
In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned
predicates, but we did not check this for `S` which can in theory be negative.

This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative,
and it adds a test where such situation may lead to incorrect conditions calculation.

Differential Revision: https://reviews.llvm.org/D36873

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311205 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Justin Bogner [Fri, 18 Aug 2017 21:38:03 +0000 (21:38 +0000)]

IR: Make stripDebugInfo robust against (invalid) empty basic blocks

Since stripDebugInfo runs before the verifier when reading IR, we can
end up in a situation where we read some invalid IR but don't know its
invalid yet. Before this patch we would crash in stripDebugInfo when
given IR with a completely empty basic block, and after we get a nice
error from the verifier instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311202 91177308-0d34-0410-b5e6-96231b3b80d8

commit | commitdiff | tree

Jonas Devlieghere [Fri, 18 Aug 2017 21:35:44 +0000 (21:35 +0000)]

[llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode

This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.

Differential Revision: https://reviews.llvm.org/D36835

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311201 91177308-0d34-0410-b5e6-96231b3b80d8

Unnamed repository; edit this file 'description' to name the repository.