Taewook Oh [Thu, 29 Jun 2017 23:11:24 +0000 (23:11 +0000)]
Remove redundant copy in recurrences
Summary:
If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code,
This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes
Previously it doesn't actually invoke the designated new PM builder
functions.
This patch moves NameAnonGlobalPass out from PassBuilder, as Chandler
points out that PassBuilder is used for non-O0 builds, and for
optimizations only.
Keno Fischer [Thu, 29 Jun 2017 20:28:59 +0000 (20:28 +0000)]
[CodeGenPrepare] Don't create inttoptr for ni ptrs
Summary:
Arguably non-integral pointers probably shouldn't show up here at all,
but since the backend doesn't complain and this takes valid (according
to the Verifier) IR and makes it invalid, make sure not to introduce
any inttoptr instructions if we're dealing with non-integral pointers.
Reid Kleckner [Thu, 29 Jun 2017 20:15:08 +0000 (20:15 +0000)]
Attempt to fix Orc JIT test timeouts
I think there are some destruction ordering issues here. The
ShouldDelete map seems to be getting destroyed before the shared_ptr
deleter lambda accesses it. In any case, this avoids inserting elements
into the map during shutdown.
[DWARF] Added verification checks for the .apple_names section.
This patch verifies the number of atoms, the validity of the form for each atom, as well as the validity of the
hashdata. For hashdata, we're verifying that the hashdata offset is correct and that the offset in the .debug_info for
each DIE in the hashdata is also valid.
Sam Clegg [Thu, 29 Jun 2017 19:35:17 +0000 (19:35 +0000)]
Remove `inline` keyword from inline `classof` methods
The style guide states that the explicit `inline`
should not be used with inline methods. classof is
very common inline method with a fair amount on
inconsistency:
I chose to target this method rather the larger change
since this method is easily cargo-culted (I did it at
least once). I considered doing the larger change and
removing all occurrences but that would be a much larger
change.
Keno Fischer [Thu, 29 Jun 2017 19:13:11 +0000 (19:13 +0000)]
[AliasSetTracker] Don't drop AA MD so eagerly
Summary:
When we have patterns like
loop:
%la = load %ptr, !tbaa
%lba = load %ptr, !tbaa !noalias
AliasSetTracker would previously think that the two types of annotation for
the pointer conflict, dropping both for the purpose of determining alias sets.
That is clearly way too conservative, as the tbaa is still valid whether or
not one of the memory accesses has additional AA metadata. We could go
one step further and attempt to properly merge the AA metadata,
but it's not clear that that would be worth it since that may introduce
additional MD nodes, which may be undesirable since this is merely an
Analysis.
Brian Gesiak [Thu, 29 Jun 2017 18:56:25 +0000 (18:56 +0000)]
[opt-viewer] Add progress indicators (PR33522)
Summary:
Provide feedback to users of opt-diff.py, opt-stats.py, and opt-viewer.py,
on how many YAML files have finished being processed, and how many HTML
files have been generated. This feedback is particularly helpful for
opt-viewer.py, which may take a long time to complete when given many
large YAML files as input.
The progress indicators use simple output such as the following:
```
Reading YAML files...
9 of 1197
```
Test plan:
Run `utils/opt-viewer/opt-*.py` on a CentOS and macOS machine, using
Python 3.4 and Python 2.7 respectively, and ensure the output is
formatted well on both.
Brian Gesiak [Thu, 29 Jun 2017 18:47:31 +0000 (18:47 +0000)]
[opt-viewer] Python 3 support in opt-viewer.py
Summary:
Minor changes that allow opt-stats.py to support both Python 2 and 3.
In addition to the same dictionary iterator changes that were necessary
in https://reviews.llvm.org/D34564, this diff also:
* Explcitly converts strings to bytes when reading from and writing to stdin
and stdout.
* No longer uses dictionaries as a sort key for optimization remarks.
Dictionary sort order in Python 2 is pretty esoteric anyway, so it's
not clear that the additional sorting had a benefit for end users
(for details, https://stackoverflow.com/a/3484456/679254 is a good
resource on Python 2 dictionary sort order).
Jakub Kuderski [Thu, 29 Jun 2017 17:53:35 +0000 (17:53 +0000)]
[Dominators] Rearrange access specifiers in DominatorTreeBase
Summary:
This patch makes DominatorTreeBase more readable by putting most important members on top of the class.
Before, the class looked like that: private -> protected (including data members) -> public -> protected.
The patch changes it to: protected (data members only) -> public -> protected -> public.
Jakub Kuderski [Thu, 29 Jun 2017 17:50:19 +0000 (17:50 +0000)]
[Dominators] Remove DominatorBase class
Summary:
DominatorBase class was only used by DominatorTreeBase. It didn't provide any useful abstractions, nor simplified anything, so I see no point keeping it.
This commit removes the DominatorBase class and moves its content into DominatorTreeBase.
This is the first patch in a series that tries to make all DomTrees have a single virtual root, which will allow to further simplify code (especially when it comes to incremental updates).
Jakub Kuderski [Thu, 29 Jun 2017 17:45:51 +0000 (17:45 +0000)]
[Dominators] Add parent and sibling property verification (non-hacky)
Summary:
This patch adds an additional level of verification - it checks parent and sibling properties of a tree. By definition, every tree with these two properties is a dominator tree.
It is possible to run those check by running llvm with `-verify-dom-info=1`.
Bootstrapping clang and building the llvm test suite with this option enabled doesn't yield any errors.
Leo Li [Thu, 29 Jun 2017 17:03:34 +0000 (17:03 +0000)]
[ConstantHoisting] Avoid hoisting constants in GEPs that index into a struct type.
Summary:
Indices for GEPs that index into a struct type should always be
constants. This added more checks in `collectConstantCandidates:` which make
sure constants for GEP pointer type are not hoisted.
This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538
Paul Robinson [Thu, 29 Jun 2017 16:52:08 +0000 (16:52 +0000)]
[DWARF] NFC: DWARFDataExtractor combines relocs with DataExtractor.
Requires callers to directly associate relocations with a DataExtractor
used to read data from a DWARF section, which helps a callee not make
assumptions about which section it is reading.
This is the next step in reducing DWARFFormValue's dependence on DWARFUnit.
Brian Gesiak [Thu, 29 Jun 2017 16:20:31 +0000 (16:20 +0000)]
[opt-viewer] opt-viewer.py takes -o argument
Summary:
Change how the output directory is specified when invoking
opt-viewer.py, from `opt-viewer.py yaml_file_one yaml_file_two output_dir` to
`opt-viewer.py -o output_dir yaml_file_one yaml_file_two`.
This makes it easier to pipe the results of another command into
opt-viewer.py. For example:
Nirav Dave [Thu, 29 Jun 2017 15:48:11 +0000 (15:48 +0000)]
[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.
Relanding after restricting equalBaseIndex to not erroneuosly consider
a FrameIndices stemming from alloca from being comparable as its
offset is set post-selectionDAG.
Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.
Yonghong Song [Thu, 29 Jun 2017 15:18:54 +0000 (15:18 +0000)]
bpf: remove unnecessary truncate operation
For networking-type bpf program, it often needs to access
packet data. A context data structure is provided to the bpf
programs with two fields:
u32 data;
u32 data_end;
User can access these two fields with ctx->data and ctx->data_end.
During program verification process, the kernel verifier modifies
the bpf program with loading of actual pointer value from kernel
data structure.
r = ctx->data ===> r = actual data start ptr
r = ctx->data_end ===> r = actual data end ptr
A typical program accessing ctx->data like
char *data_ptr = (char *)(long)ctx->data
will result in a 32-bit load followed by a zero extension.
Such an operation is combined into a single LDW in DAG combiner
as bpf LDW does zero extension automatically.
In cases like the below (which can be a result of global value numbering
and partial redundancy elimination before insn selection):
B1:
u32 a = load-32-bit &ctx->data
u64 pa = zext a
...
B2:
u32 b = load-32-bit &ctx->data
u64 pb = zext b
...
B3:
u32 m = PHI(a, b)
u64 pm = zext m
In B3, "pm = zext m" cannot be removed, which although is legal
from compiler perspective, will generate incorrect code after
kernel verification.
This patch recognizes this pattern and traces through PHI node
to see whether the operand of "zext m" is defined with LDWs or not.
If it is, the "zext m" itself can be removed.
The patch also recognizes the pattern where the load and use of
the load value not in the same basic block, where truncate operation
may be removed as well.
The patch handles 1-byte, 2-byte and 4-byte truncation.
Two test cases are added to verify the transformation happens properly
for the above code pattern.
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306685 91177308-0d34-0410-b5e6-96231b3b80d8
Daniel Neilson [Thu, 29 Jun 2017 14:21:28 +0000 (14:21 +0000)]
Restore original intent of memset instcombine test
Summary:
The original intent of test/Transforms/InstCombine/memset.ll was to test for lowering of llvm.memset into stores when the size of the memset is 1, 2, 4, or 8. Sometime between then and now the test has stopped testing for that, but remained passing due to testing for the absence of llvm.memset calls rather than the presence of store instructions. Right now this test ends up with an empty function body because the alloca is eliminated as safe-to-remove, which results in the llvm.memset calls's being eliminated due to their pointer args being undef; so it is not testing for conversion of llvm.memset into store instructions at all.
This change alters the test to verify that store instructions are created, and moves the target of the memset to an arg of the proc to avoid it being eliminated as unused.
Daniel Neilson [Thu, 29 Jun 2017 14:17:50 +0000 (14:17 +0000)]
Explicitly check for presence of correct results in instcombine memmove test
Summary:
Rather than testing for expected results, test/Transforms/InstCombine/memmove.ll is testing for the absence of calls to llvm.memmove.
In the case of test3, the test has stopped testing for materialization of loads/stores, but remained passing due to testing for the absence of llvm.memset calls rather than the presence of load/store instructions. Right now this test ends up with an empty function body because the alloca is eliminated as safe-to-remove, which results in the llvm.memmove calls being eliminated due to a pointer arg being undef; so it is not testing for conversion of llvm.memmove into load/store instructions at all.
Hiroshi Inoue [Thu, 29 Jun 2017 14:13:38 +0000 (14:13 +0000)]
[PowerPC] fix potential verification error on __tls_get_addr
This patch fixes a verification error with -verify-machineinstrs while expanding __tls_get_addr by not creating ADJCALLSTACKUP and ADJCALLSTACKDOWN if there is another ADJCALLSTACKUP in this basic block since nesting ADJCALLSTACKUP/ADJCALLSTACKDOWN is not allowed.
Here, ADJCALLSTACKUP and ADJCALLSTACKDOWN are created as a fence for instruction scheduling to avoid _tls_get_addr is scheduled before mflr in the prologue (https://bugs.llvm.org//show_bug.cgi?id=25839). So if another ADJCALLSTACKUP exists before _tls_get_addr, we do not need to create a new ADJCALLSTACKUP.
George Rimar [Thu, 29 Jun 2017 14:05:18 +0000 (14:05 +0000)]
[DWARF] - Fix message reporting about broken relocation.
Because of mistake introduced in r306517,
wrong variable ("name" instead of "Name") was used
in error message.
As a result it reported section name instead of
relocation name.
This file still needs cleanup to match LLVM coding style
and more tests I think.
Daniel Jasper [Thu, 29 Jun 2017 13:58:24 +0000 (13:58 +0000)]
Revert "r306529 - [X86] Correct dwarf unwind information in function epilogue"
I am 99% sure that this breaks the PPC ASAN build bot:
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio
If it doesn't go back to green, we can recommit (and fix the original
commit message at the same time :) ).
[TargetTransformInfo, API] Add a list of operands to TTI::getUserCost
The changes are a result of discussion of https://reviews.llvm.org/D33685.
It solves the following problem:
1. We can inform getGEPCost about simplified indices to help it with
calculating the cost. But getGEPCost does not take into account the
context which GEPs are used in.
2. We have getUserCost which can take the context into account but we cannot
inform about simplified indices.
With the changes getUserCost will have access to additional information
as getGEPCost has.
The difference from the previous version is the use of decltype, as the
implementation of std::result_of in libc++ did not work correctly for
variadic function like open(2).
Original summary:
This function retries an operation if it was interrupted by a signal
(failed with EINTR). It's inspired by the TEMP_FAILURE_RETRY macro in
glibc, but I've turned that into a template function. I've also added a
fail-value argument, to enable the function to be used with e.g.
fopen(3), which is documented to fail for any reason that open(2) can
fail (which includes EINTR).
The main user of this function will be lldb, but there were also a
couple of uses within llvm that I could simplify using this function.
Igor Breger [Thu, 29 Jun 2017 12:08:28 +0000 (12:08 +0000)]
[GlobalISel][X86] Support vector type G_MERGE_VALUES selection.
Summary:
Support vector type G_MERGE_VALUES selection. For now G_MERGE_VALUES marked as legal for any type, so nothing to do in legalizer.
Split from https://reviews.llvm.org/D33665
Florian Hahn [Thu, 29 Jun 2017 08:45:31 +0000 (08:45 +0000)]
[ARM] Add tGPRwithpc register class and use it for TBB/THH
Summary:
TBB and THH allow using a Thumb GPR or the PC as destination operand.
A few machine verifier failures where due to those instructions not
expecting PC as destination operand.
Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add
test coverage even if expensive checks are disabled.
David L. Jones [Thu, 29 Jun 2017 04:37:35 +0000 (04:37 +0000)]
[lit] Re-apply: Fix some convoluted logic around Unicode encoding, and de-duplicate across modules that used it.
(Take 2: this patch re-applies r306625, which was reverted in r306629. This
patch includes only trivial fixes.)
In Python2 and Python3, the various (non-)?Unicode string types are sort of
spaghetti. Python2 has unicode support tacked on via the 'unicode' type, which
is distinct from 'str' (which are bytes). Python3 takes the "unicode-everywhere"
approach, with 'str' representing a Unicode string.
Both have a 'bytes' type. In Python3, it is the only way to represent raw bytes.
However, in Python2, 'bytes' is an alias for 'str'. This leads to interesting
problems when an interface requires a precise type, but has to run under both
Python2 and Python3.
The previous logic appeared to be correct in all cases, but went through more
layers of indirection than necessary. This change does the necessary conversions
in one shot, with documentation about which paths might be taken in Python2 or
Python3.
Changes from r306625: some tests just print binary outputs, so in those cases,
fall back to str() in Python3. For googletests, add one missing call to
to_string().
(Tested by verifying the visible breakage with Python3. Verified that everything
works in py2 and py3.)
David Blaikie [Thu, 29 Jun 2017 02:51:58 +0000 (02:51 +0000)]
llvm-profdata: Indirect infrequently used fields to reduce memory usage
Examining a large profile example, it seems relatively few records have
non-empty IndirectCall and MemOP data, so indirecting these through a
unique_ptr (non-null only when they are non-empty) Reduces memory usage
on this particular example from 14GB to 10GB according to valgrind's
massif.
I suspect it'd still be worth moving InstrProfWriter to its own data
structure that had Counts and the indirected IndirectCall+MemOP, and did
not include the Name, Hash, or Error fields. This would reduce the size
of this dominant data structure by half of this new, lower amount.
(Name(2), Hash(1), Error(1) ~= Counts(vector, 3), ValueProfData
(unique_ptr, 1))
-> From code review feedback, might actually refactor InstrProfRecord
itself to have a sub-struct with all the counts, and use that from
InstrProfWriter, rather than InstrProfWriter owning its own data
structure for this.
David L. Jones [Thu, 29 Jun 2017 01:03:55 +0000 (01:03 +0000)]
[lit] Fix some convoluted logic around Unicode encoding, and de-duplicate across modules that used it.
Summary:
In Python2 and Python3, the various (non-)?Unicode string types are sort of
spaghetti. Python2 has unicode support tacked on via the 'unicode' type, which
is distinct from 'str' (which are bytes). Python3 takes the "unicode-everywhere"
approach, with 'str' representing a Unicode string.
Both have a 'bytes' type. In Python3, it is the only way to represent raw bytes.
However, in Python2, 'bytes' is an alias for 'str'. This leads to interesting
problems when an interface requires a precise type, but has to run under both
Python2 and Python3.
The previous logic appeared to be correct in all cases, but went through more
layers of indirection than necessary. This change does the necessary conversions
in one shot, with documentation about which paths might be taken in Python2 or
Python3.
David L. Jones [Thu, 29 Jun 2017 01:01:03 +0000 (01:01 +0000)]
[lit] Remove dead code not referenced in the LLVM SVN repo.
Summary:
This change removes the intermediate 'FileBasedTest' format from lit. This
format is only ever used by the ShTest format, so the logic can be moved into
ShTest directly.
In order to better clarify what the TestFormat subclasses do, I fleshed out the
TestFormat base class with Python's notion of abstract methods, using
@abc.abstractmethod. This gives a convenient way to document the expected
interface, without the risk of instantiating an abstract class (that's what
ABCMeta does -- it raises an exception if you try to instantiate a class which
has abstract methods, but not if you instantiate a subclass that implements
them).
This is done in order to address the failure of CrWinClangLLD etc. bots.
These throw an error of "side-by-side configuration is incorrect" during
compilation, which sounds suspiciously related to these manifest
changes.
Revert "Switch external cvtres.exe for llvm's own resource library."
Craig Topper [Thu, 29 Jun 2017 00:07:08 +0000 (00:07 +0000)]
[InstCombine] In visitXor, use m_Not on the instruction itself instead of looking for all ones in Op1. This is consistent with 3 other not checks before this one. NFCI
Keno Fischer [Wed, 28 Jun 2017 23:36:40 +0000 (23:36 +0000)]
[InstCombine] Retain TBAA when narrowing memory accesses
Summary:
As discussed on the mailing list it is legal to propagate TBAA to loads/stores
from/to smaller regions of a larger load tagged with TBAA. Do so for
(load->extractvalue)=>(gep->load) and similar foldings.
Adrian McCarthy [Wed, 28 Jun 2017 22:47:40 +0000 (22:47 +0000)]
Introduce symbol cache to PDB NativeSession
Instead of creating symbols directly in the findChildren methods of the native
symbol implementations, they will rely on the NativeSession to act as a factory
for these types. This lets NativeSession cache the NativeRawSymbols in its
new symbol cache and makes that cache the source of unique IDs for the symbols.
Right now, this affects only NativeCompilandSymbols. There's no external
change yet, so I think the existing tests are still sufficient. Coming soon
are patches to extend this to built-in types and enums.
David L. Jones [Wed, 28 Jun 2017 21:14:13 +0000 (21:14 +0000)]
[lit] Remove dead code (not referenced anywhere), and clarify some function names.
Summary:
The dead code seems to be unreferenced, according to textual search across the
LLVM SVN repo.
The clarification part of this change alters the name of a module-level function
so that it is different from the name of the class-methods that call it.
Currently, there are no erroneous references, but stylistically (c.f. PEP-8),
internal "helper" functions should generally be named accordingly by prepending
an underscore. (I also chose to add '_impl', which isn't necessary, but helps me
at least to mentally disambiguate the interface and implementation functions.)
Jakub Kuderski [Wed, 28 Jun 2017 18:15:45 +0000 (18:15 +0000)]
[Dominators] Move helper functions into SemiNCAInfo
Summary: Helper functions (DFSPass, ReverseDFSPass, Eval) need SemiNCAInfo anyway, so it's simpler to have them there as member functions. This also makes them simpler by removing template boilerplate.
Craig Topper [Wed, 28 Jun 2017 18:07:29 +0000 (18:07 +0000)]
[InstCombine] Remove 64-bit bit width restriction from m_ConstantInt(uint64_t*&)
I think we only need to make sure the value fits in 64-bits not that bit width is 64-bit.
This helps places that use this for shift amounts since the shift amount needs to be the same bitwidth as the LHS, but can't be larger than the bit width.
Jakub Kuderski [Wed, 28 Jun 2017 18:00:36 +0000 (18:00 +0000)]
[Dominators] Move SemiNCAInfo and helper functions out of DominatorTreeBase
Summary:
This moves SemiNCAInfo from DeminatorTreeBase to GenericDomTreeConstruction. It also put helper functions used during tree constructions in the same file.
The point of this change is to further clean up DominatorTreeBase and make it easier to construct and verify (in future patches).
Ayal Zaks [Wed, 28 Jun 2017 17:59:33 +0000 (17:59 +0000)]
[LV] Fix PR33613 - retain order of insertelement per part
r306381 caused PR33613, by reversing the order in which insertelements were
generated per unroll part. This patch fixes PR33613 by retraining this order,
placing each set of insertelements per part immediately after the last scalar
being packed for this part. Includes a test case derived from PR33613.
Jakub Kuderski [Wed, 28 Jun 2017 17:56:09 +0000 (17:56 +0000)]
[Dominators] Move IDoms out of DominatorTreeBase and put them in SNCAInfo
Summary: The temporary IDoms map was used only during DomTree calculation. We can move it to SNCAInfo so that it's no longer a DominatorTreeBase member.
Jakub Kuderski [Wed, 28 Jun 2017 16:54:34 +0000 (16:54 +0000)]
[Dominators] Move number to node mapping out of DominatorTreeBase
Summary: Number to node mapping in DominatorTreeBase is used only during calculation, so there is no point keeping is as a member variable. This patch moves this mapping to Calculate function and passes it to helper functions. It also makes the name more descriptive.