Zachary Turner [Mon, 1 May 2017 16:46:39 +0000 (16:46 +0000)]
[PDB/CodeView] Rename some classes.
In preparation for introducing writing capabilities for each of
these classes, I would like to adopt a Foo / FooRef naming
convention, where Foo indicates that the class can manipulate and
serialize Foos, and FooRef indicates that it is an immutable view of
an existing Foo. In other words, Foo is a writer and FooRef is a
reader. This patch names some existing readers to conform to the
FooRef convention, while offering no functional change.
Sanjoy Das [Mon, 1 May 2017 16:28:58 +0000 (16:28 +0000)]
Emulate TrackingVH using WeakVH
Summary:
This frees up one slot in the HandleBaseKind enum, which I will use
later to add a new kind of value handle. The size of the
HandleBaseKind enum is important because we store a HandleBaseKind in
the low two bits of a (in the worst case) 4 byte aligned pointer.
Craig Topper [Mon, 1 May 2017 16:08:06 +0000 (16:08 +0000)]
[SelectionDAG] Use known ones to provide a better bound for the known zeros for CTTZ/CTLZ operations.
This is the SelectionDAG version of D32521. If know where at least one 1 is located in the input to these intrinsics we can place an upper bound on the number of bits needed to represent the count and thus increase the number of known zeros in the output.
I think we can also refine this further for CTTZ_UNDEF/CTLZ_UNDEF by assuming that the answer will never be BitWidth. I've left this out for now because it caused other test failures across multiple targets. Usually because of turning ADD into OR based on this new information.
In this patch, I introduce a new alt macro feature.
This feature adds meaning for the % when using it as a prefix to the calling macro arguments.
In the altmacro mode, the percent sign '%' before an absolute expression convert the expression first to a string.
As described in the https://sourceware.org/binutils/docs-2.27/as/Altmacro.html
"Expression results as strings
You can write `%expr' to evaluate the expression expr and use the result as a string."
expression assumptions:
1. '%' can only evaluate an absolute expression.
2. Altmacro '%' must be the first character of the evaluated expression.
3. If no '%' is located before the expression, a regular module operation is expected.
4. The result of Absolute Expressions can be only integer.
[DAGCombiner] shrink/widen a vselect to match its condition operand size (PR14657)
We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that
patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check
legality of the select that we want to produce.
A few things to note:
1. We can't wait until after legalization and do this generically because (at least in the x86
tests from PR14657), we'll have PACKSS and bitcasts in the pattern.
2. This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but
that requires a closer look to make sure we don't end up worse.
3. There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases.
That should be fixed next.
4. I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple
legal vector sizes, but if there are other targets like that, we should add more tests.
5. There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests):
despite IR that is terrible for the target, this patch allows us to generate the optimal loop
code because something post-ISEL is hoisting the splat extends above the vector loops.
Sanjoy Das [Sun, 30 Apr 2017 19:41:19 +0000 (19:41 +0000)]
Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC
Summary:
programUndefinedIfPoison makes more sense, given what the function
does; and I'm about to add a function with a name similar to
isKnownNotFullPoison (so do the rename to avoid confusion).
Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.
[APInt] Remove support for wrapping from APInt::setBits.
This features isn't used anywhere in tree. It's existence seems to be preventing selfhost builds from inlining any of the setBits methods including setLowBits, setHighBits, and setBitsFrom. This is because the code makes the method recursive.
If anyone needs this feature in the future we could consider adding a setBitsWithWrap method. This way only the calls that need it would pay for it.
Summary:
Apply canonicalization rules:
1. Input vectors with no elements selected from can be replaced with undef.
2. If only one input vector is constant it shall be the second one.
This allows constant-folding to cover more ad-hoc simplifications that
were in place and avoid duplication for RHS and LHS checks.
There are more rules we may want to add in the future when we see a
justification. e.g. mask elements that select undef elements can be
replaced with undef.
Daniel Sanders [Sat, 29 Apr 2017 17:30:09 +0000 (17:30 +0000)]
[globalisel][tablegen] Compute available feature bits correctly.
Summary:
Predicate<> now has a field to indicate how often it must be recomputed.
Currently, there are two frequencies, per-module (RecomputePerFunction==0)
and per-function (RecomputePerFunction==1). Per-function predicates are
currently recomputed more frequently than necessary since the only predicate
in this category is cheap to test. Per-module predicates are now computed in
getSubtargetImpl() while per-function predicates are computed in selectImpl().
Tablegen now manages the PredicateBitset internally. It should only be
necessary to add the required includes.
Also fixed a problem revealed by the test case where
constrainSelectedInstRegOperands() would attempt to tie operands that
BuildMI had already tied.
[KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state
Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit.
[ConstantRange] Improve the efficiency of one of the ConstantRange constructors.
We were default constructing the Lower/Upper APInts. Then creating min or max value, then doing a move assignment to Lower and copy assignment to upper. The copy assignment operator in particular has an out of line function call that has to examine whether or not a previous allocation exists that can be reused which of course it can't in this case.
The new code creates the min/max value first, move constructs Lower from it then copy constructs Upper from Lower.
This also seems to have convinced a self host build that this constructor can be inlined more readily into other methods in ConstantRange.
[llvm-pdbdump] Abstract some of the YAML/Raw printing code.
There is a lot of duplicate code for printing line info between
YAML and the raw output printer. This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.
[ObjCARC] Do not move a release between a call and a
retainAutoreleasedReturnValue that retains the returned value.
This commit fixes a bug in ARC optimizer where it moves a release
between a call and a retainAutoreleasedReturnValue, causing the returned
object to be released before the retainAutoreleasedReturnValue can
retain it.
This commit accomplishes that by doing a lookahead and checking whether
the call prevents the release from moving upwards. In the long term, we
should treat the region between the retainAutoreleasedReturnValue and
the call as a critical section and disallow moving anything there
(possibly using operand bundles).
Davide Italiano [Sat, 29 Apr 2017 00:18:26 +0000 (00:18 +0000)]
[LoopUnswitch] Make DEBUG output more readable (part 2).
I fixed my miscompile in r301722 and I hope I don't have to take
a look at this code again now that Chandler has a new LoopUnswitch
pass, but maybe this could be of use for somebody else in the
meanwhile.
Fuzzer: Mark test/cxxstring.test UNSUPPORTED: windows
This has been mysteriously failing since r301593, which cleaned up the
types of things like size_t and SIZE_MAX for freestanding targets. Reid
and Kostya suggested marking it as UNSUPPORTED on windows, given that no
one has been able to reproduce locally.
Hans Wennborg [Fri, 28 Apr 2017 23:01:32 +0000 (23:01 +0000)]
Revert r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList"
This broke the Clang build. (Clang-side patch missing?)
Original commit message:
> [IR] Make add/remove Attributes use AttrBuilder instead of
> AttributeList
>
> This change cleans up call sites and avoids creating temporary
> AttributeList objects.
>
> NFC
Adrian Prantl [Fri, 28 Apr 2017 22:25:46 +0000 (22:25 +0000)]
Remove line and file from DINamespace.
Fixes the issue highlighted in
http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html.
The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces can
prevent LLVM from uniquing types that are in the same namespace. They
also don't carry any meaningful information.
Matt Arsenault [Fri, 28 Apr 2017 22:18:19 +0000 (22:18 +0000)]
InferAddressSpaces: Avoid looking up deleted values
While looking at pure addressing expressions, it's possible
for the value to appear later in Postorder.
I haven't been able to come up with a testcase where this
exhibits an actual issue, but if you insert a dump before
the value map lookup, a few testcases crash.
Matt Arsenault [Fri, 28 Apr 2017 22:18:08 +0000 (22:18 +0000)]
InferAddressSpaces: Infer from just addrspacecasts
Eliminates some more cases where some subset of the addressing
computation remains flat. Some cases with addrspacecasts
in nested constant expressions are still left behind however.
[ConstantRange] Use APInt::isNullValue rather than APInt::isMinValue where it would make more sense to thing of 0 as 0 rather than the minimum unsigned value. NFC
[APInt] Add an isNullValue method to check for all bits being zero. Use it in a couple internal methods where it makes more sense than isMinValue or !getBoolValue. NFC
I used Null rather than Zero to match the getNullValue method name.
There are some other places outside APInt where isNullValue would be more readable than isMinValue even though they do the same thing. I'll update those in future patches.
Davide Italiano [Fri, 28 Apr 2017 21:30:50 +0000 (21:30 +0000)]
[LoopUnswitch] Make DEBUG output more readable.
While debugging a miscompile I realized loopunswitch doesn't
put newlines when printing the instruction being replacement.
Ending up with a single line with many instruction replaced isn't
the best for readability and/or mental sanity.
Matt Arsenault [Fri, 28 Apr 2017 21:01:46 +0000 (21:01 +0000)]
TableGen: Add IntrHasSideEffects property for intrinsics
The IntrNoMem, IntrReadMem, IntrWriteMem, and IntrArgMemOnly intrinsic
properties differ from their corresponding LLVM IR attributes by specifying
that the intrinsic, in addition to its memory properties, has no other side
effects.
The IntrHasSideEffects flag used in combination with one of the memory flags
listed above, makes it possible to define an intrinsic such that its
properties at the CodeGen layer match its properties at the IR layer.
The method is called "get *Param* Alignment", and is only used for
return values exactly once, so it should take argument indices, not
attribute indices.
Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.
This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.
Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.
Bob Haarman [Fri, 28 Apr 2017 20:17:15 +0000 (20:17 +0000)]
limit to 2 parallel links when using thinlto
Summary:
When using ThinLTO, the linker performs its own parallelism. This
change limits the number of parallel link jobs that Ninja will issue
to keep the total number of threads reasonable when linking with
ThinLTO.
Use Argument::hasAttribute and AttributeList::ReturnIndex more
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.
Bitcode: Do not remove empty summary entries when reading a per-module summary.
This became no longer necessary after D19462 landed, and will be incompatible
with an upcoming change to the summary data structures that changes how we
represent references.
. swap 4-bit register encoding, 16-bit offset and 32-bit imm to support big endian archs
. add a test
Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301653 91177308-0d34-0410-b5e6-96231b3b80d8