It was an horribly redundant interface since a file not existing is also a valid
error_code. Now we have an access function that returns just an error_code. This
is the only function that has to be implemented for Unix and Windows. The
functions can_write, exists and can_execute an now just wrappers.
One still has to be very careful using these function to avoid introducing
race conditions (Time of check to time of use).
Bill Schmidt [Thu, 11 Sep 2014 20:10:03 +0000 (20:10 +0000)]
[PATCH, PowerPC] Accept 'U' and 'X' constraints in inline asm
Inline asm may specify 'U' and 'X' constraints to print a 'u' for an
update-form memory reference, or an 'x' for an indexed-form memory
reference. However, these are really only useful in GCC internal code
generation. In inline asm the operand of the memory constraint is
typically just a register containing the address, so 'U' and 'X' make
no sense.
This patch quietly accepts 'U' and 'X' in inline asm patterns, but
otherwise does nothing. If we ever unexpectedly see a non-register,
we'll assert and sort it out afterwards.
I've added a new test for these constraints; the test case should be
used for other asm-constraints changes down the road.
Adam Nemet [Thu, 11 Sep 2014 16:51:10 +0000 (16:51 +0000)]
[AVX512] Fix miscompile for unpack
r189189 implemented AVX512 unpack by essentially performing a 256-bit unpack
between the low and the high 256 bits of src1 into the low part of the
destination and another unpack of the low and high 256 bits of src2 into the
high part of the destination.
I don't think that's how unpack works. AVX512 unpack simply has more 128-bit
lanes but other than it works the same way as AVX. So in each 128-bit lane,
we're always interleaving certain parts of both operands rather different
parts of one of the operands.
Combine fmul vector FP constants when unsafe math is allowed.
This is an extension of the change made with r215820:
http://llvm.org/viewvc/llvm-project?view=revision&revision=215820
That patch allowed combining of splatted vector FP constants that are multiplied.
This patch allows combining non-uniform vector FP constants too by relaxing the
check on the type of vector. Also, canonicalize a vector fmul in the
same way that we already do for scalars - if only one operand of the fmul is a
constant, make it operand 1. Otherwise, we miss potential folds.
This fold is also done by -instcombine, but it's possible that extra
fmuls may have been generated during lowering.
Refactored the R600_LDS_1A2D class a bit to get it to actually work.
It seemed to be previously unused and broken.
We also have to disable the conversion to the noret variant for now in
R600ISelLowering because the getLDSNoRetOp method only handles 1A1D LDS ops.
Someone can feel free to modify the AMDGPU::getLDSNoRetOp method to
work for more than 1A1D variants of LDS operations. It's being left as a
future TODO for now.
Signed-off-by: Aaron Watry <awatry at gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217596 91177308-0d34-0410-b5e6-96231b3b80d8
[AArch64] Reenable the PBQP test now that the leak issue has been fixed.
David Blaikie's commits r217563 & r217564, which added shared_ptr to the
CostPool have fixed some memory leak issues exposed by the PBQP with
coalescing constraints.
The sanitizer bot was failing because of those leaks. Now that the leaks
are gone, we can reenable the aarch64/pbqp test.
Hal Finkel [Thu, 11 Sep 2014 08:40:17 +0000 (08:40 +0000)]
[AlignmentFromAssumptions] Don't crash just because the target is 32-bit
We used to crash processing any relevant @llvm.assume on a 32-bit target
(because we'd ask SE to subtract expressions of differing types). I've copied
our 'simple.ll' test, but with the data layout from arm-linux-gnueabihf to get
some meaningful test coverage here.
David Blaikie [Wed, 10 Sep 2014 23:54:45 +0000 (23:54 +0000)]
shared_ptrify ownershp of PoolEntries in PBQP's CostPool
Leveraging both intrusive shared_ptr-ing (std::enable_shared_from_this)
and shared_ptr<T>-owning-U (to allow external users to hold
std::shared_ptr<CostT> while keeping the underlying PoolEntry alive).
The intrusiveness could be removed if we had a weak_set that implicitly
removed items from the set when their underlying data went away.
This /might/ fix an existing memory leak reported by LeakSanitizer in
r217504.
Matt Arsenault [Wed, 10 Sep 2014 23:26:16 +0000 (23:26 +0000)]
R600/SI: Report offset in correct units for st64 DS instructions
Need to convert the 64 element offset into bytes, not just the element
size like the normal case instructions.
Noticed by inspection. This can't be hit now because
st64 instructions aren't emitted during instruction selection,
and the post-RA scheduler isn't enabled.
Rafael Espindola [Wed, 10 Sep 2014 21:27:43 +0000 (21:27 +0000)]
Add doInitialization/doFinalization to DataLayoutPass.
With this a DataLayoutPass can be reused for multiple modules.
Once we have doInitialization/doFinalization, it doesn't seem necessary to pass
a Module to the constructor.
Overall this change seems in line with the idea of making DataLayout a required
part of Module. With it the only way of having a DataLayout used is to add it
to the Module.
Hal Finkel [Wed, 10 Sep 2014 21:05:52 +0000 (21:05 +0000)]
[AlignmentFromAssumptions] Don't divide by zero for unknown starting alignment
The routine that determines an alignment given some SCEV returns zero if the
answer is unknown. In a case where we could determine the increment of an
AddRec but not the starting alignment, we would compute the integer modulus by
zero (which is illegal and traps). Prevent this by returning early if either
the start or increment alignment is unknown (zero).
The increase of the interleave factor to 4 has side-effects
like performance losses eg. due to reminder loops being executed
more frequently and may increase code size. It requires more
analysis and careful heuristic tuning. Expect double digit gains
in small benchmarks like lowercase.c and losses in puzzle.c.
Summary:
Make CallingConv::ID a plain unsigned instead of enum with a
fixed set of valus. LLVM IR allows arbitraty calling conventions (you are
free to write cc12345), and loading them as enum is an undefined
behavior. This was reported by UBSan.
[AArch 64] Use a constant pool load for weak symbol references when
using static relocation model and small code model.
Summary: currently we generate GOT based relocations for weak symbol
references regardless of the underlying relocation model. This should
be change so that in static relocation model we use a constant pool
load instead.
David Majnemer [Wed, 10 Sep 2014 12:51:52 +0000 (12:51 +0000)]
Object: Add support for bigobj
This adds support for reading the "bigobj" variant of COFF produced by
cl's /bigobj and mingw's -mbig-obj.
The most significant difference that bigobj brings is more than 2**16
sections to COFF.
bigobj brings a few interesting differences with it:
- It doesn't have a Characteristics field in the file header.
- It doesn't have a SizeOfOptionalHeader field in the file header (it's
only used in executable files).
- Auxiliary symbol records have the same width as a symbol table entry.
Since symbol table entries are bigger, so are auxiliary symbol
records.
Dan Liew [Wed, 10 Sep 2014 10:18:59 +0000 (10:18 +0000)]
Attempt to fix PR20884
This fixes the generation of broken LLVMExports.cmake file by
the Autoconf/Makefile build system when --enable-shared is passed to
configure.
When --enable_shared is passed the Makefile.rules does not set the
LLVMConfigLibs variable which cmake/modules/Makefile previously relied
on. Now it runs the llvm-config command itself to get the library names.
This still isn't perfect because the generated LLVM targets refer to the
static libraries and not the shared library but that is much larger
problem to fix.
MergeFunctions: FunctionPtr has been renamed to FunctionNode.
It's supposed to store additional pass information for current function here.
That was the reason for name change.
It appears that the -filename-equivalence option for testing llvm-cov
doesn't work correctly with -show-expansions. I'm reverting this test
to get the bots green while I look into fixing that.
David Blaikie [Tue, 9 Sep 2014 23:13:01 +0000 (23:13 +0000)]
Sink PrevCU updating into DwarfUnit::addRange to ensure consistency
So that the two operations in DwarfDebug couldn't get separated (because
I accidentally separated them in some work in progress), put them
together. While we're here, move DwarfUnit::addRange to
DwarfCompileUnit, since it's not relevant to type units.
David Blaikie [Tue, 9 Sep 2014 22:56:36 +0000 (22:56 +0000)]
Remove DwarfDebug::PrevSection, PrevCU is sufficient for handling address range holes.
PrevSection/PrevCU are used to detect holes in the address range of a CU
to ensure the DW_AT_ranges does not include those holes. When we see a
function with no debug info, though it may be in the same range as the
prior and subsequent functions, there should be a gap in the CU's
ranges. By setting PrevCU to null in that case, the range would not be
extended to cover the gap.
Handle common linkage correctly in the gold plugin.
This is the plugin version of pr20882.
This handles the case of every common symbol being in the IR. We will need some
support from gold to handle the case where some symbols are in ELF and some in
the IR.
Add a scheduling model for AMD 16H Jaguar (btver2).
This is a first pass at a scheduling model for Jaguar.
It's structured largely on the existing SandyBridge and SLM sched models.
Using this model, in addition to turning on the PostRA scheduler, results in
some perf wins on internal and 3rd party benchmarks. There's not much difference
in LLVM's test-suite benchmarking subset of tests.
[mips] Add assembler support for .set mips0 directive.
Summary:
This directive is used to reset the assembler options to their initial values.
Assembly programmers use it in conjunction with the ".set mipsX" directives.
This patch depends on the .set push/pop directive (http://reviews.llvm.org/D4821).