Teach Clang how to use response files when calling other tools
Patch by Rafael Auler!
This patch addresses PR15171 and teaches Clang how to call other tools
with response files, when the command line exceeds system limits. This
is a problem for Windows systems, whose maximum command-line length is
32kb.
I introduce the concept of "response file support" for each Tool object.
A given Tool may have full support for response files (e.g. MSVC's
link.exe) or only support file names inside response files, but no flags
(e.g. Apple's ld64, as commented in PR15171), or no support at all (the
default case). Therefore, if you implement a toolchain in the clang
driver and you want clang to be able to use response files in your
tools, you must override a method (getReponseFileSupport()) to tell so.
I designed it to support different kinds of tools and
internationalisation needs:
- VS response files ( UTF-16 )
- GNU tools ( uses system's current code page, windows' legacy intl.
support, with escaped backslashes. On unix, fallback to UTF-8 )
- Clang itself ( UTF-16 on windows, UTF-8 on unix )
- ld64 response files ( only a limited file list, UTF-8 on unix )
With this design, I was able to test input file names with spaces and
international characters for Windows. When the linker input is large
enough, it creates a response file with the correct encoding. On a Mac,
to test ld64, I temporarily changed Clang's behavior to always use
response files regardless of the command size limit (avoiding using huge
command line inputs). I tested clang with the LLVM test suite (compiling
benchmarks) and it did fine.
Test Plan: A LIT test that tests proper response files support. This is
tricky, since, for Unix systems, we need a 2MB response file, otherwise
Clang will simply use regular arguments instead of a response file. To
do this, my LIT test generate the file on the fly by cloning many -DTEST
parameters until we have a 2MB file. I found out that processing 2MB of
arguments is pretty slow, it takes 1 minute using my notebook in a debug
build, or 10s in a Release build. Therefore, I also added "REQUIRES:
long_tests", so it will only run when the user wants to run long tests.
In the full discussion in
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130408/171463.html,
Rafael Espindola discusses a proper way to test
llvm::sys::argumentsFitWithinSystemLimits(), and, there, Chandler
suggests to use 10 times the current system limit (20MB resp file), so
we guarantee that the system will always use response file, even if a
new linux comes up that can handle a few more bytes of arguments.
However, by testing with a 20MB resp file, the test takes long 8 minutes
just to perform a silly check to see if the driver will use a response
file. I found it to be unreasonable. Thus, I discarded this approach and
uses a 2MB response file, which should be enough.
David Blaikie [Mon, 15 Sep 2014 17:30:56 +0000 (17:30 +0000)]
Fix memory leak of raw_ostreams in LogDiagnosticPrinter handling.
This is another case of conditional ownership (in this case a raw
reference, plus a boolean to indicate whether the referenced object
should be deleted). While it's not ideal, I prefer to make the ownership
explicit with a unique_ptr than using a boolean flag (though it does
make the reference and the unique_ptr redundant in the sense that they
both refer to the same memory). At some point we might write a reusable
conditional ownership pointer (a stateful custom deleter for a unique_ptr
may be appropriate).
This adds a flag called -fseh-exceptions that uses the native Windows
.pdata and .xdata unwind mechanism to throw exceptions. The other EH
possibilities are DWARF and SJLJ exceptions.
When pretty printing attributes that have enumeration arguments, print the enumerator identifier (as a string literal) instead of the internal enumerator integral value.
Benjamin Kramer [Mon, 15 Sep 2014 11:47:10 +0000 (11:47 +0000)]
Edit: Do not extend a removal to include trailing whitespace if we're at the end
of the file.
This would run past the end of the buffer. Sadly I don't have a great way to
test it, the only way to trigger the bug is having a removal fix it at the end
of the file, which none of our current warnings can generate.
patch to add missing warning on sizeof wrong parameter
for __builtin___strlcpy_chk/__builtin___strlcat_chk.
Patch by Jacques Fortier with monir change by me and
addition of test. rdar://18259539
[ASan/Win] Rename asan_win_uar_thunk.lib to asan_win_dynamic_runtime_thunk.lib
It turned out that we have to bridge more stuff between the executable
and the ASan RTL DLL than just __asan_option_detect_stack_use_after_return.
See PR20918 for more details.
David Majnemer [Fri, 12 Sep 2014 04:38:08 +0000 (04:38 +0000)]
MS ABI: The latest VC "14" CTP implements deleted virtual functions
Deleted virtual functions get _purecall inserted into the vftable.
Earlier CTPs would simply stick nullptr in there.
N.B. MSVC can't handle deleted virtual functions which require return
adjusting thunks, they give an error that a deleted function couldn't be
called inside of a compiler generated function. We get this correct by
making the thunk have a __purecall entry as well.
Thread Safety Analysis: Avoid infinite recursion in an operator<<
r217556 introduced an operator<<(std::ostream &, StringRef) that seems
to self recurse on some systems, because str.data(), which is a char *,
was being implicitly converted back to StringRef in overload
resolution.
This manifested as SemaCXX/warn-thread-safety-analysis.cpp timing out
in release builds and overflowing the stack in debug builds. One of
the failing systems that saw this is here:
Objective-C. Under a special flag, -Wcstring-format-directive,
off by default, issue a warning if %s directive is used
in formart argument of a function/method declared as
__attribute__((format(CF/NSString, ...)))
To complete rdar://18182443
clang-cl: Add support for the /o option for object files, executables, and preprocessor output
Summary:
cl.exe recognizes /o as a deprecated and undocumented option similar to
/Fe. This patch adds support for this option to clang-cl for /Fe, /Fo
and /Fi. It also ensures that the last option among /o and /F* wins,
if both specified.
This is required at least for building autoconf based software, since
autoconf uses -o to specify the executable output.
Thread Safety Analysis: major update to thread safety TIL.
Numerous changes, including:
* Changed the way variables and instructions are handled in basic blocks to
be more efficient.
* Eliminated SExprRef.
* Simplified futures.
* Fixed documentation.
* Compute dominator and post dominator trees.
Ben Langmuir [Wed, 10 Sep 2014 21:29:41 +0000 (21:29 +0000)]
Avoid a couple of assertions when preprocessing with modules
1. We were hitting the NextIsPrevious assertion because we were trying
to merge decl chains that were independent of each other because we had
no Sema object to allow them to find existing decls. This is fixed by
delaying loading the "preloaded" decls until Sema is available.
2. We were trying to get identifier info from an annotation token, which
asserts. The fix is to special-case the module annotations in the
preprocessed output printer.
Fixed in a single commit because when you hit 1 you almost invariably
hit 2 as well.
Only override the target architecture on -m32 and friends if it is
actually different. Fixes a surprising link error with nodejs on rpi,
where armv6-netbsd-eabihf turned into armv5e-netbsd-eabihf, which
doesn't lacks the necessary VFP support.
In Parser::ParseCXXClassMemberDeclaration(), it was possible to change
isAccessDecl = NextToken().is(tok::kw_operator);
to
isAccessDecl = false;
and no tests would fail. Now there's coverage for this.
Benjamin Kramer [Wed, 10 Sep 2014 12:50:59 +0000 (12:50 +0000)]
CodeGen: Use a fixed alignment for vtables.
Pointer-sized alignment is sufficient as we only ever read single values
from the table. Otherwise we'd bump the alignment to 16 bytes in the
backend if the vtable is larger than 16 bytes. This is great for
structures that are accessed with vector instructions or copied around, but
that's simply not the case for vtables.
Shrinks the data segment of a Release x86_64 clang by 0.3%. The wins are
larger for i386 and code bases that use vtables more often than we do.
Objective-C. Under a special flag, -Wcstring-format-directive,
off by default, issue a warning if %s directive is used in
certain CF/NS formatting APIs, to assist user in deprecating
use of such %s in these APIs. rdar://18182443
Daniel Jasper [Tue, 9 Sep 2014 14:37:39 +0000 (14:37 +0000)]
clang-format: [JS] Support regex literals with trailing escaped slash.
Before:
var regex = / a\//; int i;
After:
var regex = /a\//;
int i;
This required pushing the Lexer into its wrapper class and generating a
new one in this specific case. Otherwise, the sequence get lexed as a
//-comment. This is hacky, but I don't know a better way (short of
supporting regex literals in the Lexer).
Pushing the Lexer down seems to make all the call sites simpler.
Benjamin Kramer [Tue, 9 Sep 2014 13:53:29 +0000 (13:53 +0000)]
Tooling: Ignore file names in tooling::deduplicate.
This was horribly broken due to how the sort predicate works. We would
report a conflict for files with a replacement in the same position but
different names if the length differed. Just ignore paths as this is often
what the user wants. Files can occur with different names (due to symlinks
or relative paths) and we don't ever want to do the same edit in one file
twice.
Summary:
This patch implements a new UBSan check, which verifies
that function arguments declared to be nonnull with __attribute__((nonnull))
are actually nonnull in runtime.
To implement this check, we pass FunctionDecl to CodeGenFunction::EmitCallArgs
(where applicable) and if function declaration has nonnull attribute specified
for a certain formal parameter, we compare the corresponding RValue to null as
soon as it's calculated.
Ben Langmuir [Mon, 8 Sep 2014 16:15:54 +0000 (16:15 +0000)]
Make FileEntry::getName() valid across calls to FileManager::getFile()
Because we may change the name of a FileEntry inside getFile, the name
returned by FileEntry::getName() could be destroyed. This was causing a
use-after-free when searching the HeaderFileInfo on-disk hashtable for a
module or pch.
Hal Finkel [Mon, 8 Sep 2014 00:09:15 +0000 (00:09 +0000)]
Don't test really-large alignments for now
Temporarily comment out the test for really-large powers of two. This seems to
be host-sensitive for some reason... trying to fix the clang-i386-freebsd
builder.
Hal Finkel [Sun, 7 Sep 2014 22:58:14 +0000 (22:58 +0000)]
Add __builtin_assume and __builtin_assume_aligned using @llvm.assume.
This makes use of the recently-added @llvm.assume intrinsic to implement a
__builtin_assume(bool) intrinsic (to provide additional information to the
optimizer). This hooks up __assume in MS-compatibility mode to mirror
__builtin_assume (the semantics have been intentionally kept compatible), and
implements GCC's __builtin_assume_aligned as assume((p - o) & mask == 0). LLVM
now contains special logic to deal with assumptions of this form.
Hal Finkel [Sun, 7 Sep 2014 21:28:53 +0000 (21:28 +0000)]
Adjust test/CodeGenCXX/pr12251.cpp
InstCombine just got a bit smarter about checking known bits of returned
values, and because this test runs the optimizer, it requires an update. We
should really rewrite this test to directly check the IR output from CodeGen.
[x86] Clean up the x86 builtin specs to reflect r217310 in LLVM which
made the 8-bit masks actually 8-bit arguments to these intrinsics.
These builtins are a mess. Many were missing the I qualifier which
I added where obviously correct. Most aren't tested, but I've updated
the relevant tests. I've tried to catch all the things that should
become 'c' in this round.
It's also frustrating because the set of these is really ad-hoc and
doesn't really map that cleanly to the set supported by either GCC or
LLVM. Oh well...
Add -Wunused-local-typedef, a warning that finds unused local typedefs.
The warning warns on TypedefNameDecls -- typedefs and C++11 using aliases --
that are !isReferenced(). Since the isReferenced() bit on TypedefNameDecls
wasn't used for anything before this warning it wasn't always set correctly,
so this patch also adds a few missing MarkAnyDeclReferenced() calls in
various places for TypedefNameDecls.
This is made a bit complicated due to local typedefs possibly being used only
after their local scope has closed. Consider:
Here the typedef in XXX is only used at end-of-translation unit, when YYY in
template_fun() gets instantiated. To handle this, typedefs that are unused when
their scope exits are added to a set of potentially unused typedefs, and that
set gets checked at end-of-TU. Typedefs that are still unused at that point then
get warned on. There's also serialization code for this set, so that the
warning works with precompiled headers and modules. For modules, the warning
is emitted when the module is built, for precompiled headers each time the
header gets used.
Finally, consider a function using C++14 auto return types to return a local
type defined in a header:
auto f() {
struct S { typedef int a; };
return S();
}
Here, the typedef escapes its local scope and could be used by only some
translation units including the header. To not warn on this, add a
RecursiveASTVisitor that marks all delcs on local types returned from auto
functions as referenced. (Except if it's a function with internal linkage, or
the decls are private and the local type has no friends -- in these cases, it
_is_ safe to warn.)
Several of the included testcases (most of the interesting ones) were provided
by Richard Smith.
(gcc's spelling -Wunused-local-typedefs is supported as an alias for this
warning.)
Richard Smith [Sat, 6 Sep 2014 00:24:58 +0000 (00:24 +0000)]
Reword switch/goto diagnostics "protected scope" diagnostics. Making up a term
"protected scope" is very unhelpful here and actively confuses users. Instead,
simply state the nature of the problem in the diagnostic: we cannot jump from
here to there. The notes explain nicely why not.
David Blaikie [Fri, 5 Sep 2014 23:36:59 +0000 (23:36 +0000)]
Fix r217275 to work without the need for standard headers being included
It seems (I guess) in ObjC that va_list is provided without the need for
inclusions. I verified that with this change the test still crashes in
the absence of the fix committed in r217275.
Ben Langmuir [Fri, 5 Sep 2014 20:24:27 +0000 (20:24 +0000)]
Move the initialization of VAListTagName after InitializeSema()
This innocuous statement to get the identifier info for __va_list_tag
was causing an assertion failure:
NextIsPrevious() && "decl became non-canonical unexpectedly"
if the __va_list_tag identifier was found in a PCH in some
circumstances, because it was looked up before the ASTReader had a Sema
object to use to find existing decls to merge with.
We could possibly move getting the identifier info even later, or make
it lazy if we wanted to, but this seemed like the minimal change.
Now why a PCH would have this identifier in the first place is a bit
mysterious. This seems to be related to the global module index in some
way, because when the test case is built without the global module index
it will not emit an identifier for __va_list_tag into the PCH, but with
the global module index it does.
Separate the matchers by type and statically dispatch to the right list.
Summary:
Separate the matchers by type and statically dispatch to the right list.
For any node type that we support, it reduces the number of matchers we
run it through.
For node types we do not support, it makes match() a noop.
This change improves our clang-tidy related benchmark by ~30%.
-frewrite-includes: Normalize line endings to match the main source file
It is very common to include headers with DOS-style line endings, such
as windows.h, from source files with Unix-style line endings.
Previously, we would end up with mixed line endings and #endifs that
appeared to be on the same line:
#if 0 /* expanded by -frewrite-includes */
#include <windows.h>^M#endif /* expanded by -frewrite-includes */
Clang treats either of \r or \n as a line ending character, so this is
purely a cosmetic issue.
This has no automated test because most Unix tools on Windows will
implictly convert CRLF to LF when reading files, making it very hard to
detect line ending mismatches. FileCheck doesn't understand {{\r}}
either.