Aaron Ballman [Thu, 20 Dec 2018 20:20:20 +0000 (20:20 +0000)]
Allow direct navigation to static analysis checker documentation through SARIF exports.
This adds anchors to all of the documented checks so that you can directly link to a check by a stable name. This is useful because the SARIF file format has a field for specifying a URI to documentation for a rule and some viewers, like CodeSonar, make use of this information. These links are then exposed through the SARIF exporter.
Bruno Ricci [Thu, 20 Dec 2018 20:05:11 +0000 (20:05 +0000)]
[Sema] Don't try to account for the size of an incomplete type in CheckArrayAccess
When checking that the array access is not out-of-bounds in CheckArrayAccess
it is possible that the type of the base expression after IgnoreParenCasts is
incomplete, even though the type of the base expression before IgnoreParenCasts
is complete. In this case we have no information about whether the array access
is out-of-bounds and we should just bail-out instead. This fixes PR39746 which
was caused by trying to obtain the size of an incomplete type.
Artem Dergachev [Thu, 20 Dec 2018 19:36:06 +0000 (19:36 +0000)]
Revert "[analyzer] pr38668: Do not attempt to cast loaded values..."
This reverts commit r349701.
The patch was incorrect. The whole point of CastRetrievedVal()
is to handle the case in which the type from which the cast is made
(i.e., the "type" of value `V`) has nothing to do with the type of
the region it was loaded from (i.e., `R->getValueType()`).
Pete Cooper [Thu, 20 Dec 2018 18:05:41 +0000 (18:05 +0000)]
Use @llvm.objc.clang.arc.use intrinsic instead of clang.arc.use function.
Calls to this function are deleted in the ARC optimizer. However when the ARC
optimizer was updated to use intrinsics instead of functions (r349534), the corresponding
clang change (r349535) to use intrinsics missed this one so it wasn't being deleted.
Ulrich Weigand [Thu, 20 Dec 2018 13:10:47 +0000 (13:10 +0000)]
[SystemZ] Improve testing of vecintrin.h intrinsics
This adds assembly-level tests to verify that the high-level
intrinsics generate the instructions they're supposed to.
These tests would have caught the codegen bugs I just fixed.
Michal Gorny [Thu, 20 Dec 2018 13:09:30 +0000 (13:09 +0000)]
Replace getOS() == llvm::Triple::*BSD with isOS*BSD() [NFCI]
Replace multiple comparisons of getOS() value with FreeBSD, NetBSD,
OpenBSD and DragonFly with matching isOS*BSD() methods. This should
improve the consistency of coding style without changing the behavior.
Direct getOS() comparisons were left whenever used in switch or switch-
like context.
Artem Dergachev [Wed, 19 Dec 2018 23:48:44 +0000 (23:48 +0000)]
[analyzer] pr38668: Do not attempt to cast loaded values of non-scalar types.
It is expected to have the same object (memory region) treated as if it has
different types in different program points. The correct behavior for
RegionStore when an object is stored as an object of type T1 but loaded as
an object of type T2 is to store the object as if it has type T1 but cast it
to T2 during load.
Note that the cast here is some sort of a "reinterpret_cast" (even in C). For
instance, if you store a float and load an integer, you won't have your float
rounded to an integer; instead, you will have garbage.
Admit that we cannot perform the cast as long as types we're dealing with are
non-trivial (neither integers, nor pointers).
Of course, if the cast is not necessary (eg, T1 == T2), we can still load the
value just fine.
Artem Dergachev [Wed, 19 Dec 2018 23:14:06 +0000 (23:14 +0000)]
[analyzer] Improve modeling for returning an object from the top frame with RVO.
Static Analyzer processes the program function-by-function, sometimes diving
into other functions ("inlining" them). When an object is returned from an
inlined function, Return Value Optimization is modeled, and the returned object
is constructed at its return location directly.
When an object is returned from the function from which the analysis has started
(the top stack frame of the analysis), the return location is unknown. Model it
with a SymbolicRegion based on a conjured symbol that is specifically tagged for
that purpose, because this is generally the correct way to symbolicate
unknown locations in Static Analyzer.
Fixes leak false positives when an object is returned from top frame in C++17:
objects that are put into a SymbolicRegion-based memory region automatically
"escape" and no longer get reported as leaks. This only applies to C++17 return
values with destructors, because it produces a redundant CXXBindTemporaryExpr
in the call site, which confuses our liveness analysis. The actual fix
for liveness analysis is still pending, but it is no longer causing problems.
Additionally, re-enable temporary destructor tests in C++17.
Artem Dergachev [Wed, 19 Dec 2018 21:50:46 +0000 (21:50 +0000)]
[analyzer] CStringChecker: Fix a crash on C++ overloads of standard functions.
It turns out that it's not all that uncommon to have a C++ override of, say,
memcpy that receives a structure (or two) by reference (or by value, if it's
being copied from) and copies memory from it (or into it, if it's passed
by reference). In this case the argument will be of structure type (recall that
expressions of reference type do not exist: instead, C++ classifies expressions
into prvalues and lvalues and xvalues).
In this scenario we crash because we are trying to assume that, say,
a memory region is equal to an empty CompoundValue (the non-lazy one; this is
what makeZeroVal() return for compound types and it represents prvalue of
an object that is initialized with an empty initializer list).
David Blaikie [Wed, 19 Dec 2018 19:33:35 +0000 (19:33 +0000)]
PR40096: Forwards-compatible with C++20 rule regarding aggregates not having user-declared ctors
Looks like these were in place to make these types move-only. That's
generally not a feature that the type should prescribe (unless it's an
inherent limitation) - instead leaving it up to the users of a type.
Alexey Bataev [Wed, 19 Dec 2018 18:16:37 +0000 (18:16 +0000)]
[OPENMP]Mark the loop as started when initialized.
Need to mark the loop as started when the initialization statement is
found. It is required to prevent possible incorrect loop iteraton
variable detection during template instantiation and fix the compiler
crash during the codegen.
Ilya Biryukov [Wed, 19 Dec 2018 18:01:24 +0000 (18:01 +0000)]
[CodeComplete] Properly determine qualifiers of 'this' in a lambda
Summary:
The clang used to pick up the qualifiers of the lamba's call operator
(which is always const) and fail to show non-const methods of 'this' in
completion results.
Michal Gorny [Wed, 19 Dec 2018 17:25:59 +0000 (17:25 +0000)]
[Driver] [NetBSD] Add -D_REENTRANT when using sanitizers
NetBSD intends to support only reentrant interfaces in interceptors.
When -lpthread is used without _REENTRANT defined, things are
not guaranteed to work.
This is especially important for <stdio.h> and sanitization of
interfaces around FILE. Some APIs have alternative modes depending
on the _REENTRANT definition, and NetBSD intends to support sanitization
of the _REENTRANT ones.
Michal Gorny [Wed, 19 Dec 2018 17:25:55 +0000 (17:25 +0000)]
[Driver] Add .hasAnySanitizer() to SanitizerArgs
Add a simple method to query whether any sanitizer was enabled,
via SanitizerArgs. This will be used in the NetBSD driver to pass
additional definitions that are required by all sanitizers.
Simon Pilgrim [Wed, 19 Dec 2018 14:43:47 +0000 (14:43 +0000)]
[X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT generic intrinsics (clang)
Sibling patch to D55855, this emits UADD_SAT/USUB_SAT generic intrinsics for the SSE saturated math intrinsics instead of expanding to a IR code sequence that could be difficult to reassemble.
Bill Wendling [Tue, 18 Dec 2018 22:54:03 +0000 (22:54 +0000)]
Emit ASM input in a constant context
Summary:
Some ASM input constraints (e.g., "i" and "n") require immediate values. At O0,
very few code transformations are performed. So if we cannot resolve to an
immediate when emitting the ASM input we shouldn't delay its processing.
Aaron Ballman [Tue, 18 Dec 2018 21:42:20 +0000 (21:42 +0000)]
Fix errors with the Clang natvis file.
This updates the FunctionProtoType visualizer to use the proper bits for determining parameter information and the DeclarationName visualizer to use the detail namespace. It also adds support for viewing newer special declaration names (like deduction guides).
Vedant Kumar [Tue, 18 Dec 2018 21:05:03 +0000 (21:05 +0000)]
[CodeGen] Handle mixed-width ops in mixed-sign mul-with-overflow lowering
The special lowering for __builtin_mul_overflow introduced in r320902
fixed an ICE seen when passing mixed-sign operands to the builtin.
This patch extends the special lowering to cover mixed-width, mixed-sign
operands. In a few common scenarios, calls to muloti4 will no longer be
emitted.
This should address the latest comments in PR34920 and work around the
link failure seen in:
Alexey Bataev [Tue, 18 Dec 2018 21:01:42 +0000 (21:01 +0000)]
[OPENMP][NVPTX]Emit shared memory buffer for reduction as 128 bytes
buffer.
Seems to me, nvlink has a bug with the proper support of the weakly
linked symbols. It does not allow to define several shared memory buffer
with the different sizes even with the weak linkage. Instead we always
use 128 bytes buffer to prevent nvlink from the error message emission.
Serge Guelton [Tue, 18 Dec 2018 16:04:21 +0000 (16:04 +0000)]
Portable Python script across Python version
In Python3, dict.items, dict.keys, dict.values, zip, map and filter no longer return lists, they create generator instead.
The portability patch consists in forcing an extra `list` call if the result is actually used as a list.
`map` are replaced by list comprehension and `filter` by filtered list comprehension.
Aaron Ballman [Tue, 18 Dec 2018 15:54:38 +0000 (15:54 +0000)]
Emit -Wformat properly for bit-field promotions.
Only explicitly look through integer and floating-point promotion where the result type is actually a promotion, which is not always the case for bit-fields in C.
Haojian Wu [Tue, 18 Dec 2018 15:29:12 +0000 (15:29 +0000)]
[AST] Unify the code paths of traversing lambda expressions.
Summary:
This supposes to be a non-functional change. We have two code paths when
traversing lambda expressions:
1) traverse the function proto typeloc when parameters and return type
are explicit;
2) otherwise fallback to traverse parameter decls and return type loc
individually;
This patch unifies the code path to always traverse parameters and
return type, rather than relying on traversing the full type-loc.
Serge Guelton [Tue, 18 Dec 2018 08:38:50 +0000 (08:38 +0000)]
Portable Python script across Python version
In Python2, division between integer yields an integer, while it yields a float in Python3.
Use a combination of from __future__ import division and // operator to get a portable behavior.
Martin Storsjo [Tue, 18 Dec 2018 08:36:10 +0000 (08:36 +0000)]
[Driver] Automatically enable -munwind-tables if -fseh-exceptions is enabled
For targets where SEH exceptions are used by default (on MinGW,
only x86_64 so far), -munwind-tables are added automatically. If
-fseh-exeptions is enabled on a target where SEH exeptions are
availble but not enabled by default yet (aarch64), we need to
pass -munwind-tables if -fseh-exceptions was specified.
JF Bastien [Tue, 18 Dec 2018 05:12:21 +0000 (05:12 +0000)]
Automatic variable initialization
Summary:
Add an option to initialize automatic variables with either a pattern or with
zeroes. The default is still that automatic variables are uninitialized. Also
add attributes to request uninitialized on a per-variable basis, mainly to disable
initialization of large stack arrays when deemed too expensive.
This isn't meant to change the semantics of C and C++. Rather, it's meant to be
a last-resort when programmers inadvertently have some undefined behavior in
their code. This patch aims to make undefined behavior hurt less, which
security-minded people will be very happy about. Notably, this means that
there's no inadvertent information leak when:
- The compiler re-uses stack slots, and a value is used uninitialized.
- The compiler re-uses a register, and a value is used uninitialized.
- Stack structs / arrays / unions with padding are copied.
This patch only addresses stack and register information leaks. There's many
more infoleaks that we could address, and much more undefined behavior that
could be tamed. Let's keep this patch focused, and I'm happy to address related
issues elsewhere.
To keep the patch simple, only some `undef` is removed for now, see
`replaceUndef`. The padding-related infoleaks are therefore not all gone yet.
This will be addressed in a follow-up, mainly because addressing padding-related
leaks should be a stand-alone option which is implied by variable
initialization.
There are three options when it comes to automatic variable initialization:
0. Uninitialized
This is C and C++'s default. It's not changing. Depending on code
generation, a programmer who runs into undefined behavior by using an
uninialized automatic variable may observe any previous value (including
program secrets), or any value which the compiler saw fit to materialize on
the stack or in a register (this could be to synthesize an immediate, to
refer to code or data locations, to generate cookies, etc).
1. Pattern initialization
This is the recommended initialization approach. Pattern initialization's
goal is to initialize automatic variables with values which will likely
transform logic bugs into crashes down the line, are easily recognizable in
a crash dump, without being values which programmers can rely on for useful
program semantics. At the same time, pattern initialization tries to
generate code which will optimize well. You'll find the following details in
`patternFor`:
- Integers are initialized with repeated 0xAA bytes (infinite scream).
- Vectors of integers are also initialized with infinite scream.
- Pointers are initialized with infinite scream on 64-bit platforms because
it's an unmappable pointer value on architectures I'm aware of. Pointers
are initialize to 0x000000AA (small scream) on 32-bit platforms because
32-bit platforms don't consistently offer unmappable pages. When they do
it's usually the zero page. As people try this out, I expect that we'll
want to allow different platforms to customize this, let's do so later.
- Vectors of pointers are initialized the same way pointers are.
- Floating point values and vectors are initialized with a negative quiet
NaN with repeated 0xFF payload (e.g. 0xffffffff and 0xffffffffffffffff).
NaNs are nice (here, anways) because they propagate on arithmetic, making
it more likely that entire computations become NaN when a single
uninitialized value sneaks in.
- Arrays are initialized to their homogeneous elements' initialization
value, repeated. Stack-based Variable-Length Arrays (VLAs) are
runtime-initialized to the allocated size (no effort is made for negative
size, but zero-sized VLAs are untouched even if technically undefined).
- Structs are initialized to their heterogeneous element's initialization
values. Zero-size structs are initialized as 0xAA since they're allocated
a single byte.
- Unions are initialized using the initialization for the largest member of
the union.
Expect the values used for pattern initialization to change over time, as we
refine heuristics (both for performance and security). The goal is truly to
avoid injecting semantics into undefined behavior, and we should be
comfortable changing these values when there's a worthwhile point in doing
so.
Why so much infinite scream? Repeated byte patterns tend to be easy to
synthesize on most architectures, and otherwise memset is usually very
efficient. For values which aren't entirely repeated byte patterns, LLVM
will often generate code which does memset + a few stores.
2. Zero initialization
Zero initialize all values. This has the unfortunate side-effect of
providing semantics to otherwise undefined behavior, programs therefore
might start to rely on this behavior, and that's sad. However, some
programmers believe that pattern initialization is too expensive for them,
and data might show that they're right. The only way to make these
programmers wrong is to offer zero-initialization as an option, figure out
where they are right, and optimize the compiler into submission. Until the
compiler provides acceptable performance for all security-minded code, zero
initialization is a useful (if blunt) tool.
I've been asked for a fourth initialization option: user-provided byte value.
This might be useful, and can easily be added later.
Why is an out-of band initialization mecanism desired? We could instead use
-Wuninitialized! Indeed we could, but then we're forcing the programmer to
provide semantics for something which doesn't actually have any (it's
uninitialized!). It's then unclear whether `int derp = 0;` lends meaning to `0`,
or whether it's just there to shut that warning up. It's also way easier to use
a compiler flag than it is to manually and intelligently initialize all values
in a program.
Why not just rely on static analysis? Because it cannot reason about all dynamic
code paths effectively, and it has false positives. It's a great tool, could get
even better, but it's simply incapable of catching all uses of uninitialized
values.
Why not just rely on memory sanitizer? Because it's not universally available,
has a 3x performance cost, and shouldn't be deployed in production. Again, it's
a great tool, it'll find the dynamic uses of uninitialized variables that your
test coverage hits, but it won't find the ones that you encounter in production.
What's the performance like? Not too bad! Previous publications [0] have cited
2.7 to 4.5% averages. We've commmitted a few patches over the last few months to
address specific regressions, both in code size and performance. In all cases,
the optimizations are generally useful, but variable initialization benefits
from them a lot more than regular code does. We've got a handful of other
optimizations in mind, but the code is in good enough shape and has found enough
latent issues that it's a good time to get the change reviewed, checked in, and
have others kick the tires. We'll continue reducing overheads as we try this out
on diverse codebases.
Is it a good idea? Security-minded folks think so, and apparently so does the
Microsoft Visual Studio team [1] who say "Between 2017 and mid 2018, this
feature would have killed 49 MSRC cases that involved uninitialized struct data
leaking across a trust boundary. It would have also mitigated a number of bugs
involving uninitialized struct data being used directly.". They seem to use pure
zero initialization, and claim to have taken the overheads down to within noise.
Don't just trust Microsoft though, here's another relevant person asking for
this [2]. It's been proposed for GCC [3] and LLVM [4] before.
What are the caveats? A few!
- Variables declared in unreachable code, and used later, aren't initialized.
This goto, Duff's device, other objectionable uses of switch. This should
instead be a hard-error in any serious codebase.
- Volatile stack variables are still weird. That's pre-existing, it's really
the language's fault and this patch keeps it weird. We should deprecate
volatile [5].
- As noted above, padding isn't fully handled yet.
I don't think these caveats make the patch untenable because they can be
addressed separately.
Should this be on by default? Maybe, in some circumstances. It's a conversation
we can have when we've tried it out sufficiently, and we're confident that we've
eliminated enough of the overheads that most codebases would want to opt-in.
Let's keep our precious undefined behavior until that point in time.
How do I use it:
1. On the command-line:
-ftrivial-auto-var-init=uninitialized (the default)
-ftrivial-auto-var-init=pattern
-ftrivial-auto-var-init=zero -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
2. Using an attribute:
int dont_initialize_me __attribute((uninitialized));
Reid Kleckner [Mon, 17 Dec 2018 23:16:43 +0000 (23:16 +0000)]
Fix ms-layout_version declspec test and add missing new test
Now that MSVC compatibility versions are stored as a four digit number
(1912) instead of a two digit number (19), we need to adjust how we
handle this attribute.
Also add a new test that was intended to be part of r349414.
Reid Kleckner [Mon, 17 Dec 2018 23:10:43 +0000 (23:10 +0000)]
Update Microsoft name mangling scheme for exception specifiers in the type system
Summary:
The msvc exception specifier for noexcept function types has changed
from the prior default of "Z" to "_E" if the function cannot throw when
compiling with /std:C++17.
Alex Lorenz [Mon, 17 Dec 2018 21:01:04 +0000 (21:01 +0000)]
Make test/Driver/darwin-sdk-version.c pass on hosts < macOS10.14
The test test/Driver/darwin-sdk-version.c from r349380 checks if the macOS
deployment target can be correctly inferred from the SDK version. When the
SDK version is > host version, the driver will pick the host version, so
the old test failed on macOS < 10.14. This commit makes this test more
resilient by using an older SDK version.
Alex Lorenz [Mon, 17 Dec 2018 19:30:46 +0000 (19:30 +0000)]
[darwin][arm64] use the "cyclone" CPU for Darwin even when `-arch`
is not specified
The -target option allows the user to specify the build target using LLVM
triple. The triple includes the arch, and so the -arch option is redundant.
This should work just as well without the -arch. However, the driver has a bug
in which it doesn't target the "Cyclone" CPU for darwin if -target is used
without -arch. This commit fixes this issue.
Alex Lorenz [Mon, 17 Dec 2018 19:19:15 +0000 (19:19 +0000)]
[darwin] parse the SDK settings from SDKSettings.json if it exists and
pass in the -target-sdk-version to the compiler and backend
This commit adds support for reading the SDKSettings.json file in the Darwin
driver. This file is used by the driver to determine the SDK's version, and it
uses that information to pass it down to the compiler using the new
-target-sdk-version= option. This option is then used to set the appropriate
SDK Version module metadata introduced in r349119.
Note: I had to adjust the two ast tests as the SDKROOT environment variable
on macOS caused SDK version to be picked up for the compilation of source file
but not the AST.
Gabor Marton [Mon, 17 Dec 2018 13:53:12 +0000 (13:53 +0000)]
[ASTImporter] Add importer specific lookup
Summary:
There are certain cases when normal C/C++ lookup (localUncachedLookup)
does not find AST nodes. E.g.:
Example 1:
template <class T>
struct X {
friend void foo(); // this is never found in the DC of the TU.
};
Example 2:
// The fwd decl to Foo is not found in the lookupPtr of the DC of the
// translation unit decl.
struct A { struct Foo *p; };
In these cases we create a new node instead of returning with the old one.
To fix it we create a new lookup table which holds every node and we are
not interested in any C++ specific visibility considerations.
Simply, we must know if there is an existing Decl in a given DC.
Gabor Marton [Mon, 17 Dec 2018 12:42:12 +0000 (12:42 +0000)]
[ASTImporter] Fix redecl chain of classes and class templates
Summary:
The crux of the issue that is being fixed is that lookup could not find
previous decls of a friend class. The solution involves making the
friend declarations visible in their decl context (i.e. adding them to
the lookup table).
Also, we simplify `VisitRecordDecl` greatly.
This fix involves two other repairs (without these the unittests fail):
(1) We could not handle the addition of injected class types properly
when a redecl chain was involved, now this is fixed.
(2) DeclContext::removeDecl failed if the lookup table in Vector form
did not contain the to be removed element. This caused troubles in
ASTImporter::ImportDeclContext. This is also fixed.
Kristof Umann [Mon, 17 Dec 2018 12:25:48 +0000 (12:25 +0000)]
Revert rC349281 '[analyzer][MallocChecker][NFC] Document and reorganize some functions'
Accidentally commited earlier with the same commit title, but really it
should've been
"Revert rC349283 '[analyzer][MallocChecker] Improve warning messages on double-delete errors'"
Carey Williams [Mon, 17 Dec 2018 10:11:35 +0000 (10:11 +0000)]
[Docs] Expand -fstack-protector and -fstack-protector-all
Improve the description of these command line options
by providing specific heuristic information, as outlined
for the ssp function attribute(s) in LLVM's documentation.
Artem Dergachev [Mon, 17 Dec 2018 06:19:32 +0000 (06:19 +0000)]
[analyzer] MoveChecker: Add an option to suppress warnings on locals.
Re-using a moved-from local variable is most likely a bug because there's
rarely a good motivation for not introducing a separate variable instead.
We plan to keep emitting such warnings by default.
Introduce a flag that allows disabling warnings on local variables that are
not of a known move-unsafe type. If it doesn't work out as we expected,
we'll just flip the flag.
We still warn on move-unsafe objects and unsafe operations on known move-safe
objects.
Artem Dergachev [Sun, 16 Dec 2018 23:44:06 +0000 (23:44 +0000)]
[analyzer] Fix some expressions staying live too long. Add a debug checker.
StaticAnalyzer uses the CFG-based RelaxedLiveVariables analysis in order to,
in particular, figure out values of which expressions are still needed.
When the expression becomes "dead", it is garbage-collected during
the dead binding scan.
Expressions that constitute branches/bodies of control flow statements,
eg. `E1' in `if (C1) E1;' but not `E2' in `if (C2) { E2; }', were kept alive
for too long. This caused false positives in MoveChecker because it relies
on cleaning up loop-local variables when they go out of scope, but some of those
live-for-too-long expressions were keeping a reference to those variables.
Fix liveness analysis to correctly mark these expressions as dead.
Add a debug checker, debug.DumpLiveStmts, in order to test expressions liveness.
Aaron Puchert [Sun, 16 Dec 2018 14:15:30 +0000 (14:15 +0000)]
Thread safety analysis: Allow scoped releasing of capabilities
Summary:
The pattern is problematic with C++ exceptions, and not as widespread as
scoped locks, but it's still used by some, for example Chromium.
We are a bit stricter here at join points, patterns that are allowed for
scoped locks aren't allowed here. That could still be changed in the
future, but I'd argue we should only relax this if people ask for it.
Kristof Umann [Sat, 15 Dec 2018 18:34:00 +0000 (18:34 +0000)]
[analyzer][MallocChecker][NFC] Document and reorganize some functions
This patch merely reorganizes some things, and features no functional change.
In detail:
* Provided documentation, or moved existing documentation in more obvious
places.
* Added dividers. (the //===----------===// thing).
* Moved getAllocationFamily, printAllocDeallocName, printExpectedAllocName and
printExpectedDeallocName in the global namespace on top of the file where
AllocationFamily is declared, as they are very strongly related.
* Moved isReleased and MallocUpdateRefState near RefState's definition for the
same reason.
* Realloc modeling was very poor in terms of variable and structure naming, as
well as documentation, so I renamed some of them and added much needed docs.
* Moved function IdentifierInfos to a separate struct, and moved isMemFunction,
isCMemFunction adn isStandardNewDelete inside it. This makes the patch affect
quite a lot of lines, should I extract it to a separate one?
* Moved MallocBugVisitor out of MallocChecker.
* Preferred switches to long else-if branches in some places.
* Neatly organized some RUN: lines.
Kristof Umann [Sat, 15 Dec 2018 18:11:49 +0000 (18:11 +0000)]
[analyzer][NFC] Merge ClangCheckerRegistry to CheckerRegistry
Now that CheckerRegistry lies in Frontend, we can finally eliminate
ClangCheckerRegistry. Fortunately, this also provides us with a
DiagnosticsEngine, so I went ahead and removed some parameters from it's
methods.
Kristof Umann [Sat, 15 Dec 2018 16:23:51 +0000 (16:23 +0000)]
[analyzer][NFC] Move CheckerRegistry from the Core directory to Frontend
ClangCheckerRegistry is a very non-obvious, poorly documented, weird concept.
It derives from CheckerRegistry, and is placed in lib/StaticAnalyzer/Frontend,
whereas it's base is located in lib/StaticAnalyzer/Core. It was, from what I can
imagine, used to circumvent the problem that the registry functions of the
checkers are located in the clangStaticAnalyzerCheckers library, but that
library depends on clangStaticAnalyzerCore. However, clangStaticAnalyzerFrontend
depends on both of those libraries.
One can make the observation however, that CheckerRegistry has no place in Core,
it isn't used there at all! The only place where it is used is Frontend, which
is where it ultimately belongs.
This move implies that since
include/clang/StaticAnalyzer/Checkers/ClangCheckers.h only contained a single function:
it had to re purposed, as CheckerRegistry is no longer available to
clangStaticAnalyzerCheckers. It was renamed to BuiltinCheckerRegistration.h,
which actually describes it a lot better -- it does not contain the registration
functions for checkers, but only those generated by the tblgen files.
Kristof Umann [Sat, 15 Dec 2018 15:44:05 +0000 (15:44 +0000)]
[analyzer] Prefer returns values to out-params in CheckerRegistry.cpp
Renaming collectCheckers to getEnabledCheckers
Changing the functionality to acquire all enabled checkers, rather then collect
checkers for a specific CheckerOptInfo (for example, collecting all checkers for
{ "core", true }, which meant enabling all checkers from the core package, which
was an unnecessary complication).
Removing CheckerOptInfo, instead of storing whether the option was claimed via a
field, we handle errors immediately, as getEnabledCheckers can now access a
DiagnosticsEngine. Realize that the remaining information it stored is directly
accessible through AnalyzerOptions.CheckerControlList.
Fix a test with -analyzer-disable-checker -verify accidentally left in.
Fangrui Song [Sat, 15 Dec 2018 08:54:06 +0000 (08:54 +0000)]
[libclang] Add dependency on clangSerialization to unbreak -DBUILD_SHARED_LIBS=1 build after rC349237
Frontend headers have undefined reference on the symbol `clang::PCHContainerOperations::PCHContainerOperations()` through some shared_ptr usage. Any dependents will get the undefined reference which can only be resolved by explicit dependency on clangSerialization (due to -z defs).
Martin Storsjo [Sat, 15 Dec 2018 08:08:11 +0000 (08:08 +0000)]
[MinGW] Produce a vtable and RTTI for dllexported classes without a key function
This matches what GCC does in these situations.
This fixes compiling Qt in debug mode. In release mode, references to
the vtable of this particular class ends up optimized away, but in debug
mode, the compiler creates references to the vtable, which is expected
to be dllexported from a different DLL. Make sure the dllexported
version actually ends up emitted.
Artem Dergachev [Sat, 15 Dec 2018 02:13:26 +0000 (02:13 +0000)]
[analyzer] Fix unknown block calls to have zero parameters.
Right now they report to have one parameter with null decl,
because initializing an ArrayRef of pointers with a nullptr
yields an ArrayRef to an array of one null pointer.
Fixes a crash in the OSObject section of RetainCountChecker.
Artem Dergachev [Sat, 15 Dec 2018 02:09:02 +0000 (02:09 +0000)]
[analyzer] ObjCDealloc: Fix a crash when a class attempts to deallocate a class.
The checker wasn't prepared to see the dealloc message sent to the class itself
rather than to an instance, as if it was +dealloc.
Additionally, it wasn't prepared for pure-unknown or undefined self values.
The new guard covers that as well, but it is annoying to test because
both kinds of values shouldn't really appear and we generally want to
get rid of all of them (by modeling unknown values with symbols and
by warning on use of undefined values before they are used).
The CHECK: directive for FileCheck at the end of the test looks useless,
so i removed it.
Artem Dergachev [Sat, 15 Dec 2018 02:06:13 +0000 (02:06 +0000)]
[analyzer] ObjCContainers: Track index values.
Use trackExpressionValue() (previously known as trackNullOrUndefValue())
to track index value in the report, so that the user knew
what Static Analyzer thinks the index is.
Additionally, implement printState() to help debugging the checker later.
Artem Dergachev [Sat, 15 Dec 2018 01:53:38 +0000 (01:53 +0000)]
[analyzer] MoveChecker: Add checks for dereferencing a smart pointer after move.
Calling operator*() or operator->() on a null STL smart pointer is
undefined behavior.
Smart pointers are specified to become null after being moved from.
So we can't warn on arbitrary method calls, but these two operators
definitely make no sense.
The new bug is fatal because it's an immediate UB,
unlike other use-after-move bugs.
The work on a more generic null smart pointer dereference checker
is still pending.