René Scharfe [Sat, 15 Oct 2016 16:23:11 +0000 (18:23 +0200)]
avoid pointer arithmetic involving NULL in FLEX_ALLOC_MEM
Calculating offsets involving a NULL pointer is undefined. It works in
practice (for now?), but we should not rely on it. Allocate first and
then simply refer to the flexible array member by its name instead of
performing pointer arithmetic up front. The resulting code is slightly
shorter, easier to read and doesn't rely on undefined behaviour.
NB: The cast to a (non-const) void pointer is necessary to keep support
for flexible array members declared as const.
Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 50a6c8e (use st_add and st_mult for allocation size
computation, 2016-02-22) fixed up many xmalloc call-sites
including ones in compat/mingw.c.
But I screwed up one of them, which was half-converted to
ALLOC_ARRAY, using a very early prototype of the function.
And I never caught it because I don't build on Windows.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:45:15 +0000 (17:45 -0500)]
ewah: convert to REALLOC_ARRAY, etc
Now that we're built around xmalloc and friends, we can use
helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make
the code shorter and protect against integer overflow.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:45:12 +0000 (17:45 -0500)]
convert ewah/bitmap code to use xmalloc
This code was originally written with the idea that it could
be spun off into its own ewah library, and uses the
overrideable ewah_malloc to do allocations.
We plug in xmalloc as our ewah_malloc, of course. But over
the years the ewah code itself has become more entangled
with git, and the return value of many ewah_malloc sites is
not checked.
Let's just drop the level of indirection and use xmalloc and
friends directly. This saves a few lines, and will let us
adapt these sites to our more advanced malloc helpers.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:45:08 +0000 (17:45 -0500)]
diff_populate_gitlink: use a strbuf
We allocate 100 bytes to hold the "Submodule commit ..."
text. This is enough, but it's not immediately obvious that
this is the case, and we have to repeat the magic 100 twice.
We could get away with xstrfmt here, but we want to know the
size, as well, so let's use a real strbuf. And while we're
here, we can clean up the logic around size_only. It
currently sets and clears the "data" field pointlessly, and
leaves the "should_free" flag on even after we have cleared
the data.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:45:05 +0000 (17:45 -0500)]
transport_anonymize_url: use xstrfmt
This function uses xcalloc and two memcpy calls to
concatenate two strings. We can do this as an xstrfmt
one-liner, and then it is more clear that we are allocating
the correct amount of memory.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:45:01 +0000 (17:45 -0500)]
git-compat-util: drop mempcpy compat code
There are no callers of this left, as the last one was
dropped in the previous patch. And there are not likely to
be new ones, as the function has been around since 2010
without gaining any new callers.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
It takes advantage of the fact that these strings are
subsets of each other, and allocates only _one_ string, with
pointers into the various parts. Unfortunately, this makes
the string allocation complicated and hard to follow.
Since we keep only one of these in memory at a time, we can
afford to simply allocate three strings. This lets us build
on tools like xstrfmt and avoid manual computation.
While we're here, we can also drop the ad-hoc
reimplementation of get_git_commit_encoding(), and simply
call that function.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The normalize_path_copy function needs an output buffer that
is at least as long as its input (it may shrink the path,
but never expand it). However, this test program feeds it
static PATH_MAX-sized buffers, which have no relation to the
input size.
In the normalize_ceiling_entry case, we do at least check
the size against PATH_MAX and die(), but that case is even
more convoluted. We normalize into a fixed-size buffer, free
the original, and then replace it with a strdup'd copy of
the result. But normalize_path_copy explicitly allows
normalizing in-place, so we can simply do that.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:50 +0000 (17:44 -0500)]
fetch-pack: simplify add_sought_entry
We have two variants of this function, one that takes a
string and one that takes a ptr/len combo. But we only call
the latter with the length of a NUL-terminated string, so
our first simplification is to drop it in favor of the
string variant.
Since we know we have a string, we can also replace the
manual memory computation with a call to alloc_ref().
Furthermore, we can rely on get_oid_hex() to complain if it
hits the end of the string. That means we can simplify the
check for "<sha1> <ref>" versus just "<ref>". Rather than
manage the ptr/len pair, we can just bump the start of our
string forward. The original code over-allocated based on
the original "namelen" (which wasn't _wrong_, but was simply
wasteful and confusing).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:46 +0000 (17:44 -0500)]
fast-import: simplify allocation in start_packfile
This function allocate a packed_git flex-array, and adds a
mysterious 2 bytes to the length of the pack_name field. One
is for the trailing NUL, but the other has no purpose. This
is probably cargo-culted from add_packed_git, which gets the
".idx" path and needed to allocate enough space to hold the
matching ".pack" (though since 48bcc1c, we calculate the
size there differently).
This site, however, is using the raw path of a tempfile, and
does not need the extra byte. We can just replace the
allocation with FLEX_ALLOC_STR, which handles the allocation
and the NUL for us.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:42 +0000 (17:44 -0500)]
write_untracked_extension: use FLEX_ALLOC helper
We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.
Note that there's one subtlety here, though. We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.
As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:39 +0000 (17:44 -0500)]
prepare_{git,shell}_cmd: use argv_array
These functions transform an existing argv into one suitable
for exec-ing or spawning via git or a shell. We can use an
argv_array in each to avoid dealing with manual counting and
allocation.
This also makes the memory allocation more clear and fixes
some leaks. In prepare_shell_cmd, we would sometimes
allocate a new string with "$@" in it and sometimes not,
meaning the caller could not correctly free it. On the
non-Windows side, we are in a child process which will
exec() or exit() immediately, so the leak isn't a big deal.
On Windows, though, we use spawn() from the parent process,
and leak a string for each shell command we run. On top of
that, the Windows code did not free the allocated argv array
at all (but does for the prepare_git_cmd case!).
By switching both of these functions to write into an
argv_array, we can consistently free the result as
appropriate.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:35 +0000 (17:44 -0500)]
use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:32 +0000 (17:44 -0500)]
convert trivial cases to FLEX_ARRAY macros
Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:28 +0000 (17:44 -0500)]
use xmallocz to avoid size arithmetic
We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.
There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.
In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:21 +0000 (17:44 -0500)]
convert manual allocations to argv_array
There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:
1. We no longer have to manually compute the correct final
array size (so it's one less thing we can screw up).
2. In many cases we had to make a separate pass to count,
then allocate, then fill in the array. Now we can do it
in one pass, making the code shorter and easier to
follow.
3. argv_array handles memory ownership for us, making it
more obvious when things should be free()d and and when
not.
Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:44:15 +0000 (17:44 -0500)]
argv-array: add detach function
The usual pattern for an argv array is to initialize it,
push in some strings, and then clear it when done. Very
occasionally, though, we must do other exotic things with
the memory, like freeing the list but keeping the strings.
Let's provide a detach function so that callers can make use
of our API to build up the array, and then take ownership of
it.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:43:25 +0000 (17:43 -0500)]
add helpers for allocating flex-array structs
Allocating a struct with a flex array is pretty simple in
practice: you over-allocate the struct, then copy some data
into the over-allocation. But it can be a slight pain to
make sure you're allocating and copying the right amounts.
This patch adds a few helpers to turn simple cases of
flex-array struct allocation into a one-liner that properly
checks for overflow. See the embedded documentation for
details.
Ideally we could provide a more flexible version that could
handle multiple strings, like:
FLEX_ALLOC_FMT(ref, name, "%s%s", prefix, name);
But we have to implement this as a macro (because of the
offset calculation of the flex member), which means we would
need all compilers to support variadic macros.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 22 Feb 2016 22:43:18 +0000 (17:43 -0500)]
harden REALLOC_ARRAY and xcalloc against size_t overflow
REALLOC_ARRAY inherently involves a multiplication which can
overflow size_t, resulting in a much smaller buffer than we
think we've allocated. We can easily harden it by using
st_mult() to check for overflow. Likewise, we can add
ALLOC_ARRAY to do the same thing for xmalloc calls.
xcalloc() should already be fine, because it takes the two
factors separately, assuming the system calloc actually
checks for overflow. However, before we even hit the system
calloc(), we do our memory_limit_check, which involves a
multiplication. Let's check for overflow ourselves so that
this limit cannot be bypassed.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 19 Feb 2016 11:21:30 +0000 (06:21 -0500)]
tree-diff: catch integer overflow in combine_diff_path allocation
A combine_diff_path struct has two "flex" members allocated
alongside the struct: a string to hold the pathname, and an
array of parent pointers. We use an "int" to compute this,
meaning we may easily overflow it if the pathname is
extremely long.
We can fix this by using size_t, and checking for overflow
with the st_add helper.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 19 Feb 2016 11:21:19 +0000 (06:21 -0500)]
add helpers for detecting size_t overflow
Performing computations on size_t variables that we feed to
xmalloc and friends can be dangerous, as an integer overflow
can cause us to allocate a much smaller chunk than we
realized.
We already have unsigned_add_overflows(), but let's add
unsigned_mult_overflows() to that. Furthermore, rather than
have each site manually check and die on overflow, we can
provide some helpers that will:
- promote the arguments to size_t, so that we know we are
doing our computation in the same size of integer that
will ultimately be fed to xmalloc
- check and die on overflow
- return the result so that computations can be done in
the parameter list of xmalloc.
These functions are a lot uglier to use than normal
arithmetic operators (you have to do "st_add(foo, bar)"
instead of "foo + bar"). To at least limit the damage, we
also provide multi-valued versions. So rather than:
st_add(st_add(a, b), st_add(c, d));
you can write:
st_add4(a, b, c, d);
This isn't nearly as elegant as a varargs function, but it's
a lot harder to get it wrong. You don't have to remember to
add a sentinel value at the end, and the compiler will
complain if you get the number of arguments wrong. This
patch adds only the numbered variants required to convert
the current code base; we can easily add more later if
needed.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 19 Feb 2016 11:21:08 +0000 (06:21 -0500)]
reflog_expire_cfg: NUL-terminate pattern field
You can tweak the reflog expiration for a particular subset
of refs by configuring gc.foo.reflogexpire. We keep a linked
list of reflog_expire_cfg structs, each of which holds the
pattern and a "len" field for the length of the pattern. The
pattern itself is _not_ NUL-terminated.
However, we feed the pattern directly to wildmatch(), which
expects a NUL-terminated string, meaning it may keep reading
random junk after our struct.
We can fix this by allocating an extra byte for the NUL
(which is already zero because we use xcalloc). Let's also
drop the misleading "len" field, which is no longer
necessary. The existing use of "len" can be converted to use
strncmp().
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 5 Feb 2016 22:54:21 +0000 (14:54 -0800)]
Merge branch 'ss/clone-depth-single-doc' into maint
Documentation for "git fetch --depth" has been updated for clarity.
* ss/clone-depth-single-doc:
docs: clarify that --depth for git-fetch works with newly initialized repos
docs: say "commits" in the --depth option wording for git-clone
docs: clarify that passing --depth to git-clone implies --single-branch
Junio C Hamano [Fri, 5 Feb 2016 22:54:18 +0000 (14:54 -0800)]
Merge branch 'ss/user-manual' into maint
Drop a few old "todo" items by deciding that the change one of them
suggests is not such a good idea, and doing the change the other
one suggested to do.
* ss/user-manual:
user-manual: add addition gitweb information
user-manual: add section documenting shallow clones
glossary: define the term shallow clone
user-manual: remove temporary branch entry from todo list
Junio C Hamano [Fri, 5 Feb 2016 22:54:17 +0000 (14:54 -0800)]
Merge branch 'jk/ref-cache-non-repository-optim' into maint
The underlying machinery used by "ls-files -o" and other commands
have been taught not to create empty submodule ref cache for a
directory that is not a submodule. This removes a ton of wasted
CPU cycles.
* jk/ref-cache-non-repository-optim:
resolve_gitlink_ref: ignore non-repository paths
clean: make is_git_repository a public function
Junio C Hamano [Fri, 5 Feb 2016 22:54:17 +0000 (14:54 -0800)]
Merge branch 'js/dirname-basename' into maint
dirname() emulation has been added, as Msys2 lacks it.
* js/dirname-basename:
mingw: avoid linking to the C library's isalpha()
t0060: loosen overly strict expectations
t0060: verify that basename() and dirname() work as expected
compat/basename.c: provide a dirname() compatibility function
compat/basename: make basename() conform to POSIX
Refactor skipping DOS drive prefixes
Junio C Hamano [Fri, 5 Feb 2016 22:54:15 +0000 (14:54 -0800)]
Merge branch 'jk/list-tag-2.7-regression' into maint
"git tag" started listing a tag "foo" as "tags/foo" when a branch
named "foo" exists in the same repository; remove this unnecessary
disambiguation, which is a regression introduced in v2.7.0.
* jk/list-tag-2.7-regression:
tag: do not show ambiguous tag names as "tags/foo"
t6300: use test_atom for some un-modern tests
Junio C Hamano [Fri, 5 Feb 2016 22:54:13 +0000 (14:54 -0800)]
Merge branch 'js/close-packs-before-gc' into maint
Many codepaths that run "gc --auto" before exiting kept packfiles
mapped and left the file descriptors to them open, which was not
friendly to systems that cannot remove files that are open. They
now close the packs before doing so.
* js/close-packs-before-gc:
receive-pack: release pack files before garbage-collecting
merge: release pack files before garbage-collecting
am: release pack files before garbage-collecting
fetch: release pack files before garbage-collecting
Junio C Hamano [Fri, 5 Feb 2016 22:54:12 +0000 (14:54 -0800)]
Merge branch 'ho/gitweb-squelch-undef-warning' into maint
Asking gitweb for a nonexistent commit left a warning in the server
log.
Somebody may want to follow this up with a new test, perhaps?
IIRC, we do test that no Perl warnings are given to the server log,
so this should have been caught if our test coverage were good.
Junio C Hamano [Fri, 5 Feb 2016 22:54:11 +0000 (14:54 -0800)]
Merge branch 'js/fopen-harder' into maint
Some codepaths used fopen(3) when opening a fixed path in $GIT_DIR
(e.g. COMMIT_EDITMSG) that is meant to be left after the command is
done. This however did not work well if the repository is set to
be shared with core.sharedRepository and the umask of the previous
user is tighter. They have been made to work better by calling
unlink(2) and retrying after fopen(3) fails with EPERM.
* js/fopen-harder:
Handle more file writes correctly in shared repos
commit: allow editing the commit message even in shared repos
Junio C Hamano [Fri, 5 Feb 2016 22:54:11 +0000 (14:54 -0800)]
Merge branch 'nd/exclusion-regression-fix' into maint
The ignore mechanism saw a few regressions around untracked file
listing and sparse checkout selection areas in 2.7.0; the change
that is responsible for the regression has been reverted.
* nd/exclusion-regression-fix:
Revert "dir.c: don't exclude whole dir prematurely if neg pattern may match"
Junio C Hamano [Fri, 5 Feb 2016 22:54:08 +0000 (14:54 -0800)]
Merge branch 'nd/dir-exclude-cleanup' into maint
The "exclude_list" structure has the usual "alloc, nr" pair of
fields to be used by ALLOC_GROW(), but clear_exclude_list() forgot
to reset 'alloc' to 0 when it cleared 'nr' to discard the managed
array.
* nd/dir-exclude-cleanup:
dir.c: clean the entire struct in clear_exclude_list()
Junio C Hamano [Fri, 5 Feb 2016 22:54:07 +0000 (14:54 -0800)]
Merge branch 'nd/stop-setenv-work-tree' into maint
An earlier change in 2.5.x-era broke users' hooks and aliases by
exporting GIT_WORK_TREE to point at the root of the working tree,
interfering when they tried to use a different working tree without
setting GIT_WORK_TREE environment themselves.
* nd/stop-setenv-work-tree:
Revert "setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR"
Eric Wong [Sat, 16 Jan 2016 10:17:19 +0000 (10:17 +0000)]
git-svn: fix auth parameter handling on SVN 1.9.0+
For users with "store-passwords = no" set in the "[auth]" section of
their ~/.subversion/config, SVN 1.9.0+ would fail with the
following message when attempting to call svn_auth_set_parameter:
Value is not a string (or undef) at Git/SVN/Ra.pm
Ironically, this breakage was caused by r1553823 in subversion:
"Make svn_auth_set_parameter() usable from Perl bindings."
Since 2007 (602015e0e6ec), git-svn has used a workaround to make
svn_auth_set_parameter usable internally. However this workaround
breaks under SVN 1.9+, which deals properly with the type mapping
and fails to recognize our workaround.
For pre-1.9.0 SVN, we continue to use the existing workaround for
the lack of proper type mapping in the bindings.
Tested under subversion 1.6.17 and 1.9.3.
I've also verified r1553823 was not backported to SVN 1.8.x:
Jeff King [Tue, 26 Jan 2016 03:00:05 +0000 (22:00 -0500)]
tag: do not show ambiguous tag names as "tags/foo"
Since b7cc53e9 (tag.c: use 'ref-filter' APIs, 2015-07-11),
git-tag has started showing tags with ambiguous names (i.e.,
when both "heads/foo" and "tags/foo" exists) as "tags/foo"
instead of just "foo". This is both:
- pointless; the output of "git tag" includes only
refs/tags, so we know that "foo" means the one in
"refs/tags".
and
- ambiguous; in the original output, we know that the line
"foo" means that "refs/tags/foo" exists. In the new
output, it is unclear whether we mean "refs/tags/foo" or
"refs/tags/tags/foo".
The reason this happens is that commit b7cc53e9 switched
git-tag to use ref-filter's "%(refname:short)" output
formatting, which was adapted from for-each-ref. This more
general code does not know that we care only about tags, and
uses shorten_unambiguous_ref to get the short-name. We need
to tell it that we care only about "refs/tags/", and it
should shorten with respect to that value.
In theory, the ref-filter code could figure this out by us
passing FILTER_REFS_TAGS. But there are two complications
there:
1. The handling of refname:short is deep in formatting
code that does not even have our ref_filter struct, let
alone the arguments to the filter_ref struct.
2. In git v2.7.0, we expose the formatting language to the
user. If we follow this path, it will mean that
"%(refname:short)" behaves differently for "tag" versus
"for-each-ref" (including "for-each-ref refs/tags/"),
which can lead to confusion.
Instead, let's add a new modifier to the formatting
language, "strip", to remove a specific set of prefix
components. This fixes "git tag", and lets users invoke the
same behavior from their own custom formats (for "tag" or
"for-each-ref") while leaving ":short" with its same
consistent meaning in all places.
We introduce a test in t7004 for "git tag", which fails
without this patch. We also add a similar test in t3203 for
"git branch", which does not actually fail. But since it is
likely that "branch" will eventually use the same formatting
code, the test helps defend against future regressions.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Paul Wagland [Tue, 26 Jan 2016 09:37:19 +0000 (10:37 +0100)]
completion: update completion arguments for stash
Add --all and --include-untracked to the git stash save completions.
Add --quiet to the git stash drop completions.
Update git stash branch so that the first argument expands out to the
possible branch names, and the other arguments expand to the stash
names.
Signed-off-by: Paul Wagland <paul@kungfoocoder.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt [Mon, 25 Jan 2016 21:47:56 +0000 (22:47 +0100)]
mingw: avoid linking to the C library's isalpha()
The implementation of mingw_skip_dos_drive_prefix() calls isalpha() via
has_dos_drive_prefix(). Since the definition occurs long before isalpha()
is defined in git-compat-util.h, my build environment reports:
CC alloc.o
In file included from git-compat-util.h:186,
from cache.h:4,
from alloc.c:12:
compat/mingw.h: In function 'mingw_skip_dos_drive_prefix':
compat/mingw.h:365: warning: implicit declaration of function 'isalpha'
Dscho does not see a similar warning in his build and suspects that
ctype.h is included somehow behind the scenes. This implies that his build
links to the C library's isalpha() and does not use git's isalpha().
To fix both the warning in my build and the inconsistency in Dscho's
build, move the function definition to mingw.c. Then it picks up git's
isalpha() because git-compat-util.h is included at the top of the file.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Sun, 24 Jan 2016 23:08:18 +0000 (18:08 -0500)]
t6300: use test_atom for some un-modern tests
Because this script has to test so many formatters, we have
the nice "test_atom" helper, but we don't use it
consistently. Let's do so. This is shorter, gets rid of some
tests that have their "expected" setup outside of a
test_expect_success block, and lets us organize the changes
better (e.g., putting "refname:short" near "refname").
We also expand the "%(push)" tests a little to match the
"%(upstream)" ones.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 22 Jan 2016 22:29:30 +0000 (17:29 -0500)]
resolve_gitlink_ref: ignore non-repository paths
When we want to look up a submodule ref, we use
get_ref_cache(path) to find or auto-create its ref cache.
But if we feed a path that isn't actually a git repository,
we blindly create the ref cache, and then may die deeper in
the code when we try to access it. This is a problem because
many callers speculatively feed us a path that looks vaguely
like a repository, and expect us to tell them when it is
not.
This patch teaches resolve_gitlink_ref to reject
non-repository paths without creating a ref_cache. This
avoids the die(), and also performs better if you have a
large number of these faux-submodule directories (because
the ref_cache lookup is linear, under the assumption that
there won't be a large number of submodules).
To accomplish this, we also break get_ref_cache into two
pieces: the lookup and auto-creation (the latter is lumped
into create_ref_cache). This lets us first cheaply ask our
cache "is it a submodule we know about?" If so, we can avoid
repeating our filesystem lookup. So lookups of real
submodules are not penalized; they examine the submodule's
.git directory only once.
The test in t3000 demonstrates a case where this improves
correctness (we used to just die). The new perf case in
p7300 shows off the speed improvement in an admittedly
pathological repository:
Test HEAD^ HEAD
----------------------------------------------------------------
7300.4: ls-files -o 66.97(66.15+0.87) 0.33(0.08+0.24) -99.5%
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Fri, 22 Jan 2016 22:27:33 +0000 (17:27 -0500)]
clean: make is_git_repository a public function
We have always had is_git_directory(), for looking at a
specific directory to see if it contains a git repo. In 0179ca7 (clean: improve performance when removing lots of
directories, 2015-06-15), we added is_git_repository() which
checks for a non-bare repository by looking at its ".git"
entry.
However, the fix in 0179ca7 needs to be applied other
places, too. Let's make this new helper globally available.
We need to give it a better name, though, to avoid confusion
with is_git_directory(). This patch does that, documents
both functions with a comment to reduce confusion, and
removes the clean-specific references in the comments.
Based-on-a-patch-by: Andreas Krey <a.krey@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff-no-index: do not take a redundant prefix argument
Prefix is already set up in "revs". The same prefix should be used for
all options parsing. So kill the last argument. This patch does not
actually change anything because the only caller does use the same
prefix for init_revisions() and diff_no_index().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lars Vogel [Thu, 21 Jan 2016 10:09:15 +0000 (11:09 +0100)]
git-add doc: do not say working directory when you mean working tree
The usage of working directory is inconsistent in the git add help.
Also http://git-scm.com/docs/git-clone speaks only about working tree.
Remaining entry found by "git grep -B1 '^directory' git-add.txt" really
relates to a directory.
Signed-off-by: Lars Vogel <Lars.Vogel@vogella.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git subtree split' can incorrectly skip a merge even when both parents
act on the subtree, provided the merge results in a tree identical to
one of the parents. Fix by copying the merge if at least one parent is
non-identical, and the non-identical parent is not an ancestor of the
identical parent.
Also, add a test case which checks that a descendant remains a
descendent on the subtree in this case.
Signed-off-by: Dave Ware <davidw@realtimegenomics.com> Reviewed-by: David A. Greene <greened@obbligato.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 19 Jan 2016 22:07:22 +0000 (17:07 -0500)]
filter-branch: resolve $commit^{tree} in no-index case
Commit 348d4f2 (filter-branch: skip index read/write when
possible, 2015-11-06) taught filter-branch to optimize out
the final "git write-tree" when we know we haven't touched
the tree with any of our filters. It does by simply putting
the literal text "$commit^{tree}" into the "$tree" variable,
avoiding a useless rev-parse call.
However, when we pass this to git_commit_non_empty_tree(),
it gets confused; it resolves "$commit^{tree}" itself, and
compares our string to the 40-hex sha1, which obviously
doesn't match. As a result, "--prune-empty" (or any custom
filter using git_commit_non_empty_tree) will fail to drop
an empty commit (when filter-branch is used without a tree
or index filter).
Let's resolve $tree to the 40-hex ourselves, so that
git_commit_non_empty_tree can work. Unfortunately, this is a
bit slower due to the extra process overhead:
$ cd t/perf && ./run 348d4f2 HEAD p7000-filter-branch.sh
[...]
Test 348d4f2 HEAD
--------------------------------------------------------------
7000.2: noop filter 3.76(0.24+0.26) 4.54(0.28+0.24) +20.7%
We could try to make git_commit_non_empty_tree more clever.
However, the value of $tree here is technically
user-visible. The user can provide arbitrary shell code at
this stage, which could itself have a similar assumption to
what is in git_commit_non_empty_tree. So the conservative
choice to fix this regression is to take the 20% hit and
give the pre-348d4f2 behavior. We still end up much faster
than before the optimization:
$ cd t/perf && ./run 348d4f2^ HEAD p7000-filter-branch.sh
[...]
Test 348d4f2^ HEAD
--------------------------------------------------------------
7000.2: noop filter 9.51(4.32+0.40) 4.51(0.28+0.23) -52.6%
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 19 Jan 2016 22:09:37 +0000 (14:09 -0800)]
test-lib: clarify and tighten SANITY
f400e51c (test-lib.sh: set prerequisite SANITY by testing what we
really need, 2015-01-27) improved the way SANITY prerequisite was
determined, but made the resulting code (incorrectly) imply that
SANITY is all about effects of permission bits of the containing
directory has on the files contained in it by the comment it added,
its log message and the actual tests.
State what SANITY is about more clearly in the comment, and test
that a file whose permission bits says should be unreadble truly
cannot be read.
The dirname() tests file were developed and tested on only the five
platforms available to the developer at the time, namely: Linux (both 32
and 64bit), Windows XP 32-bit (MSVC), MinGW 32-bit and Cygwin 32-bit.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/basename.html
(i.e. the POSIX spec) says, in part:
If the string pointed to by path consists entirely of the '/'
character, basename() shall return a pointer to the string "/".
If the string pointed to by path is exactly "//", it is
implementation-defined whether "/" or "//" is returned.
The thinking behind testing precise, OS-dependent output values was to
document that different setups produce different values. However, as the
test failures on MacOSX illustrated eloquently: hardcoding pretty much each
and every setup's expectations is pretty fragile.
This is not limited to the "//" vs "/" case, of course, other inputs are
also allowed to produce multiple outputs by the POSIX specs.
So let's just test for all allowed values and be done with it. This still
documents that Git cannot rely on one particular output value in those
cases, so the intention of the original tests is still met.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Wed, 13 Jan 2016 18:47:18 +0000 (13:47 -0500)]
rebase: ignore failures from "gc --auto"
After rebasing, we call "gc --auto" to clean up if we
created a lot of loose objects. However, we do so inside an
&&-chain. If "gc --auto" fails (e.g., because a previous
background gc blocked us by leaving "gc.log" in place),
then:
1. We will fail to clean up the state directory, leaving
the user stuck in the rebase forever (even "git am
--abort" doesn't work, because it calls "gc --auto"!).
2. In some cases, we may return a bogus exit code from
rebase, indicating failure when everything except the
auto-gc succeeded.
We can fix this by ignoring the exit code of "gc --auto".
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
receive-pack: release pack files before garbage-collecting
Before auto-gc'ing, we need to make sure that the pack files are
released in case they need to be repacked and garbage-collected.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge: release pack files before garbage-collecting
Before auto-gc'ing, we need to make sure that the pack files are
released in case they need to be repacked and garbage-collected.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before auto-gc'ing, we need to make sure that the pack files are
released in case they need to be repacked and garbage-collected.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch: release pack files before garbage-collecting
Before auto-gc'ing, we need to make sure that the pack files are
released in case they need to be repacked and garbage-collected.
This fixes https://github.com/git-for-windows/git/issues/500
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Øyvind A. Holm [Tue, 12 Jan 2016 03:31:56 +0000 (04:31 +0100)]
gitweb: squelch "uninitialized value" warning
git_object() chomps $type that is read from "cat-file -t", but
it does so before checking if $type is defined, resulting in
a Perl warning in the server error log:
gitweb.cgi: Use of uninitialized value $type in scalar chomp at
[...]/gitweb.cgi line 7579., referer: [...]
when trying to access a non-existing commit, for example:
Check the value in $type before chomping. This will cause us to
call href with its action parameter set to undef when formulating
the URL to redirect to, but that is harmless, as the function treats
a parameter that set to undef as if it does not exist.
Signed-off-by: Øyvind A. Holm <sunny@sunbase.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0060: verify that basename() and dirname() work as expected
Unfortunately, some libgen implementations yield outcomes different
from what Git expects. For example, mingw-w64-crt provides a basename()
function, that shortens `path0/` to `path`!
So let's verify that the basename() and dirname() functions we use
conform to what Git expects.
Derived-from-code-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
compat/basename.c: provide a dirname() compatibility function
When there is no `libgen.h` to our disposal, we miss the `dirname()`
function. Earlier we added basename() compatibility function for
the same reason at e1c06886 (compat: add a basename() compatibility
function, 2009-05-31).
So far, we only had one user of that function: credential-cache--daemon
(which was only compiled when Unix sockets are available, anyway). But
now we also have `builtin/am.c` as user, so we need it.
Since `dirname()` is a sibling of `basename()`, we simply put our very
own `gitdirname()` implementation next to `gitbasename()` and use it
if `NO_LIBGEN_H` has been set.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio noticed that there is an implicit assumption in pretty much
all the code calling has_dos_drive_prefix(): it forces all of its
callsites to hardcode the knowledge that the DOS drive prefix is
always two bytes long.
While this assumption is pretty safe, we can still make the code
more readable and less error-prone by introducing a function that
skips the DOS drive prefix safely.
While at it, we change the has_dos_drive_prefix() return value: it
now returns the number of bytes to be skipped if there is a DOS
drive prefix.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In shared repositories, we have to be careful when writing files whose
permissions do not allow users other than the owner to write them.
In particular, we force the marks file of fast-export and the FETCH_HEAD
when fetching to be rewritten from scratch.
This commit does not touch other calls to fopen() that want to
write files:
- commands that write to working tree files (core.sharedRepository
does not affect permission bits of working tree files),
e.g. .rej file created by "apply --reject", result of applying a
previous conflict resolution by "rerere", "git merge-file".
- git am, when splitting mails (git-am correctly cleans up its directory
after finishing, so there is no need to share those files between users)
- git submodule clone, when writing the .git file, because the file
will not be overwritten
- git_terminal_prompt() in compat/terminal.c, because it is not writing to
a file at all
- git diff --output, because the output file is clearly not intended to be
shared between the users of the current repository
- git fast-import, when writing a crash report, because the reports' file
names are unique due to an embedded process ID
- mailinfo() in mailinfo.c, because the output is clearly not intended to
be shared between the users of the current repository
- check_or_regenerate_marks() in remote-testsvn.c, because this is only
used for Git's internal testing
- git fsck, when writing lost&found blobs (this should probably be
changed, but left as a low-hanging fruit for future contributors).
Note that this patch does not touch callers of write_file() and
write_file_gently(), which would benefit from the same scrutiny as
to usage in shared repositories. Most notable users are branch,
daemon, submodule & worktree, and a worrisome call in transport.c
when updating one ref (which ignores the shared flag).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
docs: clarify that --depth for git-fetch works with newly initialized repos
The original wording sounded as if --depth could only be used to deepen or
shorten the history of existing repos. However, that is not the case. In a
workflow like
docs: say "commits" in the --depth option wording for git-clone
It is not wrong to talk about "revisions" here, but in this context
revisions are always commits, and that is how we already name it in the
git-fetch docs. So align the docs by always referring to "commits".
Revert "dir.c: don't exclude whole dir prematurely if neg pattern may match"
This reverts commit 57534ee77d22e725d971ee89c77dc6aad61c573f. The
feature added in that commit requires that patterns behave the same way
from anywhere. But some patterns can behave differently depending on
current "working" directory. The conditions to catch and avoid these
patterns are too loose. The untracked listing[1] and sparse-checkout
selection[2] can become incorrect as a result.
commit: allow editing the commit message even in shared repos
It was pointed out by Yaroslav Halchenko that the file containing the
commit message is writable only by the owner, which means that we have
to rewrite it from scratch in a shared repository.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
docs: clarify that passing --depth to git-clone implies --single-branch
It is confusing to document how --depth behaves as part of the
--single-branch docs. Better move that part to the --depth docs, saying
that it implies --single-branch by default.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Modify various document (man page) files to explain
in more detail what --signoff means.
This was inspired by https://lwn.net/Articles/669976/ where
paulj noted, "adding [the] '-s' argument to [a] git commit
doesn't really mean you have even heard of the DCO...".
Extending git's documentation will make it easier to argue
that developers understood --signoff when they use it.
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog-walk: don't segfault on non-commit sha1's in the reflog
git reflog (ab)uses the log machinery to display its list of log
entries. To do so it must fake commit parent information for the log
walker.
For refs in refs/heads this is no problem, as they should only ever
point to commits. Tags and other refs however can point to anything,
thus their reflog may contain non-commit objects.
To avoid segfaulting, we check whether reflog entries are commits before
feeding them to the log walker and skip any non-commits. This means that
git reflog output will be incomplete for such refs, but that's one step
up from segfaulting. A more complete solution would be to decouple git
reflog from the log walker machinery.
Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net> Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Tue, 5 Jan 2016 10:33:30 +0000 (11:33 +0100)]
t6050-replace: make failing editor test more robust
'git replace --edit' should error out when the invoked editor fails,
but the test checking this behavior would not notice if this weren't
the case.
The test in question, ever since it was added in 85f98fc037ae
(replace: add tests for --edit, 2014-05-17), has simulated a failing
editor in an unconventional way:
I presume the reason for this unconventional editor was the fact that
'git replace --edit' requires the edited object to be different from
the original, but a mere 'false' as editor would leave the object
unchanged and 'git replace --edit' would error out anyway complaining
about the new and the original object files being the same. Running
'fakeeditor' before 'false' was supposed to ensure that the object
file is modified and thus 'git replace --edit' errors out because of
the failed editor.
However, this editor doesn't actually modify the edited object,
because start_command() turns this editor into:
This means that the test's fakeeditor script doesn't even get the path
of the object to be edited as argument, triggering error messages from
the commands executed inside the script ('sed' and 'mv'), and
ultimately leaving the object file unchanged.
If a patch were to remove the die() from the error path after
launch_editor(), the test would not catch it, because 'git replace'
would continue execution past launch_editor() and would error out a
bit later due to the unchanged edited object. Though 'git replace'
would error out for the wrong reason, this would satisfy
'test_must_fail' just as well, and the test would succeed leaving the
undesired change unnoticed.
Create a proper failing fake editor script for this test to ensure
that the edited object is in fact modified and 'git replace --edit'
won't error out because the new and original object files are the
same.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Mon, 4 Jan 2016 22:03:07 +0000 (14:03 -0800)]
Merge branch 'jk/pending-keep-tag-name' into maint
History traversal with "git log --source" that starts with an
annotated tag failed to report the tag as "source", due to an
old regression in the command line parser back in v2.2 days.
* jk/pending-keep-tag-name:
revision.c: propagate tag names from pending array
Junio C Hamano [Mon, 4 Jan 2016 22:02:57 +0000 (14:02 -0800)]
Merge branch 'jk/ident-loosen-getpwuid' into maint
When getpwuid() on the system returned NULL (e.g. the user is not
in the /etc/passwd file or other uid-to-name mappings), the
codepath to find who the user is to record it in the reflog barfed
and died. Loosen the check in this codepath, which already accepts
questionable ident string (e.g. host part of the e-mail address is
obviously bogus), and in general when we operate fmt_ident() function
in non-strict mode.
* jk/ident-loosen-getpwuid:
ident: loosen getpwuid error in non-strict mode
ident: keep a flag for bogus default_email
ident: make xgetpwuid_self() a static local helper