Junio C Hamano [Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)]
Merge branch 'jk/size-t'
Code clean-up to use size_t/ssize_t when they are the right type.
* jk/size-t:
strbuf_humanise: use unsigned variables
pass st.st_size as hint for strbuf_readlink()
strbuf_readlink: use ssize_t
strbuf: use size_t for length in intermediate variables
reencode_string: use size_t for string lengths
reencode_string: use st_add/st_mult helpers
Junio C Hamano [Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)]
Merge branch 'sg/coccicheck-updates'
Update the way we use Coccinelle to find out-of-style code that
need to be modernised.
* sg/coccicheck-updates:
coccinelle: extract dedicated make target to clean Coccinelle's results
coccinelle: put sane filenames into output patches
coccinelle: exclude sha1dc source files from static analysis
coccinelle: use $(addsuffix) in 'coccicheck' make target
coccinelle: mark the 'coccicheck' make target as .PHONY
Junio C Hamano [Wed, 15 Aug 2018 22:08:25 +0000 (15:08 -0700)]
Merge branch 'sb/histogram-less-memory'
"git diff --histogram" had a bad memory usage pattern, which has
been rearranged to reduce the peak usage.
* sb/histogram-less-memory:
xdiff/histogram: remove tail recursion
xdiff/xhistogram: move index allocation into find_lcs
xdiff/xhistogram: factor out memory cleanup into free_index()
xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diff
Junio C Hamano [Wed, 15 Aug 2018 22:08:23 +0000 (15:08 -0700)]
Merge branch 'nd/i18n'
Many more strings are prepared for l10n.
* nd/i18n: (23 commits)
transport-helper.c: mark more strings for translation
transport.c: mark more strings for translation
sha1-file.c: mark more strings for translation
sequencer.c: mark more strings for translation
replace-object.c: mark more strings for translation
refspec.c: mark more strings for translation
refs.c: mark more strings for translation
pkt-line.c: mark more strings for translation
object.c: mark more strings for translation
exec-cmd.c: mark more strings for translation
environment.c: mark more strings for translation
dir.c: mark more strings for translation
convert.c: mark more strings for translation
connect.c: mark more strings for translation
config.c: mark more strings for translation
commit-graph.c: mark more strings for translation
builtin/replace.c: mark more strings for translation
builtin/pack-objects.c: mark more strings for translation
builtin/grep.c: mark strings for translation
builtin/config.c: mark more strings for translation
...
Junio C Hamano [Wed, 15 Aug 2018 22:08:23 +0000 (15:08 -0700)]
Merge branch 'hs/gpgsm'
Teach "git tag -s" etc. a few configuration variables (gpg.format
that can be set to "openpgp" or "x509", and gpg.<format>.program
that is used to specify what program to use to deal with the format)
to allow x.509 certs with CMS via "gpgsm" to be used instead of
openpgp via "gnupg".
* hs/gpgsm:
gpg-interface t: extend the existing GPG tests with GPGSM
gpg-interface: introduce new signature format "x509" using gpgsm
gpg-interface: introduce new config to select per gpg format program
gpg-interface: do not hardcode the key string len anymore
gpg-interface: introduce an abstraction for multiple gpg formats
t/t7510: check the validation of the new config gpg.format
gpg-interface: add new config to select how to sign a commit
Junio C Hamano [Wed, 15 Aug 2018 22:08:23 +0000 (15:08 -0700)]
Merge branch 'bw/clone-ref-prefixes'
The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement. "git
clone" when learned to speak v2 forgot to do so, which has been
corrected.
* bw/clone-ref-prefixes:
clone: send ref-prefixes when using protocol v2
Junio C Hamano [Wed, 15 Aug 2018 22:08:22 +0000 (15:08 -0700)]
Merge branch 'jk/core-use-replace-refs'
A new configuration variable core.usereplacerefs has been added,
primarily to help server installations that want to ignore the
replace mechanism altogether.
Junio C Hamano [Wed, 15 Aug 2018 22:08:22 +0000 (15:08 -0700)]
Merge branch 'bb/make-developer-pedantic'
"make DEVELOPER=1 DEVOPTS=pedantic" allows developers to compile
with -pedantic option, which may catch more problematic program
constructs and potential bugs.
* bb/make-developer-pedantic:
Makefile: add a DEVOPTS flag to get pedantic compilation
Junio C Hamano [Wed, 15 Aug 2018 22:08:21 +0000 (15:08 -0700)]
Merge branch 'sg/travis-cocci-diagnose-failure'
Update the way we run static analysis tool at TravisCI to make it
easier to use its findings.
* sg/travis-cocci-diagnose-failure:
travis-ci: fail if Coccinelle static analysis found something to transform
travis-ci: run Coccinelle static analysis with two parallel jobs
Junio C Hamano [Thu, 2 Aug 2018 22:30:46 +0000 (15:30 -0700)]
Merge branch 'es/chain-lint-in-subshell'
Look for broken "&&" chains that are hidden in subshell, many of
which have been found and corrected.
* es/chain-lint-in-subshell:
t/chainlint.sed: drop extra spaces from regex character class
t/chainlint: add chainlint "specialized" test cases
t/chainlint: add chainlint "complex" test cases
t/chainlint: add chainlint "cuddled" test cases
t/chainlint: add chainlint "loop" and "conditional" test cases
t/chainlint: add chainlint "nested subshell" test cases
t/chainlint: add chainlint "one-liner" test cases
t/chainlint: add chainlint "whitespace" test cases
t/chainlint: add chainlint "basic" test cases
t/Makefile: add machinery to check correctness of chainlint.sed
t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
Junio C Hamano [Thu, 2 Aug 2018 22:30:46 +0000 (15:30 -0700)]
Merge branch 'jt/fetch-negotiator-skipping'
Add a server-side knob to skip commits in exponential/fibbonacci
stride in an attempt to cover wider swath of history with a smaller
number of iterations, potentially accepting a larger packfile
transfer, instead of going back one commit a time during common
ancestor discovery during the "git fetch" transaction.
* jt/fetch-negotiator-skipping:
negotiator/skipping: skip commits during fetch
Junio C Hamano [Thu, 2 Aug 2018 22:30:46 +0000 (15:30 -0700)]
Merge branch 'jm/send-email-tls-auth-on-batch'
"git send-email" when using in a batched mode that limits the
number of messages sent in a single SMTP session lost the contents
of the variable used to choose between tls/ssl, unable to send the
second and later batches, which has been fixed.
* jm/send-email-tls-auth-on-batch:
send-email: fix tls AUTH when sending batch
"git rebase" started exporting GIT_DIR environment variable and
exposing it to hook scripts when part of it got rewritten in C.
Instead of matching the old scripted Porcelains' behaviour,
compensate by also exporting GIT_WORK_TREE environment as well to
lessen the damage. This can harm existing hooks that want to
operate on different repository, but the current behaviour is
already broken for them anyway.
* bc/sequencer-export-work-tree-as-well:
sequencer: pass absolute GIT_WORK_TREE to exec commands
Tests to cover conflict cases that involve submodules have been
added for merge-recursive.
* en/t7405-recursive-submodule-conflicts:
t7405: verify 'merge --abort' works after submodule/path conflicts
t7405: add a directory/submodule conflict
t7405: add a file/submodule conflict
Junio C Hamano [Thu, 2 Aug 2018 22:30:45 +0000 (15:30 -0700)]
Merge branch 'en/t6036-merge-recursive-tests'
Tests to cover various conflicting cases have been added for
merge-recursive.
* en/t6036-merge-recursive-tests:
t6036: add a failed conflict detection case: regular files, different modes
t6036: add a failed conflict detection case with conflicting types
t6036: add a failed conflict detection case with submodule add/add
t6036: add a failed conflict detection case with submodule modify/modify
t6036: add a failed conflict detection case with symlink add/add
t6036: add a failed conflict detection case with symlink modify/modify
Junio C Hamano [Thu, 2 Aug 2018 22:30:44 +0000 (15:30 -0700)]
Merge branch 'en/dirty-merge-fixes'
The recursive merge strategy did not properly ensure there was no
change between HEAD and the index before performing its operation,
which has been corrected.
* en/dirty-merge-fixes:
merge: fix misleading pre-merge check documentation
merge-recursive: enforce rule that index matches head before merging
t6044: add more testcases with staged changes before a merge is invoked
merge-recursive: fix assumption that head tree being merged is HEAD
merge-recursive: make sure when we say we abort that we actually abort
t6044: add a testcase for index matching head, when head doesn't match HEAD
t6044: verify that merges expected to abort actually abort
index_has_changes(): avoid assuming operating on the_index
read-cache.c: move index_has_changes() from merge.c
Junio C Hamano [Thu, 2 Aug 2018 22:30:44 +0000 (15:30 -0700)]
Merge branch 'js/rebase-merge-octopus'
"git rebase --rebase-merges" mode now handles octopus merges as
well.
* js/rebase-merge-octopus:
rebase --rebase-merges: adjust man page for octopus support
rebase --rebase-merges: add support for octopus merges
merge: allow reading the merge commit message from a file
Junio C Hamano [Thu, 2 Aug 2018 22:30:43 +0000 (15:30 -0700)]
Merge branch 'kg/gc-auto-windows-workaround'
"git gc --auto" opens file descriptors for the packfiles before
spawning "git repack/prune", which would upset Windows that does
not want a process to work on a file that is open by another
process. The issue has been worked around.
* kg/gc-auto-windows-workaround:
gc --auto: release pack files before auto packing
Junio C Hamano [Thu, 2 Aug 2018 22:30:43 +0000 (15:30 -0700)]
Merge branch 'jm/cache-entry-from-mem-pool'
For a large tree, the index needs to hold many cache entries
allocated on heap. These cache entries are now allocated out of a
dedicated memory pool to amortize malloc(3) overhead.
* jm/cache-entry-from-mem-pool:
block alloc: add validations around cache_entry lifecyle
block alloc: allocate cache entries from mem_pool
mem-pool: fill out functionality
mem-pool: add life cycle management functions
mem-pool: only search head block for available space
block alloc: add lifecycle APIs for cache_entry structs
read-cache: teach make_cache_entry to take object_id
read-cache: teach refresh_cache_entry to take istate
Junio C Hamano [Thu, 2 Aug 2018 22:30:43 +0000 (15:30 -0700)]
Merge branch 'jt/fetch-nego-tip'
"git fetch" learned a new option "--negotiation-tip" to limit the
set of commits it tells the other end as "have", to reduce wasted
bandwidth and cycles, which would be helpful when the receiving
repository has a lot of refs that have little to do with the
history at the remote it is fetching from.
* jt/fetch-nego-tip:
fetch-pack: support negotiation tip whitelist
Junio C Hamano [Thu, 2 Aug 2018 22:30:41 +0000 (15:30 -0700)]
Merge branch 'jt/fetch-pack-negotiator'
Code restructuring and a small fix to transport protocol v2 during
fetching.
* jt/fetch-pack-negotiator:
fetch-pack: introduce negotiator API
fetch-pack: move common check and marking together
fetch-pack: make negotiation-related vars local
fetch-pack: use ref adv. to prune "have" sent
fetch-pack: directly end negotiation if ACK ready
fetch-pack: clear marks before re-marking
fetch-pack: split up everything_local()
Junio C Hamano [Thu, 2 Aug 2018 22:30:41 +0000 (15:30 -0700)]
Merge branch 'ab/checkout-default-remote'
"git checkout" and "git worktree add" learned to honor
checkout.defaultRemote when auto-vivifying a local branch out of a
remote tracking branch in a repository with multiple remotes that
have tracking branches that share the same names.
* ab/checkout-default-remote:
checkout & worktree: introduce checkout.defaultRemote
checkout: add advice for ambiguous "checkout <branch>"
builtin/checkout.c: use "ret" variable for return
checkout: pass the "num_matches" up to callers
checkout.c: change "unique" member to "num_matches"
checkout.c: introduce an *_INIT macro
checkout.h: wrap the arguments to unique_tracking_name()
checkout tests: index should be clean after dwim checkout
Junio C Hamano [Thu, 2 Aug 2018 22:30:40 +0000 (15:30 -0700)]
Merge branch 'sb/diff-color-move-more'
"git diff --color-moved" feature has further been tweaked.
* sb/diff-color-move-more:
diff.c: offer config option to control ws handling in move detection
diff.c: add white space mode to move detection that allows indent changes
diff.c: factor advance_or_nullify out of mark_color_as_moved
diff.c: decouple white space treatment from move detection algorithm
diff.c: add a blocks mode for moved code detection
diff.c: adjust hash function signature to match hashmap expectation
diff.c: do not pass diff options as keydata to hashmap
t4015: avoid git as a pipe input
xdiff/xdiffi.c: remove unneeded function declarations
xdiff/xdiff.h: remove unused flags
Junio C Hamano [Thu, 2 Aug 2018 22:30:39 +0000 (15:30 -0700)]
Merge branch 'jk/fsck-gitmodules-gently'
Recent "security fix" to pay attention to contents of ".gitmodules"
while accepting "git push" was a bit overly strict than necessary,
which has been adjusted.
* jk/fsck-gitmodules-gently:
fsck: downgrade gitmodulesParse default to "info"
fsck: split ".gitmodules too large" error from parse failure
fsck: silence stderr when parsing .gitmodules
config: add options parameter to git_config_from_mem
config: add CONFIG_ERROR_SILENT handler
config: turn die_on_error into caller-facing enum
Junio C Hamano [Thu, 2 Aug 2018 22:30:39 +0000 (15:30 -0700)]
Merge branch 'bc/object-id'
Conversion from uchar[40] to struct object_id continues.
* bc/object-id:
pretty: switch hard-coded constants to the_hash_algo
sha1-file: convert constants to uses of the_hash_algo
log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz
diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
builtin/merge-recursive: make hash independent
builtin/merge: switch to use the_hash_algo
builtin/fmt-merge-msg: make hash independent
builtin/update-index: simplify parsing of cacheinfo
builtin/update-index: convert to using the_hash_algo
refs/files-backend: use the_hash_algo for writing refs
sha1-name: use the_hash_algo when parsing object names
strbuf: allocate space with GIT_MAX_HEXSZ
commit: express tree entry constants in terms of the_hash_algo
hex: switch to using the_hash_algo
tree-walk: replace hard-coded constants with the_hash_algo
cache: update object ID functions for the_hash_algo
Junio C Hamano [Thu, 2 Aug 2018 22:30:37 +0000 (15:30 -0700)]
Merge branch 'jk/has-uncommitted-changes-fix'
"git pull --rebase" on a corrupt HEAD caused a segfault. In
general we substitute an empty tree object when running the in-core
equivalent of the diff-index command, and the codepath has been
corrected to do so as well to fix this issue.
* jk/has-uncommitted-changes-fix:
has_uncommitted_changes(): fall back to empty tree
Eric Sunshine [Tue, 31 Jul 2018 05:03:20 +0000 (01:03 -0400)]
t/chainlint.sed: drop extra spaces from regex character class
This character class, like many others in this script, matches
horizontal whitespace consisting of spaces and tabs, however, a few
extra, entirely harmless, spaces somehow slipped into the expression.
Removing them is purely a cosmetic fix.
While at it, re-indent three lines with a single TAB each which were
incorrectly indented with six spaces. Also, a purely cosmetic fix.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eric Sunshine [Tue, 24 Jul 2018 21:58:45 +0000 (17:58 -0400)]
diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 24 Jul 2018 21:50:50 +0000 (14:50 -0700)]
Merge branch 'es/test-lint-one-shot-export'
Look for broken use of "VAR=VAL shell_func" in test scripts as part
of test-lint.
* es/test-lint-one-shot-export:
t/check-non-portable-shell: detect "FOO=bar shell_func"
t/check-non-portable-shell: make error messages more compact
t/check-non-portable-shell: stop being so polite
t6046/t9833: fix use of "VAR=VAL cmd" with a shell function
"git rev-parse ':/substring'" did not consider the history leading
only to HEAD when looking for a commit with the given substring,
when the HEAD is detached. This has been fixed.
* wc/find-commit-with-pattern-on-detached-head:
sha1-name.c: for ":/", find detached HEAD commits
Junio C Hamano [Tue, 24 Jul 2018 21:50:48 +0000 (14:50 -0700)]
Merge branch 'mk/merge-in-sparse-checkout'
"git reset --merge" (hence "git merge ---abort") and "git reset --hard"
had trouble working correctly in a sparsely checked out working
tree after a conflict, which has been corrected.
* mk/merge-in-sparse-checkout:
unpack-trees: do not fail reset because of unmerged skipped entry
Junio C Hamano [Tue, 24 Jul 2018 21:50:48 +0000 (14:50 -0700)]
Merge branch 'hs/push-cert-check-cleanup'
Code clean-up.
* hs/push-cert-check-cleanup:
gpg-interface: make parse_gpg_output static and remove from interface header
builtin/receive-pack: use check_signature from gpg-interface
Junio C Hamano [Tue, 24 Jul 2018 21:50:47 +0000 (14:50 -0700)]
Merge branch 'jt/partial-clone-fsck-connectivity'
Partial clone support of "git clone" has been updated to correctly
validate the objects it receives from the other side. The server
side has been corrected to send objects that are directly
requested, even if they may match the filtering criteria (e.g. when
doing a "lazy blob" partial clone).
* jt/partial-clone-fsck-connectivity:
clone: check connectivity even if clone is partial
upload-pack: send refs' objects despite "filter"
Junio C Hamano [Tue, 24 Jul 2018 21:50:47 +0000 (14:50 -0700)]
Merge branch 'bc/send-email-auto-cte'
The content-transfer-encoding of the message "git send-email" sends
out by default was 8bit, which can cause trouble when there is an
overlong line to bust RFC 5322/2822 limit. A new option 'auto' to
automatically switch to quoted-printable when there is such a line
in the payload has been introduced and is made the default.
* bc/send-email-auto-cte:
docs: correct RFC specifying email line length
send-email: automatically determine transfer-encoding
send-email: accept long lines with suitable transfer encoding
send-email: add an auto option for transfer encoding
Junio C Hamano [Tue, 24 Jul 2018 21:50:47 +0000 (14:50 -0700)]
Merge branch 'bb/pedantic'
The codebase has been updated to compile cleanly with -pedantic
option.
* bb/pedantic:
utf8.c: avoid char overflow
string-list.c: avoid conversion from void * to function pointer
sequencer.c: avoid empty statements at top level
convert.c: replace "\e" escapes with "\033".
fixup! refs/refs-internal.h: avoid forward declaration of an enum
refs/refs-internal.h: avoid forward declaration of an enum
fixup! connect.h: avoid forward declaration of an enum
connect.h: avoid forward declaration of an enum
"git fetch" failed to correctly validate the set of objects it
received when making a shallow history deeper, which has been
corrected.
* jt/connectivity-check-after-unshallow:
fetch-pack: write shallow, then check connectivity
fetch-pack: implement ref-in-want
fetch-pack: put shallow info in output parameter
fetch: refactor to make function args narrower
fetch: refactor fetch_refs into two functions
fetch: refactor the population of peer ref OIDs
upload-pack: test negotiation with changing repository
upload-pack: implement ref-in-want
test-pkt-line: add unpack-sideband subcommand
Junio C Hamano [Tue, 24 Jul 2018 21:50:44 +0000 (14:50 -0700)]
Merge branch 'jk/for-each-ref-icase'
The "--ignore-case" option of "git for-each-ref" (and its friends)
did not work correctly, which has been fixed.
* jk/for-each-ref-icase:
ref-filter: avoid backend filtering with --ignore-case
for-each-ref: consistently pass WM_IGNORECASE flag
t6300: add a test for --ignore-case
Junio C Hamano [Tue, 24 Jul 2018 21:50:43 +0000 (14:50 -0700)]
Merge branch 'en/rebase-consistency'
"git rebase" behaved slightly differently depending on which one of
the three backends gets used; this has been documented and an
effort to make them more uniform has begun.
* en/rebase-consistency:
git-rebase: make --allow-empty-message the default
t3401: add directory rename testcases for rebase and am
git-rebase.txt: document behavioral differences between modes
directory-rename-detection.txt: technical docs on abilities and limitations
git-rebase.txt: address confusion between --no-ff vs --force-rebase
git-rebase: error out when incompatible options passed
t3422: new testcases for checking when incompatible options passed
git-rebase.sh: update help messages a bit
git-rebase.txt: document incompatible options
Junio C Hamano [Tue, 24 Jul 2018 21:50:43 +0000 (14:50 -0700)]
Merge branch 'sb/submodule-move-head-error-msg'
"git checkout --recurse-submodules another-branch" did not report
in which submodule it failed to update the working tree, which
resulted in an unhelpful error message.
* sb/submodule-move-head-error-msg:
submodule.c: report the submodule that an error occurs in
Junio C Hamano [Tue, 24 Jul 2018 21:50:42 +0000 (14:50 -0700)]
Merge branch 'rj/submodule-fsck-skip'
"fsck.skipList" did not prevent a blob object listed there from
being inspected for is contents (e.g. we recently started to
inspect the contents of ".gitmodules" for certain malicious
patterns), which has been corrected.
* rj/submodule-fsck-skip:
fsck: check skiplist for object in fsck_blob()
Jeff King [Tue, 24 Jul 2018 10:52:29 +0000 (06:52 -0400)]
strbuf_humanise: use unsigned variables
All of the numeric formatting done by this function uses
"%u", but we pass in a signed "int". The actual range
doesn't matter here, since the conditional makes sure we're
always showing reasonably small numbers. And even gcc's
format-checker does not seem to mind. But it's potentially
confusing to a reader of the code to see the mismatch.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 24 Jul 2018 10:51:39 +0000 (06:51 -0400)]
pass st.st_size as hint for strbuf_readlink()
When we initially added the strbuf_readlink() function in b11b7e13f4 (Add generic 'strbuf_readlink()' helper function,
2008-12-17), the point was that we generally have a _guess_
as to the correct size based on the stat information, but we
can't necessarily trust it.
Over the years, a few callers have grown up that simply pass
in 0, even though they have the stat information. Let's have
them pass in their hint for consistency (and in theory
efficiency, since it may avoid an extra resize/syscall loop,
but neither location is probably performance critical).
Note that st.st_size is actually an off_t, so in theory we
need xsize_t() here. But none of the other callsites use it,
and since this is just a hint, it doesn't matter either way
(if we wrap we'll simply start with a too-small hint and
then eventually complain when we cannot allocate the
memory).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 24 Jul 2018 10:51:25 +0000 (06:51 -0400)]
strbuf_readlink: use ssize_t
The return type of readlink() is ssize_t, not int. This
probably doesn't matter in practice, as it would require a
2GB symlink destination, but it doesn't hurt to be careful.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 24 Jul 2018 10:51:08 +0000 (06:51 -0400)]
strbuf: use size_t for length in intermediate variables
A few strbuf functions store the length of a strbuf in a
temporary variable. We should always use size_t for this, as
it's possible for a strbuf to exceed an "int" (e.g., a 2GB
string on a 64-bit system). This is unlikely in practice,
but we should try to behave sensibly on silly or malicious
input.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 24 Jul 2018 10:50:33 +0000 (06:50 -0400)]
reencode_string: use size_t for string lengths
The iconv interface takes a size_t, which is the appropriate
type for an in-memory buffer. But our reencode_string_*
functions use integers, meaning we may get confusing results
when the sizes exceed INT_MAX. Let's use size_t
consistently.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 24 Jul 2018 10:50:10 +0000 (06:50 -0400)]
reencode_string: use st_add/st_mult helpers
When converting a string with iconv, if the output buffer
isn't big enough, we grow it. But our growth is done without
any concern for integer overflow. So when we add:
outalloc = sofar + insz * 2 + 32;
we may end up wrapping outalloc (which is a size_t), and
allocating a too-small buffer. We then manipulate it
further:
outsz = outalloc - sofar - 1;
and feed outsz back to iconv. If outalloc is wrapped and
smaller than sofar, we'll end up with a small allocation but
feed a very large outsz to iconv, which could result in it
overflowing the buffer.
Can we use this to construct an attack wherein the victim
clones a repository with a very large commit object with an
encoding header, and running "git log" reencodes it into
utf8, causing an overflow?
An attack of this sort is likely impossible in practice.
"sofar" is how many output bytes we've written total, and
"insz" is the number of input bytes remaining. Imagine our
input doubles in size as we output it (which is easy to do
by converting latin1 to utf8, for example), and that we
start with N input bytes. Our initial output buffer also
starts at N bytes, so after the first call we'd have N/2
input bytes remaining (insz), and have written N bytes
(sofar). That means our next allocation will be
(N + N/2 * 2 + 32) bytes, or (2N + 32).
We can therefore overflow a 32-bit size_t with a commit
message that's just under 2^31 bytes, assuming it consists
mostly of "doubling" sequences (e.g., latin1 0xe1 which
becomes utf8 0xc3 0xa1).
But we'll never make it that far with such a message. We'll
be spending 2^31 bytes on the original string. And our
initial output buffer will also be 2^31 bytes. Which is not
going to succeed on a system with a 32-bit size_t, since
there will be other things using the address space, too. The
initial malloc will fail.
If we imagine instead that we can triple the size when
converting, then our second allocation becomes
(N + 2/3N * 2 + 32), or (7/3N + 32). That still requires two
allocations of 3/7 of our address space (6/7 of the total)
to succeed.
If we imagine we can quadruple, it becomes (5/2N + 32); we
need to be able to allocate 4/5 of the address space to
succeed.
This might start to get plausible. But is it possible to get
a 4-to-1 increase in size? Probably if you're converting to
some obscure encoding. But since git defaults to utf8 for
its output, that's the likely destination encoding for an
attack. And while there are 4-character utf8 sequences, it's
unlikely that you'd be able find a single-byte source
sequence in any encoding.
So this is certainly buggy code which should be fixed, but
it is probably not a useful attack vector.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Mon, 23 Jul 2018 13:51:00 +0000 (15:51 +0200)]
coccinelle: extract dedicated make target to clean Coccinelle's results
Sometimes I want to remove only Coccinelle's results, but keep all
other build artifacts left after my usual 'make all man' build. This
new 'cocciclean' make target will allow just that.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note the lack of 'diff --opts <old> <new>' line, the differing number
of path components on the --- and +++ lines, and the nonsensical
filename on the +++ line. 'patch -p0' can still apply these patches,
as it takes the filename to be modified from the --- line. Alas, 'git
apply' can't, because it takes the filename from the +++ line, and
then complains about the nonexisting file.
Pass the '--patch .' options to Coccinelle via the SPATCH_FLAGS 'make'
variable, as it seems to make it generate proper context diff patches,
with the header starting with a 'diff ...' line and containing sane
filenames. The resulting 'contrib/coccinelle/*.cocci.patch' files
then can be applied both with 'git apply' and 'patch' (even without
'-p0').
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Mon, 23 Jul 2018 13:50:58 +0000 (15:50 +0200)]
coccinelle: exclude sha1dc source files from static analysis
sha1dc is an external library, that we carry in-tree for convenience
or grab as a submodule, so there is no use in applying our semantic
patches to its source files.
Therefore, exclude sha1dc's source files from Coccinelle's static
analysis.
This change also makes the static analysis somewhat faster: presumably
because of the heavy use of repetitive macro declarations, applying
the semantic patches 'array.cocci' and 'swap.cocci' to 'sha1dc/sha1.c'
takes over half a minute each on my machine, which amounts to about a
third of the runtime of applying these two semantic patches to the
whole git source tree.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Mon, 23 Jul 2018 13:50:57 +0000 (15:50 +0200)]
coccinelle: use $(addsuffix) in 'coccicheck' make target
The dependencies of the 'coccicheck' make target are listed with the
help of the $(patsubst) make function, which in this case doesn't do
any pattern substitution, but only adds the '.patch' suffix.
Use the shorter and more idiomatic $(addsuffix) make function instead.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Regression tests are automated tests which try to ensure a specific
behavior. The idea is: if the test case fails, the behavior indicated in
the test case's title regressed.
If a regression test that fails, even occasionally, for any reason other
than to indicate the particular regression(s) it tries to catch, it is
less useful than when it really only fails when there is a bug in the
(non-test) code that needs to be fixed.
In the instance of the test case "submodule update --init --recursive
from subdirectory" of the script t7406-submodule-update.sh, the exact
output of a recursive clone is compared with a pre-generated one. And
this is a racy test because the structure of the submodules only
guarantees a *partial* order. The 'none' and the 'rebasing' submodules
*can* be cloned in any order, which means that a mismatch with the
hard-coded order does not necessarily indicate a bug in the tested code.
See for example:
https://git-for-windows.visualstudio.com/git/_build/results?buildId=14035&view=logs
To prevent such false positives from unnecessarily costing time when
investigating test failures, let's take the exact order of the lines out
of the equation by sorting them before comparing them.
This test script seems not to have any more test cases that try to
verify any specific order in which recursive clones process the
submodules, therefore this is the only test case that is changed in this
manner.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Mon, 23 Jul 2018 13:02:30 +0000 (15:02 +0200)]
travis-ci: fail if Coccinelle static analysis found something to transform
Coccinelle's and in turn 'make coccicheck's exit code only indicates
that Coccinelle managed to finish its analysis without any errors
(e.g. no unknown --options, no missing files, no syntax errors in the
semantic patches, etc.), but it doesn't indicate whether it found any
undesired code patterns to transform or not. To find out the latter,
one has to look closer at 'make coccicheck's standard output and look
for lines like:
And this only indicates that there is something to transform, but to
see what the suggested transformations are one has to actually look
into those '*.cocci.patch' files.
This makes the automated static analysis build job on Travis CI not
particularly useful, because it neither draws our attention to
Coccinelle's findings, nor shows the actual findings. Consequently,
new topics introducing undesired code patterns graduated to master
on several occasions without anyone noticing.
The only way to draw attention in such an automated setting is to fail
the build job. Therefore, modify the 'ci/run-static-analysis.sh'
build script to check all the resulting '*.cocci.patch' files, and
fail the build job if any of them turns out to be not empty. Include
those files' contents, i.e. Coccinelle's suggested transformations, in
the build job's trace log, so we'll know why it failed.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Mon, 23 Jul 2018 13:02:29 +0000 (15:02 +0200)]
travis-ci: run Coccinelle static analysis with two parallel jobs
Currently the static analysis build job runs Coccinelle using a single
'make' job. Using two parallel jobs cuts down the build job's run
time from around 10-12mins to 6-7mins, sometimes even under 6mins
(there is quite large variation between build job runtimes). More
than two parallel jobs don't seem to bring further runtime benefits.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>