Nicolas Pitre [Sat, 26 May 2007 02:16:27 +0000 (22:16 -0400)]
update diff-delta.c copyright
There is actually nothing left from the original LibXDiff code I used
over 2 years ago, and even the GIT implementation has diverged quite a
bit from LibXDiff's at this point. Let's update the copyright notice
to better reflect that fact.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Nicolas Pitre [Sat, 26 May 2007 01:38:58 +0000 (21:38 -0400)]
improve delta long block matching with big files
Martin Koegler noted that create_delta() performs a new hash lookup
after every block copy encoding which are currently limited to 64KB.
In case of larger identical blocks, the next hash lookup would normally
point to the next 64KB block in the reference buffer and multiple block
copy operations will be consecutively encoded.
It is however possible that the reference buffer be sparsely indexed if
hash buckets have been trimmed down in create_delta_index() when hashing
of the reference buffer isn't well balanced. In that case the hash
lookup following a block copy might fail to match anything and the fact
that the reference buffer still matches beyond the previous 64KB block
will be missed.
Let's rework the code so that buffer comparison isn't bounded to 64KB
anymore. The match size should be as large as possible up front and
only then should multiple block copy be encoded to cover it all.
Also, fewer hash lookups will be performed in the end.
According to Martin, this patch should reduce his 92MB pack down to 75MB
with the dataset he has.
Tests performed on the Linux kernel repo show a slightly smaller pack and
a slightly faster repack.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Linus Torvalds [Sat, 26 May 2007 16:25:31 +0000 (09:25 -0700)]
Make the pack-refs interfaces usable from outside
This just basically creates a "pack_refs()" function that could be used by
anybody. You pass it in the flags you want as a bitmask (PACK_REFS_ALL and
PACK_REFS_PRUNE), and it will do all the heavy lifting.
Of course, it's still static, and it's all in the builtin-pack-refs.c
file, so it's not actually visible to the outside, but the next step would
be to just move it all to a library file (probably refs.c) and expose it.
Then we could easily make "git gc" do this too.
While I did it, I also made it check the return value of the fflush and
fsync stage, to make sure that we don't overwrite the old packed-refs file
with something that got truncated due to write errors!
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Sun, 27 May 2007 01:53:22 +0000 (18:53 -0700)]
Merge branch 'maint'
* maint:
Fix git-svn to handle svn not reporting the md5sum of a file, and test.
Fix mishandling of $Id$ expanded in the repository copy in convert.c
More echo "$user_message" fixes.
Add tests for the last two fixes.
git-commit: use printf '%s\n' instead of echo on user-supplied strings
git-am: use printf instead of echo on user-supplied strings
Documentation: Add definition of "evil merge" to GIT Glossary
Replace the last 'dircache's by 'index'
Documentation: Clean up links in GIT Glossary
Junio C Hamano [Sat, 26 May 2007 08:30:40 +0000 (01:30 -0700)]
Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
Fix git-svn to handle svn not reporting the md5sum of a file, and test.
More echo "$user_message" fixes.
Add tests for the last two fixes.
git-commit: use printf '%s\n' instead of echo on user-supplied strings
git-am: use printf instead of echo on user-supplied strings
Documentation: Add definition of "evil merge" to GIT Glossary
Replace the last 'dircache's by 'index'
Documentation: Clean up links in GIT Glossary
Andy Parkins [Fri, 25 May 2007 10:50:08 +0000 (11:50 +0100)]
Fix mishandling of $Id$ expanded in the repository copy in convert.c
If the repository contained an expanded ident keyword (i.e. $Id:XXXX$),
then the wrong bytes were discarded, and the Id keyword was not
expanded. The fault was in convert.c:ident_to_worktree().
Previously, when a "$Id:" was found in the repository version,
ident_to_worktree() would search for the next "$" after this, and
discarded everything it found until then. That was done with the loop:
do {
ch = *cp++;
if (ch == '$')
break;
rem--;
} while (rem);
The above loop left cp pointing one character _after_ the final "$"
(because of ch = *cp++). This was different from the non-expanded case,
were cp is left pointing at the "$", and was different from the comment
which stated "discard up to but not including the closing $". This
patch fixes that by making the loop:
do {
ch = *cp;
if (ch == '$')
break;
cp++;
rem--;
} while (rem);
That is, cp is tested _then_ incremented.
This loop exits if it finds a "$" or if it runs out of bytes in the
source. After this loop, if there was no closing "$" the expansion is
skipped, and the outer loop is allowed to continue leaving this
non-keyword as it was. However, when the "$" is found, size is
corrected, before running the expansion:
size -= (cp - src);
This is wrong; size is going to be corrected anyway after the expansion,
so there is no need to do it here. This patch removes that redundant
correction.
To help find this bug, I heavily commented the routine; those comments
are included here as a bonus.
Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Sat, 26 May 2007 07:26:20 +0000 (00:26 -0700)]
Add tests for the last two fixes.
This updates t4014 to check the two fixes for git-am and git-commit
we observed with "echo" that does backslash interpolation by default
without being asked with -e option.
Junio C Hamano [Sat, 26 May 2007 05:00:54 +0000 (22:00 -0700)]
git-commit: use printf '%s\n' instead of echo on user-supplied strings
This fixes the same issue git-am had, which was fixed by Jeff
King in the previous commit. Cleverly enough, this commit's log
message is a good test case at the same time.
Jeff King [Sat, 26 May 2007 03:42:36 +0000 (23:42 -0400)]
git-am: use printf instead of echo on user-supplied strings
Under some implementations of echo (such as that provided by
dash), backslash escapes are recognized without any other
options. This means that echo-ing user-supplied strings may
cause any backslash sequences in them to be converted. Using
printf resolves the ambiguity.
This bug can be seen when using git-am to apply a patch
whose subject contains the character sequence "\n"; the
characters are converted to a literal newline. Noticed by
Szekeres Istvan.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
Nicolas Pitre [Sat, 26 May 2007 03:37:40 +0000 (23:37 -0400)]
fixes to output of git-verify-pack -v
Now that the default delta depth is 50, it is a good idea to also bump
MAX_CHAIN to 50.
While at it, make the display a bit prettier by making the MAX_CHAIN
limit inclusive, and display the number of deltas that are above that
limit at the end instead of the beginning.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Fri, 25 May 2007 04:35:29 +0000 (21:35 -0700)]
Merge branch 'maint'
* maint:
fix memory leak in parse_object when check_sha1_signature fails
name-rev: tolerate clock skew in committer dates
Update bash completion for git-config options
Teach bash completion about recent log long options
Teach bash completion about 'git remote update'
Update bash completion header documentation
Remove a duplicate --not option in bash completion
Teach bash completion about git-shortlog
Hide the plumbing diff-{files,index,tree} from bash completion
Update bash completion to ignore some more plumbing commands
Junio C Hamano [Fri, 25 May 2007 04:34:59 +0000 (21:34 -0700)]
Merge branch 'master' of git://repo.or.cz/git/fastimport into maint
* 'master' of git://repo.or.cz/git/fastimport:
Update bash completion for git-config options
Teach bash completion about recent log long options
Teach bash completion about 'git remote update'
Update bash completion header documentation
Remove a duplicate --not option in bash completion
Teach bash completion about git-shortlog
Hide the plumbing diff-{files,index,tree} from bash completion
Update bash completion to ignore some more plumbing commands
Linus Torvalds [Thu, 24 May 2007 18:41:39 +0000 (11:41 -0700)]
Make "git gc" pack all refs by default
I've taught myself to use "git gc" instead of doing the repack explicitly,
but it doesn't actually do what I think it should do.
We've had packed refs for a long time now, and I think it just makes sense
to pack normal branches too. So I end up having to do
git pack-refs --all --prune
in order to get a nice git repo that doesn't have any unnecessary files.
So why not just do that in "git gc"? It's not as if there really is any
downside to packing branches, even if they end up changing later. Quite
often they don't, and even if they do, so what?
Also, make the default for refs packing just be an unambiguous "do it",
rather than "do it by default only for non-bare repositories". If you want
that behaviour, you can always just add a
[gc]
packrefs = notbare
in your ~/.gitconfig file, but I don't actually see why bare would be any
different (except for the broken reason that http-fetching used to be
totally broken, and not doing it just meant that it didn't even get
fixed in a timely manner!).
So here's a trivial patch to make "git gc" do a better job. Hmm?
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Carlos Rica [Fri, 25 May 2007 01:46:22 +0000 (03:46 +0200)]
fix memory leak in parse_object when check_sha1_signature fails
When check_sha1_signature fails, program is not terminated:
it prints an error message and returns NULL, so the
buffer returned by read_sha1_file should be freed before.
Signed-off-by: Carlos Rica <jasampler@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Thu, 24 May 2007 19:20:42 +0000 (12:20 -0700)]
name-rev: tolerate clock skew in committer dates
In git.git repository, "git-name-rev v1.3.0~158" cannot name the
rev, while adjacent revs can be named.
This was because it gives up traversal from the tips of existing
refs as soon as it sees a commit that has older commit timestamp
than what is being named. This is usually a good heuristics,
but v1.3.0~158 has a slightly older commit timestamp than
v1.3.0~157 (i.e. it's child), as these two were made in a
separate repostiory (in fact, in a different continent).
This adds a hardcoded slop value (1 day) to the cut-off
heuristics to work this kind of problem around. The current
algorithm essentially runs around from the available tips down
to ancient commits and names every single rev available that are
newer than cut-off date, so a single day slop would not add that
much overhead in repositories with long enough history where the
performance of name-rev matters.
I think the algorithm could be made a bit smarter by deepening
the graph on demand as a new commit is asked to be named (this
would require rewriting of name_rev() function not to recurse
itself but use a traversal list like revision.c traverser does),
but that would be a separate issue.
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
Shawn O. Pearce [Thu, 24 May 2007 06:07:45 +0000 (02:07 -0400)]
Update bash completion for git-config options
A few new configuration options grew out of the woodwork during the
1.5.2 series. Most of these are pretty easy to support a completion
of, so we do so.
I wanted to also add completion support for the <driver> part of
merge.<driver>.name but to do that we have to look at all of the
.gitattributes files and guess what the unique set of <driver>
strings would be. Since this appears to be non-trivial I'm punting
on it at this time.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 05:51:30 +0000 (01:51 -0400)]
Teach bash completion about recent log long options
(Somewhat) recently git-log learned about --reverse (to show commits
in the opposite order) and a looong time ago I think it learned
about --raw (to show the raw diff, rather than a unified diff).
These are both useful options, so we should make them easy for the
user to complete.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 05:46:49 +0000 (01:46 -0400)]
Teach bash completion about 'git remote update'
Recently the git-remote command grew an update subcommand, which
can be used to execute git-fetch across multiple repositories
in a single step. These can be configured with the 'remotes.*'
configuration options, so we can offer completion for any name that
matches and appears to be useful to git-remote update.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 05:36:46 +0000 (01:36 -0400)]
Update bash completion header documentation
1) Added a note about supporting the long options for most commands,
as we have been doing so for quite some time.
2) Include a notice that these routines are covered by the GPL,
as that may not be obvious, even though they are distributed
as part of the core Git distribution.
3) Added a short section on how to send patches to the routines,
and to whom they should get sent to. Currently that is me,
as I am the active maintainer.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 05:26:58 +0000 (01:26 -0400)]
Remove a duplicate --not option in bash completion
This was just me being silly; I put the --not option into the
completion list twice. There's no duplicates shown in the shell
as the shell removes them before showing them to the user. But we
really don't need the duplicates in the source script either.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 05:25:34 +0000 (01:25 -0400)]
Teach bash completion about git-shortlog
We've had completion for git-log for quite some time, but just
today I noticed we don't have it for the new builtin shortlog
that runs git-log internally. This is indeed a handy thing to
have completion for, especially when your branch names are of
the Very-Very-Long-and-Hard/To-Type/Variety/That-Some-Use.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 05:01:02 +0000 (01:01 -0400)]
Hide the plumbing diff-{files,index,tree} from bash completion
The diff-* programs are meant to be plumbing for the diff frontend;
most end users aren't invoking these commands directly. Consequently
we should avoid showing them as possible completions.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 04:32:31 +0000 (00:32 -0400)]
Fix possible coredump with fast-import --import-marks
When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload
the marks table on subsequent runs of fast-import we really broke
things, as we set pack_id to MAX_PACK_ID for any objects we imported
into the marks table. Creating a branch from that mark should fail
as we attempt to read the object through a non-existant packed_git
pointer. Instead we have to use the normal Git object system to
locate the older commit, as we ourselves do not have a reference
to the packed_git it resides in.
This bug only occurred because t9300 was not complete enough.
When we added the --import-marks feature we didn't actually test
its implementation enough to verify the function worked as intended.
I have corrected that, and included the changes as part of this fix.
Prior versions of fast-import fail the new test(s); this commit
allows them to pass.
Credit for this bug find goes to Simon Hausmann <simon@lst.de> as
he recently identified a similiar bug in the tree lazy-loading path.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 24 May 2007 04:05:19 +0000 (00:05 -0400)]
Refactor fast-import branch creation from existing commit
To resolve a corner case uncovered by Simon Hausmann I need to
reuse the logic for the SHA-1 expression version of the 'from '
command within the mark version of the 'from ' command. This change
doesn't alter any functionality, but is merely breaking the common
code out to a function that I can reuse.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Simon Hausmann [Wed, 23 May 2007 21:01:49 +0000 (23:01 +0200)]
fast-import: Fix crash when referencing already existing objects
Commit a5c1780a0355a71b9fb70f1f1977ce726ee5b8d8 sets the pack_id of existing
objects to MAX_PACK_ID. When the same object is referenced later again it is
found in the local object hash. With such a pack_id fast-import should not try
to locate that object in the newly created pack(s).
Signed-off-by: Simon Hausmann <simon@lst.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Mark Levedahl [Sun, 20 May 2007 15:46:46 +0000 (11:46 -0400)]
gitweb.perl - Optionally send archives as .zip files
git-archive already knows how to generate an archive as a tar or a zip
file, but gitweb did not. zip archvies are much more usable in a Windows
environment due to native support and this patch allows a site admin the
option to deliver zip rather than tar files. The selection is done by
inserting
Junio C Hamano [Wed, 23 May 2007 05:52:59 +0000 (22:52 -0700)]
Fix command line parameter parser of revert/cherry-pick
The parser was inconsistently done, in that it did not look at
the last command line parameter to see if it could be an unknown
option, although it was designed to notice unknown options if
they were given in positions the command expects to find them
(i.e. everything except the last parameter, which ought to be
<commit-ish>). This prevented a very natural invocation
While we can easily test the cvs <-> git-cvsserver
communication with :fork: and git-cvsserver server
there are some pserver specifics we should test, too.
Currently this are two tests of the pserver authentication.
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
Shawn O. Pearce [Mon, 21 May 2007 07:20:25 +0000 (03:20 -0400)]
Teach git-describe how to run name-rev
Often users want to know not which tagged version a commit came
after, but which tagged version a commit is contained within.
This latter task is the job of git-name-rev, but most users are
looking to git-describe to do the job.
Junio suggested we make `git describe --contains` run the correct
tool, `git name-rev`, and that's exactly what we do here. The output
of name-rev was adjusted slightly through the new --name-only option,
allowing describe to execv into name-rev and maintain its current
output format.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Mon, 21 May 2007 06:51:06 +0000 (23:51 -0700)]
git-apply: Fix removal of new trailing blank lines.
The earlier code removed one newline too many from the hunk that
adds new lines at the end of the file. Also the way the code
counted the added blank lines was somewhat roundabout; I think
the way updated code does it is more direct and easier to
follow:
* We keep track of the number of blank lines added;
* While processing each line, we notice if it adds a blank
line, and increment the counter, or reset it to zero
otherwise;
* When actually we apply the data, we remove the empty lines we
counted earlier if we are applying it at the end of the
file.
Jakub Narebski [Sat, 19 May 2007 20:08:11 +0000 (22:08 +0200)]
Add an option to git-ls-tree to display also the size of blob
Add -l/--long option to git-ls-tree command, which displays
object size of a blob entry. Object size is placed after
object id (left-justified with minimum width of 7 characters).
For non-blob entries `-' is used.
Rationale: for non-blob entries size of an object has no much
meaning, and is not very interesting. Moreover, in planned
pack v4 tree objects would be constructed on demand, so tree
size would need to be calculated.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Petr Baudis [Sat, 19 May 2007 00:13:29 +0000 (02:13 +0200)]
git-rev-list: Add regexp tuning options
This patch introduces --extended-regexp and --regexp-ignore-case options to
tune what kind of patterns the pattern-limiting options (--grep, --author,
...) accept.
Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Mon, 21 May 2007 02:58:03 +0000 (19:58 -0700)]
Merge branch 'maint'
* maint:
annotate: make it work from subdirectories.
git-config: Correct asciidoc documentation for --int/--bool
t1300: Add tests for git-config --bool --get
unpack-trees.c: verify_uptodate: remove dead code
Use PATH_MAX instead of TEMPFILE_PATH_LEN
branch: fix segfault when resolving an invalid HEAD
Junio C Hamano [Mon, 21 May 2007 02:57:00 +0000 (19:57 -0700)]
Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
annotate: make it work from subdirectories.
git-config: Correct asciidoc documentation for --int/--bool
t1300: Add tests for git-config --bool --get
unpack-trees.c: verify_uptodate: remove dead code
Use PATH_MAX instead of TEMPFILE_PATH_LEN
branch: fix segfault when resolving an invalid HEAD
Junio C Hamano [Sun, 20 May 2007 09:18:43 +0000 (02:18 -0700)]
Merge branch 'np/pack'
* np/pack:
deprecate the new loose object header format
make "repack -f" imply "pack-objects --no-reuse-object"
allow for undeltified objects not to be reused
Eric Wong [Sat, 19 May 2007 10:59:02 +0000 (03:59 -0700)]
git-svn: don't minimize-url when doing an init that tracks multiple paths
I didn't have a chance to test the off-by-default minimize-url
stuff enough before, but it's quite broken for people passing
the --trunk/-T, --tags/-t, --branches/-b switches to "init" or
"clone" commands.
Additionally, follow-parent functionality seems broken when we're
not connected to the root of the repository.
Default behavior for "traditional" git-svn users who only track
one directory (without needing follow-parent) should be
reasonable, as those users started using things before
minimize-url functionality existed.
Behavior for users more used to the git-svnimport-like command
line will also benefit from a more-flexible command-line than
svnimport given the assumption they're working with
non-restrictive read permissions on the repository.
I hope to properly fix these bugs when I get a chance to in the
next week or so, but I would like to get this stopgap measure of
reverting to the old behavior as soon as possible.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
Eric Wong [Sat, 19 May 2007 09:58:37 +0000 (02:58 -0700)]
git-svn: avoid crashing svnserve when creating new directories
When sorting directory names by depth (slash ("/") count) and
closing the deepest directories first (as the protocol
requires), we failed to put the root baton (with an empty string
as its key "") after top-level directories (which did not have
any slashes).
This resulted in svnserve being in a situation it couldn't
handle and caused a segmentation fault on the remote server.
This bug did not affect users of DAV and filesystem repositories.
Signed-off-by: Eric Wong <normalperson@yhbt.net> Confirmed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <junkio@cox.net>
Johan Herland [Wed, 16 May 2007 00:31:40 +0000 (02:31 +0200)]
user-manual: Add section on ignoring files
The todo list at the end of the user manual says that something must be
said about .gitignore. Also, there seems to be a lack of documentation
on how to choose between the various types of ignore files (.gitignore
vs. .git/info/exclude, etc.).
This patch adds a section on ignoring files which try to introduce how
to tell git about ignored files, and how the different strategies
complement eachother.
The syntax of exclude patterns is explained in a simplified manner, with
a reference to git-ls-files(1) which already contains a more thorough
explanation.
J. Bruce Fields [Tue, 15 May 2007 04:30:58 +0000 (00:30 -0400)]
user-manual: discourage shared repository
I don't really want to look like we're encouraging the shared repository
thing. Take down some of the argument for using purely
single-developer-owned repositories and collaborating using patches and
pulls instead.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
J. Bruce Fields [Fri, 18 May 2007 04:51:42 +0000 (00:51 -0400)]
tutorial: revise index introduction
The embarassing history of this tutorial is that I started it without
really understanding the index well, so I avoided mentioning it.
And we all got the idea that "index" was a word to avoid using around
newbies, but it was reluctantly mentioned that *something* had to be
said. The result is a little awkward: the discussion of the index never
actually uses that word, and isn't well-integrated into the surrounding
material.
Let's just go ahead and use the word "index" from the very start, and
try to demonstrate its use with a minimum of lecturing.
Also, remove discussion of using git-commit with explicit filenames.
We're already a bit slow here to get people to their first commit, and
I'm not convinced this is really so important.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>