This avoids keeping tree entries around, and free's them as it traverses
the list. This avoids building up a huge memory footprint just for these
small but very common allocations.
Note how the minor fault numbers - which ends up being how many pages we
needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is:
Before:
42934 minor pagefaults
After:
26718 minor pagefaults
This is all in _addition_ to the previous fixes. It used to be
~48,000 pagefaults.
That's still a honking big memory footprint, but it's about half of what
it was just a day or two ago (and this is the object list for a pretty big
update - almost 60,000 objects. Smaller updates need less memory).
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Chris Wright [Fri, 16 Sep 2005 20:38:49 +0000 (13:38 -0700)]
[PATCH] Update git-core.spec.in
Update git-core spec file based on feedback from Fedora Extras review.
- update BuildRoot to be more specific
- eliminate Requires that must be satisfied for base system install
- drop Vendor
- use dist tag to differentiate between branches
- own %{_datadir}/git-core/
- use RPM_OPT_FLAGS in spec file
Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
An earlier commit causes a mismatch in <emphasis> and <superscript>
tags, one way of fixing it is having no more than one caret symbol per
line, which is the only solution I found in the asciidoc
documentation. Ugly, but it works.
[jc: ugly indeed but that is not Peter's fault.]
Signed-off-by: Peter Hagervall <hager@cs.umu.se> Signed-off-by: Junio C Hamano <junkio@cox.net>
The logic to calculate the full object list used to be very inter-twined
with the logic that looked up the commits.
For no good reason - it's actually a lot simpler to just do that logic
as a separate pass.
This improves performance a bit, and uses slightly less memory in my
tests, but more importantly it makes the code simpler to work with and
follow what it does.
The performance win is less than I had hoped for, but I get:
ie it improved by 2 seconds, and took a 5000+ fewer pages (hey, that's
20MB out of 174MB to go). And got the same number of objects (in theory,
the more expensive one might find some more shared objects to avoid. In
practice it obviously doesn't).
I know how to make it use _lots_ less memory, which will probably speed it
up. But that's for another time, and I'd prefer to see this go in first.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
As pointed out on the list, git-rev-list can use a lot of memory.
One low-hanging fruit is to free the commit buffer for commits that we
parse. By default, parse_commit() will save away the buffer, since a lot
of cases do want it, and re-reading it continually would be unnecessary.
However, in many cases the buffer isn't actually necessary and saving it
just wastes memory.
We could just free the buffer ourselves, but especially in git-rev-list,
we actually end up using the helper functions that automatically add
parent commits to the commit lists, so we don't actually control the
commit parsing directly.
Instead, just make this behaviour of "parse_commit()" a global flag.
Maybe this is a bit tasteless, but it's very simple, and it makes a
noticable difference in memory usage.
note how the minor faults have decreased from 3714 pages to 2433 pages.
That's all due to the fewer anonymous pages allocated to hold the comment
buffers and their metadata.
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Thu, 15 Sep 2005 21:56:37 +0000 (14:56 -0700)]
Be more backward compatible with git-ssh-{push,pull}.
HPA reminded me that these programs knows about the name of the
counterpart on the other end and simply symlinking the old name to
new name locally would not be enough.
H. Peter Anvin [Thu, 15 Sep 2005 19:33:14 +0000 (12:33 -0700)]
[PATCH] rsh.c env and quoting cleanup, take 2
This patch does proper quoting, and uses "env" to be compatible with
tcsh. As a side benefit, I believe the code is a lot cleaner to read.
[jc: I am accepting this not because I necessarily agree with the
quoting approach taken by it, but because (1) the code is only used
by ssh-fetch/ssh-upload pair which I do not care much about (if you
have ssh account on the remote end you should be using git-send-pack
git-fetch-pack pair over ssh anyway), and (2) HPA is one of the more
important customers belonging to the Linux kernel community and I
want to help his workflow -- which includes not wasting his time by
asking him to switch to git-send-pack/git-fetch-pack pair, nor to use
a better shell ;-). I might not have taken this patch if it mucked
with git_connect in connect.c in its current form.]
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Wed, 14 Sep 2005 20:15:53 +0000 (13:15 -0700)]
Unoptimize info/refs creation.
The code did not catch the case where you removed an existing ref
without changing anything else. We are not talking about hundreds of
refs anyway, so remove that optimization.
[PATCH] git-http-fetch: Allow caching of retrieved objects by proxy servers
By default the curl library adds "Pragma: no-cache" header to all
requests, which disables caching by proxy servers. However, most
files in a GIT repository are immutable, and caching them is safe and
could be useful.
This patch removes the "Pragma: no-cache" header from requests for all
files except the pack list (objects/info/packs) and references
(refs/*), which are really mutable and should not be cached.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 3b2a4c46fd5093ec79fb60e1b14b8d4a58c74612 commit)
Junio C Hamano [Wed, 14 Sep 2005 08:43:53 +0000 (01:43 -0700)]
git-branch -d <branch>: delete unused branch.
The new flag '-d' lets you delete a branch. For safety, it does not
lets you delete the branch you are currently on, nor a branch that
has been fully merged into your current branch.
The credit for the safety check idea goes to Daniel Barkalow.
An entry in the alternates file can name a directory relative to
the object store it describes. A typical linux-2.6 maintainer
repository would have "../../../torvalds/linux-2.6.git/objects" there,
because the subsystem maintainer object store would live in
This unfortunately is different from GIT_ALTERNATE_OBJECT_DIRECTORIES
which is relative to the cwd of the running process, but there is no
way to make it consistent with the behaviour of the environment
variable. The process typically is run in $system.git/ directory for
a naked repository, or one level up for a repository with a working
tree, so we just define it to be relative to the objects/ directory
to be different from either ;-).
Later, the dumb transport could be updated to read from info/alternates
and make requests for the repository the repository borrows from.
Junio C Hamano [Tue, 13 Sep 2005 02:47:07 +0000 (19:47 -0700)]
Fix CDPATH problem.
CDPATH has two problems:
* It takes scripts to unexpected places (somebody had
CDPATH=..:../..:$HOME and the "cd" in git-clone.sh:get_repo_base
took him to $HOME/.git when he said "clone foo bar" to clone a
repository in "foo" which had "foo/.git"). CDPATH mechanism does
not implicitly give "." at the beginning of CDPATH, which is
the most irritating part.
* The extra echo when it does its thing confuses scripts further.
Most of our scripts that use "cd" includes git-sh-setup so the problem
is primarily fixed there. git-clone starts without a repository, and
it needs its own fix.
Junio C Hamano [Sun, 11 Sep 2005 21:12:08 +0000 (14:12 -0700)]
[PATCH] Make 'git checkout' a bit more forgiving when switching branches.
If you make a commit on a path, and then make the path
cache-dirty afterwards without changing its contents, 'git
checkout' to switch to another branch is prevented because
switching the branches done with 'read-tree -m -u $current
$next' detects that the path is cache-dirty, but it does not
bother noticing that the contents of the path has not been
actualy changed.
Since switching branches would involve checking out paths
different in the two branches, hence it is reasonably expensive
operation, we can afford to run update-index before running
read-tree to reduce this kind of false change from triggering
the check needlessly.
Junio C Hamano [Sun, 11 Sep 2005 18:35:20 +0000 (11:35 -0700)]
[PATCH] Omit patches that have already been merged from format-patch output.
This switches the logic to pick which commits to include in the output
from git-rev-list to git-cherry; as a side effect, 'format-patch ^up mine'
would stop working although up..mine would continue to work.
[PATCH] There are several undocumented dependencies
There are several undocumented dependencies in the .spec and in the
INSTALL files. The following is from Fedora, perhaps other RPM
distributions call the packages differently.
Also, the manpages aren't always installed gzipped.
Updates to git-core.spec.in file:
- Some git scripts use Perl
- gitk needs wish, which is part of TCL/Tk.
- curl is used all over
- Need the ssh program from openssh-clients
Updates to INSTALL:
- Mention wish
- Mention ssh
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
This allows any arbitrary flags to "grep", and knows about the few
special grep flags that take an argument too.
It also allows some flags for git-ls-files, although their usefulness
is questionable.
With this, something line
git grep -w -1 pattern
works, without the script enumerating every possible flag.
[jc: this is the version Linus sent out after I showed him a
barf-o-meter test version that avoids shell arrays. He must
have typed this version blindly, since he said:
I'm not barfing, but that's probably because my brain just shut
down and is desperately trying to gouge my eyes out with a spoon.
I slightly fixed it to catch the remaining arguments meant to be
given git-ls-files.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Sun, 11 Sep 2005 20:58:41 +0000 (13:58 -0700)]
Use int instead of socklen_t
This should work around the compilation problem Johannes Schindelin
and others had on Mac OS/X.
Quoting Linus:
Any operating system where socklen_t is anything else than "int" is
terminally broken. The people who introduced that typedef were confused,
and I actually had to argue with them that it was fundamentally wrong:
there is no other valid type than "int" that makes sense for it.
Herbert Xu [Mon, 12 Sep 2005 01:03:43 +0000 (11:03 +1000)]
[PATCH] Apply N -> A status change in diff-helper
When the git diff status 'N' was changed to 'A', diff-helper.c was
not updated accordingly. This means that it no longer shows the
diff for newly added files.
This patch makes that change in diff-helper.c.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Junio C Hamano <junkio@cox.net>
Switched from backwards hard-coded tmp directory creation to using
File::Temp::tempdir() to create the directory inside $TMP_PATH or
what the user has provided via the -t parameter.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz> Signed-off-by: Junio C Hamano <junkio@cox.net>
Updated the usage/help message to match asciidoc documentation. The perldoc
documentation now includes the first paragraph from the asciidoc documentation
and points users to the manpage.
Updated TODO section.
Removed some redundant options from the getopt() invocation.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz> Signed-off-by: Junio C Hamano <junkio@cox.net>
AsciiDoc replace '--' with em-dash (—) by default. em-dash
looks a lot like a single long dash and it's very confusing when
we are talking about command options.
Section 21.2.8 'Replacements' of AsciiDoc's User Guide says that a
backslash in front of double dash prevent the replacement. This
patch does just that.
Signed-off-by: Yasushi SHOJI <yashi@atmark-techno.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Sun, 11 Sep 2005 00:46:27 +0000 (17:46 -0700)]
Add a new merge strategy by Fredrik Kuivinen.
I really wanted to try this out, instead of asking for an adjustment
to the 'git merge' driver and waiting. For now the new strategy is
called 'fredrik' and not in the list of default strategies to be tried.
The script wants Python 2.4 so this commit also adjusts Debian and RPM
build procecure files.
Junio C Hamano [Thu, 8 Sep 2005 20:47:12 +0000 (13:47 -0700)]
Multi-backend merge driver.
The new command 'git merge' takes the current head and one or more
remote heads, with the commit log message for the automated case.
If the heads being merged are simple fast-forwards, it acts the
same way as the current 'git resolve'. Otherwise, it tries
different merge strategies and takes the result from the one that
succeeded auto-merging, if there is any.
If no merge strategy succeeds auto-merging, their results are
evaluated for number of paths needed for hand resolving, and the
one with the least number of such paths is left in the working
tree. The user is asked to resolve them by hand and make a
commit manually.
The calling convention from the 'git merge' driver to merge
strategy programs is very simple:
- A strategy program is to be called 'git-merge-<strategy>'.
That is, one or more the common ancestors, double dash, the
current head, and one or more remote heads being merged into
the current branch.
- Before a strategy program is called, the working tree is
matched to the current <head>.
- The strategy program exits with status code 0 when it
successfully auto-merges the given heads. It should do
update-cache for all the merged paths when it does so -- the
index file will be used to record the merge result as a
commit by the driver.
- The strategy program exits with status code 1 when it leaves
conflicts behind. It should do update-cache for all the
merged paths that it successfully auto-merged, and leave the
cache entry in the index file as the same as <head> for paths
it could not auto-merge, and leave its best-effort result
with conflict markers in the working tree when it does so.
- The strategy program exists with status code other than 0 or
1 if it does not handle the given merge at all.
As examples, this commit comes with merge strategies based on
'git resolve' and 'git octopus'.
Junio C Hamano [Tue, 6 Sep 2005 19:53:56 +0000 (12:53 -0700)]
[PATCH] Add debugging help for case #16 to read-tree.c
This will help us detect if real-world example merges have multiple
merge-base candidates and one of them matches one head while another
matches the other head.
Junio C Hamano [Sat, 10 Sep 2005 22:18:31 +0000 (15:18 -0700)]
Keep bisection log so that it can be replayed later.
The 'git bisect' command was very unforgiving in that once you made a
mistake telling it good/bad it was very hard to take it back. Keep a
log of what you told it in an earlier session, so that it can be
replayed after removing everything after what you botched last time.
Junio C Hamano [Sat, 10 Sep 2005 19:42:32 +0000 (12:42 -0700)]
Fix copy marking from diffcore-rename.
When (A,B) ==> (B,C) rename-copy was detected, we incorrectly said
that C was created by copying B. This is because we only check if the
path of rename/copy source still exists in the resulting tree to see
if the file is renamed out of existence. In this case, the new B is
created by copying or renaming A, so the original B is lost and we
should say C is a rename of B not a copy of B.
We now keep track of the patches merged in each branch since they have
diverged, using the records that the Arch "logs" provide. Merge parents
for a commit are defined if we are merging a series of patches that starts
from the mergebase.
If patches from a related branch are merged out-of-order, we keep track of
how much has been merged sequentially -- the tip of that sequential merge
is our new parent from that branch.
This mechanism works very well for branches that merge in dovetail and/or
flying fish patterns, probably less well for others.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Fri, 9 Sep 2005 22:40:45 +0000 (15:40 -0700)]
show-branch: --list and --independent
The --list option is what 'git branch' without parameter should
have been; it shows the one-line commit message for each branch
name. The --independent option is used to filter out commits
that can be reachable from other commits, to make detection of
fast forward condition in multi-head merge easier.
Marco Roeland [Fri, 9 Sep 2005 18:08:50 +0000 (20:08 +0200)]
[PATCH] remove duplicate git-send-email-script.perl target in Makefile
Remove duplicate git-send-email-perl target in Makefile.
When WITH_SEND_EMAIL was defined, as in the Debian 'deb' target,
git-send-email-perl was added twice to SCRIPT_PERL, leading to a
duplicate definition in the Makefile. Creating a ".deb" then failed.
Signed-off-by: Marco Roeland <marco.roeland@xs4all.nl> Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio C Hamano [Fri, 9 Sep 2005 01:50:33 +0000 (18:50 -0700)]
'build' scripts before installing.
Earlier we renamed git-foo.sh to git-foo while installing, which
was mostly done by inertia than anything else. This however
made writing tests to use scripts harder.
This patch builds the scripts the same way as we build binaries
from their sources. As a side effect, you can now specify
non-standard paths you have your Perl binary is in when running
the make.
git-daemon using inetd. does not work properly. inetd routes stderr onto the
network line just like stdout, which was apparently not expected to be so.
As the result of this, the stream is closed by the receiver, because some
"Packing %d objects\n" originating from pack_objects is first reported over
the line instead of the expected pack_header, and so the SIGNATURE test
fails. Here is a workaround.
[PATCH] Do not create bogus branch from flag to git branch
If you run `git branch --help', you will unexpectedly have created a new
branch named "--help". This simple patch adds logic and a usage
statement to catch this and similar problems, and adds a testcase for it.
Signed-off-by: Amos Waterland <apw@rossby.metr.ou.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
Patrick Mauritz [Mon, 5 Sep 2005 23:24:03 +0000 (01:24 +0200)]
[PATCH] Portability fix for Solaris 10/x86
* getdomainname unavailable there.
* needs -lsocket for linkage.
* needs __EXTENSIONS__ at the beginning of convert-objects.c
[JC: I've done this slightly differently from what Patrick originally
sent to the list and dropped the bit that deals with installations
that has curl header and library at non-default location. I am
resisting the slipperly slope called autoconf.]
Junio C Hamano [Thu, 8 Sep 2005 00:26:23 +0000 (17:26 -0700)]
Big tool rename.
As promised, this is the "big tool rename" patch. The primary differences
since 0.99.6 are:
(1) git-*-script are no more. The commands installed do not
have any such suffix so users do not have to remember if
something is implemented as a shell script or not.
(2) Many command names with 'cache' in them are renamed with
'index' if that is what they mean.
There are backward compatibility symblic links so that you and
Porcelains can keep using the old names, but the backward
compatibility support is expected to be removed in the near
future.