It looks like cached patches are truncated to the nearest 1024-byte
boundary in the patch body. E.g.:
> mricon@nikko:[/tmp]$ wget -O no-cache
> "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=
6e1b4fdad5157bb9e88777d525704aba24389bee"
...
> 2014-06-11 15:34:51 (80.4 MB/s) - ‘no-cache’ saved [4767]
Patch is complete, without truncation. Next hit, with cache in place:
> mricon@nikko:[/tmp]$ wget -O yes-cache
> "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=6e1b4
>
fdad5157bb9e88777d525704aba24389bee"
...
> 2014-06-11 15:35:01 (17.0 MB/s) - ‘yes-cache’ saved [4096/4096]
Length truncated to 4096. The cache on disk looks truncated as well, so
the bug must me during the process of saving cache. The same is true for
larger patches:
> mricon@nikko:[/tmp]$ wget -O no-cache
> "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=
2840c566e95599cd60c7143762ca8b49d9395050"
...
> 2014-06-11 15:41:33 (1.07 MB/s) - ‘no-cache’ saved [979644]
979644 bytes with a cache-miss
> mricon@nikko:[/tmp]$ wget -O yes-cache
> "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=2840c
>
566e95599cd60c7143762ca8b49d9395050"
...
> 2014-06-11 15:41:46 (1.05 MB/s) - ‘yes-cache’ saved [978944]
978944 (956KB exactly) with a cache-hit
Since the "html" functions use raw write(2) to STDIO_FILENO, we don't
notice problems with most pages, but raw patches write using printf(3).
This is fine if we're outputting straight to stdout since the buffers
are flushed on exit, but we close the cache output before this, so the
cached output ends up being truncated.
Make sure the buffers are flushed when we finish outputting a patch so
that we avoid this.
No other UIs use printf(3) so we do not need to worry about them.
Actually, it's slightly more interesting than this... since we don't set
GIT_FLUSH, Git decides whether or not it will flush stdout after writing
each commit based on whether or not stdout points to a regular file (in
maybe_flush_or_die()).
Which means that when writing directly to the webserver, Git flushes
stdout for us, but when we redirect stdout to the cache it points to a
regular file so Git no longer flushes the output for us.
The patch is still correct, but perhaps the full explanation is
interesting!
Reported-by: Konstantin Ryabitsev <mricon@kernel.org>