Lasse Collin [Fri, 10 Sep 2010 07:30:33 +0000 (10:30 +0300)]
xz: Multiple fixes.
The code assumed that printing numbers with thousand separators
and decimal points would always produce only US-ASCII characters.
This was used for buffer sizes (with snprintf(), no overflows)
and aligning columns of the progress indicator and --list. That
assumption was wrong (e.g. LC_ALL=fi_FI.UTF-8 with glibc), so
multibyte character support was added in this commit. The old
way is used if the operating system doesn't have enough multibyte
support (e.g. lacks wcwidth()).
The sizes of buffers were increased to accomodate multibyte
characters. I don't know how big they should be exactly, but
they aren't used for anything critical, so it's not too bad.
If they still aren't big enough, I hopefully get a bug report.
snprintf() takes care of avoiding buffer overflows.
Some static buffers were replaced with buffers allocated on
stack. double_to_str() was removed. uint64_to_str() and
uint64_to_nicestr() now share the static buffer and test
for thousand separator support.
Integrity check names "None" and "Unknown-N" (2 <= N <= 15)
were marked to be translated. I had forgot these, plus they
wouldn't have worked correctly anyway before this commit,
because printing tables with multibyte strings didn't work.
Thanks to Marek Černocký for reporting the bug about
misaligned table columns in --list output.
Lasse Collin [Sat, 4 Sep 2010 19:16:28 +0000 (22:16 +0300)]
Don't set lc=4 with --extreme.
This should reduce the cases where --extreme makes
compression worse. On the other hand, some other
files may now benefit slightly less from --extreme.
Lasse Collin [Fri, 3 Sep 2010 09:28:41 +0000 (12:28 +0300)]
liblzma: Adjust default depth calculation for HC3 and HC4.
It was 8 + nice_len / 4, now it is 4 + nice_len / 4.
This allows faster settings at lower nice_len values,
even though it seems that I won't use automatic depth
calcuation with HC3 and HC4 in the presets.
Lasse Collin [Sat, 7 Aug 2010 17:45:18 +0000 (20:45 +0300)]
Disable the memory usage limiter by default.
For several people, the limiter causes bigger problems that
it solves, so it is better to have it disabled by default.
Those who want to have a limiter by default need to enable
it via the environment variable XZ_DEFAULTS.
Support for environment variable XZ_DEFAULTS was added. It is
parsed before XZ_OPT and technically identical with it. The
intended uses differ quite a bit though; see the man page.
The memory usage limit can now be set separately for
compression and decompression using --memlimit-compress and
--memlimit-decompress. To set both at once, -M or --memlimit
can be used. --memory was retained as a legacy alias for
--memlimit for backwards compatibility.
The semantics of --info-memory were changed in backwards
incompatible way. Compatibility wasn't meaningful due to
changes in the memory usage limiter functionality.
The memory usage limiter info is no longer shown at the
bottom of xz --long -help.
The memory usage limiter support for removed completely from xzdec.
xz's man page was updated to match the above changes. Various
unrelated fixes were also made to the man page.
Lasse Collin [Tue, 13 Jul 2010 16:55:50 +0000 (19:55 +0300)]
Add two simple example programs.
Hopefully these help a bit when learning the basics
of liblzma API. I plan to write detailed examples about
both basic and advanced features with lots of comments,
but these two examples are good have right now.
The examples were written by Daniel Mealha Cabrita. Thanks.
Lasse Collin [Wed, 2 Jun 2010 20:09:22 +0000 (23:09 +0300)]
Silence a bogus Valgrind warning.
When using -O2 with GCC, it liked to swap two comparisons
in one "if" statement. It's otherwise fine except that
the latter part, which is seemingly never executed, got
executed (nothing wrong with that) and then triggered
warning in Valgrind about conditional jump depending on
uninitialized variable. A few people find this annoying
so do things a bit differently to avoid the warning.
Lasse Collin [Mon, 12 Apr 2010 18:55:56 +0000 (21:55 +0300)]
Show both elapsed time and estimated remaining time in xz -v.
The extra space for showing both has been taken from the
sizes field. If the sizes grow big, bigger units than MiB
will be used. It makes it slightly difficult to see that
progress is still happening with huge files, but it should
be OK in practice.
Thanks to Trent W. Buck for <http://bugs.debian.org/574583>
and Jonathan Nieder for suggestions how to fix it.
Lasse Collin [Sun, 7 Mar 2010 11:59:32 +0000 (13:59 +0200)]
Treat all integer multiplier suffixes as base-2.
Originally both base-2 and base-10 were supported, but since
there seems to be little need for base-10 in XZ Utils, treat
everything as base-2 and also be more relaxed about the case
of the first letter of the suffix. Now xz will accept e.g.
KiB, Ki, k, K, kB, and KB, and interpret them all as 1024. The
recommended spelling of the suffixes are still KiB, MiB, and GiB.
Lasse Collin [Sun, 7 Mar 2010 11:50:23 +0000 (13:50 +0200)]
Consistently round up the memory usage limit in messages.
It still feels a bit wrong to round 1 byte to 1 MiB but
at least it is now done consistently so that the same
byte value is always rounded the same way to MiB.
Lasse Collin [Sun, 7 Mar 2010 11:29:28 +0000 (13:29 +0200)]
Increase the default memory usage limit on "low-memory" systems.
Previously the default limit was always 40 % of RAM. The
new limit is a little bit more complex:
- If 40 % of RAM is at least 80 MiB, 40 % of RAM is used
as the limit.
- If 80 % of RAM is over 80 MiB, 80 MiB is used as the limit.
- Otherwise 80 % of RAM is used as the limit.
This should make it possible to decompress files created with
"xz -9" on more systems. Swapping is generally more expected
on systems with less RAM, so higher default limit on them
shouldn't cause too bad surprises in terms of heavy swapping.
Instead, the higher default limit should reduce the number of
bad surprises when it used to prevent decompression of files
created with "xz -9". The DoS prevention system shouldn't be
a DoS itself.
Note that even with the new default limit, a system with 64 MiB
RAM cannot decompress files created with "xz -9" without user
overriding the limit. This should be OK, because if xz is going
to need more memory than the system has RAM, it will run very
very slowly and thus it's good that user has to override the limit
in that case.
Lasse Collin [Sat, 6 Mar 2010 19:17:20 +0000 (21:17 +0200)]
Fix missing initialization in lzma_strm_init().
With bad luck, lzma_code() could return LZMA_BUF_ERROR
when it shouldn't.
This has been here since the early days of liblzma.
It got triggered by the modifications made to the xz
tool in commit 18c10c30d2833f394cd7bce0e6a821044b15832f
but only when decompressing .lzma files. Somehow I managed
to miss testing that with Valgrind earlier.
This fixes <http://bugs.gentoo.org/show_bug.cgi?id=305591>.
Thanks to Rafał Mużyło for helping to debug it on IRC.
Lasse Collin [Sun, 7 Feb 2010 17:48:06 +0000 (19:48 +0200)]
Subtle change to liblzma Block handling API.
lzma_block.version has to be initialized even for
lzma_block_header_decode(). This way a future version
of liblzma won't allocate memory in a way that an old
application doesn't know how to free it.
The subtlety of this change is that all current apps
using lzma_block_header_decode() will keep working for
now, because the only possible version value is zero,
and lzma_block_header_decode() unconditionally sets the
version to zero even now. Unless fixed, these apps will
break in the future if a new version of the Block options
is ever needed.
Lasse Collin [Sun, 31 Jan 2010 21:28:51 +0000 (23:28 +0200)]
Revise the Windows build files.
The old Makefile + config.h was deleted, because it
becomes outdated too easily and building with the
Autotools based build system works fine even on Windows.
windows/build.sh hasn't got much testing, but it should
work to build 32-bit x86 and x86-64 versions of XZ Utils
using MSYS, MinGW or MinGW-w32, and MinGW-w64.
windows/INSTALL-Windows.txt describes what packages are
needed and how to install them.
windows/README-Windows.txt is a readme file for the binary
package that build.sh hopefully builds.
There are no instructions about using Autotools for now,
so those using a git snapshot may want to run
"autoreconf -fi && ./configure && make mydist" on a UN*X
box and then copy the resulting .tar.gz to a Windows.
Lasse Collin [Sun, 31 Jan 2010 10:53:56 +0000 (12:53 +0200)]
Don't use uninitialized sigset_t.
If signal handlers haven't been established, then it's
useless to try to block them, especially since the sigset_t
used for blocking hasn't been initialized yet.
Lasse Collin [Sun, 31 Jan 2010 10:01:54 +0000 (12:01 +0200)]
Delay opening the destionation file and other fixes.
The opening of the destination file is now delayed a little.
The coder is initialized, and if decompressing, the memory
usage of the first Block compared against the memory
usage limit before the destination file is opened. This
means that if --force was used, the old "target" file won't
be deleted so easily when something goes wrong very early.
Thanks to Mark K for the bug report.
The above fix required some changes to progress message
handling. Now there is a separate function for setting and
printing the filename. It is used also in list.c.
list_file() now handles stdin correctly (gives an error).
A useless check for user_abort was removed from file_io.c.
Lasse Collin [Sun, 17 Jan 2010 09:59:54 +0000 (11:59 +0200)]
Updated windows/Makefile.
Thanks to Dan Shechter for the patch.
It is likely that windows/Makefile will be removed
completely, because Autotols based build nowadays
works well with both 32-bit and 64-bit MinGW (I
just need to update the docs).