From: Lasse Collin Date: Sun, 19 Jul 2009 10:14:20 +0000 (+0300) Subject: Major documentation update. X-Git-Tag: v4.999.9beta~42 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=99f9e879a6a8bb54a65da99c12e0f390216c152a;p=xz Major documentation update. Installation and packaging instructions were added. README and other generic docs were revised. Some of the documentation files are now installed to $docdir. --- diff --git a/AUTHORS b/AUTHORS index d7bf3a9f..63a9815b 100644 --- a/AUTHORS +++ b/AUTHORS @@ -2,17 +2,26 @@ Authors of XZ Utils =================== -Igor Pavlov - * designed LZMA as an algorithm; - * wrote an implementation known as LZMA SDK, which is part of - the bigger 7-Zip project. - -Ville Koskinen - * wrote the first version of the gzip-like lzma command line - utility (C++) - * helped a lot with the documentation. - -Lasse Collin - * ported LZMA SDK to C and zlib-like API (liblzma); - * rewrote the command line tool again to use liblzma and pthreads. + XZ Utils is developed and maintained by Lasse Collin + . + + Major parts of liblzma are based on code written by Igor Pavlov, + specifically the LZMA SDK . Without + this code, XZ Utils wouldn't exist. + + The SHA-256 implementation in liblzma is based on the code found from + 7-Zip , which has a modified version of the SHA-256 + code found from Crypto++ . The SHA-256 code + in Crypto++ was written by Kevin Springle and Wei Dai. + + Some scripts have been adapted from gzip. The original versions + were written by Jean-loup Gailly, Charles Levert, and Paul Eggert. + Andrew Dudman helped adapting the script and their man pages for + XZ Utils. + + The GNU Autotools based build system contains files from many authors, + which I'm not trying list here. + + Several people have contributed fixes or reported bugs. Most of them + are mentioned in the file THANKS. diff --git a/ChangeLog b/ChangeLog index 8382de76..ff22a974 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,2 +1,7 @@ See the commit log in the git repository: -git://ctrl.tukaani.org/xz.git + + git://ctrl.tukaani.org/xz.git + +Note that "make dist" doesn't put this tiny file into the package. +Instead, the git commit log is used as ChangeLog. See dist-hook in +Makefile.am for details. diff --git a/INSTALL b/INSTALL new file mode 100644 index 00000000..b0970d17 --- /dev/null +++ b/INSTALL @@ -0,0 +1,327 @@ + +XZ Utils Installation +===================== + + 0. Preface + 1. Supported platforms + 1.1. Compilers + 1.2. Platform-specific notes + 1.2.1. Darwin (Mac OS X) + 1.2.2. Tru64 + 1.2.3. Windows + 1.2.4. DOS + 1.2.5. OS/2 + 1.3. Adding support for new platforms + 2. configure options + 3. xzgrep and other scripts + 3.1. Dependencies + 3.2. PATH + 4. Troubleshooting + 4.1. "No C99 compiler was found." + 4.1. "No POSIX conforming shell (sh) was found." + 4.2. configure works but build fails at crc32_x86.S + + +0. Preface +---------- + + If you aren't familiar with building packages that use GNU Autotools, + see the file INSTALL.generic for generic instructions before reading + further. + + If you are going to build a package for distribution, see also the + file PACKAGERS. It contains information that should help making the + binary packages as good as possible, but the information isn't very + interesting to those making local builds for private use or for use + in special situations like embedded systems. + + +1. Supported platforms +---------------------- + + XZ Utils are developed on GNU/Linux, but they should work on many + POSIX-like operating systems like *BSDs and Solaris, and even on + a few non-POSIX operating systems. + + +1.1. Compilers + + A C99 compiler is required to compile XZ Utils. If you use GCC, you + need at least version 3.x.x. GCC version 2.xx.x doesn't support some + C99 features used in XZ Utils source code, thus GCC 2 won't compile + XZ Utils. + + XZ Utils takes advantage of some GNU C extensions when building + with GCC. Because these extensions are used only when building + with GCC, it should be possible to use any C99 compiler. + + +1.2. Platform-specific notes + +1.2.1. Darwin (Mac OS X) + + You may need --disable-assembler if building universal binaries on + Darwin. This is because different files are built when assembler is + enabled, and there's no way to make it work with universal build. + If you want to keep the assembler code, consider building one + architecture at a time, and then combining the results to create + universal binaries (see lipo(1)). + + +1.2.2. Tru64 + + If you try to use the native C compiler on Tru64 (passing CC=cc to + configure), it is possible that the configure script will complain + that no C99 compiler was found even when the native compiler supports + C99. You can safely override the test for C99 compiler by passing + ac_cv_prog_cc_c99= as the argument to the configure script. + + +1.2.3. Windows + + Building XZ Utils on Windows is supported under MinGW and Cygwin. + If the Autotools based build gives you trouble with MinGW, you may + want try the alternative method found from the "windows" directory. + + MSVC doesn't support C99, thus it is not possible to use MSVC to + compile XZ Utils. However, it is possible to use liblzma.dll from + MSVC once liblzma.dll has been built with MinGW. The required + import library for MSVC can be created from liblzma.def using the + "lib" command shipped in MSVC: + + lib /def:liblzma.def /out:liblzma.lib /machine:ix86 + + On x86-64, the /machine argument has to naturally be changed: + + lib /def:liblzma.def /out:liblzma.lib /machine:x64 + + +1.2.4. DOS + + There is an experimental Makefile in the "dos" directory to build + XZ Utils on DOS using DJGPP. Support for long file names (LFN) is + needed. + + GNU Autotools based build hasn't been tried on DOS. + + +1.2.5. OS/2 + + You will need to pass --disable-assembler to configure when building + on OS/2. + + +1.3. Adding support for new platforms + + If you have written patches to make XZ Utils to work on previously + unsupported platform, please send the patches to me! I will consider + including them to the official version. It's nice to minimize the + need of third-party patching. + + One exception: Don't request or send patches to change the whole + source package to C89. I find C99 substantially nicer to write and + maintain. However, the public library headers must be in C89 to + avoid frustrating those who maintain programs, which are strictly + in C89 or C++. + + +2. configure options +-------------------- + + In most cases, the defaults are what you want. Most of the options + below are useful only when building a size-optimized version of + liblzma or command line tools. + + --enable-encoders=LIST + --disable-encoders + Specify a comma-separated LIST of filter encoders to + build. See "./configure --help" for exact list of + available filter encoders. The default is to build all + supported encoders. + + If LIST is empty or --disable-encoders is used, no filter + encoders will be built and also the code shared between + encoders will be omitted. + + Disabling encoders will remove some symbols from the + liblzma ABI, so this option should be used only when it + is known to not cause problems. + + --enable-decoders=LIST + --disable-decoders + This is like --enable-encoders but for decoders. The + default is to build all supported decoders. + + --enable-match-finders=LIST + liblzma includes two categories of match finders: + hash chains and binary trees. Hash chains (hc3 and hc4) + are quite fast but they don't provide the best compression + ratio. Binary trees (bt2, bt3 and bt4) give excellent + compression ratio, but they are slower and need more + memory than hash chains. + + You need to enable at least one match finder to build the + LZMA1 or LZMA2 filter encoders. Usually hash chains are + used only in the fast mode, while binary trees are used to + when the best compression ratio is wanted. + + The default is to build all the match finders if LZMA1 + or LZMA2 filter encoders are being built. + + --enable-checks=LIST + liblzma support multiple integrity checks. CRC32 is + mandatory, and cannot be omitted. See "./configure --help" + for exact list of available integrity check types. + + liblzma and the command line tools can decompress files + which use unsupported integrity check type, but naturally + the file integrity cannot be verified in that case. + + Disabling integrity checks may remove some symbols from + the liblzma ABI, so this option should be used only when + it is known to not cause problems. + + --disable-assembler + liblzma includes some assembler optimizations. Currently + there is only assembler code for CRC32 and CRC64 for + 32-bit x86. + + All the assembler code in liblzma is position-independent + code, which is suitable for use in shared libraries and + position-independent executables. So far only i386 + instructions are used, but the code is optimized for i686 + class CPUs. If you are compiling liblzma exclusively for + pre-i686 systems, you may want to disable the assembler + code. + + --enable-unaligned-access + Allow liblzma to use unaligned memory access for 16-bit + and 32-bit loads and stores. This should be enabled only + when the hardware supports this, i.e. when unaligned + access is fast. Some operating system kernels emulate + unaligned access, which is extremely slow. This option + shouldn't be used on systems that rely on such emulation. + + Unaligned access is enabled by default on x86, x86-64, + and big endian PowerPC. + + --enable-small + Reduce the size of liblzma by selecting smaller but + semantically equivalent version of some functions, and + omit precomputed lookup tables. This option tends to + make liblzma slightly slower. + + Note that while omitting the precomputed tables makes + liblzma smaller on disk, the tables are still needed at + run time, and need to be computed at startup. This also + means that the RAM holding the tables won't be shared + between applications linked against shared liblzma. + + --disable-threads + Disable threading support. This makes some things + thread-unsafe, meaning that if multithreaded application + calls liblzma functions from more than one thread, + something bad may happen. + + Use this option if threading support causes you trouble, + or if you know that you will use liblzma only from + single-threaded applications and want to avoid dependency + on libpthread. + + --enable-dynamic + Link the command line tools against shared liblzma. The + default (and recommended way) is to link the command line + tools against static liblzma. + + This option is mostly useful for packagers, if distro + policy requires linking against shared libaries. See the + file PACKAGERS for more information about pros and cons + of this option. + + --enable-debug + This enables the assert() macro and possibly some other + run-time consistency checks. It makes the code slower, so + you normally don't want to have this enabled. + + --enable-werror + If building with GCC, make all compiler warnings an error, + that abort the compilation. This may help catching bugs, + and should work on most systems. This has no effect on the + resulting binaries. + + +3. xzgrep and other scripts +--------------------------- + +3.1. Dependencies + + POSIX shell (sh) and bunch of other standard POSIX tools are required + to run the scripts. The configure script tries to find a POSIX + compliant sh, but if it fails, you can force the shell by passing + gl_cv_posix_shell=/path/to/posix-sh as an argument to the configure + script. + + Some of the scripts require also mktemp. The original mktemp can be + found from . On GNU, most will use the mktemp + program from GNU coreutils instead of the original implementation. + Both mktemp versions are fine for XZ Utils (and practically for + everything else too). + + +3.2. PATH + + The scripts assume that the required tools (standard POSIX utilities, + mktemp, and xz) are in PATH; the scripts don't set the PATH themselves. + Some people like this while some think this is a bug. Those in the + latter group can easily patch the scripts before running the configure + script by taking advantage of a placeholder line in the scripts. + + For example, to make the scripts prefix /usr/bin:/bin to PATH: + + perl -pi -e 's|^#SET_PATH.*$|PATH=/usr/bin:/bin:\$PATH|' \ + src/scripts/xz*.in + + +4. Troubleshooting +------------------ + +4.1. "No C99 compiler was found." + + You need a C99 compiler to build XZ Utils. If the configure script + cannot find a C99 compiler and you think you have such a compiler + installed, set the compiler command by passing CC=/path/to/c99 as + an argument to the configure script. + + If you get this error even when you think your compiler supports C99, + you can override the test by passing ac_cv_prog_cc_c99= as an argument + to the configure script. The test for C99 compiler is not perfect (and + it is not as easy to make it perfect as it sounds), so sometimes this + may be needed. You will get a compile error if your compiler doesn't + support enough C99. + + +4.1. "No POSIX conforming shell (sh) was found." + + xzgrep and other scripts need a shell that (roughly) conforms + to POSIX. The configure script tries to find such a shell. If + it fails, you can force the shell to be used by passing + gl_cv_posix_shell=/path/to/posix-sh as an argument to the configure + script. + + +4.2. configure works but build fails at crc32_x86.S + + The easy fix is to pass --disable-assembler to the configure script. + + The configure script determines if assembler code can be used by + looking at the configure triplet; there is currently no check if + the assembler code can actually actually be built. The x86 assembler + code should work on x86 GNU/Linux, *BSDs, Solaris, Darwin, MinGW, + Cygwin, and DJGPP. On other x86 systems, there may be problems and + the assembler code may need to be disabled with the configure option. + + If you get this error when building for x86-64, you have specified or + the configure script has misguessed your architecture. Pass the + correct configure triplet using the --build=CPU-COMPANY-SYSTEM option + (see INSTALL.generic). + diff --git a/INSTALL.generic b/INSTALL.generic new file mode 100644 index 00000000..2550dab7 --- /dev/null +++ b/INSTALL.generic @@ -0,0 +1,302 @@ +Installation Instructions +************************* + +Copyright (C) 1994, 1995, 1996, 1999, 2000, 2001, 2002, 2004, 2005, +2006, 2007, 2008, 2009 Free Software Foundation, Inc. + + This file is free documentation; the Free Software Foundation gives +unlimited permission to copy, distribute and modify it. + +Basic Installation +================== + + Briefly, the shell commands `./configure; make; make install' should +configure, build, and install this package. The following +more-detailed instructions are generic; see the `README' file for +instructions specific to this package. + + The `configure' shell script attempts to guess correct values for +various system-dependent variables used during compilation. It uses +those values to create a `Makefile' in each directory of the package. +It may also create one or more `.h' files containing system-dependent +definitions. Finally, it creates a shell script `config.status' that +you can run in the future to recreate the current configuration, and a +file `config.log' containing compiler output (useful mainly for +debugging `configure'). + + It can also use an optional file (typically called `config.cache' +and enabled with `--cache-file=config.cache' or simply `-C') that saves +the results of its tests to speed up reconfiguring. Caching is +disabled by default to prevent problems with accidental use of stale +cache files. + + If you need to do unusual things to compile the package, please try +to figure out how `configure' could check whether to do them, and mail +diffs or instructions to the address given in the `README' so they can +be considered for the next release. If you are using the cache, and at +some point `config.cache' contains results you don't want to keep, you +may remove or edit it. + + The file `configure.ac' (or `configure.in') is used to create +`configure' by a program called `autoconf'. You need `configure.ac' if +you want to change it or regenerate `configure' using a newer version +of `autoconf'. + +The simplest way to compile this package is: + + 1. `cd' to the directory containing the package's source code and type + `./configure' to configure the package for your system. + + Running `configure' might take a while. While running, it prints + some messages telling which features it is checking for. + + 2. Type `make' to compile the package. + + 3. Optionally, type `make check' to run any self-tests that come with + the package. + + 4. Type `make install' to install the programs and any data files and + documentation. + + 5. You can remove the program binaries and object files from the + source code directory by typing `make clean'. To also remove the + files that `configure' created (so you can compile the package for + a different kind of computer), type `make distclean'. There is + also a `make maintainer-clean' target, but that is intended mainly + for the package's developers. If you use it, you may have to get + all sorts of other programs in order to regenerate files that came + with the distribution. + + 6. Often, you can also type `make uninstall' to remove the installed + files again. + +Compilers and Options +===================== + + Some systems require unusual options for compilation or linking that +the `configure' script does not know about. Run `./configure --help' +for details on some of the pertinent environment variables. + + You can give `configure' initial values for configuration parameters +by setting variables in the command line or in the environment. Here +is an example: + + ./configure CC=c99 CFLAGS=-g LIBS=-lposix + + *Note Defining Variables::, for more details. + +Compiling For Multiple Architectures +==================================== + + You can compile the package for more than one kind of computer at the +same time, by placing the object files for each architecture in their +own directory. To do this, you can use GNU `make'. `cd' to the +directory where you want the object files and executables to go and run +the `configure' script. `configure' automatically checks for the +source code in the directory that `configure' is in and in `..'. + + With a non-GNU `make', it is safer to compile the package for one +architecture at a time in the source code directory. After you have +installed the package for one architecture, use `make distclean' before +reconfiguring for another architecture. + + On MacOS X 10.5 and later systems, you can create libraries and +executables that work on multiple system types--known as "fat" or +"universal" binaries--by specifying multiple `-arch' options to the +compiler but only a single `-arch' option to the preprocessor. Like +this: + + ./configure CC="gcc -arch i386 -arch x86_64 -arch ppc -arch ppc64" \ + CXX="g++ -arch i386 -arch x86_64 -arch ppc -arch ppc64" \ + CPP="gcc -E" CXXCPP="g++ -E" + + This is not guaranteed to produce working output in all cases, you +may have to build one architecture at a time and combine the results +using the `lipo' tool if you have problems. + +Installation Names +================== + + By default, `make install' installs the package's commands under +`/usr/local/bin', include files under `/usr/local/include', etc. You +can specify an installation prefix other than `/usr/local' by giving +`configure' the option `--prefix=PREFIX'. + + You can specify separate installation prefixes for +architecture-specific files and architecture-independent files. If you +pass the option `--exec-prefix=PREFIX' to `configure', the package uses +PREFIX as the prefix for installing programs and libraries. +Documentation and other data files still use the regular prefix. + + In addition, if you use an unusual directory layout you can give +options like `--bindir=DIR' to specify different values for particular +kinds of files. Run `configure --help' for a list of the directories +you can set and what kinds of files go in them. + + If the package supports it, you can cause programs to be installed +with an extra prefix or suffix on their names by giving `configure' the +option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'. + +Optional Features +================= + + Some packages pay attention to `--enable-FEATURE' options to +`configure', where FEATURE indicates an optional part of the package. +They may also pay attention to `--with-PACKAGE' options, where PACKAGE +is something like `gnu-as' or `x' (for the X Window System). The +`README' should mention any `--enable-' and `--with-' options that the +package recognizes. + + For packages that use the X Window System, `configure' can usually +find the X include and library files automatically, but if it doesn't, +you can use the `configure' options `--x-includes=DIR' and +`--x-libraries=DIR' to specify their locations. + +Particular systems +================== + + On HP-UX, the default C compiler is not ANSI C compatible. If GNU +CC is not installed, it is recommended to use the following options in +order to use an ANSI C compiler: + + ./configure CC="cc -Ae -D_XOPEN_SOURCE=500" + +and if that doesn't work, install pre-built binaries of GCC for HP-UX. + + On OSF/1 a.k.a. Tru64, some versions of the default C compiler cannot +parse its `' header file. The option `-nodtk' can be used as +a workaround. If GNU CC is not installed, it is therefore recommended +to try + + ./configure CC="cc" + +and if that doesn't work, try + + ./configure CC="cc -nodtk" + + On Solaris, don't put `/usr/ucb' early in your `PATH'. This +directory contains several dysfunctional programs; working variants of +these programs are available in `/usr/bin'. So, if you need `/usr/ucb' +in your `PATH', put it _after_ `/usr/bin'. + + On Haiku, software installed for all users goes in `/boot/common', +not `/usr/local'. It is recommended to use the following options: + + ./configure --prefix=/boot/common + +Specifying the System Type +========================== + + There may be some features `configure' cannot figure out +automatically, but needs to determine by the type of machine the package +will run on. Usually, assuming the package is built to be run on the +_same_ architectures, `configure' can figure that out, but if it prints +a message saying it cannot guess the machine type, give it the +`--build=TYPE' option. TYPE can either be a short name for the system +type, such as `sun4', or a canonical name which has the form: + + CPU-COMPANY-SYSTEM + +where SYSTEM can have one of these forms: + + OS + KERNEL-OS + + See the file `config.sub' for the possible values of each field. If +`config.sub' isn't included in this package, then this package doesn't +need to know the machine type. + + If you are _building_ compiler tools for cross-compiling, you should +use the option `--target=TYPE' to select the type of system they will +produce code for. + + If you want to _use_ a cross compiler, that generates code for a +platform different from the build platform, you should specify the +"host" platform (i.e., that on which the generated programs will +eventually be run) with `--host=TYPE'. + +Sharing Defaults +================ + + If you want to set default values for `configure' scripts to share, +you can create a site shell script called `config.site' that gives +default values for variables like `CC', `cache_file', and `prefix'. +`configure' looks for `PREFIX/share/config.site' if it exists, then +`PREFIX/etc/config.site' if it exists. Or, you can set the +`CONFIG_SITE' environment variable to the location of the site script. +A warning: not all `configure' scripts look for a site script. + +Defining Variables +================== + + Variables not defined in a site shell script can be set in the +environment passed to `configure'. However, some packages may run +configure again during the build, and the customized values of these +variables may be lost. In order to avoid this problem, you should set +them in the `configure' command line, using `VAR=value'. For example: + + ./configure CC=/usr/local2/bin/gcc + +causes the specified `gcc' to be used as the C compiler (unless it is +overridden in the site shell script). + +Unfortunately, this technique does not work for `CONFIG_SHELL' due to +an Autoconf bug. Until the bug is fixed you can use this workaround: + + CONFIG_SHELL=/bin/bash /bin/bash ./configure CONFIG_SHELL=/bin/bash + +`configure' Invocation +====================== + + `configure' recognizes the following options to control how it +operates. + +`--help' +`-h' + Print a summary of all of the options to `configure', and exit. + +`--help=short' +`--help=recursive' + Print a summary of the options unique to this package's + `configure', and exit. The `short' variant lists options used + only in the top level, while the `recursive' variant lists options + also present in any nested packages. + +`--version' +`-V' + Print the version of Autoconf used to generate the `configure' + script, and exit. + +`--cache-file=FILE' + Enable the cache: use and save the results of the tests in FILE, + traditionally `config.cache'. FILE defaults to `/dev/null' to + disable caching. + +`--config-cache' +`-C' + Alias for `--cache-file=config.cache'. + +`--quiet' +`--silent' +`-q' + Do not print messages saying which checks are being made. To + suppress all normal output, redirect it to `/dev/null' (any error + messages will still be shown). + +`--srcdir=DIR' + Look for the package's source code in directory DIR. Usually + `configure' can determine that directory automatically. + +`--prefix=DIR' + Use DIR as the installation prefix. *Note Installation Names:: + for more details, including other options available for fine-tuning + the installation locations. + +`--no-create' +`-n' + Run the configure checks, but stop before creating any output + files. + +`configure' also accepts some other, not widely useful, options. Run +`configure --help' for more details. + diff --git a/Makefile.am b/Makefile.am index 6543361f..b04a096c 100644 --- a/Makefile.am +++ b/Makefile.am @@ -14,6 +14,17 @@ endif SUBDIRS += src po tests +doc_DATA = \ + AUTHORS \ + COPYING \ + COPYING.GPLv2 \ + NEWS \ + README \ + THANKS \ + TODO \ + doc/xz-file-format.txt \ + doc/lzma-file-format.txt + EXTRA_DIST = \ version.sh \ Doxyfile.in \ diff --git a/PACKAGERS b/PACKAGERS new file mode 100644 index 00000000..da5158ce --- /dev/null +++ b/PACKAGERS @@ -0,0 +1,278 @@ + +Information to packagers of XZ Utils +==================================== + + 0. Preface + 1. Package naming + 2. Package description + 3. License + 4. configure options + 4.1. Static vs. dynamic linking of liblzma + 4.2. Optimizing xzdec and lzmadec + 5. Additional documentation + 6. Extra files + 7. Installing XZ Utils and LZMA Utils in parallel + 8. Example + + +0. Preface +---------- + + This document is meant for people who create and maintain XZ Utils + packages for operating system distributions. The focus is on GNU/Linux + systems, but most things apply to other systems too. + + While the standard "configure && make DESTDIR=$PKG install" should + give a pretty good package, there are some details which packagers + may want to tweak. + + Packagers should also read the INSTALL file. + + +1. Package naming +----------------- + + The preferred name for the XZ Utils package is "xz", because that's + the name of the upstream tarball. Naturally you may have good reasons + to use some other name; I won't get angry about it. ;-) It's just nice + to be able to point people to the correct package name without asking + what distro they have. + + If your distro policy is to split things into small pieces, here is + one suggestion: + + xz xz, xzdec, scripts (xzdiff, xzgrep, etc.), docs + xz-lzma lzma, unlzma, lzcat, lzgrep etc. symlinks and + lzmadec binary for compatibility with LZMA Utils + liblzma liblzma.so.* + liblzma-devel liblzma.so, liblzma.a, API headers + + +2. Package description +---------------------- + + Here is a suggestion which you may use as the package description. + If you can use only one-line description, pick only the first line. + Naturally, feel free to use some other description if you find it + better, and maybe send it to me too. + + Library and command line tools for XZ and LZMA compressed files + + XZ Utils provide a general purpose data compression library + and command line tools. The native file format is the .xz + format, but also the legacy .lzma format is supported. The .xz + format supports multiple compression algorithms, of which LZMA2 + is currently the primary algorithm. With typical files, XZ Utils + create about 30 % smaller files than gzip. + + If you are splitting XZ Utils into multiple packages, here are some + suggestions for package descriptions: + + xz: + + Command line tools for XZ and LZMA compressed files + + This package includes the xz compression tool and other command + line tools from XZ Utils. xz has command line syntax similar to + that of gzip. The native file format is the .xz format, but also + the legacy .lzma format is supported. The .xz format supports + multiple compression algorithms, of which LZMA2 is currently the + primary algorithm. With typical files, XZ Utils create about 30 % + smaller files than gzip. + + Note that this package doesn't include the files needed for + LZMA Utils 4.32.x compatibility. Install also the xz-lzma + package to make XZ Utils emulate LZMA Utils 4.32.x. + + xz-lzma: + + LZMA Utils emulation with XZ Utils + + This package includes executables and symlinks to make + XZ Utils emulate lzma, unlzma, lzcat, and other command + line tools found from the legacy LZMA Utils 4.32.x package. + + liblzma: + + Library for XZ and LZMA compressed files + + liblzma is a general purpose data compression library with + an API similar to that of zlib. liblzma supports multiple + algorithms, of which LZMA2 is currently the primary algorithm. + The native file format is .xz, but also the legacy .lzma + format and raw streams (no headers at all) are supported. + + This package includes the shared library. + + liblzma-devel: + + Library for XZ and LZMA compressed files + + This package includes the API headers, static library, and + other development files related to liblzma. + + +3. License +---------- + + If the package manager supports a license field, you probably should + put GPLv2+ there (GNU GPL v2 or later). The interesting parts of + XZ Utils are in the public domain, but some less important files + ending up into the binary package are under GPLv2+. So it is simplest + to just say GPLv2+ if you cannot specify "public domain and GPLv2+". + + If you split XZ Utils into multiple packages as described earlier + in this file, liblzma and liblzma-dev packages will contain only + public domain code (from XZ Utils at least; compiler or linker may + add some third-party code, which may be copyrighted). + + +4. configure options +-------------------- + + Unless you are building a package for a distribution that is meant + only for embedded systems, don't use the following configure options: + + --enable-debug + --enable-encoders (*) + --enable-decoders + --enable-match-finders + --enable-checks + --enable-small (*) + --disable-threads (*) + + (*) These are OK when building xzdec and lzmadec as explained later. + + You may use --enable-werror but be careful with it since it may break + the build due to some useless warning when the build environment + changes (like CPU architecture or compiler version). + + +4.1. Static vs. dynamic linking of liblzma + + The default is to link the command line tools against static liblzma. + This can be changed by passing --enable-dynamic to configure, or by + not building static libraries at all by passing --disable-static to + configure. It is mildly recommended that you use the default and link + the command line tools against static liblzma, but the configure + options make it easy to do otherwise if the distro policy so requires. + + On 32-bit x86, linking against static liblzma can give a minor + speed improvement. Static libraries on x86 are usually compiled as + position-dependent code (non-PIC) and shared libraries are built as + position-independent code (PIC). PIC wastes one register, which can + make the code slightly slower compared to a non-PIC version. (Note + that this doesn't apply to x86-64.) + + Linking against static liblzma avoids a dependency on liblzma shared + library, and makes it slightly easier to copy the command line tools + between systems (e.g. quick 'n' dirty emergency recovery of some + files). It also allows putting the command line tools to /bin while + leaving liblzma to /usr/lib (assuming that your distribution uses + such a file system hierarchy), if no other file in /bin would require + liblzma. + + If you don't want to distribute static libraries but you still + want to link the command line tools against static liblzma, it is + probably easiest to build both static and shared liblzma, but after + "make DESTDIR=$PKG install" remove liblzma.a and modify liblzma.la + to not contain a reference to liblzma.a. + + +4.2. Optimizing xzdec and lzmadec + + xzdec and lzmadec are intended to be relatively small instead of + optimizing for the best speed. Thus, it is a good idea to build + xzdec and lzmadec separately: + + - Only decoder code is needed, so you can speed up the build + slightly by passing --disable-encoders to configure. This + shouldn't affect the final size of the executables though, + because the linker is able to omit the encoder code anyway. + + - xzdec and lzmadec will never use multithreading capabilities of + liblzma. You can avoid dependency on libpthread by passing + --disable-threads to configure. + + - There are and will be no translated messages for xzdec and + lzmadec, so it is fine to pass also --disable-nls to configure. + + - To select somewhat size-optimized variant of some things in + liblzma, pass --enable-small to configure. + + - Tell the compiler to optimize for size instead of speed. + E.g. with GCC, put -Os into CFLAGS. + + +5. Additional documentation +--------------------------- + + "make install" copies some additional documentation to $docdir + (--docdir in configure). These a copy of the GNU GPL v2, which can + be replaced with a symlink if your distro ships with shared copies + of the common license texts. + + +6. Extra files +-------------- + + The "extra" directory contains some small extra tools or other files. + The exact set of extra files can vary between XZ Utils releases. The + extra files have only limited use or they are too dangerous to be + put directly to $bindir (7z2lzma.sh is a good example, since it can + silently create corrupt output if certain conditions are not met). + + If you feel like it, you may copy the extra directory under the doc + directory (e.g. /usr/share/doc/xz/extra). Maybe some people will find + them useful. However, most people needing these tools probably are + able to find them from the source package too. + + The "debug" directory contains some tools that are useful only when + hacking on XZ Utils. Don't package these tools. + + +7. Installing XZ Utils and LZMA Utils in parallel +------------------------------------------------- + + XZ Utils and LZMA Utils 4.32.x can be installed in parallel by + omitting the compatibility symlinks (lzma, unlzma, lzcat, lzgrep etc.) + from the XZ Utils package. It's probably a good idea to still package + the symlinks into a separate package so that users may choose if they + want to use XZ Utils or LZMA Utils for handling .lzma files. + + +8. Example +---------- + + Here is an example for i686 GNU/Linux that + - links xz against static liblzma; + - includes only shared liblzma in the final package; + - links xzdec and lzmadec against static liblzma while + avoiding libpthread dependency. + + PKG=/tmp/xz-pkg + tar xf xz-x.y.z.tar.gz + cd xz-x.y.z + ./configure \ + --prefix=/usr \ + --sysconfdir=/etc \ + CFLAGS='-march=i686 -O2' + make + make DESTDIR=$PKG install-strip + rm -f $PKG/usr/lib/lib*.a + sed -i "s/^old_library=.*$/old_library=''/" $PKG/usr/lib/lib*.la + make clean + ./configure \ + --prefix=/usr \ + --sysconfdir=/etc \ + --disable-shared \ + --disable-nls \ + --disable-encoders \ + --enable-small \ + --disable-threads \ + CFLAGS='-march=i686 -Os' + make -C src/liblzma + make -C src/xzdec + make -C src/xzdec DESTDIR=$PKG install-strip + cp -a extra $PKG/usr/share/doc/xz + diff --git a/README b/README index 24467cd0..0c25e722 100644 --- a/README +++ b/README @@ -2,89 +2,121 @@ XZ Utils ======== -Important + 0. Overview + 1. Documentation + 1.1. Overall documentation + 1.2. Documentation for command line tools + 1.3. Documentation for liblzma + 2. Version numbering + 3. Other implementations of the .xz format + 4. Contact information + - This is a beta version. The .xz file format is now stable though, - which means that files created with the beta version will be - decompressible with all future XZ Utils versions too (assuming - that there are no catastrophic bugs). +0. Overview +----------- - liblzma API is pretty stable now, although minor tweaks may still - be done if really needed. The ABI is not stable yet. The major - soname will be bumped right before the first stable release. - Probably it will be bumped to something like .so.5.0.0 because - some distributions using the alpha versions already had to use - other versions than .so.0.0.0. + XZ Utils provide a general purporse data compression library and + command line tools. The native file format is the .xz format, but + also the legacy .lzma format is supported. The .xz format supports + multiple compression algorithms, which are called "filters" in + context of XZ Utils. The primary filter is currently LZMA2. With + typical files, XZ Utils create about 30 % smaller files than gzip. - Excluding the Doxygen style docs in liblzma API headers, the - documentation in this package (including the rest of this - README) is not very up to date, and may contain incorrect or - misleading information. + To ease adapting support for the .xz format into existing applications + and scripts, the API of liblzma is somewhat similar to the API of the + popular zlib library. For the same reason, the command line tool xz + has similar command line syntax than that of gzip. + When aiming for the highest compression ratio, LZMA2 encoder uses + a lot of CPU time and may use, depending on the settings, even + hundreds of megabytes of RAM. However, in fast modes, LZMA2 encoder + competes with bzip2 in compression speed, RAM usage, and compression + ratio. -Overview + LZMA2 is reasonably fast to decompress. It is a little slower than + gzip, but a lot faster than bzip2. Being fast to decompress means + that the .xz format is especially nice when the same file will be + decompressed very many times (usually on different computers), which + is the case e.g. when distributing software packages. In such + situations, it's not too bad if the compression takes some time, + since that needs to be done only once to benefit many people. - LZMA is a general purpose compression algorithm designed by - Igor Pavlov as part of 7-Zip. It provides high compression ratio - while keeping the decompression speed fast. + With some file types, combining (or "chaining") LZMA2 with an + additional filter can improve compression ratio. A filter chain may + contain up to four filters, although usually only one two is used. + For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2 + in the filter chain can improve compression ratio of executable files. - XZ Utils are an attempt to make LZMA compression easy to use - on free (as in freedom) operating systems. This is achieved by - providing tools and libraries which are similar to use than the - equivalents of the most popular existing compression algorithms. + Since the .xz format allows adding new filter IDs, it is possible that + some day there will be a filter that is, for example, much faster to + compress than LZMA2 (but probably with worse compression ratio). + Similarly, it is possible that some day there is a filter that will + compress better than LZMA2. - XZ Utils consist of a few relatively separate parts: - * liblzma is an encoder/decoder library with support for several - filters (algorithm implementations). The primary filter is LZMA. - * libzfile (or whatever the name will be) enables reading from and - writing to gzip, bzip2 and LZMA compressed and uncompressed files - with an API similar to the standard ANSI-C file I/O. - [ NOTE: libzfile is not implemented yet. ] - * xz command line tool has almost identical syntax than gzip - and bzip2. It makes LZMA easy for average users, but also - provides advanced options to finetune the compression settings. - * A few shell scripts make diffing and grepping LZMA compressed - files easy. The scripts were adapted from gzip and bzip2. + XZ Utils doesn't support multithreaded compression or decompression + yet. It has been planned though and taken into account when designing + the .xz file format. -Supported platforms +1. Documentation +---------------- - XZ Utils are developed on GNU+Linux, but they should work at - least on *BSDs and Solaris. They probably work on some other - POSIX-like operating systems too. +1.1. Overall documentation - If you use GCC to compile XZ Utils, you need at least version - 3.x.x. GCC version 2.xx.x doesn't support some C99 features used - in XZ Utils source code, thus GCC 2 won't compile XZ Utils. + README This file - If you have written patches to make XZ Utils to work on previously - unsupported platform, please send the patches to me! I will consider - including them to the official version. It's nice to minimize the - need of third-party patching. + INSTALL.generic Generic install instructions for those not familiar + with packages using GNU Autotools + INSTALL Installation instructions specific to XZ Utils + PACKAGERS Information to packagers of XZ Utils - One exception: Don't request or send patches to change the whole - source package to C89. I find C99 substantially nicer to write and - maintain. However, the public library headers must be in C89 to - avoid frustrating those who maintain programs, which are strictly - in C89 or C++. + COPYING XZ Utils copyright and license information + COPYING.GPLv2 GNU General Public License version 2 + COPYING.GPLv3 GNU General Public License version 3 + COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1 + AUTHORS The main authors of XZ Utils + THANKS Incomplete list of people who have helped making + this software + NEWS User-visible changes between XZ Utils releases + ChangeLog Detailed list of changes (commit log) -Platform-specific notes + Note that only some of the above files are included in binary + packages. - On some Tru64 systems using the native C99 compiler, the configure - script may reject the compiler as non-C99 compiler. This may happen - if there is no stdbool.h available. You can still compile XZ Utils - on such a system by passing ac_cv_prog_cc_c99= to configure script. - Fixing this bug seems to be non-trivial since if the configure - doesn't check for stdbool.h, it runs into problems at least on - Solaris. +1.2. Documentation for command line tools -Version numbering + The command line tools are documented as man pages. In source code + releases (and possibly also in some binary packages), the man pages + are also provided in plain text (ASCII only) and PDF formats in the + directory "doc/man" to make the man pages more accessible to those + whose operating system doesn't provide an easy way to view man pages. - The version number of XZ Utils has absolutely nothing to do with - the version number of LZMA SDK or 7-Zip. The new version number - format of XZ Utils is X.Y.ZS: + +1.3. Documentation for liblzma + + The liblzma API headers include short docs about each function + and data type as Doxygen tags. These docs should be quite OK as + a quick reference. + + I have planned to write a bunch of very well documented example + programs, which (due to comments) should work as a tutorial to + various features of liblzma. No such example programs have been + written yet. + + For now, if you have never used liblzma, libbzip2, or zlib, I + recommend learning *basics* of zlib API. Once you know that, it + should be easier to learn liblzma. + + http://zlib.net/manual.html + http://zlib.net/zlib_how.html + + +2. Version numbering +-------------------- + + The version number format of XZ Utils is X.Y.ZS: - X is the major version. When this is incremented, the library API and ABI break. @@ -109,97 +141,32 @@ Version numbering the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta. -configure options - - If you are not familiar with `configure' scripts, read the file - INSTALL first. - - In most cases, the default --enable/--disable/--with/--without options - are what you want. Don't touch them if you are unsure. - - --disable-encoder - Do not compile the encoder component of liblzma. This - implies --disable-match-finders. If you need only - the decoder, you can decrease the library size - dramatically with this option. - - The default is to build the encoder. - - --disable-decoder - Do not compile the decoder component of liblzma. - - The default is to build the decoder. - - --enable-filters= - liblzma supports several filters. See liblzma-intro.txt - for a little more information about these. - - The default is to build all the filters. - - --enable-match-finders= - liblzma includes two categories of match finders: - hash chains and binary trees. Hash chains (hc3 and hc4) - are quite fast but they don't provide the best compression - ratio. Binary trees (bt2, bt3 and bt4) give excellent - compression ratio, but they are slower and need more - memory than hash chains. - - You need to enable at least one match finder to build the - LZMA filter encoder. Usually hash chains are used only in - the fast mode, while binary trees are used to when the best - compression ratio is wanted. - - The default is to build all the match finders. - - --enable-checks= - liblzma support multiple integrity checks. CRC32 is - mandatory, and cannot be omitted. See liblzma-intro.txt - for more information about usage of the integrity checks. - - --disable-assembler - liblzma includes some assembler optimizations. Currently - there is only assembler code for CRC32 and CRC64 for - 32-bit x86. - - All the assembler code in liblzma is position-independent - code, which is suitable for use in shared libraries and - position-independent executables. So far only i386 - instructions are used, but the code is optimized for i686 - class CPUs. If you are compiling liblzma exclusively for - pre-i686 systems, you may want to disable the assembler - code. - - --enable-small - Omits precomputed tables. This makes liblzma a few KiB - smaller. Startup time increases, because the tables need - to be computed first. - - --enable-debug - This enables the assert() macro and possibly some other - run-time consistency checks. It slows down things somewhat, - so you normally don't want to have this enabled. +3. Other implementations of the .xz format +------------------------------------------ - --enable-werror - Makes all compiler warnings an error, that abort the - compilation. This may help catching bugs, and should work - on most systems. This has no effect on the resulting - binaries. + 7-Zip and the p7zip port of 7-Zip support the .xz format starting + from the version 9.00alpha. + http://7-zip.org/ + http://p7zip.sourceforge.net/ -Static vs. dynamic linking of the command line tools + XZ Embedded is a limited implementation written for use in the Linux + kernel, but it is also suitable for other embedded use. - By default, the command line tools are linked statically against - liblzma. There a are a few reasons: + http://tukaani.org/xz-embedded/ - - The executable(s) can be in /bin while the shared liblzma can still - be in /usr/lib (if the distro uses such file system hierarchy). - - It's easier to copy the executables to other systems, since they - depend only on libc. +4. Contact information +---------------------- - - It's slightly faster on some architectures like x86. + If you have questions, bug reports, patches etc. related to XZ Utils, + contact Lasse Collin . tukaani.org uses + greylisting to reduce spam, thus when you send your first email, it + may get delayed by a few hours. In addition to that, I'm sometimes + slow at replying. If you haven't got a reply within two weeks, assume + that your email has got lost and resend it or use IRC. - If you don't like this, you can get the command line tools linked - against the shared liblzma by specifying --disable-static to configure. - This disables building static liblzma completely. + You can find me also from #tukaani on Freenode; my nick is Larhzu. + The channel tends to be pretty quiet, so just ask your question and + someone may wake up. diff --git a/THANKS b/THANKS index e66c4a51..e038ea3d 100644 --- a/THANKS +++ b/THANKS @@ -1,11 +1,11 @@ Thanks ------- +====== -Some people have helped more, some less, some don't even know they have -been helpful, but nevertheless everyone's help has been important. :-) -In alphabetical order: +Some people have helped more, some less, but nevertheless everyone's help +has been important. :-) In alphabetical order: - Mark Adler + - H. Peter Anvin - Nelson H. F. Beebe - Anders F. Björklund - Emmanuel Blot @@ -13,7 +13,6 @@ In alphabetical order: - Andrew Dudman - İsmail Dönmez - Mike Frysinger - - Jean-loup Gailly - Per Øyvind Karlsen - Ville Koskinen - Stephan Kulow @@ -26,7 +25,6 @@ In alphabetical order: - Bernhard Reutner-Fischer - Alexandre Sauvé - Andreas Schwab - - Julian Seward - Dan Shechter - Paul Townsend - Mohammed Adnène Trojette @@ -34,8 +32,11 @@ In alphabetical order: - Bert Wesarg - Ralf Wildenhues - Charles Wilson + - Lars Wirzenius - Andreas Zieringer -Also thanks to all the people who have participated the Tukaani project -and others who I have forgot. +Also thanks to all the people who have participated in the Tukaani project. + +I have probably forgot to add some names to the above list. Sorry about +that and thanks for your help.