From: Ian Darwin Date: Fri, 28 Aug 1987 21:40:40 +0000 (+0000) Subject: Initial revision X-Git-Tag: FILE3_27~420 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=98cbbcf41a48a5d27c9c8839fa800f15b854ef33;p=file Initial revision --- diff --git a/doc/file.man b/doc/file.man new file mode 100644 index 00000000..fe82be8d --- /dev/null +++ b/doc/file.man @@ -0,0 +1,186 @@ +.TH FILE 1 "Public Domain" +.SH NAME +.I file +\- determine file type +.SH SYNOPSIS +.B file +[ +.B -c +] +[ +.B -f +file ] +[ +.B -m +magicfile ] +file ... +.SH DESCRIPTION +.I File +performs a series of tests on each argument in an attempt to classify it. +There are three sets of tests, performed in this order: +filesystem tests, magic number tests, and language tests. +The program prints the +.I first +type that matches the file, and goes on to the next file argument if any. +.PP +The type printed will usually contain one of the words +.B text +(the file contains only ASCII characters and is +moderately safe to read on an ASCII terminal), +.B executable +(the file contains the result of compiling a program +in a form understandable to some \s-1UNIX\s0 kernel or another), +or +.B data +meaning anything else (data is usually `binary' or non-printable). +Exceptions are well-known file formats (core files, tar archives) +that are known to contain binary data. +When modifying the file +.I /etc/magic +or the program itself, +.B "preserve these keywords" . +People depend on knowing that all the files in a directory +that are readable contain the word ``text''. +Don't do as one computer vendor did \- change ``shell commands text'' +to ``shell script''. +.PP +The filesystem tests are based on examining the return from a +.I stat (2) +system call. +The program checks to see if the file is empty, +or if it's some sort of special file. +Any known file types appropriate to the system you are running on +(sockets and symbolic links on 4.2BSD, named pipes (fifos) on System V) +are intuited if they are defined in +the appropriate system header file +.I sys/stat.h . +.PP +The magic number tests are used to check for files with data in +particular fixed formats. +The canonical example of this is a binary executable (compiled program) +.I a.out +file, whose format is defined in +.I a.out.h +and possibly +.I exec.h +in the standard include directory. +These files have a `magic number' stored in a particular place +near the beginning of the file that tells the \s-1UNIX\s0 operating system +that the file is a binary executable, and which of several types thereof. +The concept of `magic number' has been applied by extention to data files. +Any file with some invariant identifier at a fixed +offset into the file can usually be described in this way. +The information in these files is read from the magic file +.I /etc/magic . +.PP +If an argument appears to be +.SM ASCII , +.I file +attempts to guess its language. +The language tests look for particular strings that can appear +anywhere in the first few blocks of the file. +For example, the keyword +.I dimension +indicates that the file is most likely a \s-1FORTRAN\s0 program, +just as the keyword +.I struct +indicates a C program. +These tests are less reliable than the previous +two groups, so they are performed last. +The keywords used in the language tests are stored in the +source file +.I names.h . +The routine that handles the language tests also tests for some miscellany +(such as +.I tar +archives) and determines whether an unknown file should be +labelled as `ascii text' or `data'. +.PP +Use +.B -m +.I file +to specify an alternate file of magic numbers. +The option +.B -c +causes a checking printout of the parsed form of the magic file. +This is usually used in conjunction with +.B -m +to debug a new magic file. +.PP +The +.B -f +.I file +option specifies that the names of the files to be examined +is to come from +.I file +rather than (or in addition to) the argument list. +.SH FILES +.I /etc/magic +\- default list of magic numbers +.SH SEE ALSO +.IR File (5) +\- description of magic file format. +.IR Strings (1), " od" (1) +\- tools for examining non-textfiles. +.SH STANDARDS CONFORMANCE +This program is believed to exceed the System V Interface Definition +of FILE(CMD), as near as one can determine from the vague language +contained therein. +Its behaviour is mostly compatible with the System V program of the same name. +This version knows more magic, however, so it will produce +different (albeit more accurate) output in many cases. +.SH HISTORY +There has been a +.I file +command in every UNIX since at least Research Version 6 +(man page dated January, 1975). +The System V version introduced one significant major change: +the external list of magic number types. +This slowed the program down slightly but made it significantly +more flexible. +.PP +This program, based on the System V version, +was written by Ian Darwin without looking at anybody else's source code +(I looked at one later, and was glad I hadn't!). +.SH NOTICE +This program is copyright \(co 1986, Ian Darwin. +.B "Unmodified copies +of the source may be freely copied for any purpose +provided this notice is maintained. +.B "Redistribution of modified source copies is prohibited; +redistribute the original source with your changes as diff listings. +Send the author your changes; if I like 'em I'll incorporate them. +.PP +Author's addresses: +Postal: Box 603, Station F, Toronto, CANADA M4Y 2L8; +uUCp: {utzoo|ihnp4}!darwin!ian; +InterNet: ian@sq.com. +.PP +A few files (such as +.I is_tar.c , +.I strtok.c , +and +.I strtol.c ) +are not covered by the above restrictions. +.SH BUGS +You can't use `-' (or a null argument list) to determine the file type +of the standard input, since the program insists on doing a +.I stat (2) +call to glean some information about the file. +.PP +.I File +uses several algorithms that favor speed over accuracy, +thus it is often misled about the contents of ASCII files. +.PP +The support for ASCII files (primarily for programming languages) +is simplistic, inefficient and requires recompilation to update. +.PP +The magic file should be compiled into a binary +(or better yet, fixed-length ASCII strings +for use in heterogenous network environments) +for faster startup. +Then the program would run as fast as the Version 7 program of the same name, +with the flexibility of the System V version. +But then there would have to be yet another magic number for the +.I magic.out +file.