>10 leshort x height=%d pixels,
>12 leshort x depth=%d,
>16 leshort x ticks/frame=%d
-# newer FLI or FLC format
-4 leshort 0xAF12 newer FLI or FLC file
+# FLC animation format
+4 leshort 0xAF12 FLC file
>6 leshort x - %d frames
>8 leshort x width=%d pixels,
>10 leshort x height=%d pixels,
#------------------------------------------------------------------------------
-# archive: file(1) magic for archive formats (see also "dos" for self-
+# archive: file(1) magic for archive formats (see also "msdos" for self-
# extracting compressed archives)
#
# cpio, ar, arc, arj, hpack, lha/lharc, rar, squish, uc2, zip, zoo, etc.
0 string 070707 ASCII cpio archive (pre-SVR4 or odc)
0 string 070701 ASCII cpio archive (SVR4 with no CRC)
0 string 070702 ASCII cpio archive (SVR4 with CRC)
+
+# other archives
0 long 0177555 very old archive
0 short 0177555 very old PDP-11 archive
0 long 0177545 old archive
#
0 string -h- Software Tools format archive text
-# ARC archiver (empirical entries), from Daniel Quinlan (quinlan@yggdrasil.com)
-#
-# Other entries seem likely, but single byte magic entries collide
-# with many things. I think the real ARC magic is a single byte.
+# ARC archiver, from Daniel Quinlan (quinlan@yggdrasil.com)
#
-# the first entry covers about 84% of ARC files, the second 7%, the third 5%
-0 string \032\010 ARC archive data
-0 string \032\011 ARC archive data
-0 string \032\002 ARC archive data
-# these seem to be rather uncommon, at less than 3% and 2%, respectively
-0 string \032\003 ARC archive data
-0 string \032\004 ARC archive data
+# The first byte is the magic (0x1a), byte 2 is the compression type for
+# the first file (0x01 through 0x09), and bytes 3 to 15 are the MS-DOS
+# filename of the first file (null terminated). Since some types collide
+# we only test some types on basis of frequency: 0x08 (83%), 0x09 (5%),
+# 0x02 (5%), 0x03 (3%), 0x04 (2%), 0x06 (2%). 0x01 collides with terminfo.
+0 lelong&0x8080ffff 0x0000081a ARC archive data, dynamic LZW
+0 lelong&0x8080ffff 0x0000091a ARC archive data, squashed
+0 lelong&0x8080ffff 0x0000021a ARC archive data, uncompressed
+0 lelong&0x8080ffff 0x0000031a ARC archive data, packed
+0 lelong&0x8080ffff 0x0000041a ARC archive data, squeezed
+0 lelong&0x8080ffff 0x0000061a ARC archive data, crunched
# Acorn archive formats (Disaster prone simpleton, m91dps@ecs.ox.ac.uk)
# I can't create either SPARK or ArcFS archives so I have not tested this stuff
# ARJ archiver (jason@jarthur.Claremont.EDU)
0 leshort 0xea60 ARJ archive data
->5 byte x - version %d,
->8 byte >0 flags:
->>8 byte &0x04 multi-volume,
->>8 byte &0x10 slash switched,
->>8 byte &0x20 backup,
+>5 byte x \b, v%d,
+>8 byte &0x04 multi-volume,
+>8 byte &0x10 slash-switched,
+>8 byte &0x20 backup,
>34 string x original name: %s,
->7 byte 0 os: MS/DOS
+>7 byte 0 os: MS-DOS
>7 byte 1 os: PRIMOS
->7 byte 2 os: UNIX
+>7 byte 2 os: Unix
>7 byte 3 os: Amiga
>7 byte 4 os: Macintosh
>7 byte 5 os: OS/2
>3 byte >0 %d]
# HA archiver (Greg Roelofs, newt@uchicago.edu)
-0 string HA HA archive data,
->2 leshort x %u files,
->4 byte&0x0f =0 first is type CPY
->4 byte&0x0f =1 first is type ASC
->4 byte&0x0f =2 first is type HSC
->4 byte&0x0f =0x0e first is type DIR
->4 byte&0x0f =0x0f first is type SPECIAL
+# This is a really bad format. A file containing HAWAII will match this...
+#0 string HA HA archive data,
+#>2 leshort =1 1 file,
+#>2 leshort >1 %u files,
+#>4 byte&0x0f =0 first is type CPY
+#>4 byte&0x0f =1 first is type ASC
+#>4 byte&0x0f =2 first is type HSC
+#>4 byte&0x0f =0x0e first is type DIR
+#>4 byte&0x0f =0x0f first is type SPECIAL
# HPACK archiver (Peter Gutmann, pgut1@cs.aukuni.ac.nz)
0 string HPAK HPACK archive data
+# JAM Archive volume format, by Dmitry.Kohmanyuk@UA.net
+0 string \351,\001JAM\ JAM archive,
+>7 string >\0 version %.4s
+>0x26 byte =0x27 -
+>>0x2b string >\0 label %.11s,
+>>0x27 lelong x serial %08x,
+>>0x36 string >\0 fstype %.8s
+
# LHARC/LHA archiver (Greg Roelofs, newt@uchicago.edu)
-2 string -lh0- Lharc 1.x archive data [lh0]
-2 string -lh1- Lharc 1.x archive data [lh1]
-2 string -lz4- Lharc 1.x archive data [lz4]
-2 string -lz5- Lharc 1.x archive data [lz5]
+2 string -lh0- LHarc 1.x archive data [lh0]
+2 string -lh1- LHarc 1.x archive data [lh1]
+2 string -lz4- LHarc 1.x archive data [lz4]
+2 string -lz5- LHarc 1.x archive data [lz5]
# [never seen any but the last; -lh4- reported in comp.compression:]
2 string -lzs- LHa 2.x? archive data [lzs]
2 string -lh - LHa 2.x? archive data [lh ]
# SQUISH archiver (Greg Roelofs, newt@uchicago.edu)
0 string SQSH squished archive data (Acorn RISCOS)
-# ZIP archiver (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu)
+# UC2 archiver (Greg Roelofs, newt@uchicago.edu)
+# I can't figure out the self-extracting form of these buggers...
+0 string UC2\x1a UC2 archive data
+
+# ZIP archives (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu)
0 string PK\003\004 Zip archive data
->4 byte 0x09 (at least v0.9 to extract)
->4 byte 0x0a (at least v1.0 to extract)
->4 byte 0x0b (at least v1.1 to extract)
->4 byte 0x14 (at least v2.0 to extract)
-
-# ZOO archiver (Greg Roelofs, newt@uchicago.edu)
-#0 string ZOO Zoo archive data
-# above is an alternate identifier
-20 string \xdc\xa7\xc4\xfd Zoo archive data
-# I don't know if all of these versions exist, or if some are missing...
->4 string 1.00 (v%4s)
->4 string 1.10 (v%4s)
->4 string 1.20 (v%4s)
->4 string 1.30 (v%4s)
->4 string 1.40 (v%4s)
->4 string 1.50 (v%4s)
->4 string 1.60 (v%4s)
->4 string 1.70 (v%4s)
->4 string 1.71 (v%4s)
->4 string 2.00 (v%4s)
->4 string 2.01 (v%4s)
->4 string 2.10 (v%4s)
->32 string \001\000 (modify: v1.0+)
->32 string \001\004 (modify: v1.4+)
->32 string \002\000 (modify: v2.0+)
->70 string \001\000 (extract: v1.0+)
->70 string \002\001 (extract: v2.1+)
+>4 byte 0x09 \b, at least v0.9 to extract
+>4 byte 0x0a \b, at least v1.0 to extract
+>4 byte 0x0b \b, at least v1.1 to extract
+>4 byte 0x14 \b, at least v2.0 to extract
+
+# Zoo archiver
+20 lelong 0xfdc4a7dc Zoo archive data
+>4 byte >48 \b, v%c.
+>>6 byte >47 \b%c
+>>>7 byte >47 \b%c
+>32 byte >0 \b, modify: v%d
+>>33 byte x \b.%d+
+>42 lelong 0xfdc4a7dc \b,
+>>70 byte >0 extract: v%d
+>>>71 byte x \b.%d+
# Shell archives
10 string #\ This\ is\ a\ shell\ archive shell archive text
->2 string >\0 (%s)
-
-# JAM Archive volume format, by Dmitry.Kohmanyuk@UA.net
-0 string \351,\001JAM\ JAM archive,
->7 string >\0 version %.4s
->0x26 byte =0x27 -
->>0x2b string >\0 label %.11s,
->>0x27 lelong x serial %08x,
->>0x36 string >\0 fstype %.8s
#------------------------------------------------------------------------------
# audio: file(1) magic for sound formats
#
-# from Jan Nicolai Langfeldt <janl@ifi.uio.no>,
+# Jan Nicolai Langfeldt (janl@ifi.uio.no), Dan Quinlan (quinlan@yggdrasil.com),
+# and others
#
# Sun/NeXT audio data
# Microsoft WAVE format (*.wav)
# [GRR 950115: probably all of the shorts and longs should be leshort/lelong]
0 string RIFF Microsoft RIFF
->8 string WAVE - WAVE format
->34 short >0 %d bit
->22 short =1 Mono
->22 short =2 Stereo
->22 short >2 %d Channels
+>8 string WAVE \b, WAVE audio data
+>34 short >0 \b, %d bit
+>22 short =1 \b, mono
+>22 short =2 \b, stereo
+>22 short >2 \b, %d channels
>24 long >0 %d Hz
+
+# Extended MOD format (*.emd) (Greg Roelofs, newt@uchicago.edu); NOT TESTED
+# [based on posting 940824 by "Dirk/Elastik", husberg@lehtori.cc.tut.fi]
+0 string EMOD Extended MOD sound data,
+>4 byte&0xf0 x version %d
+>4 byte&0x0f x \b.%d,
+>45 byte x %d instruments
+>83 byte 0 (module)
+>83 byte 1 (song)
#
# XPM icons (Greg Roelofs, newt@uchicago.edu)
-# ideally should go into "images", but entries below would tag XPM as C source
-0 string /*\ XPM X pixmap image data
+# if you uncomment "/*" for C/REXX below, also uncomment this entry
+#0 string /*\ XPM\ */ X pixmap image data
# this first will upset you if you're a PL/1 shop...
# in which case rm it; ascmagic will catch real C programs
# commands: file(1) magic for various shells and interpreters
#
0 string :\ shell archive or commands for antique kernel text
-0 string #!/bin/sh Bourne Shell script text
-0 string #!\ /bin/sh Bourne Shell script text
-0 string #!/bin/csh C Shell script text
-0 string #!\ /bin/csh C Shell script text
+0 string #!/bin/sh Bourne shell script text
+0 string #!\ /bin/sh Bourne shell script text
+0 string #!/bin/csh C shell script text
+0 string #!\ /bin/csh C shell script text
# korn shell magic, sent by George Wu, gwu@clyde.att.com
-0 string #!/bin/ksh Korn Shell script text
-0 string #!\ /bin/ksh Korn Shell script text
-0 string #!/bin/tcsh Tenex C Shell script text
-0 string #!\ /bin/tcsh Tenex C Shell script text
-0 string #!/usr/local/tcsh Tenex C Shell script text
-0 string #!\ /usr/local/tcsh Tenex C Shell script text
-0 string #!/usr/local/bin/tcsh Tenex C Shell script text
-0 string #!\ /usr/local/bin/tcsh Tenex C Shell script text
+0 string #!/bin/ksh Korn shell script text
+0 string #!\ /bin/ksh Korn shell script text
+0 string #!/bin/tcsh Tenex C shell script text
+0 string #!\ /bin/tcsh Tenex C shell script text
+0 string #!/usr/local/tcsh Tenex C shell script text
+0 string #!\ /usr/local/tcsh Tenex C shell script text
+0 string #!/usr/local/bin/tcsh Tenex C shell script text
+0 string #!\ /usr/local/bin/tcsh Tenex C shell script text
#
# zsh/ash/ae/nawk/gawk magic from cameron@cs.unsw.oz.au (Cameron Simpson)
-0 string #!/usr/local/bin/zsh - Paul Falstad's zsh
-0 string #!\ /usr/local/bin/zsh - Paul Falstad's zsh
-0 string #!/usr/local/bin/ash - NeilBrown's ash
-0 string #!\ /usr/local/bin/ash - NeilBrown's ash
-0 string #!/usr/local/bin/ae - NeilBrown's ae
-0 string #!\ /usr/local/bin/ae - NeilBrown's ae
-0 string #!/bin/nawk - New Awk script text
-0 string #!\ /bin/nawk - New Awk script text
-0 string #!/usr/bin/nawk - New Awk script text
-0 string #!\ /usr/bin/nawk - New Awk script text
-0 string #!/usr/local/bin/nawk - New Awk script text
-0 string #!\ /usr/local/bin/nawk - New Awk script text
-0 string #!/bin/gawk - GNU awk script text
-0 string #!\ /bin/gawk - GNU awk script text
-0 string #!/usr/bin/gawk - GNU awk script text
-0 string #!\ /usr/bin/gawk - GNU awk script text
-0 string #!/usr/local/bin/gawk - GNU awk script text
-0 string #!\ /usr/local/bin/gawk - GNU awk script text
+0 string #!/usr/local/bin/zsh Paul Falstad's zsh
+0 string #!\ /usr/local/bin/zsh Paul Falstad's zsh
+0 string #!/usr/local/bin/ash Neil Brown's ash
+0 string #!\ /usr/local/bin/ash Neil Brown's ash
+0 string #!/usr/local/bin/ae Neil Brown's ae
+0 string #!\ /usr/local/bin/ae Neil Brown's ae
+0 string #!/bin/nawk new awk script text
+0 string #!\ /bin/nawk new awk script text
+0 string #!/usr/bin/nawk new awk script text
+0 string #!\ /usr/bin/nawk new awk script text
+0 string #!/usr/local/bin/nawk new awk script text
+0 string #!\ /usr/local/bin/nawk new awk script text
+0 string #!/bin/gawk GNU awk script text
+0 string #!\ /bin/gawk GNU awk script text
+0 string #!/usr/bin/gawk GNU awk script text
+0 string #!\ /usr/bin/gawk GNU awk script text
+0 string #!/usr/local/bin/gawk GNU awk script text
+0 string #!\ /usr/local/bin/gawk GNU awk script text
#
-0 string #!/bin/awk Awk Commands text
-0 string #!\ /bin/awk Awk Commands text
-0 string #!/usr/bin/awk Awk Commands text
-0 string #!\ /usr/bin/awk Awk Commands text
+0 string #!/bin/awk awk commands text
+0 string #!\ /bin/awk awk commands text
+0 string #!/usr/bin/awk awk commands text
+0 string #!\ /usr/bin/awk awk commands text
# For Larry Wall's perl language. The ``eval'' line recognizes an
# outrageously clever hack for USG systems.
0 string eval\ "exec\ /usr/local/bin/perl perl commands text
# AT&T Bell Labs' Plan 9 shell
-0 string #!/bin/rc Plan 9 rc Shell script text
-0 string #!\ /bin/rc Plan 9 rc Shell script text
+0 string #!/bin/rc Plan 9 rc shell script text
+0 string #!\ /bin/rc Plan 9 rc shell script text
# bash shell magic, from Peter Tobias (tobias@server.et-inf.fho-emden.de)
-0 string #!/bin/bash Bourne-Again Shell script text
-0 string #!\ /bin/bash Bourne-Again Shell script text
-0 string #!/usr/local/bin/bash Bourne-Again Shell script text
-0 string #!\ /usr/local/bin/bash Bourne-Again Shell script text
+0 string #!/bin/bash Bourne-Again shell script text
+0 string #!\ /bin/bash Bourne-Again shell script text
+0 string #!/usr/local/bin/bash Bourne-Again shell script text
+0 string #!\ /usr/local/bin/bash Bourne-Again shell script text
# generic shell magic
0 string #!\ / a
#------------------------------------------------------------------------------
# compress: file(1) magic for pure-compression formats (no archives)
#
-# compress, gzip, pack, compact, huf, squeeze, crunch, freeze, yabba, whap, etc.
+# compress, gzip, pack, compact, huf, squeeze, crunch, freeze, yabba, etc.
#
# Formats for various forms of compressed data
# Formats for "compress" proper have been moved into "compress.c",
>2 byte&0x80 >0 block compressed
>2 byte&0x1f x %d bits
-# gzip (GNU zip, not to be confused with [Info-ZIP/PKWARE] zip archiver)
+# gzip (GNU zip, not to be confused with Info-ZIP or PKWARE zip archiver)
0 string \037\213 gzip compressed data
->2 byte <8 - reserved method,
->2 byte 8 - deflate method,
->3 byte &0x01 ascii,
+>2 byte <8 \b, reserved method,
+>2 byte 8 \b, deflated,
+>3 byte &0x01 ASCII,
>3 byte &0x02 continuation,
>3 byte &0x04 extra field,
->3 byte &0x08 original file name,
+>3 byte &0x08 original filename,
>3 byte &0x10 comment,
>3 byte &0x20 encrypted,
>4 ledate x last modified: %s,
>9 byte =0x0A os: Tops/20
>9 byte =0x0B os: Win/32
-# According to gzip.h, this is the correct byte order for packed data.
-#
+# packed data, Huffman (minimum redundancy) codes on a byte-by-byte basis
0 string \037\036 packed data
->2 belong >1 %d characters originally
->2 belong =1 %d character originally
-#
-# This magic number is byte-order-independent.
+>2 belong >1 \b, %d characters originally
+>2 belong =1 \b, %d character originally
#
+# This magic number is byte-order-independent. XXX - Does that mean this
+# is big-endian, little-endian, either, or that you can't tell?
0 short 017437 old packed data
# XXX - why *two* entries for "compacted data", one of which is
0 leshort 0x76FE crunched data (CP/M, DOS)
# Freeze
-0 string \037\237 Frozen file 2.1
-0 string \037\236 Frozen file 1.0 (or gzip 0.5)
+0 string \037\237 frozen file 2.1
+0 string \037\236 frozen file 1.0 (or gzip 0.5)
-# lzh?
-0 string \037\240 LZH compressed data
+# SCO compress -H (LZH)
+0 string \037\240 SCO compress -H (LZH) data
# European GSM 06.10 is a provisional standard for full-rate speech
# transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse
# excitation/long term prediction) coding at 13 kbit/s.
#
-# WEAK - There's only a magic nibble (4 bits); but that nibble repeats
-# every 33 bytes. This magic is NOT suited for use, but maybe we can
-# use it someday.
+# There's only a magic nibble (4 bits); that nibble repeats every 33
+# bytes. This isn't suited for use, but maybe we can use it someday.
#
# This will cause very short GSM files to be declared as data and
# mismatches to be declared as data too!
# We have to check the byte order flag to see what byte order all the
# other stuff in the header is in.
#
-# MIPS, i486 added by Daniel Quinlan (quinlan@yggdrasil.com)
+# Byte order is probably big-endian for MIPS RS3000 and Amdahl.
+# MIPS RS3000 may also be for MIPS RS2000.
+#
+# updated by Daniel Quinlan (quinlan@yggdrasil.com)
0 string \177ELF ELF
>4 byte 0 invalid class
>4 byte 1 32-bit
>4 byte 2 64-bit
>5 byte 0 invalid byte order
>5 byte 1 LSB
->>16 leshort 0 unknown type
->>16 leshort 1 relocatable
->>16 leshort 2 executable
->>16 leshort 3 dynamic lib
->>16 leshort 4 core file
->>18 leshort 0 unknown machine
->>18 leshort 1 WE32100 and up
->>18 leshort 2 SPARC - invalid byte order
->>18 leshort 3 i386 (386 and up)
->>18 leshort 4 M68000 - invalid byte order
->>18 leshort 5 M88000 - invalud byte order
->>18 leshort 6 i486
->>18 leshort 7 i860
->>18 leshort 8 MIPS
->>20 lelong 1 Version 1
+>>16 leshort 0 no file type,
+>>16 leshort 1 relocatable,
+>>16 leshort 2 executable,
+>>16 leshort 3 shared object,
+>>16 leshort 4 core file,
+>>16 leshort &0xff00 processor-specific,
+>>18 leshort 0 no machine,
+>>18 leshort 1 AT&T WE32100 - invalid byte order,
+>>18 leshort 2 SPARC - invalid byte order,
+>>18 leshort 3 Intel 80386,
+>>18 leshort 4 Motorola 68000 - invalid byte order,
+>>18 leshort 5 Motorola 88000 - invalid byte order,
+>>18 leshort 6 Intel 80486,
+>>18 leshort 7 Intel 80860,
+>>18 leshort 8 MIPS RS3000,
+>>18 leshort 9 Amdahl,
+>>20 lelong 0 invalid version
+>>20 lelong 1 version 1
>>36 lelong 1 MathCoPro/FPU/MAU Required
>5 byte 2 MSB
->>16 beshort 0 unknown type
->>16 beshort 1 relocatable
->>16 beshort 2 executable
->>16 beshort 3 dynamic lib
->>16 beshort 4 core file
->>18 beshort 0 unknown machine
->>18 beshort 1 WE32100 and up
->>18 beshort 2 SPARC
->>18 beshort 3 i386 (386 and up) - invalid byte order
->>18 beshort 4 M68000
->>18 beshort 5 M88000
->>18 beshort 6 i486 - invalid byte order
->>18 beshort 7 i860
->>18 beshort 8 MIPS
->>20 belong 1 Version 1
+>>16 beshort 0 no file type,
+>>16 beshort 1 relocatable,
+>>16 beshort 2 executable,
+>>16 beshort 3 shared object,
+>>16 beshort 4 core file,
+>>16 beshort &0xff00 processor-specific,
+>>18 beshort 0 no machine,
+>>18 beshort 1 AT&T WE32100,
+>>18 beshort 2 SPARC,
+>>18 beshort 3 Intel 80386 - invalid byte order,
+>>18 beshort 4 Motorola 68000,
+>>18 beshort 5 Motorola 88000,
+>>18 beshort 6 Intel 80486 - invalid byte order,
+>>18 beshort 7 Intel 80860,
+>>18 beshort 8 MIPS RS3000,
+>>18 leshort 9 Amdahl,
+>>20 belong 0 invalid version
+>>20 belong 1 version 1
>>36 belong 1 MathCoPro/FPU/MAU Required
--- /dev/null
+
+#------------------------------------------------------------------------------
+# filesystems: file(1) magic for different filesystems
+#
+0x438 leshort 0xEF53 Linux/i386 ext2 filesystem
+0 string \366\366\366\366 PC formatted floppy with no filesystem
# UNIX environment atop the "SUN kernel"; dunno whether it was
# big-endian or little-endian.
#
-# I'm guessing that the 200 series was 68K-based; the 300 and 400 series
-# are.
+# Daniel Quinlan (quinlan@yggdrasil.com): hp200 machines are 68010 based;
+# hp300 are 68020+68881 based; hp400 are also 68k. The following basic
+# HP magic is useful for reference, but using "long" magic is a better
+# practice in order to avoid collisions.
#
-# Daniel Quinlan (quinlan@yggdrasil.com): hp200 machines are 68010
-# based; hp300 are 68020+68881 based. I think the following "basic"
-# magic should probably be integrated and the various flavors of
-# binaries be implemented with ">2" in case some flavors have been missed.
-# 0 beshort 200 hp200 (68010) BSD binary
-# 0 beshort 300 hp300 (68020+68881) BSD binary
-# 0 beshort 0x20c hp200/300 HP-UX binary
-# 0 beshort 0x20b hp800 HP-UX binary
+# 0 beshort 200 hp200 (68010) BSD binary
+# 0 beshort 300 hp300 (68020+68881) BSD binary
+# 0 beshort 0x20c hp200/300 HP-UX binary
+# 0 beshort 0x20b hp800 HP-UX binary
#
# The "misc" stuff needs a byte order; the archives look suspiciously
# like the old 177545 archives (0xff65 = 0177545).
#
#### Old Apollo stuff
0 beshort 0627 Apollo m68k COFF executable
->18 beshort ^040000 not stripped
+>18 beshort ^040000 not stripped
>22 beshort >0 - version %ld
0 beshort 0624 apollo a88k COFF executable
->18 beshort ^040000 not stripped
+>18 beshort ^040000 not stripped
>22 beshort >0 - version %ld
0 long 01203604016 TML 0123 byte-order format
0 long 01702407010 TML 1032 byte-order format
0 beshort 0535 SVR2 executable (USS/370)
>12 belong >0 not stripped
>24 belong >0 - version %ld
-
#------------------------------------------------------------------------------
-# images: file(1) magic for image formats (see also "c-lang" for XPM bitmaps)
+# images: file(1) magic for image formats
#
# originally from jef@helios.ee.lbl.gov (Jef Poskanzer),
# additions by janl@ifi.uio.no as well as others. Jan also suggested
# merging several one- and two-line files into here.
#
-# XXX - byte order for GIF and TIFF fields?
-# [GRR: TIFF allows both byte orders; GIF is probably little-endian]
-#
-
-# [GRR: what the hell is this doing in here?]
-0 string xbtoa btoa'd file
+# little magic: PCX (first byte is 0x0a)
+# no magic: Targa
# PBMPLUS
0 string P1 PBM file
0 string P5 PGM "rawbits" file
0 string P6 PPM "rawbits" file
-# NIFF (Navy Interchange File Format, a modification of TIFF)
-# [GRR: this *must* go before TIFF]
+# NIFF (Navy Interchange File Format, a modification of TIFF) images
0 string IIN1 NIFF raster data
-# TIFF and friends
-0 string MM TIFF file, big-endian
->2 short >0 version %d
-0 string II TIFF file, little-endian
->2 short >0 version %d
+# Tag Image File Format, from Daniel Quinlan (quinlan@yggdrasil.com)
+# The second word of TIFF files is the TIFF version number, 42, which has
+# never changed. The TIFF specification recommends testing for it.
+0 string MM\x00\x2a TIFF file, big-endian
+0 string II\x2a\x00 TIFF file, little-endian
-# possible GIF replacements; none yet released!
+# PNG [Portable Network Graphics, or "PNG's Not GIF"] images
# (Greg Roelofs, newt@uchicago.edu)
#
-# GRR 950115: this was mine ("Zip GIF"):
-0 string GIF94z ZIF image (GIF+deflate alpha)
-#
-# GRR 950115: this is Jeremy Wohl's Free Graphics Format (better):
-0 string FGF95a FGF image (GIF+deflate beta)
+# 137 P N G \r \n ^Z \n [4-byte length] H E A D [HEAD data] [HEAD crc] ...
#
-# GRR 950115: this is Thomas Boutell's Portable Bitmap Format proposal
-# (best; not yet implemented):
-0 string .PBF PBF image (deflate compression)
+0 string \x89PNG PNG image,
+>4 belong !0x0d0a1a0a CORRUPTED,
+>16 belong x %ld x
+>20 belong x %ld,
+>24 byte x %d-bit
+>25 byte 0 grayscale,
+>25 byte 2 \b/color RGB,
+>25 byte 3 colormap,
+>25 byte 4 gray+alpha,
+>25 byte 6 \b/color RGBA,
+#>26 byte 0 deflate/32K,
+>28 byte 0 non-interlaced
+>28 byte 1 interlaced
# GIF
0 string GIF GIF image
->3 string 87a - version %s,
->3 string 89a - version %s,
+>3 string 87a \b, version %s,
+>3 string 89a \b, version %s,
>6 leshort >0 %hd x
>8 leshort >0 %hd,
#>10 byte &0x80 color mapped,
-# GRR 950118: the following is not accurate for xv-created GIFs:
+# GRR 950428: the following is wrong; interlace flag is at a variable offset
#>10 byte &0x40 interlaced,
>10 byte&0x07 =0x00 2 colors
>10 byte&0x07 =0x01 4 colors
>10 byte&0x07 =0x06 128 colors
>10 byte&0x07 =0x07 256 colors
-# Miscellany
+# ITC (CMU WM) raster files. It is essentially a byte-reversed Sun raster,
+# 1 plane, no encoding.
+0 string \361\0\100\273 CMU window manager raster image data
+>4 lelong >0 %d x
+>8 lelong >0 %d,
+>12 lelong >0 %d-bit
+
+# Magick Image File Format
+0 string id=ImageMagick MIFF image file
+
+# miscellaneous images
0 long 1123028772 Artisan image file
->4 long 1 rectangular 24-bit image
->4 long 2 rectangular 8-bit image with colormap
+>4 long 1 rectangular 24-bit image
+>4 long 2 rectangular 8-bit image with colormap
>4 long 3 rectangular 32-bit image (24-bit with matte)
-0 string \361\0\100\273 CMU window manager bitmap
0 string #FIG FIG graphics savefile text
>6 string 2.1 Version 2.1
>6 string 2.0 Version 2.0
0 string GKSM GKS Metafile
8 string ILBM IFF ILBM file
-0 string ARF_BEGARF PHIGS clear text archive
-
-# More miscellany from Daniel Quinlan (quinlan@yggdrasil.com)
0 string This\ is\ a\ BitMap\ file Lisp Machine bit-array-file
-0 string !! Bennet Yee's "face" format
+0 string !! Bennet Yee's "face" format
+
+# PHIGS
+0 string ARF_BEGARF PHIGS clear text archive
0 string @(#)SunPHIGS SunPHIGS
# version number follows, in the form m.n
>40 string SunBin binary
>32 string archive archive
+
+# CGM image files
0 string BEGMF clear text Computer Graphics Metafile
-# these should be beshort, but not sure
0 beshort&0xffe0 0x0020 binary Computer Graphics Metafile
0 beshort 0x3020 character Computer Graphics Metafile
+# MGR bitmaps (Michael Haardt, u31b3hs@pool.informatik.rwth-aachen.de)
+0 string yz MGR bitmap, modern format, 8-bit aligned
+0 string zz MGR bitmap, old format, 1 bit deep, 16-bit aligned
+0 string xz MGR bitmap, old format, 1 bit deep, 32-bit aligned
+0 string yx MGR bitmap, modern format, squeezed
-# From: <u31b3hs@pool.informatik.rwth-aachen.de> (Michael Haardt)
-0 string yz MGR bitmap, modern format, 8 bit aligned
-0 string zz MGR bitmap, old format, 1 bit deep, 16 bit aligned
-0 string xz MGR bitmap, old format, 1 bit deep, 32 bit aligned
-0 string yx MGR bitmap, modern format, squeezed
-
-0 string %bitmap FBM pixmap
->30 long 0x31 (mono)
->30 long 0x33 (color)
+# Fuzzy Bitmap (FBM) images
+0 string %bitmap\0 FBM image data
+>30 long 0x31 \b, mono
+>30 long 0x33 \b, color
-4 string Research, Digifax-G3-File
->29 byte 1 , fine resolution
->29 byte 0 , normal resolution
+# facsimile data
+1 string PC\ Research,\ Inc group 3 fax image
+>29 byte 0 \b, normal resolution (204x98 DPI)
+>29 byte 1 \b, fine resolution (204x196 DPI)
# JPEG images
-0 beshort 0xffd8 JPEG image
->6 string JFIF - JFIF standard
-# from cameron@cs.unsw.oz.au (Cameron Simpson):
-0 string hsi1 JPEG image - HSI encoded (proprietary)
+0 beshort 0xffd8 JPEG image data
+>6 string JFIF \b, JFIF standard
+# HSI is Handmade Software's proprietary JPEG encoding scheme
+0 string hsi1 JPEG image data, HSI proprietary
# PC bitmaps (OS/2, Windoze BMP files) (Greg Roelofs, newt@uchicago.edu)
-0 string BM bitmap
->14 byte 12 (OS/2 1.x format)
->14 byte 64 (OS/2 2.x format)
->14 byte 40 (Windows 3.x format)
-0 string IC icon
-0 string PI pointer
-0 string CI color icon
-0 string CP color pointer
-0 string BA bitmap array
-
-# Utah Raster Toolkit RLE images (two versions)
-#
-# From <janl@ifi.uio.no>
-# I made this with the help of the man page for rle(5). Ihey missing
-# from the magic numbers I have:
+0 string BM PC bitmap data
+>14 byte 12 \b, OS/2 1.x format
+>14 byte 64 \b, OS/2 2.x format
+>14 byte 40 \b, Windows 3.x format
+0 string IC PC icon data
+0 string PI PC pointer image data
+0 string CI PC color icon data
+0 string CP PC color pointer image data
+0 string BA PC bitmap array data
+
+# XPM icons (Greg Roelofs, newt@uchicago.edu)
+# note possible collision with C/REXX entry in c-lang; currently commented out
+0 string /*\ XPM\ */ X pixmap image text
+
+# Utah Raster Toolkit RLE images (janl@ifi.uio.no)
0 beshort 0xcc52 Utah Raster Toolkit RLE
->2 beshort >0 lower left corner: %d
->4 beshort >0 lower right corner: %d
->6 beshort >0 %d x
->8 beshort >0 %d
->10 byte&0x1 =0x1 CLEARFIRST
->10 byte&0x2 =0x2 NO_BACKGROUND
->10 byte&0x4 =0x4 ALPHA
->10 byte&0x8 =0x8 COMMENT
->11 byte >0 %d colour channels
->12 byte >0 %d bits per pixel
+>6 beshort >0 \b, %d x
+>8 beshort >0 %d,
+>2 beshort >0 lower left corner: %d,
+>4 beshort >0 lower right corner: %d,
+>10 byte&0x1 =0x1 CLEARFIRST,
+>10 byte&0x2 =0x2 NO_BACKGROUND,
+>10 byte&0x4 =0x4 ALPHA,
+>10 byte&0x8 =0x8 COMMENT,
+>11 byte >0 %d colour channels,
+>12 byte >0 %d bits per pixel,
>13 byte >0 %d colour map channels
-#
-# RLE images (Disaster prone simpleton, m91dps@ecs.ox.ac.uk)
-# Here's a magic file entry for rle images. 24-bit images tend to produce
-# foo.rle size 42x42, 3 comps each 8 bits
-# (for arbitary, prossibly different values of 42).
-# freely redistribuable under the GPL
-# [GRR: which endianness? big?]
-0 short 0xcc55 RLE image data
->6 short >0 %d x
->8 short >0 %d,
->2 short >0 x offset by %d,
->4 short >0 y offset by %d,
->11 byte =0 colour map
->11 byte >1 %d comps each
->12 byte =1 1 bit
->12 byte >1 %d bits
-
-# FBM images, culled from xli source (d. p. simpleton, m91dps@ecs.ox.ac.uk)
-0 string %bitmap fbm image data
# image file format (Robert Potter, potter@cs.rochester.edu)
0 string Imagefile\ version- iff image data
# Sun rasterfiles, from Daniel Quinlan (quinlan@yggdrasil.com)
#
-# XXX - Does the Sun 386i use the same byte order?
-#
0 belong 0x59a66a95 Sun raster image
>4 belong >0 %d x
>8 belong >0 %d,
# this doesn't impart much useful information (or does it?)
#>28 belong >0 colormap is %d bytes long
-# Daniel Quinlan (quinlan@yggdrasil.com) -- from an SGI machine
-0 string IT01 FIT image file
->4 belong x (%d x
->8 belong x %d x
->12 belong x %d)
+# SGI image file format, from Daniel Quinlan (quinlan@yggdrasil.com)
+# file://sgi.com/graphics/SGIIMAGESPEC
+0 beshort 474 SGI image
+#>2 byte 0 \b, verbatim
+>2 byte 1 \b, RLE
+#>3 byte 1 \b, normal precision
+>3 byte 2 \b, high precision
+>4 beshort x \b, %d-D
+>6 beshort x \b, %d x
+>8 beshort x %d
+>10 beshort x \b, %d channel
+>10 beshort !1 \bs
+>80 string >0 \b, "%s"
+
+0 string IT01 FIT image file
+>4 belong x \b, %d x
+>8 belong x %d x
+>12 belong x %d
#
-0 string IT02 FIT image file
->4 belong x (%d x
->8 belong x %d x
->12 belong x %d)
+0 string IT02 FIT image file
+>4 belong x \b, %d x
+>8 belong x %d x
+>12 belong x %d
#
-2048 string PCD_IPI Kodak Photo CD image pack file
-0 string PCD_OPA Kodak Photo CD overview pack file
+2048 string PCD_IPI Kodak Photo CD image pack file
+0 string PCD_OPA Kodak Photo CD overview pack file
-# Jeff Uphoff <juphoff@tarsier.cv.nrao.edu>
+# FITS format. Jeff Uphoff <juphoff@tarsier.cv.nrao.edu>
# FITS is the Flexible Image Transport System, the de facto standard for
# data and image transfer, storage, etc., for the astronomical community.
-# FITS format.
0 string SIMPLE\ \ = FITS
->107 string -32 32 bits per pixel, IEEE big endian float
->107 string \ 32 32 bits per pixel, unsigned integer
->108 string 16 16 bits per pixel, unsigned integer
->109 string 8 8 bits per pixel, unsigned integer
+>107 string -32 \b, 32 bits per pixel, IEEE big-endian float
+>107 string \ 32 \b, 32 bits per pixel, unsigned integer
+>108 string 16 \b, 16 bits per pixel, unsigned integer
+>109 string 8 \b, 8 bits per pixel, unsigned integer
# intel: file(1) magic for x86 Unix
#
# Various flavors of x86 UNIX executable/object (other than Xenix, which
-# is in "microsoft"). DOS is in "ms-dos"; the ambitious soul can do
+# is in "microsoft"). DOS is in "msdos"; the ambitious soul can do
# Windows as well.
#
# Windows NT belongs elsewhere, as you need x86 and MIPS and Alpha and
0 string =<!OPS Interleaf document text
>5 string ,\ Version\ (version
>>14 string >\0 %s)
-
#------------------------------------------------------------------------------
# ispell: file(1) magic for ispell
#
-# XXX - byte order?
+# Ispell 3.0 has a magic of 0x9601 and ispell 3.1 has 0x9602. This magic
+# will match 0x9600 through 0x9603 in *both* little endian and big endian.
+# (No other current magic entries collide.)
#
-0 short 0xffff9601 ispell hash file
->2 short 0x00 - 8-bit, no capitalization, 26 flags
->2 short 0x01 - 7-bit, no capitalization, 26 flags
->2 short 0x02 - 8-bit, capitalization, 26 flags
->2 short 0x03 - 7-bit, capitalization, 26 flags
->2 short 0x04 - 8-bit, no capitalization, 52 flags
->2 short 0x05 - 7-bit, no capitalization, 52 flags
->2 short 0x06 - 8-bit, capitalization, 52 flags
->2 short 0x07 - 7-bit, capitalization, 52 flags
->2 short 0x08 - 8-bit, no capitalization, 128 flags
->2 short 0x09 - 7-bit, no capitalization, 128 flags
->2 short 0x0A - 8-bit, capitalization, 128 flags
->2 short 0x0B - 7-bit, capitalization, 128 flags
->2 short 0x0C - 8-bit, no capitalization, 256 flags
->2 short 0x0D - 7-bit, no capitalization, 256 flags
->2 short 0x0E - 8-bit, capitalization, 256 flags
->2 short 0x0F - 7-bit, capitalization, 256 flags
->4 short >0 and %d string characters
+# Updated by Daniel Quinlan (quinlan@yggdrasil.com)
+#
+0 leshort&0xFFFC 0x9600 little endian ispell
+>0 byte 0 hash file (?),
+>0 byte 1 3.0 hash file,
+>0 byte 2 3.1 hash file,
+>0 byte 3 hash file (?),
+>2 leshort 0x00 8-bit, no capitalization, 26 flags
+>2 leshort 0x01 7-bit, no capitalization, 26 flags
+>2 leshort 0x02 8-bit, capitalization, 26 flags
+>2 leshort 0x03 7-bit, capitalization, 26 flags
+>2 leshort 0x04 8-bit, no capitalization, 52 flags
+>2 leshort 0x05 7-bit, no capitalization, 52 flags
+>2 leshort 0x06 8-bit, capitalization, 52 flags
+>2 leshort 0x07 7-bit, capitalization, 52 flags
+>2 leshort 0x08 8-bit, no capitalization, 128 flags
+>2 leshort 0x09 7-bit, no capitalization, 128 flags
+>2 leshort 0x0A 8-bit, capitalization, 128 flags
+>2 leshort 0x0B 7-bit, capitalization, 128 flags
+>2 leshort 0x0C 8-bit, no capitalization, 256 flags
+>2 leshort 0x0D 7-bit, no capitalization, 256 flags
+>2 leshort 0x0E 8-bit, capitalization, 256 flags
+>2 leshort 0x0F 7-bit, capitalization, 256 flags
+>4 leshort >0 and %d string characters
+0 beshort&0xFFFC 0x9600 big endian ispell
+>1 byte 0 hash file (?),
+>1 byte 1 3.0 hash file,
+>1 byte 2 3.1 hash file,
+>1 byte 3 hash file (?),
+>2 beshort 0x00 8-bit, no capitalization, 26 flags
+>2 beshort 0x01 7-bit, no capitalization, 26 flags
+>2 beshort 0x02 8-bit, capitalization, 26 flags
+>2 beshort 0x03 7-bit, capitalization, 26 flags
+>2 beshort 0x04 8-bit, no capitalization, 52 flags
+>2 beshort 0x05 7-bit, no capitalization, 52 flags
+>2 beshort 0x06 8-bit, capitalization, 52 flags
+>2 beshort 0x07 7-bit, capitalization, 52 flags
+>2 beshort 0x08 8-bit, no capitalization, 128 flags
+>2 beshort 0x09 7-bit, no capitalization, 128 flags
+>2 beshort 0x0A 8-bit, capitalization, 128 flags
+>2 beshort 0x0B 7-bit, capitalization, 128 flags
+>2 beshort 0x0C 8-bit, no capitalization, 256 flags
+>2 beshort 0x0D 7-bit, no capitalization, 256 flags
+>2 beshort 0x0E 8-bit, capitalization, 256 flags
+>2 beshort 0x0F 7-bit, capitalization, 256 flags
+>4 beshort >0 and %d string characters
0 string KarmaRHD Version Karma Data Structure Version
>16 long x %lu
-
# lex: file(1) magic for lex
#
# derived empirically, your offsets may vary!
-53 string yyprevious c program text (from lex)
+53 string yyprevious C program text (from lex)
>3 string >\0 for %s
# C program text from GNU flex, from Daniel Quinlan <quinlan@yggdrasil.com>
21 string generated\ by\ flex C program text (from flex)
#------------------------------------------------------------------------------
# linux: file(1) magic for Linux files
#
-# Values for Linux/i386 binaries, from Rik Faith <faith@cs.unc.edu>,
-# Peter Tobias <tobias@server.et-inf.fho-emden.de>, and Daniel Quinlan
-# <quinlan@yggdrasil.com>
+# Values for Linux/i386 binaries, from Daniel Quinlan <quinlan@yggdrasil.com>
+# The following basic Linux magic is useful for reference, but using
+# "long" magic is a better practice in order to avoid collisions.
+#
+# 2 leshort 100 Linux/i386
+# >0 leshort 0407 impure executable (OMAGIC)
+# >0 leshort 0410 pure executable (NMAGIC)
+# >0 leshort 0413 demand-paged executable (ZMAGIC)
+# >0 leshort 0314 demand-paged executable (QMAGIC)
+#
+0 lelong 0x00640107 Linux/i386 impure executable (OMAGIC)
+>16 lelong 0 - stripped
+0 lelong 0x00640108 Linux/i386 pure executable (NMAGIC)
+>16 lelong 0 - stripped
+0 lelong 0x0064010b Linux/i386 demand-paged executable (ZMAGIC)
+>16 lelong 0 - stripped
+0 lelong 0x006400cc Linux/i386 demand-paged executable (QMAGIC)
+>16 lelong 0 - stripped
#
-2 leshort 100 Linux/i386
->0 leshort 0407 impure executable (OMAGIC)
->0 leshort 0410 pure executable (NMAGIC)
->0 leshort 0413 demand-paged executable (ZMAGIC)
->0 leshort 0314 demand-paged executable (QMAGIC)
->16 lelong >0 not stripped
->16 lelong 0 stripped
->0 string Jump jump
-# object files
-# first entry is absolutely correct, but may conflict with PDP-11 executable
-#0 leshort 0407 Linux/i386 object file
0 string \007\001\000 Linux/i386 object file
->20 long >0x1020 - DLL library
+>20 lelong >0x1020 - DLL library
# message catalogs, from Mitchum DSouza <m.dsouza@mrc-apu.cam.ac.uk>
0 string *nazgul* compiled message catalog
->8 long >0 - version %ld
+>8 lelong >0 - version %ld
# core dump file, from Bill Reynolds <bill@goshawk.lanl.gov>
216 lelong 0421 Linux/i386 core file
>220 string >\0 of '%s'
+>200 lelong >0 (signal %d)
#
# LILO boot/chain loaders, from Daniel Quinlan <quinlan@yggdrasil.com>
# this can be overridden by the DOS executable (COM) entry
-# XXX - moved to "dos"
+2 string LILO Linux/i386 LILO boot/chain loader
+#
# Debian Packages, from Peter Tobias <tobias@server.et-inf.fho-emden.de>
0 string 0.9
>8 byte 0x0a Debian Binary Package -
>>3 byte >0 created by dpkg 0.9%c
>>4 byte >0 pl%c
# PSF fonts, from H. Peter Anvin <hpa@yggdrasil.com>
-0 leshort 0x0436 Pc Screen Font data
->2 byte 0 - 256 characters, no directory
->2 byte 1 - 512 characters, no directory
->2 byte 2 - 256 characters, Unicode directory
->2 byte 3 - 512 characters, Unicode directory
->3 byte >0 - 8x%d
+0 leshort 0x0436 Linux/i386 PC Screen Font data
+>2 byte 0 - 256 characters, no directory,
+>2 byte 1 - 512 characters, no directory,
+>2 byte 2 - 256 characters, Unicode directory,
+>2 byte 3 - 512 characters, Unicode directory,
+>3 byte >0 8x%d
# Linux swap file, from Daniel Quinlan <quinlan@yggdrasil.com>
4086 string SWAP-SPACE Linux/i386 swap file
0 string Xref: news text
0 string From: news or mail text
0 string Article saved news text
+0 string BABYL Emacs RMAIL text
#------------------------------------------------------------------------------
-# dos: file(1) magic for MS-DOS files (formerly "ms-dos")
-#
-# These must come before the Linux/i386 entries, with the exception of
-# Linux LILO images.
-#
-# LILO boot/chain loaders, from Daniel Quinlan <quinlan@yggdrasil.com>
-# this can be overridden by the DOS executable (COM) entry
-# XXX - this was moved from "linux"
-2 string LILO LILO boot/chain loader
+# msdos: file(1) magic for MS-DOS files
#
+
# .EXE formats (Greg Roelofs, newt@uchicago.edu)
-# [GRR: some company sells a self-extractor/displayer for image data(!)]
#
0 string MZ MS-DOS executable (EXE)
->24 string @ (OS/2 or Windows format)
->1638 string -lh5- (LHa SFX archive v2.13S)
->7195 string Rar! (RAR self-extracting archive)
+>24 string @ \b, OS/2 or Windows
+>1638 string -lh5- \b, LHa SFX archive v2.13S
+>7195 string Rar! \b, RAR self-extracting archive
#
# [GRR 950118: file 3.15 has a buffer-size limitation; offsets bigger than
# 8161 bytes are ignored. To make the following entries work, increase
# HOWMANY in file.h to 32K at least, and maybe to 70K or more for OS/2,
# NT/Win32 and VMS.]
+# [GRR: some company sells a self-extractor/displayer for image data(!)]
+#
+>11696 string PK\003\004 \b, PKZIP SFX archive v1.1
+>13297 string PK\003\004 \b, PKZIP SFX archive v1.93a
+>15588 string PK\003\004 \b, PKZIP2 SFX archive v1.09
+>15770 string PK\003\004 \b, PKZIP SFX archive v2.04g
+>28374 string PK\003\004 \b, PKZIP2 SFX archive v1.02
+#
+# Info-ZIP self-extractors
+# these are the DOS versions:
+>25115 string PK\003\004 \b, Info-ZIP SFX archive v5.12
+>26331 string PK\003\004 \b, Info-ZIP SFX archive v5.12 w/decryption
+# these are the OS/2 versions (OS/2 is flagged above):
+>47031 string PK\003\004 \b, Info-ZIP SFX archive v5.12
+>49845 string PK\003\004 \b, Info-ZIP SFX archive v5.12 w/decryption
+# this is the NT/Win32 version:
+>69120 string PK\003\004 \b, Info-ZIP NT SFX archive v5.12 w/decryption
#
->13297 string PK\003\004 (PKZIP SFX archive v1.93a)
->15770 string PK\003\004 (PKZIP SFX archive v2.04g)
-# these are the DOS versions:
->25115 string PK\003\004 (Info-ZIP SFX archive v5.12)
->26331 string PK\003\004 (Info-ZIP SFX archive v5.12 w/decryption)
-# these are the OS/2 versions (OS/2 is flagged above):
->47031 string PK\003\004 (Info-ZIP SFX archive v5.12)
->49845 string PK\003\004 (Info-ZIP SFX archive v5.12 w/decryption)
-# this is the NT/Win32 version:
->69120 string PK\003\004 (Info-ZIP NT SFX archive v5.12 w/decryption)
+# TELVOX Teleinformatica CODEC self-extractor for OS/2:
+>49801 string \x79\xff\x80\xff\x76\xff \b, CODEC archive v3.21
+>>49824 leshort =1 \b, 1 file
+>>49824 leshort >1 \b, %u files
+
+# .COM formats (Daniel Quinlan, quinlan@yggdrasil.com)
+# Uncommenting only the first two lines will cover about 2/3 of COM files,
+# but it isn't feasible to match all COM files since there must be at least
+# two dozen different one-byte "magics".
+#0 byte 0xe9 MS-DOS executable (COM)
+#0 byte 0x8c MS-DOS executable (COM)
+# 0xeb conflicts with "sequent" magic
+#0 byte 0xeb MS-DOS executable (COM)
+#0 byte 0xb8 MS-DOS executable (COM)
# miscellaneous formats
0 string LZ MS-DOS executable (built-in)
-0 byte 0xe9 MS-DOS executable (COM)
-0 byte 0xeb MS-DOS executable (COM)
-0 byte 0xf0 MS-DOS program library
+#0 byte 0xf0 MS-DOS program library data
+#
+
+# Popular applications
+2080 string Microsoft\ Word\ 6.0\ Document %s
+#
+0 belong 0x31be0000 Microsoft Word Document
+#
+2080 string Microsoft\ Excel\ 5.0\ Worksheet %s
+#
+0 belong 0x00001a00 Lotus 1-2-3
+>4 belong 0x00100400 wk3 document
+>4 belong 0x02100400 wk4 document
+>4 belong 0x07800100 fm3 or fmb document
+>4 belong 0x07800000 fm3 or fmb document
+#
+0 belong 0x00000200 Lotus 1-2-3
+>4 belong 0x06040600 wk1 document
+>4 belong 0x06800200 fmt document
+
#------------------------------------------------------------------------------
-# mirage: file(1) magic for NetBSD executables
+# netbsd: file(1) magic for NetBSD objects
#
# All new-style magic numbers are in network byte order.
#
#------------------------------------------------------------------------------
-# compress: file(1) magic for Pyramids
+# pyramid: file(1) magic for Pyramids
#
# XXX - byte order?
#
#------------------------------------------------------------------------------
-# rtf: file(1) magic for RTF (Rich text format)
+# rtf: file(1) magic for Rich Text Format (RTF)
#
-# This information was gleaned from the version 1 documentation by
-# D.P.Simpson@dcs.warwick.ac.uk (Duncan P Simpson)
+# Duncan P. Simpson, D.P.Simpson@dcs.warwick.ac.uk
#
-0 string {\\rtf Rich Text Format data (version
->5 byte x %c,
->6 string \\mac Apple Macintosh characters)
->6 string \\ansi ANSI characters)
->6 string \\pc IBM PC characters)
->6 string \\pca IBM PS/2 characters)
+0 string {\\rtf Rich Text Format data,
+>5 byte x version %c,
+>6 string \\ansi ANSI
+>6 string \\mac Apple Macintosh
+>6 string \\pc IBM PC, code page 437
+>6 string \\pca IBM PS/2, code page 850
# is a checksum that could (presumably) have any leading digit,
# and we don't have regular expression matching yet.
# Hence the following official kludge:
-8 string \001s\ SCCS archive.
+8 string \001s\ SCCS archive data
0 string SGIAUDIT SGI Audit file
>8 byte x - version %d
>9 byte x \b.%ld
-#
-0 beshort 000732 SGI imagelib image
->6 beshort x (%d x
->8 beshort x %d)
-0 beshort 0155001 SGI imagelib image byte-swapped
-0 beshort 017436 packed data
-0 beshort 017037 packed data (byte swapped)
# Are these three SGI-based file types or general ones?
0 string WNGZWZSC Wingz compiled script
0 string WNGZWZSS Wingz spreadsheet
-
#------------------------------------------------------------------------------
-# sgml: file(1) magic for Standard(?) Generalized Mark-up Language
-#
-# $Id: sgml,v 1.5 1995/03/25 22:04:56 christos Exp $
-# SGML goop, mostly from rph@sq.
-0 string \<!DOCTYPE Exported SGML document
-0 string \<!doctype Exported SGML document
-0 string \<!SUBDOC Exported SGML subdocument
-0 string \<!subdoc Exported SGML subdocument
-0 string \<!-- Exported SGML document
+# sgml: file(1) magic for Standard Generalized Markup Language
+
+# HyperText Markup Language (HTML) is an SGML document type,
+# from Daniel Quinlan (quinlan@yggdrasil.com)
+0 string \<!DOCTYPE\ HTML HTML document text
+0 string \<!doctype\ html HTML document text
+0 string \<HEAD HTML document text
+0 string \<head HTML document text
+0 string \<TITLE HTML document text
+0 string \<title HTML document text
+0 string \<html HTML document text
+0 string \<HTML HTML document text
+
+# SGML, mostly from rph@sq
+0 string \<!DOCTYPE exported SGML document text
+0 string \<!doctype exported SGML document text
+0 string \<!SUBDOC exported SGML subdocument text
+0 string \<!subdoc exported SGML subdocument text
+0 string \<!-- exported SGML document text
# breaking them apart and reading the data. The following patterns
# match most *.tfm files generated by METAFONT or afm2tfm.
2 string \000\021 TeX font metric data
+>33 string >\0 (%s)
2 string \000\022 TeX font metric data
>33 string >\0 (%s)
0 string \\input\ texinfo Texinfo source text
0 string This\ is\ Info\ file GNU Info text
-# TeX document additions from Daniel Quinlan, quinlan@yggdrasil.com
-0 string \\input TeX or TeX-derivative document text
-0 string \\chapter TeX or TeX-derivative document text
-0 string \\documentstyle TeX or TeX-derivative document text
-0 string \\section TeX or TeX-derivative document text
-0 string \\setlength TeX or TeX-derivative document text
+# TeX documents, from Daniel Quinlan (quinlan@yggdrasil.com)
+0 string \\input TeX document text
+0 string \\section LaTeX document text
+0 string \\setlength LaTeX document text
+0 string \\documentstyle LaTeX document text
+0 string \\chapter LaTeX document text
0 string \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\0 timezone data
0 string \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0 timezone data
0 string \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0 timezone data
+0 string \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\6\0 timezone data
#------------------------------------------------------------------------------
-# uuencoded: file(1) magic for ASCII-encoded files
+# uuencode: file(1) magic for ASCII-encoded files
#
-0 string begin uuencoded mail text
-# Btoa(1) is an alternative to uuencode that requires less space.
+
+# GRR: the first line of xxencoded files is identical to that in uuencoded
+# files, but the first character in most subsequent lines is 'h' instead of
+# 'M'. (xxencoding uses lowercase letters in place of most of uuencode's
+# punctuation and survives BITNET gateways better.) If regular expressions
+# were supported, this entry could possibly be split into two with
+# "begin\040\.\*\012M" or "begin\040\.\*\012h" (where \. and \* are REs).
+0 string begin\040 uuencoded or xxencoded text
+
+# btoa(1) is an alternative to uuencode that requires less space.
0 string xbtoa\ Begin btoa'd text
+
+# ship(1) is another, much cooler alternative to uuencode.
+# Greg Roelofs, newt@uchicago.edu
+0 string $\012ship ship'd binary text
+
+# bencode(8) is used to encode compressed news batches (Bnews/Cnews only?)
+# Greg Roelofs, newt@uchicago.edu
+0 string Decode\ the\ following\ with\ bdeco bencoded News text
+
+# GRR: is MIME BASE64 encoding handled somewhere?
# and deleted if they duplicate other entries.
#
0 short 0610 Perkin-Elmer executable
+# AMD 29K
+0 beshort 0572 amd 29k coff noprebar executable
+0 beshort 01572 amd 29k coff prebar executable
+0 beshort 0160007 amd 29k coff archive
+# Cray
+6 beshort 0407 unicos (cray) executable
# vms: file(1) magic for VMS executables (experimental)
#
# VMS .exe formats, both VAX and AXP (Greg Roelofs, newt@uchicago.edu)
-# This file should be renamed to "vms" eventually...
# GRR 950122: I'm just guessing on these, based on inspection of the headers
# of three executables each for Alpha and VAX architectures. The VAX files
# 00000 b0 00 30 00 44 00 60 00 00 00 00 00 30 32 30 35 ..0.D.`.....0205
# 00010 01 01 00 00 ff ff ff ff ff ff ff ff 00 00 00 00 ................
#
-0 string \xb0\x00\x30\x00 VMS VAX executable
->44032 string PK\003\004 (Info-ZIP SFX archive v5.12 w/decryption)
+0 string \xb0\0\x30\0 VMS VAX executable
+>44032 string PK\003\004 \b, Info-ZIP SFX archive v5.12 w/decryption
#
# The AXP files all looked like this, except that the byte at offset 0x22
# was 06 in some of them and 07 in others:
# 00040 00 00 00 00 ff ff ff ff ff ff ff ff 02 00 00 00 ................
#
0 belong 0x03000000 VMS Alpha executable
->75264 string PK\003\004 (Info-ZIP SFX archive v5.12 w/decryption)
+>75264 string PK\003\004 \b, Info-ZIP SFX archive v5.12 w/decryption
0 long 0xe808 pure object file (z8000 a.out)
0 long 0xe809 separate object file (z8000 a.out)
0 long 0xe805 overlay object file (z8000 a.out)
-