From: Christos Zoulas Date: Wed, 9 Jan 2013 22:37:23 +0000 (+0000) Subject: From Guy Harris: X-Git-Tag: FILE5_13~38 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=92be55fef9388d6f82fc181ca3efdcac98e5edb2;p=file From Guy Harris: There are several entries in the magic database for files that begin with a 4-byte big-endian or little-endian octal 407, 410, and 413, because several different flavors of UN*X used, at least in their earliest days, the 32-bit a.out format with the standard magic numbers. I've removed them and placed entries in a new "aout" file, and just labeled them as "little-endian 32-bit" and "big-endian 32-bit" executables, so that, for example, UNIX/32V VAX binaries aren't misidentified as 386 binaries, or *vice versa*; unless we look at the actual code, or find some other way of distinguishing between them, there's no way to identify those as anything other than little-endian 32-bit binaries. I also commented out some entries in "unknown" that would have matched the same files that other entries would also have matched. I've also added "a.out" to the description strings for several a.out file formats. As "mips" contained some of those entries, as a result of being a bit of a mix between MIPS stuff and SGI stuff, I also moved all the stuff that has nothing to do with the MIPS architecture into "sgi". (Yes, SGI did own MIPS Technologies for a while, but it didn't do so originally and doesn't do so now, and the stuff that got moved has nothing to do with the MIPS architecture.) In addition, I cleaned up the "ar" archive file entries, removing some duplicates, fixing some "random" archive checks (they were using an offset of 8 for older archive formats, but in those older archive formats the archive entries start at an earlier offset, so the name of the first entry is at an offset of 2 or 4), and adding an entry for the entry Apple's ranlib adds (it has a longer name, and OS X uses the BSD flavor of the "portable" archive format, so the name is at an offset of 68 rather than 8). I also added a comment to indicate what I suspect "thin" archives are (a type of archive produced by GNU ar). --- diff --git a/magic/Magdir/aout b/magic/Magdir/aout new file mode 100644 index 00000000..ba9630a0 --- /dev/null +++ b/magic/Magdir/aout @@ -0,0 +1,46 @@ + +#------------------------------------------------------------------------------ +# $File$ +# aout: file(1) magic for a.out executable/object/etc entries that +# handle executables on multiple platforms. +# + +# +# Little-endian 32-bit-int a.out, merged from bsdi (for BSD/OS, from +# BSDI), netbsd, and vax (for UNIX/32V and BSD) +# +# XXX - is there anything we can look at to distinguish BSD/OS 386 from +# NetBSD 386 from various VAX binaries? The BSD/OS shared library flag +# works only for binaries using shared libraries. Grabbing the entry +# point from the a.out header, using it to find the first code executed +# in the program, and looking at that might help. +# +0 lelong 0407 a.out little-endian 32-bit executable +>16 lelong >0 not stripped +>32 byte 0x6a (uses BSD/OS shared libs) + +0 lelong 0410 a.out little-endian 32-bit pure executable +>16 lelong >0 not stripped +>32 byte 0x6a (uses BSD/OS shared libs) + +0 lelong 0413 a.out little-endian 32-bit demand paged pure executable +>16 lelong >0 not stripped +>32 byte 0x6a (uses BSD/OS shared libs) + +# +# Big-endian 32-bit-int a.out, merged from sun (for old 68010 SunOS a.out), +# mips (for old 68020(!) SGI a.out), and netbsd (for old big-endian a.out). +# +# XXX - is there anything we can look at to distinguish old SunOS 68010 +# from old 68020 IRIX from old NetBSD? Again, I guess we could look at +# the first instruction or instructions in the program. +# +0 belong 0407 a.out big-endian 32-bit executable +>16 belong >0 not stripped + +0 belong 0410 a.out big-endian 32-bit pure executable +>16 belong >0 not stripped + +0 belong 0413 a.out big-endian 32-bit demand paged executable +>16 belong >0 not stripped + diff --git a/magic/Magdir/archive b/magic/Magdir/archive index c399156b..103d36ee 100644 --- a/magic/Magdir/archive +++ b/magic/Magdir/archive @@ -1,5 +1,5 @@ #------------------------------------------------------------------------------ -# $File: archive,v 1.73 2012/11/09 22:59:30 christos Exp $ +# $File: archive,v 1.74 2013/01/08 17:02:50 christos Exp $ # archive: file(1) magic for archive formats (see also "msdos" for self- # extracting compressed archives) # @@ -36,7 +36,60 @@ 0 string 070701 ASCII cpio archive (SVR4 with no CRC) 0 string 070702 ASCII cpio archive (SVR4 with CRC) -# Debian package (needs to go before regular portable archives) +# +# Various archive formats used by various versions of the "ar" +# command. +# + +# +# Original UNIX archive formats. +# They were written with binary values in host byte order, and +# the magic number was a host "int", which might have been 16 bits +# or 32 bits. We don't say "PDP-11" or "VAX", as there might have +# been ports to little-endian 16-bit-int or 32-bit-int platforms +# (x86?) using some of those formats; if none existed, feel free +# to use "PDP-11" for little-endian 16-bit and "VAX" for little-endian +# 32-bit. There might have been big-endian ports of that sort as +# well. +# +0 leshort 0177555 very old 16-bit-int little-endian archive +0 beshort 0177555 very old 16-bit-int big-endian archive +0 lelong 0177555 very old 32-bit-int little-endian archive +0 belong 0177555 very old 32-bit-int big-endian archive + +0 leshort 0177545 old 16-bit-int little-endian archive +>2 string __.SYMDEF random library +0 beshort 0177545 old 16-bit-int big-endian archive +>2 string __.SYMDEF random library +0 lelong 0177545 old 32-bit-int little-endian archive +>4 string __.SYMDEF random library +0 belong 0177545 old 32-bit-int big-endian archive +>4 string __.SYMDEF random library + +# +# From "pdp" (but why a 4-byte quantity?) +# +0 lelong 0x39bed PDP-11 old archive +0 lelong 0x39bee PDP-11 4.0 archive + +# +# XXX - what flavor of APL used this, and was it a variant of +# some ar archive format? It's similar to, but not the same +# as, the APL workspace magic numbers in pdp. +# +0 long 0100554 apl workspace + +# +# System V Release 1 portable(?) archive format. +# +0 string = System V Release 1 ar archive +!:mime application/x-archive + +# +# Debian package; it's in the portable archive format, and needs to go +# before the entry for regular portable archives, as it's recognized as +# a portable archive whose first member has a name beginning with +# "debian". # 0 string =!\ndebian !:mime application/x-debian-package @@ -53,23 +106,14 @@ #>84 string gz \b, uses gzip compression #>136 ledate x created: %s -0 string =!\n thin archive with ->68 belong 0 no symbol entries ->68 belong 1 %d symbol entry ->68 belong >1 %d symbol entries - -# other archives -0 long 0177555 very old archive -0 short 0177555 very old PDP-11 archive -0 long 0177545 old archive -0 short 0177545 old PDP-11 archive -0 long 0100554 apl workspace -0 string = archive -!:mime application/x-archive - -# MIPS archive (needs to go before regular portable archives) +# +# MIPS archive; they're in the portable archive format, and need to go +# before the entry for regular portable archives, as it's recognized as +# a portable archive whose first member has a name beginning with +# "__________E". # 0 string =!\n__________E MIPS archive +!:mime application/x-archive >20 string U with MIPS Ucode members >21 string L with MIPSEL members >21 string B with MIPSEB members @@ -80,56 +124,26 @@ 0 search/1 -h- Software Tools format archive text # -# XXX - why are there multiple thingies? Note that 0x213c6172 is -# "! current ar archive -# 0 long 0x213c6172 archive file -# -# and for SVR1 archives, we have: -# -# 0 string \ System V Release 1 ar archive -# 0 string = archive -# -# XXX - did Aegis really store shared libraries, breakpointed modules, -# and absolute code program modules in the same format as new-style -# "ar" archives? +# BSD/SVR2-and-later portable archive formats. # 0 string =! current ar archive !:mime application/x-archive >8 string __.SYMDEF random library +>68 string __.SYMDEF\ SORTED random library >0 belong =65538 - pre SR9.5 >0 belong =65539 - post SR9.5 >0 beshort 2 - object archive >0 beshort 3 - shared library module >0 beshort 4 - debug break-pointed module >0 beshort 5 - absolute code program module -0 string \ System V Release 1 ar archive -0 string = archive -# -# XXX - from "vax", which appears to collect a bunch of byte-swapped -# thingies, to help you recognize VAX files on big-endian machines; -# with "leshort", "lelong", and "string", that's no longer necessary.... -# -0 belong 0x65ff0000 VAX 3.0 archive -0 belong 0x3c61723e VAX 5.0 archive -# -0 long 0x213c6172 archive file -0 lelong 0177555 very old VAX archive -0 leshort 0177555 very old PDP-11 archive -# -# XXX - "pdp" claims that 0177545 can have an __.SYMDEF member and thus -# be a random library (it said 0xff65 rather than 0177545). -# -0 lelong 0177545 old VAX archive ->8 string __.SYMDEF random library -0 leshort 0177545 old PDP-11 archive ->8 string __.SYMDEF random library + # -# From "pdp" (but why a 4-byte quantity?) +# "Thin" archive, as can be produced by GNU ar. # -0 lelong 0x39bed PDP-11 old archive -0 lelong 0x39bee PDP-11 4.0 archive +0 string =!\n thin archive with +>68 belong 0 no symbol entries +>68 belong 1 %d symbol entry +>68 belong >1 %d symbol entries # ARC archiver, from Daniel Quinlan (quinlan@yggdrasil.com) # diff --git a/magic/Magdir/bsdi b/magic/Magdir/bsdi index bce14273..0609a811 100644 --- a/magic/Magdir/bsdi +++ b/magic/Magdir/bsdi @@ -1,25 +1,15 @@ #------------------------------------------------------------------------------ -# $File$ +# $File: bsdi,v 1.5 2009/09/19 16:28:08 christos Exp $ # bsdi: file(1) magic for BSD/OS (from BSDI) objects +# Some object/executable formats use the same magic numbers as are used +# in other OSes; those are handled by entries in aout. # 0 lelong 0314 386 compact demand paged pure executable >16 lelong >0 not stripped >32 byte 0x6a (uses shared libs) -0 lelong 0407 386 executable ->16 lelong >0 not stripped ->32 byte 0x6a (uses shared libs) - -0 lelong 0410 386 pure executable ->16 lelong >0 not stripped ->32 byte 0x6a (uses shared libs) - -0 lelong 0413 386 demand paged pure executable ->16 lelong >0 not stripped ->32 byte 0x6a (uses shared libs) - # same as in SunOS 4.x, except for static shared libraries 0 belong&077777777 0600413 sparc demand paged >0 byte &0x80 diff --git a/magic/Magdir/mips b/magic/Magdir/mips index c932b7f5..2bab394c 100644 --- a/magic/Magdir/mips +++ b/magic/Magdir/mips @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: mips,v 1.6 2010/08/13 16:12:30 christos Exp $ +# $File: mips,v 1.7 2011/05/03 01:44:17 christos Exp $ # mips: file(1) magic for Silicon Graphics (MIPS, IRIS, IRIX, etc.) # Dec Ultrix (MIPS) # all of SGI's *current* machines and OSes run in big-endian mode on the @@ -142,41 +142,3 @@ # 0 beshort 0x180 MIPSEB Ucode 0 beshort 0x182 MIPSEL-BE Ucode -# 32bit core file -0 belong 0xdeadadb0 IRIX core dump ->4 belong 1 of ->16 string >\0 '%s' -# 64bit core file -0 belong 0xdeadad40 IRIX 64-bit core dump ->4 belong 1 of ->16 string >\0 '%s' -# N32bit core file -0 belong 0xbabec0bb IRIX N32 core dump ->4 belong 1 of ->16 string >\0 '%s' -# New style crash dump file -0 string \x43\x72\x73\x68\x44\x75\x6d\x70 IRIX vmcore dump of ->36 string >\0 '%s' -# Trusted IRIX info -0 string SGIAUDIT SGI Audit file ->8 byte x - version %d ->9 byte x \b.%ld -# -0 string WNGZWZSC Wingz compiled script -0 string WNGZWZSS Wingz spreadsheet -0 string WNGZWZHP Wingz help file -# -0 string #Inventor V IRIS Inventor 1.0 file -0 string #Inventor V2 Open Inventor 2.0 file -# GLF is OpenGL stream encoding -0 string glfHeadMagic(); GLF_TEXT -4 belong 0x7d000000 GLF_BINARY_LSB_FIRST -!:strength -30 -4 belong 0x0000007d GLF_BINARY_MSB_FIRST -!:strength -30 -# GLS is OpenGL stream encoding; GLS is the successor of GLF -0 string glsBeginGLS( GLS_TEXT -4 belong 0x10000000 GLS_BINARY_LSB_FIRST -!:strength -30 -4 belong 0x00000010 GLS_BINARY_MSB_FIRST -!:strength -30 diff --git a/magic/Magdir/netbsd b/magic/Magdir/netbsd index 3a741003..15a6b886 100644 --- a/magic/Magdir/netbsd +++ b/magic/Magdir/netbsd @@ -1,16 +1,14 @@ #------------------------------------------------------------------------------ -# $File: netbsd,v 1.18 2009/09/19 16:28:11 christos Exp $ +# $File: netbsd,v 1.19 2011/10/31 17:23:34 christos Exp $ # netbsd: file(1) magic for NetBSD objects # # All new-style magic numbers are in network byte order. +# The old-style magic numbers are indistinguishable from the same magic +# numbers used in other systems, and are handled, for all those systems, +# in aout. # -0 lelong 000000407 a.out NetBSD little-endian object file ->16 lelong >0 not stripped -0 belong 000000407 a.out NetBSD big-endian object file ->16 belong >0 not stripped - 0 belong&0377777777 041400413 a.out NetBSD/i386 demand paged >0 byte &0x80 >>20 lelong <4096 shared library diff --git a/magic/Magdir/sun b/magic/Magdir/sun index 0357a00c..821525f4 100644 --- a/magic/Magdir/sun +++ b/magic/Magdir/sun @@ -1,12 +1,15 @@ #------------------------------------------------------------------------------ -# $File: sun,v 1.23 2013/01/06 01:09:42 christos Exp $ +# $File: sun,v 1.24 2013/01/08 01:43:18 christos Exp $ # sun: file(1) magic for Sun machines # # Values for big-endian Sun (MC680x0, SPARC) binaries on pre-5.x -# releases. (5.x uses ELF.) +# releases. (5.x uses ELF.) Entries for executables without an +# architecture type, used before the 68020-based Sun-3's came out, +# are in aout, as they're indistinguishable from other big-endian +# 32-bit a.out files. # -0 belong&077777777 0600413 sparc demand paged +0 belong&077777777 0600413 a.out SunOS sparc demand paged >0 byte &0x80 >>20 belong <4096 shared library >>20 belong =4096 dynamically linked executable @@ -14,17 +17,17 @@ >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0600410 sparc pure +0 belong&077777777 0600410 a.out SunOS sparc pure >0 byte &0x80 dynamically linked executable >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0600407 sparc +0 belong&077777777 0600407 a.out SunOS sparc >0 byte &0x80 dynamically linked executable >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0400413 mc68020 demand paged +0 belong&077777777 0400413 a.out SunOS mc68020 demand paged >0 byte &0x80 >>20 belong <4096 shared library >>20 belong =4096 dynamically linked executable @@ -32,17 +35,17 @@ >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0400410 mc68020 pure +0 belong&077777777 0400410 a.out SunOS mc68020 pure >0 byte &0x80 dynamically linked executable >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0400407 mc68020 +0 belong&077777777 0400407 a.out SunOS mc68020 >0 byte &0x80 dynamically linked executable >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0200413 mc68010 demand paged +0 belong&077777777 0200413 a.out SunOS mc68010 demand paged >0 byte &0x80 >>20 belong <4096 shared library >>20 belong =4096 dynamically linked executable @@ -50,24 +53,16 @@ >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0200410 mc68010 pure +0 belong&077777777 0200410 a.out SunOS mc68010 pure >0 byte &0x80 dynamically linked executable >0 byte ^0x80 executable >16 belong >0 not stripped -0 belong&077777777 0200407 mc68010 +0 belong&077777777 0200407 a.out SunOS mc68010 >0 byte &0x80 dynamically linked executable >0 byte ^0x80 executable >16 belong >0 not stripped -# reworked these to avoid anything beginning with zero becoming "old sun-2" -0 belong 0407 old sun-2 executable ->16 belong >0 not stripped -0 belong 0410 old sun-2 pure executable ->16 belong >0 not stripped -0 belong 0413 old sun-2 demand paged executable ->16 belong >0 not stripped - # # Core files. "SPARC 4.x BCP" means "core file from a SunOS 4.x SPARC # binary executed in compatibility mode under SunOS 5.x". @@ -144,5 +139,3 @@ # New format for Sun/Cobalt boot ROMs is annoying, it stores the version code # at the very end where file(1) can't get it. 0 string CRfs COBALT boot rom data (Flat boot rom or file system) - - diff --git a/magic/Magdir/unknown b/magic/Magdir/unknown index caf842e3..bac7a6e5 100644 --- a/magic/Magdir/unknown +++ b/magic/Magdir/unknown @@ -1,35 +1,34 @@ #------------------------------------------------------------------------------ -# $File$ +# $File: unknown,v 1.7 2009/09/19 16:28:13 christos Exp $ # unknown: file(1) magic for unknown machines # -# XXX - this probably should be pruned, as it'll match PDP-11 and -# VAX image formats. -# -# 0x107 is 0407; 0x108 is 0410; both are PDP-11 (executable and pure, -# respectively). -# -# 0x109 is 0411; that's PDP-11 split I&D, but the PDP-11 version doesn't -# have the "version %ld", which may be a bogus COFFism (I don't think -# there ever was COFF for the PDP-11). +# 0x107 is 0407, 0x108 is 0410, and 0x109 is 0411; those are all PDP-11 +# (executable, pure, and split I&D, respectively), but the PDP-11 version +# doesn't have the "version %ld", which may be a bogus COFFism (I don't +# think there was ever COFF for the PDP-11). # # 0x10B is 0413; that's VAX demand-paged, but this is a short, not a -# long, as it would be on a VAX. +# long, as it would be on a VAX. In any case, that could collide with +# VAX demand-paged files, as the magic number is little-endian on those +# binaries, so the first 16 bits of the file would contain 0x10B. +# +# Therefore, those entries are commented out. # -# 0x10C is 0414 and 0x10E is 416; those *are* unknown. +# 0x10C is 0414 and 0x10E is 0416; those *are* unknown. # -0 short 0x107 unknown machine executable ->8 short >0 not stripped ->15 byte >0 - version %ld -0 short 0x108 unknown pure executable ->8 short >0 not stripped ->15 byte >0 - version %ld -0 short 0x109 PDP-11 separate I&D ->8 short >0 not stripped ->15 byte >0 - version %ld -0 short 0x10b unknown pure executable ->8 short >0 not stripped ->15 byte >0 - version %ld +#0 short 0x107 unknown machine executable +#>8 short >0 not stripped +#>15 byte >0 - version %ld +#0 short 0x108 unknown pure executable +#>8 short >0 not stripped +#>15 byte >0 - version %ld +#0 short 0x109 PDP-11 separate I&D +#>8 short >0 not stripped +#>15 byte >0 - version %ld +#0 short 0x10b unknown pure executable +#>8 short >0 not stripped +#>15 byte >0 - version %ld 0 long 0x10c unknown demand paged pure executable >16 long >0 not stripped 0 long 0x10e unknown readable demand paged pure executable diff --git a/magic/Magdir/vax b/magic/Magdir/vax index 9df3acfb..303f5171 100644 --- a/magic/Magdir/vax +++ b/magic/Magdir/vax @@ -1,30 +1,22 @@ #------------------------------------------------------------------------------ -# $File$ +# $File: vax,v 1.7 2009/09/19 16:28:13 christos Exp $ # vax: file(1) magic for VAX executable/object and APL workspace # 0 lelong 0101557 VAX single precision APL workspace 0 lelong 0101556 VAX double precision APL workspace # -# VAX a.out (32V, BSD) +# VAX a.out (BSD; others collide with 386 and other 32-bit little-endian +# executables, and are handled in aout) # -0 lelong 0407 VAX executable ->16 lelong >0 not stripped - -0 lelong 0410 VAX pure executable ->16 lelong >0 not stripped - -0 lelong 0413 VAX demand paged pure executable ->16 lelong >0 not stripped - -0 lelong 0420 VAX demand paged (first page unmapped) pure executable +0 lelong 0420 a.out VAX demand paged (first page unmapped) pure executable >16 lelong >0 not stripped # # VAX COFF # -# The `versions' should be un-commented if they work for you. +# The `versions' were commented out, but have been un-commented out. # (Was the problem just one of endianness?) # 0 leshort 0570 VAX COFF executable