Christos Zoulas [Wed, 9 Jan 2013 22:37:23 +0000 (22:37 +0000)]
From Guy Harris:
There are several entries in the magic database for files that begin
with a 4-byte big-endian or little-endian octal 407, 410, and 413,
because several different flavors of UN*X used, at least in their
earliest days, the 32-bit a.out format with the standard magic numbers.
I've removed them and placed entries in a new "aout" file, and just
labeled them as "little-endian 32-bit" and "big-endian 32-bit"
executables, so that, for example, UNIX/32V VAX binaries aren't
misidentified as 386 binaries, or *vice versa*; unless we look at the
actual code, or find some other way of distinguishing between them,
there's no way to identify those as anything other than little-endian
32-bit binaries. I also commented out some entries in "unknown" that
would have matched the same files that other entries would also have
matched.
I've also added "a.out" to the description strings for several a.out
file formats.
As "mips" contained some of those entries, as a result of being a bit of
a mix between MIPS stuff and SGI stuff, I also moved all the stuff that
has nothing to do with the MIPS architecture into "sgi". (Yes, SGI did
own MIPS Technologies for a while, but it didn't do so originally and
doesn't do so now, and the stuff that got moved has nothing to do with
the MIPS architecture.)
In addition, I cleaned up the "ar" archive file entries, removing some
duplicates, fixing some "random" archive checks (they were using an
offset of 8 for older archive formats, but in those older archive
formats the archive entries start at an earlier offset, so the name of
the first entry is at an offset of 2 or 4), and adding an entry for the
entry Apple's ranlib adds (it has a longer name, and OS X uses the BSD
flavor of the "portable" archive format, so the name is at an offset of
68 rather than 8). I also added a comment to indicate what I suspect
"thin" archives are (a type of archive produced by GNU ar).
Christos Zoulas [Fri, 4 Jan 2013 16:37:54 +0000 (16:37 +0000)]
Need to pass the returnval that the child match determined in the use case.
This broke the elf mime printing, where softmagic returned a non-match although
the child match() actually printed something.
Christos Zoulas [Fri, 12 Oct 2012 16:10:39 +0000 (16:10 +0000)]
from Joerg Jenderek
Hi,
2 files (TDSK-5120x32b.img and TDSK-5120x64b.img ) in directory bootsector
are characterized wrong ( see output bootsector-5.11.txt) .
In reality they are dos images with a sector size smaller than 512. Because
smallest DOS sector size is 32 new test a level 0 searches for boot
signature 0xAA55 in the range from 32 to 512:
30 search/481 \x55\xAA
This test succeed also for some zip files. But if next test for 0xAA55 at
offset 0x1FE succeeds
>0x1FE leshort 0xAA55
i got the old examples and print "x86 boot sector".
Alternative test for boot sector sizes smaller 512 at offset 11
>11 uleshort <512
>>(11.s-2) uleshort 0xAA55 x86 boot sector
and look for boot signature at end of sector. If these tests succeeds
display also the "x86 boot sector" text.
Because i found bootloader and mbr information only in case of bootsector
sizes greater or equal 512, i keep the the old test sequences and only
replaced "x86 boot sector" string by an empty one because displaying this
text is now done by new additional test.
Some steps has to be done to get the old look like "x86 boot sector, YY
Bootloader, code offset 0xnn, OEM-ID ..."
To display that text before old one,SYSLINUX MBR and DOS BPB
information like in previous file version a strength of 72 has to be
added.
In the current version first search for the end of sector marker 0xAA55 is
done. If succeeds additional information like DOS BPB and MBR type
is printed. So some boot sector templates without boot signature
are identified as "data". Therefore i separate from "x86 boot sector" the
tests for DOS sector.
Furthermore i has done some minor bug fixes and cosmetic changes.
The jump assembler instruction use relative addresses. So one has to add 2
to
get the real code offset inside the file . The value is ubyte for 0xEB
instruction, but uleshort for 0xE9.
The values for "reserved1" til "reserved3" for DOS boot sectors are wrong,
because the start at offset 52 and not at 54 according to web page
http://thestarman.pcministry.com/asm/mbr/MSWIN41.htm#FSINFO .
This mistake was long time not seen, because this values are normally zero
except for some files like hda9data.bin.
I also display the information about " sectors/track" at offset 24.
For "physical drive" value 0xFF that words are displayed twice and one
with a wrong value ( See in sub directory physical_drive_2 for files
hda1fd95.bin,sdb2-xp.bin,...)
If the DOS bootsector is followed by the Media descriptor byte 0xFn and
some 0xFFs ( (11.s) ulelong&0x00ffffF0 0x00ffffF0) this is
characteristic
for a DOS File Allocation Table (FAT). The whole thing is the start of an
DOS disk image. So mime type "application/x-ima" is printed for floppy
images (no fixed disk with FAT12).
NTFS and DOS share the beginning parts of the BIOS parameter block (BPB)
according to http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/
bios-parameter-block.html .
For some x86 boot sectors (files in directory sample/ntfs/) some
information like Media descriptor or heads is correct displayed ( see
output ntfs-5.11.txt ). But the interesting facts of NTFS file system are
not displayed.
By information of http://thestarman.pcministry.com/asm/mbr/NTFSBR.htm
i began to patch the filesystems magic file. If a file looks like a DOS
boot sector and has zero values for the 4 fields FATs,root entries,
DOS sectors and sectors/FAT it is a NTFS bootsector and the following
bytes contain information like $MFT of the NTFS filesystem ( see output
ntfs-DOSsector.txt ).
After applying changes (file-5.11-filesystems-DOSsector.diff ) a final
output file bootsector-DOSsector.txt for files in directory bootsector is
obtained.
All diffs, output and sample files are stored under
http://mitglied.multimania.de/jenderek/file/