From 8a667072e65294efa6a7b7d9a3bc417e145e0aea Mon Sep 17 00:00:00 2001 From: Christos Zoulas Date: Wed, 20 Jul 2016 11:28:40 +0000 Subject: [PATCH] Use signed offsets to reduce false positives (Christoph Biedl) And here is your reward: Reduce the number of false-positive detections of "DOS executable (COM)" big time, especially for small files. In my case, modulo further checks, the number of files reported that way went down from some 2500 to 35, with perhaps 6 false-positives. The trick: The target of the JMP instruction at offset 0 must be valid and sound: For 0xeb (8bit offset) the offset must be positive as negative offsets would lead into the program segment prefix (PSP), and the file must be long enough that jump target actually exists. Similar for 0xe9 (16bit offset). Here negative offsets (wrapped around at 16bit) are acceptable as long as they don't lead into the PSP. Such files do exist. And that's where I needed a signed indirect offset. --- magic/Magdir/msdos | 48 ++++++++++++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 16 deletions(-) diff --git a/magic/Magdir/msdos b/magic/Magdir/msdos index c1ed03c5..eb848a3b 100644 --- a/magic/Magdir/msdos +++ b/magic/Magdir/msdos @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: msdos,v 1.107 2016/07/05 12:40:09 christos Exp $ +# $File: msdos,v 1.108 2016/07/14 17:37:12 christos Exp $ # msdos: file(1) magic for MS-DOS files # @@ -328,15 +328,6 @@ 0 string \xffKEYB\ \ \ \0\0\0\0 >12 string \0\0\0\0`\004\360 MS-DOS KEYBoard Layout file -# .COM formats (Daniel Quinlan, quinlan@yggdrasil.com) -# Uncommenting only the first two lines will cover about 2/3 of COM files, -# but it isn't feasible to match all COM files since there must be at least -# two dozen different one-byte "magics". -# test too generic ? -0 byte 0xe9 DOS executable (COM) ->0x1FE leshort 0xAA55 \b, boot code ->6 string SFX\ of\ LHarc (%s) - # DOS device driver updated by Joerg Jenderek at May 2011 # http://maben.homeip.net/static/S100/IBM/software/DOS/DOS%20techref/CHAPTER.009 0 ulequad&0x07a0ffffffff 0xffffffff DOS executable ( @@ -439,12 +430,37 @@ # byte 0xeb conflicts with "sequent" magic leshort 0xn2eb 0 ubeshort&0xeb8d >0xeb00 # DR-DOS STACKER.COM SCREATE.SYS missed ->0 byte 0xeb ->>0x1FE leshort 0xAA55 DOS executable (COM), boot code ->>85 string UPX DOS executable (COM), UPX compressed ->>4 string \ $ARX DOS executable (COM), ARX self-extracting archive ->>4 string \ $LHarc DOS executable (COM), LHarc self-extracting archive ->>0x20e string SFX\ by\ LARC DOS executable (COM), LARC self-extracting archive + +0 name msdos-com +>0 byte x DOS executable (COM) +>6 string SFX\ of\ LHarc \b, %s +>0x1FE leshort 0xAA55 \b, boot code +>85 string UPX \b, UPX compressed +>4 string \ $ARX \b, ARX self-extracting archive +>4 string \ $LHarc \b, LHarc self-extracting archive +>0x20e string SFX\ by\ LARC \b, LARC self-extracting archive + +# JMP 8bit +0 byte 0xeb +# allow forward jumps only +>1 byte >-1 +# that offset must be accessible +>>(1.b+2) byte x +>>>0 use msdos-com + +# JMP 16bit +0 byte 0xe9 +# forward jumps +>1 short >-1 +# that offset must be accessible +>>(1.s+3) byte x +>>>0 use msdos-com +# negative offset, must not lead into PSP +>1 short <-259 +# that offset must be accessible +>>(1,s+65539) byte x +>>>0 use msdos-com + # updated by Joerg Jenderek at Oct 2008,2015 # following line is too general 0 ubyte 0xb8 -- 2.40.0