Reverse Engineering an old custom (?) .DAT file from a DOS game

Extraction and unpacking of game archives and compression, encryption, obfuscation, decoding of unknown files
vortexspin
Posts: 5
Joined: Wed Mar 02, 2022 2:49 pm

Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by vortexspin »

Hey all. Trying to unpack a .DAT file from an old DOS game.

The file is not large, 1823 KB.

Magic header seems to be qx01, and you can see filenames in the hex editor: "BOSS2....ART", etc.

That said, I ran the generic files extractor on it and came up with a couple that seem promising, e.g. COMPRLIB_RLE3 (which seems to be an old DOS compression algo) produced a 36,447 KB file that still has readable file names in it.

I have two questions:

How do I know if the original 1823 KB file was actually compressed or not? Is the COMPRLIB_RLE3 decompression result possibly a red herring?

Additionally, I would appreciate some advice about possible patterns in the header.

Seems like it's 12 bytes of filename information
- Why would there be a 00 00 00 01 sequence between each filename and extension pair? To allow for filenames of different length?

And then there's 12 bytes of additional information between one filename sequence and the next filename sequence
- I assume this is offset and length? I tried taking two chunks of 6 bytes or 3 chunks of 4 bytes in different orders and using QuickBMS to log that out to a file, and so far I haven't had much luck.
- Then again, I'm not sure if this is because the files being unpacked, (e.g. BOSS2.ART) are yet another proprietary format.

Any advice appreciated!
Last edited by vortexspin on Wed Mar 02, 2022 5:59 pm, edited 2 times in total.
vortexspin
Posts: 5
Joined: Wed Mar 02, 2022 2:49 pm

Re: Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by vortexspin »

I thought I made some progress...

Using the original 1.8MB file, I tried to write out the first file with a QuickBMS script like this:

endian little
idstring "q\x01"
getdstring name 0xC
get offset long
get null long
get length long
log name offset length

This seemed to produce something promising - a file identified by TrID (https://mark0.net/onlinetrid.py) as a C64 Raster Format, (BG/BIN)
But format is black/white only, and can't be what's used for this game.

Sure seems like that's the right way to do it though...

q. magic (2 bytes)
BOSS2....ART name (12 bytes)
9A 22 00 00 offset (4 bytes, little endian)
11 04 00 00 null / bogus?
40 02 00 00 length (4 bytes, 576, little endian)

This gives you a file that starts with "GC" magic bytes of 576 bytes which takes you right up to the next GC magic byte string, which is the offset given by the second file definition.
vortexspin
Posts: 5
Joined: Wed Mar 02, 2022 2:49 pm

Re: Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by vortexspin »

Okay, this gave me 100% coverage...

However I have a bunch of files that I don't really understand the format of! I guess that is the next step...

endian little
idstring "q\x01"
for i = 0 < 369
getdstring name 0xC
get offset long
get null long
get length long
log name offset length
next i
rabatini
Posts: 179
Joined: Tue Jan 18, 2022 12:21 am

Re: Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by rabatini »

vortexspin wrote:Okay, this gave me 100% coverage...

However I have a bunch of files that I don't really understand the format of! I guess that is the next step...

endian little
idstring "q\x01"
for i = 0 < 369
getdstring name 0xC
get offset long
get null long
get length long
log name offset length
next i


Hello

try this.

Code: Select all

get entries short

for rip = 1 to entries
   getdstring name 0x08
   get null byte
   getdstring extension 0x03
   String FILENAME P "%name%.%extension%"
   get offset long
   get unknow long
   get size long
   log filename offset size
next rip


choice r option, because have 2 files with the same name.
vortexspin
Posts: 5
Joined: Wed Mar 02, 2022 2:49 pm

Re: Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by vortexspin »

Oh duh! The "magic bytes" of 71 01 are actually the number of entries, 369. Makes sense. Your script is cleaner than mine and preserves the file extensions, nice work.

Sadly the files are encrypted with some custom compression algo, each file starts with a 0x47 0x43 magic byte sequence ("GC"). Some of the files are XMI MIDIs which have a well-known format, however I ran about 1000 algos over the compressed files using QuickBMS and none of them decompressed the XMIs into standard XMI format files.

GC (whatever that is) compressed XMI on the left, uncompressed XMI from a different DOS game on the right:
rabatini
Posts: 179
Joined: Tue Jan 18, 2022 12:21 am

Re: Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by rabatini »

vortexspin wrote:Oh duh! The "magic bytes" of 71 01 are actually the number of entries, 369. Makes sense. Your script is cleaner than mine and preserves the file extensions, nice work.

Sadly the files are encrypted with some custom compression algo, each file starts with a 0x47 0x43 magic byte sequence ("GC"). Some of the files are XMI MIDIs which have a well-known format, however I ran about 1000 algos over the compressed files using QuickBMS and none of them decompressed the XMIs into standard XMI format files.

GC (whatever that is) compressed XMI on the left, uncompressed XMI from a different DOS game on the right:


I presume the "unknow long bytes" maybe is the file decompressed.
vortexspin
Posts: 5
Joined: Wed Mar 02, 2022 2:49 pm

Re: Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by vortexspin »

Yeah I agree.
rabatini
Posts: 179
Joined: Tue Jan 18, 2022 12:21 am

Re: Reverse Engineering an old custom (?) .DAT file from a DOS game

Post by rabatini »

vortexspin wrote:Yeah I agree.


Try to download

https://www.ibm.com/support/pages/ibm-r ... dy-84#DNLD

have a free or trial version.
its convert xmi files.

another subject about xmi
https://github.com/tilkinsc/XM-File-Format#readme