[Help] Harvest Moon - A Wonderful Life - .clz compression

Extraction and unpacking of game archives and compression, encryption, obfuscation, decoding of unknown files
lschmitty
Posts: 2
Joined: Sun Nov 11, 2018 5:54 am

[Help] Harvest Moon - A Wonderful Life - .clz compression

Post by lschmitty »

Hello! I've hit a wall in making progress on ripping the assets from Harvest Moon - A Wonderful Life on Gamecube. The issue is that most of the assets are compressed in .clz files, which are an absolute mystery to me. I'm assuming it's some variation of LZ compression that seems to be common for a lot of Nintendo stuff, but the tools available for this confirm that it's a non-standard compression. As far as I can tell, this compression was only used on this game, the release of the same game with a female main character, and the PS2 port.

I'm sorry this isn't terribly specific - if someone could point me in a helpful direction, that would be great. I'm still playing around with some tools from QuickBMS, but I have a growing sense of dread that I need to reverse engineer the compression. Luckily, there are test files included on the disk that have the same information, where only one is compressed! I've included one example here. Optimistically, this would make determining the compression the file easy.

I'm sure you could tell, but this is my first exploration into reverse engineering something like this. Please pardon my ignorance.

Thanks! :)
aluigi
Site Admin
Posts: 12984
Joined: Wed Jul 30, 2014 9:32 pm

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by aluigi »

Can you provide bigger compressed samples?
The only good results I obtained (bcl_rice, lzfu_raw and SCUMMVM39) look false positives so I don't think I have a ready solution.

Dummy_Uncompressed.txt is not related to Dummy_Compressed.txt.
lschmitty
Posts: 2
Joined: Sun Nov 11, 2018 5:54 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by lschmitty »

Sure, most of the data for the game is included in .arc.clz files. I think this is similar to a tarball, so after decompressing it should be a valid .arc file. I've used the comtype_scan2.bat tool and similarly had found a couple that were close to being a valid .arc file, but none that worked.

Thank you so much for looking at this, and sorry for the unrelated files. I made a correlation between file sizes, which in hindsight was a bad assumption.
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

I've also been looking into the clz compression from A Wonderful Life.

From what I can tell, the file header is composed of several parts.
  1. 4 bytes at 0x00000000 which is the CLZ identifier (i.e. 43 4C 5A 00)
  2. 4 bytes at 0x00000004 of the size (in bytes) of the decompressed data, in hex (e.g. 00 53 54 90 [5.46MB] for AWL’s commonall.arc)
    Currently this is only speculated. I am unable to confirm that this is what this variable actually is until I successfully decompress a clz file.
  3. 4 bytes at 0x00000008 with blank space (i.e. 00 00 00 00)
  4. A repeat at 0x000000c of the size in bytes (in hex). (e.g. 00 53 54 90 for the above file)
  5. One null byte at 0x00000010 (e.g. 00)
  6. The compressed file data starting at 0x00000011 (e.g. 55 AA 38 2D as this file contains a U8 [arc] Archive)

Image

I ran signsrch on the game executables and got the following results:
A Wonderful Life: dvdroot/&&systemdata/Start.dol

Code: Select all

  offset   num  description [bits.endian.size]
  --------------------------------------------
  0024bc70 3049 DMC compression [32.be.16&]
  0024bee1 1038 padding used in hashing algorithms (0x80 0 ... 0) [..64]
  002521c8 2304 zinflate_distanceExtraBits [32.be.120]
  002521cb 2303 zinflate_distanceExtraBits [32.le.120]
  0028e19b 1040 SSL3 #define [32.le.176&]
  0028e7a8 2417 MBC2 [32.le.248&]
  0028e7ab 2418 MBC2 [32.be.248&]
  002939c8 1563 libavcodec ff_zigzag_direct [..64]

- 8 signatures found in the file in 1 seconds


Another Wonderful Life (girl version of the game): dvdroot/&&systemdata/Start.dol

Code: Select all

  offset   num  description [bits.endian.size]
  --------------------------------------------
  0023bd54 2417 MBC2 [32.le.248&]
  0023c36b 2418 MBC2 [32.be.248&]
  0024d3c4 3049 DMC compression [32.be.16&]
  0024d5d1 1038 padding used in hashing algorithms (0x80 0 ... 0) [..64]
  00250dd0 2304 zinflate_distanceExtraBits [32.be.120]
  00250dd3 2303 zinflate_distanceExtraBits [32.le.120]
  0028ebb8 1563 libavcodec ff_zigzag_direct [..64]

- 7 signatures found in the file in 1 seconds


Interestingly, the PS2 version of A Wonderful Life Special Edition contains both a compressed and uncompressed version of what appears to be the same file (mainchapter0.arc.clz and mainchapter0.arc).
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

I ran another one of the files (preload.arc.clz) through comtype_scan2 and it seems like the best candidate would be some variant of either LZFU (most likely) or FIN (less likely).
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

I also tried scanning the above file (preload.arc.clz) using offzip, and got the following results:
offzip_output_preload.arc.clz.txt


Summary of valid compressed streams:

Code: Select all

+------------+-----+----------------------------+----------------------+
| hex_offset | ... | zip -> unzip size / offset | spaces before | info |
+------------+-----+----------------------------+----------------------+
  0x00000fd1  61201 -> 61187 / 0x0000fee2 _ 4049
  0x00019447  45209 -> 45199 / 0x000244e0 _ 38245
  0x0002d8b0  36618 -> 36608 / 0x000367ba _ 37840
  0x00037206  46 -> 321 / 0x00037234 _ 2636
  0x0003a263  65375 -> 65365 / 0x0004a1c2 _ 12335
  0x0004aeca  42 -> 342 / 0x0004aef4 _ 3336
  0x0004d330  34 -> 347 / 0x0004d352 _ 9276
  0x00052f48  55 -> 665 / 0x00052f7f _ 23542
  0x00057771  37 -> 47 / 0x00057796 _ 18418
  0x0005ab5a  36 -> 85 / 0x0005ab7e _ 13252
  0x0005f9b6  34 -> 103 / 0x0005f9d8 _ 20024
 
- 11 valid compressed streams found
- 0x00032f2f -> 0x0003355d bytes covering the 51% of the file
aluigi
Site Admin
Posts: 12984
Joined: Wed Jul 30, 2014 9:32 pm

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by aluigi »

deflate is prone to many false positives because it's just the compressed data without any crc or header (which is instead available in zlib).
So you can ignore those results.
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

After running mainchapter0 through the speculated decompression formats, I'm beginning to think that I was incorrect thinking it might have been LZFU.

When viewing the output, I examined the section of the file (in this case, a compressed U8 archive) which would include a list of filenames.
A lot of the data seems to be missing when attempting to decompress using either LZFU or MSLZSS1.

Original compressed data:
Image

Expected output:
Image

LZFU:
Image

LZFU_RAW:
Image

MSLZSS1:
Image

Overall, there seems to be an issue with repeated strings (e.g. "_0.arc"), where they'll show up once, but then be missing in subsequent entries.

I'll try examining some other decompressed dumps from comtype and will update if I find anything of significance.
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

If it helps, I was able to decompile the game's main executable (Start.dol) into a python-formatted script using RetDec 3.0.

There are some references to clz files, but I can't quite make sense of it.
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

I also tried opening Start.dol in BrawlBox's memory editor, and found that the mainchapter%d.arc.clz seems to be possibly related to some sort of SceneInit function.
Image

There are also references to preload.arc.clz and commonall.arc.clz.
Image
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

I'm wondering is the algorithm could be some variant of LZSS.

Is there a way to batch-test possible lzss configurations, similar to how comtype runs through the different compression types?
aluigi
Site Admin
Posts: 12984
Joined: Wed Jul 30, 2014 9:32 pm

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by aluigi »

Yes and no, I mean that it's extremely rare that lzss is used with settings different than the usual "12 4 2 2 0x20" or "12 4 2 2 0" (lzss0)

In 10 years the only non-standard lzss has been the following:
comtype lzss "11 5 2 2 0"

You can build a sort of fuzzer by generating the first 4 fields of the settings but it's just a waste of time since 99.9% is not the classical lzss algorithm.
It's probably faster to analyze or debug (via emulator) the game.
Anyway I have no additional suggestions.
seangibbz
Posts: 9
Joined: Sun Dec 02, 2018 5:53 am

Re: [Help] Harvest Moon - A Wonderful Life - .clz compression

Post by seangibbz »

Someone managed to reverse-engineer the compression method and created a tool to decompress/recompress the files.

Not sure if this could be ported to a quickbms script or not.