[Help] Harvest Moon - A Wonderful Life - .clz compression
-
- Posts: 2
- Joined: Sun Nov 11, 2018 5:54 am
[Help] Harvest Moon - A Wonderful Life - .clz compression
Hello! I've hit a wall in making progress on ripping the assets from Harvest Moon - A Wonderful Life on Gamecube. The issue is that most of the assets are compressed in .clz files, which are an absolute mystery to me. I'm assuming it's some variation of LZ compression that seems to be common for a lot of Nintendo stuff, but the tools available for this confirm that it's a non-standard compression. As far as I can tell, this compression was only used on this game, the release of the same game with a female main character, and the PS2 port.
I'm sorry this isn't terribly specific - if someone could point me in a helpful direction, that would be great. I'm still playing around with some tools from QuickBMS, but I have a growing sense of dread that I need to reverse engineer the compression. Luckily, there are test files included on the disk that have the same information, where only one is compressed! I've included one example here. Optimistically, this would make determining the compression the file easy.
I'm sure you could tell, but this is my first exploration into reverse engineering something like this. Please pardon my ignorance.
Thanks!
I'm sorry this isn't terribly specific - if someone could point me in a helpful direction, that would be great. I'm still playing around with some tools from QuickBMS, but I have a growing sense of dread that I need to reverse engineer the compression. Luckily, there are test files included on the disk that have the same information, where only one is compressed! I've included one example here. Optimistically, this would make determining the compression the file easy.
I'm sure you could tell, but this is my first exploration into reverse engineering something like this. Please pardon my ignorance.
Thanks!
-
- Site Admin
- Posts: 12984
- Joined: Wed Jul 30, 2014 9:32 pm
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
Can you provide bigger compressed samples?
The only good results I obtained (bcl_rice, lzfu_raw and SCUMMVM39) look false positives so I don't think I have a ready solution.
Dummy_Uncompressed.txt is not related to Dummy_Compressed.txt.
The only good results I obtained (bcl_rice, lzfu_raw and SCUMMVM39) look false positives so I don't think I have a ready solution.
Dummy_Uncompressed.txt is not related to Dummy_Compressed.txt.
-
- Posts: 2
- Joined: Sun Nov 11, 2018 5:54 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
Sure, most of the data for the game is included in .arc.clz files. I think this is similar to a tarball, so after decompressing it should be a valid .arc file. I've used the comtype_scan2.bat tool and similarly had found a couple that were close to being a valid .arc file, but none that worked.
Thank you so much for looking at this, and sorry for the unrelated files. I made a correlation between file sizes, which in hindsight was a bad assumption.
Thank you so much for looking at this, and sorry for the unrelated files. I made a correlation between file sizes, which in hindsight was a bad assumption.
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
I've also been looking into the clz compression from A Wonderful Life.
From what I can tell, the file header is composed of several parts.
I ran signsrch on the game executables and got the following results:
Interestingly, the PS2 version of A Wonderful Life Special Edition contains both a compressed and uncompressed version of what appears to be the same file (mainchapter0.arc.clz and mainchapter0.arc).
From what I can tell, the file header is composed of several parts.
- 4 bytes at 0x00000000 which is the CLZ identifier (i.e. 43 4C 5A 00)
- 4 bytes at 0x00000004 of the size (in bytes) of the decompressed data, in hex (e.g. 00 53 54 90 [5.46MB] for AWL’s commonall.arc)
Currently this is only speculated. I am unable to confirm that this is what this variable actually is until I successfully decompress a clz file. - 4 bytes at 0x00000008 with blank space (i.e. 00 00 00 00)
- A repeat at 0x000000c of the size in bytes (in hex). (e.g. 00 53 54 90 for the above file)
- One null byte at 0x00000010 (e.g. 00)
- The compressed file data starting at 0x00000011 (e.g. 55 AA 38 2D as this file contains a U8 [arc] Archive)
I ran signsrch on the game executables and got the following results:
A Wonderful Life: dvdroot/&&systemdata/Start.dolCode: Select all
offset num description [bits.endian.size]
--------------------------------------------
0024bc70 3049 DMC compression [32.be.16&]
0024bee1 1038 padding used in hashing algorithms (0x80 0 ... 0) [..64]
002521c8 2304 zinflate_distanceExtraBits [32.be.120]
002521cb 2303 zinflate_distanceExtraBits [32.le.120]
0028e19b 1040 SSL3 #define [32.le.176&]
0028e7a8 2417 MBC2 [32.le.248&]
0028e7ab 2418 MBC2 [32.be.248&]
002939c8 1563 libavcodec ff_zigzag_direct [..64]
- 8 signatures found in the file in 1 seconds
Another Wonderful Life (girl version of the game): dvdroot/&&systemdata/Start.dolCode: Select all
offset num description [bits.endian.size]
--------------------------------------------
0023bd54 2417 MBC2 [32.le.248&]
0023c36b 2418 MBC2 [32.be.248&]
0024d3c4 3049 DMC compression [32.be.16&]
0024d5d1 1038 padding used in hashing algorithms (0x80 0 ... 0) [..64]
00250dd0 2304 zinflate_distanceExtraBits [32.be.120]
00250dd3 2303 zinflate_distanceExtraBits [32.le.120]
0028ebb8 1563 libavcodec ff_zigzag_direct [..64]
- 7 signatures found in the file in 1 seconds
Interestingly, the PS2 version of A Wonderful Life Special Edition contains both a compressed and uncompressed version of what appears to be the same file (mainchapter0.arc.clz and mainchapter0.arc).
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
I ran another one of the files (preload.arc.clz) through comtype_scan2 and it seems like the best candidate would be some variant of either LZFU (most likely) or FIN (less likely).
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
I also tried scanning the above file (preload.arc.clz) using offzip, and got the following results:
Summary of valid compressed streams:
Summary of valid compressed streams:
Code: Select all
+------------+-----+----------------------------+----------------------+
| hex_offset | ... | zip -> unzip size / offset | spaces before | info |
+------------+-----+----------------------------+----------------------+
0x00000fd1 61201 -> 61187 / 0x0000fee2 _ 4049
0x00019447 45209 -> 45199 / 0x000244e0 _ 38245
0x0002d8b0 36618 -> 36608 / 0x000367ba _ 37840
0x00037206 46 -> 321 / 0x00037234 _ 2636
0x0003a263 65375 -> 65365 / 0x0004a1c2 _ 12335
0x0004aeca 42 -> 342 / 0x0004aef4 _ 3336
0x0004d330 34 -> 347 / 0x0004d352 _ 9276
0x00052f48 55 -> 665 / 0x00052f7f _ 23542
0x00057771 37 -> 47 / 0x00057796 _ 18418
0x0005ab5a 36 -> 85 / 0x0005ab7e _ 13252
0x0005f9b6 34 -> 103 / 0x0005f9d8 _ 20024
- 11 valid compressed streams found
- 0x00032f2f -> 0x0003355d bytes covering the 51% of the file
-
- Site Admin
- Posts: 12984
- Joined: Wed Jul 30, 2014 9:32 pm
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
deflate is prone to many false positives because it's just the compressed data without any crc or header (which is instead available in zlib).
So you can ignore those results.
So you can ignore those results.
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
After running mainchapter0 through the speculated decompression formats, I'm beginning to think that I was incorrect thinking it might have been LZFU.
When viewing the output, I examined the section of the file (in this case, a compressed U8 archive) which would include a list of filenames.
A lot of the data seems to be missing when attempting to decompress using either LZFU or MSLZSS1.
Original compressed data:
Expected output:
LZFU:
LZFU_RAW:
MSLZSS1:
Overall, there seems to be an issue with repeated strings (e.g. "_0.arc"), where they'll show up once, but then be missing in subsequent entries.
I'll try examining some other decompressed dumps from comtype and will update if I find anything of significance.
When viewing the output, I examined the section of the file (in this case, a compressed U8 archive) which would include a list of filenames.
A lot of the data seems to be missing when attempting to decompress using either LZFU or MSLZSS1.
Original compressed data:
Expected output:
LZFU:
LZFU_RAW:
MSLZSS1:
Overall, there seems to be an issue with repeated strings (e.g. "_0.arc"), where they'll show up once, but then be missing in subsequent entries.
I'll try examining some other decompressed dumps from comtype and will update if I find anything of significance.
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
If it helps, I was able to decompile the game's main executable (Start.dol) into a python-formatted script using RetDec 3.0.
There are some references to clz files, but I can't quite make sense of it.
There are some references to clz files, but I can't quite make sense of it.
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
I also tried opening Start.dol in BrawlBox's memory editor, and found that the mainchapter%d.arc.clz seems to be possibly related to some sort of SceneInit function.
There are also references to preload.arc.clz and commonall.arc.clz.
There are also references to preload.arc.clz and commonall.arc.clz.
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
I'm wondering is the algorithm could be some variant of LZSS.
Is there a way to batch-test possible lzss configurations, similar to how comtype runs through the different compression types?
Is there a way to batch-test possible lzss configurations, similar to how comtype runs through the different compression types?
-
- Site Admin
- Posts: 12984
- Joined: Wed Jul 30, 2014 9:32 pm
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
Yes and no, I mean that it's extremely rare that lzss is used with settings different than the usual "12 4 2 2 0x20" or "12 4 2 2 0" (lzss0)
In 10 years the only non-standard lzss has been the following:
comtype lzss "11 5 2 2 0"
You can build a sort of fuzzer by generating the first 4 fields of the settings but it's just a waste of time since 99.9% is not the classical lzss algorithm.
It's probably faster to analyze or debug (via emulator) the game.
Anyway I have no additional suggestions.
In 10 years the only non-standard lzss has been the following:
comtype lzss "11 5 2 2 0"
You can build a sort of fuzzer by generating the first 4 fields of the settings but it's just a waste of time since 99.9% is not the classical lzss algorithm.
It's probably faster to analyze or debug (via emulator) the game.
Anyway I have no additional suggestions.
-
- Posts: 9
- Joined: Sun Dec 02, 2018 5:53 am
Re: [Help] Harvest Moon - A Wonderful Life - .clz compression
Someone managed to reverse-engineer the compression method and created a tool to decompress/recompress the files.
Not sure if this could be ported to a quickbms script or not.
Not sure if this could be ported to a quickbms script or not.