Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

How to translate the files of a game
ner0
Posts: 15
Joined: Sat Jan 16, 2016 9:53 pm

Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by ner0 »

I have searched pretty much everywhere and did not find any usable tool that can help me unpack and repack resources from this game, the dialogue script in particular. I tried a tool in this forum intended for the remastered version but doesn't seem to be compatible with the original resources (which also makes sense since ScummVM doesn't support the Remaster due to en engine swap or heavy modification).

So, I'm asking for help with either a standalone tool or a QuickBMS script that would allow me to unpack and repack the file TEXT.CLU which contains dialogues. I've given it a try with an HexEditor and while I can change the text I cannot change the size of the file, which was to be expected. Anyway, I'll share the file for the game's demo which is essentially the full game's dialogue (unpolished) but just available for free.

I appreciate any help with this.
ner0
Posts: 15
Joined: Sat Jan 16, 2016 9:53 pm

Re: Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by ner0 »

I'm not much at reading code, but skimming through ScummVM's source I can see the following in the resource manager:

Code: Select all

	// 1st DWORD of a cluster is an offset to the look-up table
	uint32 table_offset = file->readUint32LE();
	debug(6, "table offset = %d", table_offset);
	uint32 tableSize = file->size() - table_offset; // the table is stored at the end of the file
	
	assert((tableSize % 8) == 0);
source for resman.cpp: https://github.com/scummvm/scummvm/blob ... n.cpp#L469

The first thing that stands out is the last comment, saying that the file's look-up table is stored at the end of itself. The second useful thing would be the first comment saying that the 1st DWORD (4 bytes) of the file is the offset for said look-up table. Notice the endianess in which the bytes are read in, little-endian in this case (readUint32LE).

The first 4 bytes of this file in standard hex big-endian representation are: D2 F0 04 00
Reversing their order to be represented as little-endian becomes: 00 04 F0 D2

Going to the offset 0004F0D2 puts us around 6 bytes after the last text dialogue that there is in the file. Next it calculates the table size, which is equal to the file size minus the offset.
File size is 4F452 (324690 bytes), the table offset is 4F0D2. tableSize = 4F452 - 4F0D2 = 380 (896 bytes). The last bit of code tests if the value of tableSize is divisible by eight, which 896 is.

All well and good, not sure what to do with this info still or where to go from here (haven't read all the code, much I probably can't understand). I would be happy enough if I understood how to change the block sizes in the look-up table to rewrite dialogue without the length constraint that breaks the file-structure.
ner0
Posts: 15
Joined: Sat Jan 16, 2016 9:53 pm

Re: Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by ner0 »

Following a helpful post about this subject in Xentax, I did my own research on it and confirmed essentially the same things, although slight deviation in interpretation.
  • File header
    4 bytes, lookup table offset
    4 bytes, null
  • Text Resource lookup tables
    24 bytes, resource header
    4 bytes, resource id
    16 bytes, null
    4 bytes, total # of dialogue lines
    4 bytes, offset for each dialogue
    (loop the one above until # of total dialogue lines)
    2 bytes, dialogue ID
    ? bytes, dialogue text, null terminated
    (loop the one above until # of total dialogue lines)
  • File lookup table
    4 bytes, text resource offset
    4 bytes, text resource chunk size
    (loop the two above until EOF)
Note: The 4 bytes previous to the Lookup Table are self-referencing (weird checksum?)

When changing a single dialogue line, for example, the following must be considered... let's assume that the original dialogue is 80 bytes and we add an extra 10 bytes to that line (90 bytes in total), the following must then be fixed:

- File header checksum offset: if it was B1AA04 before, it should now be BBAA04 (added 10 bytes)

- Text Resource: assuming this is the 3rd dialogue line, all subsequent offsets of this particular resource, for 4th dialogue line and onward, now need to be shifted 10 bytes forward (ex: 4th offset was C302, it will become CD02, and so on for all remaining offsets after the offset of the changed dialogue).

- File lookup table: assuming that the text resource that was changed is the 11th, in this lookup table listing Text Resource offsets and respective chunk sizes, you'd need to change the 11th chunk size (let's say from 5903 to 6303) and then shift all offsets (not chunk sizes) 10 bytes forward, until the end of the file.

Note: It's possible that the 4 bytes before the File lookup table may need to be readjusted to the new offset (+10 bytes in this case).

That being said, I'm still not positive about what's missing since the file seems to become corrupt (crashes the game) when hex-editing the dialogue lines past a certain length. I have not been able to figure out why, so everything is still slightly incomplete.

Any help is appreciated.
rabatini
Posts: 179
Joined: Tue Jan 18, 2022 12:21 am

Re: Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by rabatini »

This .CLU is a container with a lot of text and files resources.

I made a script to split them.

Code: Select all

findloc OFFSET STRING "Text Resource "
math i = 0
do
    goto OFFSET
    get DUMMY long
    findloc NEXT_OFFSET STRING "Text Resource " 0 ""

     if NEXT_OFFSET == ""

        get SIZE asize
    else
        math SIZE = NEXT_OFFSET

    endif
    math SIZE -= OFFSET
    string NAME p= OFFSET SIZE i
    STRING PATH = NAME

    log "" OFFSET SIZE
    math i += 1
    math OFFSET = NEXT_OFFSET

while NEXT_OFFSET != ""
The text dumper and insert can be done with a tool.

The bms script to extract the file is something like this

Code: Select all

GETDSTRING NAME 0X22
get file asize
get entries long
for rip = 1 to entries
get offset long
savepos temp
get size long
if size < file
xmath offset "(offset -8)"
xmath size "(size - 8)"
xmath size "(size - offset)"
goto temp
slog "" offset size
else
xmath offset "(offset -8)"
xmath size "(file - offset)"
#xmath size "(size - temp2)"
slog "" offset size
endif

next rip
ner0
Posts: 15
Joined: Sat Jan 16, 2016 9:53 pm

Re: Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by ner0 »

Thanks for sharing those scripts.
Unfortunately, the tricky part is the repacking - I'm using the term loosely.
Extracting the data and coding a repacker/importer only makes sense if all is understood about this file/container, the technical description that I made pretty much established that this file is a container with a lookup table and multiples resources with their own index tables.

Like I mentioned, I have an issue where patching the file works, but crashes the game when some lines have more than X characters, which I'm pretty sure is a problem with the way the file is being patched and not the game engine itself. Priority is still understanding this.
rabatini
Posts: 179
Joined: Tue Jan 18, 2022 12:21 am

Re: Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by rabatini »

ner0 wrote:Thanks for sharing those scripts.
Unfortunately, the tricky part is the repacking - I'm using the term loosely.
Extracting the data and coding a repacker/importer only makes sense if all is understood about this file/container, the technical description that I made pretty much established that this file is a container with a lookup table and multiples resources with their own index tables.

Like I mentioned, I have an issue where patching the file works, but crashes the game when some lines have more than X characters, which I'm pretty sure is a problem with the way the file is being patched and not the game engine itself. Priority is still understanding this.
The first script works to extract and reimport.

About the text, the game crashs because of the pointers.
Each splited file from the fisrt script have a text resource with pointers giving the right coordenate of each strings.

The second script you have to use in those splited files.

My logical understanding is at offset 0x22 you have the amount of pointers them next long byte is offset of the first string minus 0x08
The next 4 bytes is the next string, so you have to do a simple math.
size - offset to get the correct string size.
0000000a.tex.txt
ner0
Posts: 15
Joined: Sat Jan 16, 2016 9:53 pm

Re: Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by ner0 »

rabatini wrote:The first script works to extract and reimport.

About the text, the game crashs because of the pointers.
Each splited file from the fisrt script have a text resource with pointers giving the right coordenate of each strings.

The second script you have to use in those splited files.

My logical understanding is at offset 0x22 you have the amount of pointers them next long byte is offset of the first string minus 0x08
The next 4 bytes is the next string, so you have to do a simple math.
size - offset to get the correct string size.
It seems that we're saying the same things but getting through in a different way.

First of all, I entirely agree that this is down to simple math in the case you described, but that's not the problem at all.
The game crashing is unrelated to the resource pointer offsets themselves, the reason how I know this is because I shift them properly, or at least I think I do.

Offset 0x22 is indeed storing the number of pointers, except for me it's offset 0x2a (0x2c if properly extracted) just because this way the index offsets correspond to the exact relative position instead of having to do -8 when stepping through each offset, but that's irrelevant at the moment, just a personal approach that doesn't change the end result. So, let me give a specific example using one resource extracted by your script (including attached samples for illustration):

00000015.tex

- Offset 0x22 = 1c (28 pointers)

- Pointer #6, offset 0x2a3-8 (0x29b)

- Original text in #6 (0x3a bytes long), offset 0x29b:

Code: Select all

Getting past the night watchman and his dog had been easy.
- New text in #6 (0x5f bytes long), offset 0x29b:

Code: Select all

Getting past the night watchman and his dog had been easy.-EXTENDS STRING LENGTH BY 0x25 BYTES-
- The resource index table needs to shift forward 0x25 bytes on the offsets for pointers #7 through #28, offsets for pointers previous to #7 are left unchanged.

- The container's lookup table offset in the first 4 bytes needs to be increased by 0x25 bytes, from d2f00400 (0x4f0fd2) to f7f00400 (0x4f0f7)

- The file lookup table at offset 0x4f0f7, which contains the pairs of offsets and chunk sizes for every text resource, needs to increase by 0x25 bytes the size of the resource just changed (Text Resource 3257), and to also shift forward 0x25 bytes on the offsets of every subsequent resource until EOF.

This is it, but the game will crash nonetheless.

If there is anything at all about what I just described that you disagree with or any misstep in it or the attached samples, please let me know. Like I said initially, I'm fairly confident about what I found and what needs to be changed, but somehow it still crashes the game in some instances (not all), so obviously that something else besides the things described above needs to be fixed, I just don't see what yet.

Thanks for taking the time to help out, much appreciated.
Last edited by ner0 on Thu Jan 05, 2023 12:22 am, edited 2 times in total.
ner0
Posts: 15
Joined: Sat Jan 16, 2016 9:53 pm

Re: Broken Sword 2 - The Smoking Mirror (1997 Original, NOT-Remastered)

Post by ner0 »

So, it turns out that it was a math problem after all, but possibly not what would be expected - game engine parsing specificity, not defined by the file structure itself.

The byte-size of all Text Resources must be divisible by 4, which means that if the length of the text is changed, and the size of the Text Resource is not divisible by 4, then you must add null padding at the end of the resource until it is divisible by 4.

We can use 00000015_edited+fixed.tex.txt as an example though, but first a note that the current bms script skipped the first 10 bytes of every resource it extracted and appended 11 bytes from the beginning of the next, this is incorrect. Which means that the actual length of this particular example should be 0xa + 0x7b5 - 0xb = 0x7ca. Now, 0x7ca is also not divisible by 4, so padding the end with nulls until it reaches 0x7cc would fix it.