Wrong obfuscation result on the files with size > 2 GB

Programming related discussions related to game research
Posts: 42
Joined: Mon Jun 07, 2021 8:20 pm

Wrong obfuscation result on the files with size > 2 GB

Post by grandshot »

I'm trying to implement the file table obfuscation algo from Boiling Point Road to Hell game *.grp archives.
Pseudocode getted from one of engine *.dll working fine for archives smallest then 2 gb, but wrong with biggest.

I wrote this simply C++ script for demonstrate:

Code: Select all

#include <stdio.h>
#include <stdlib.h>

int main()
    unsigned int newgrpSize = 2859709129, newgrp1Size = 1168718524, initialValue = 47536;
    unsigned char data[22] = {7, 211, 11, 95, 141, 76, 227, 180, 107, 133, 207, 242, 88, 9, 168, 238, 124, 46, 30, 15, 251, 38};
    unsigned char data2[22] = { 31, 40, 102, 193, 185, 74, 118, 82, 147, 67, 151, 16, 205, 161, 54, 63, 230, 105, 167, 64, 171, 182 };

    srand((newgrp1Size + initialValue | newgrp1Size + initialValue >> 31 << 32) % 65535);

    for (int i = 0; i < 22; i++)
        int v2 = rand();
        data[i] ^= (v2 | v2 >> 31 << 32) % 255;
        printf("%c", data[i]);


    srand((newgrpSize + initialValue | newgrpSize + initialValue >> 31 << 32) % 65535);

    for (int i = 0; i < 22; i++)
        int v2 = rand();
        data2[i] ^= (v2 | v2 >> 31 << 32) % 255;
        printf("%c", data2[i]);

data's chars represent obfuscated path strings cuted from tables of different archives. As seen in the result, after re-obfuscation first data looks like readable string, but data2 isnt.


Where might be issue?
Posts: 719
Joined: Sat Sep 28, 2019 7:00 pm

Re: Wrong obfuscation result on the files with size > 2 GB

Post by spiritovod »

@grandshot: I would suggest to use IDE with some code assistant, errors like that will be explained there. srand accept uint32, but left shift for 32 bits is producing larger number (at least int64) for any non-zero value, thus result is undefined and compiler probably cast it to uint32, which may work in some cases and may not in others. The same goes for v2 usage with left shift.
Posts: 42
Joined: Mon Jun 07, 2021 8:20 pm

Re: Wrong obfuscation result on the files with size > 2 GB

Post by grandshot »

@spiritovod: Thanks for reply.

I suppose this, but in original function from dll the rand() result always casts to int32_t and all work correct.


'size' is size of obfuscated data, 'pwd' - the whole size of grp archive

Actually i write c++ simple only for test purpose. I prefer Python and already implement my version of obfuscator. It works also fast as c++ version (with same issue), and even have little improvement, which allow parse a file table on the fly without allocate whole table data.

Code: Select all


    def _randomizer_set_seed(self, initial_value=XENUS_INITIAL_VALUE) -> None:
        value = self.group_file_size + initial_value
        self._seed = (value | value >> 31 << 32) % 0xFFFF
    def _randomizer(self) -> int:
        # implementation of C++ rand() function.
        self._seed = (self._seed * 214013 + 2531011) % 2**64
        return (self._seed >> 16) & 0x7fff
    def _obfuscate_bytes(self, data: bytes) -> bytes:
        return bytes(i ^ ((randomizer := self._randomizer()) | randomizer >> 31 << 32) % 255 for i in data)

Code: Select all

    def _parse_file_table(self):
        for file_id in range(self.num_files):
            path_length = unpack('<H', self._obfuscate_bytes(self.file_stream.read(2)))[0]
            entry_size = path_length + 12
            if self.file_stream.tell() + entry_size > (self.file_table_size + 16):
                    self.error_message = 'Can\'t allocate %d bytes for file entry %d from offset 0x%X.' % (entry_size, file_id, self.file_stream.tell())
                    return 0
            file = SingleFile(unpack('<%ds3I' % (path_length), self._obfuscate_bytes(self.file_stream.read(entry_size))))
            #print(list(i for i in self.files[-1].path))
        return 1

Well, i can calling original dll function from Python, but i want make my code clear and independ from engine files.

I presume, i need to the hooking variables in assembler code, trace their changes in all steps of process, for comparing them with results of my code. Not say what i know how to do that.
Posts: 719
Joined: Sat Sep 28, 2019 7:00 pm

Re: Wrong obfuscation result on the files with size > 2 GB

Post by spiritovod »

@grandshot: This function looks much easier in IDA:

Code: Select all

void __cdecl gfLoadMajor(unsigned __int8 *data, int size, int pwd)
  int i; // esi

  srand((pwd + 47536) % 0xFFFF);
  for ( i = 0; i < size; data[i - 1] ^= rand() % 255 )

which is the same implementation as in original aluigi's script, mentioned in this topic.

It means you can just use "srand((fileSize + initialValue) % 0xFFFF)" and "data[i] ^= v2 % 255" in both cases, just declare file sizes and seed/initial value as int instead of unsigned int - this way it will work properly.
For reference, compare behavior of C++ app with quickbms script. The script works with grp smaller then 0x7FFFFFFF (which is around 2GB and max value for signed int) with both usual quickbms and 4gb_files, but for bigger grp over that size only usual quickbms will work. It's related to how values are processed in both versions.
Posts: 42
Joined: Mon Jun 07, 2021 8:20 pm

Re: Wrong obfuscation result on the files with size > 2 GB

Post by grandshot »

Thanks. I edit my C++ test script and all works fine.

Now I need in some way edit my Python _randomizer function to get same result.

Suspect, this modulus and exponent ops at the end is guilty:

Code: Select all

    def _randomizer(self) -> int:
        # implementation of C++ rand() function.
        self._seed = (self._seed * 214013 + 2531011) % 2**64 #It's convert signed to unsigned, what not needed
        return (self._seed >> 16) & 0x7FFF

But without them code execution is very very very slow. I even can't wait the ending. Endured two minuts and make keyboard interrupt)
Posts: 719
Joined: Sat Sep 28, 2019 7:00 pm

Re: Wrong obfuscation result on the files with size > 2 GB

Post by spiritovod »

@grandshot: I'm not familiar with python, but still can't understand why "% 2**64" is even required there. If you want to explicitly cast value to int, there should be other ways to do so. Maybe it's worth to take a look at numpy / ctypes functionality or just use bitwise operations.