Team Ico Sample Thread

Extraction and unpacking of game archives and compression, encryption, obfuscation, decoding of unknown files
AnonBaiter
Posts: 1125
Joined: Tue Feb 02, 2016 2:35 am

Team Ico Sample Thread

Post by AnonBaiter »

So I came across a game by Team Ico/Fumito Ueda that uses this type of archive. Due to their regional differences, I decided to divide it into three folders, each with their respective (filecutter'd) .df file. Here goes nothing:
https://mega.nz/#!UcswzRKA!V56Me_brCn_zaJZKX34W7mlpuXHFMbzvemfPolOyF9E
Last edited by AnonBaiter on Tue May 10, 2016 2:53 pm, edited 1 time in total.
aluigi
Site Admin
Posts: 12984
Joined: Wed Jul 30, 2014 9:32 pm

Re: ICO .DF

Post by aluigi »

http://aluigi.org/bms/teamico.bms

The archived DF files are automatically decompressed (deflate) on the fly.
AnonBaiter
Posts: 1125
Joined: Tue Feb 02, 2016 2:35 am

Re: ICO .DF

Post by AnonBaiter »

Well, inside the DATA.DF archive there are also .DF files in there. So I assume I must reuse the script for these files, right?

And here's one for Shadow of the Colossus:
https://mega.nz/#!1MdHVR7B!1uZbzzqpx5du1CAJyfiXnN83lvk3ugpNg4NhqDybbJU
aluigi
Site Admin
Posts: 12984
Joined: Wed Jul 30, 2014 9:32 pm

Re: Team Ico Sample Thread

Post by aluigi »

The extracted DF don't have the same format of the archive.
At a first look it doesn't seem so easy to parse because there are some sequential SIZE DATA but inside this DATA there are also other files (for example VAG data in the middle without references).

Instead regarding the other game I don't see a common format for those files and the reason is the lack of an index, XAD starts immediately with raw VAG audio data and there is no index at its end too.
AnonBaiter
Posts: 1125
Joined: Tue Feb 02, 2016 2:35 am

Re: Team Ico Sample Thread

Post by AnonBaiter »

I see... in this case, can you take a look at the attached file?

As for the Colossus game, I'll see what I can do to find the index file. Though in the NICO.DAT file(the full one(2.6GB), as was contained in the USA, KOR and JAP versions) it does have a index scattered around somewhere at the beginning of the file. In the EUR version however, the archive file itself was split into 5 parts(NICO.DAT->XAB->XAC->XAD->XAE).

EDIT: And here is what I could find out of both regional versions of the game(USA, EUR/AUS):
https://mega.nz/#!kUcSySxK!BSUGDqjjS8Ele6E_r-4_3GU4Qt92QJyQfCAJrH40rwo
https://mega.nz/#!VEV0mQAY!xRrrpyc9m9c4wOEktdOtATjRE2eNahwa_-4FHXfvh2M

EDIT2: And here's what I found on the other two regional versions of the game(KOR, JAP):
https://mega.nz/#!hBs3UahK!bolv-KU5uYMLdtV6Itji_f4V3vIUaSM8jp5mri8ICk4
https://mega.nz/#!VJlEHJzJ!Ewy_n7ohzc3naSbZY44kROkhYoJVxRhPdzOGf-q66nk
https://mega.nz/#!4ZUyTC6D!OIGAk9VmKUh9aeAWl_TiTMZjIYVsTDhlyCGkg4cSdts <- two regions added(KOR, JAP)
aluigi
Site Admin
Posts: 12984
Joined: Wed Jul 30, 2014 9:32 pm

Re: Team Ico Sample Thread

Post by aluigi »

Have you already checked this thread?
http://forum.xentax.com/viewtopic.php?p=29025

Almost all the posts on xentax are deleted so it's probably totally useless but who knows...
AnonBaiter
Posts: 1125
Joined: Tue Feb 02, 2016 2:35 am

Re: Team Ico Sample Thread

Post by AnonBaiter »

Yeah but I found this one too:
https://github.com/moosotc/dormin

It can only extract NICO.DAT->XAB->XAC->XAD->XAE files though(maybe it was made with the PAL version in mind), and I don't know how to compile it.
AnonBaiter
Posts: 1125
Joined: Tue Feb 02, 2016 2:35 am

Re: Team Ico Sample Thread

Post by AnonBaiter »

...and i'm back with a new update.

i finally managed to figure out the whole archive format as used on Shadow of the Colossus/Wander to Kyozou.
the structure is all a mess in that the archive is nothing but file-fetching, yet i managed to conquer this one anyway.

here's all of its madness in action

Code: Select all

open FDSE "NICO.DAT" 0
#open FDSE "nico.lst" 1 EXISTS # in case that one exists at all...

# (todo) organize (and probaly recode) this entire mess...

math build_index = 0
# ^ set this to 1 *if* you want to build a separate .index file
math demo = 0
# ^ set this to 1 *if* you have the demo(either as its own disc or out of some "PS2 magazine edition" kind of disc which contains said demo)
# it contains some extra "folders" not found anywhere else
math pal_release = 0
# ^ no need to set this to any number actually, this script will automatically check the entire NICO.DAT size for this one

get nico_size asize 0

if nico_size == 0x40000000
   math pal_release = 1
endif

print " okay, gathering all files out of this archive. \n "

math i = 0
for first_i = 0 < 2
   get offset1 long
   get size1 long
   putarray 1 i offset1
   putarray 2 i size1
   math i + 1
next first_i

get entries1 long
get name_size1 long
for name1_i = 0 < entries1
   get name1 string
next name1_i

for second_i = 0 < entries1
   get chunks1 long
   for first_j = 0 < chunks1
      get main_chunk1 short
   next first_j
   get sub_entries1 long
   get sub_entries2 long
   if pal_release == 0
      xmath total_sub_entries1 "sub_entries1 + sub_entries2"
   elif pal_release == 1
      xmath total_sub_entries1 "sub_entries1 + (sub_entries2 * 5)"
   endif
   math i = i
   for second_j = 0 < total_sub_entries1
      get offset2 long
      get size2 long
      putarray 1 i offset2
      putarray 2 i size2
      math i + 1
   next second_j
next second_i

get entries2 long
get name_size2 long
for name2_i = 0 < entries2
   get name2 string
next name2_i

for third_i = 0 < entries2
   get chunks2 long
   for third_j = 0 < chunks2
      get main_chunk2 short
   next third_j
   get sub_entries3 long
   get sub_entries4 long
   xmath total_sub_entries2 "(sub_entries3 + sub_entries4) * 2"
   math i = i
   for fourth_j = 0 < total_sub_entries2
      get offset3 long
      get size3 long
      putarray 1 i offset3
      putarray 2 i size3
      math i + 1
   next fourth_j
   
   # (todo) maybe the below cycles aren't necessary here, although i'm not too sure about this though...
   get unknown_chunks1 long
   for fifth_j = 0 < unknown_chunks1
      get unknown01 long
      get unknown02 long
      get unknown03 long
   next fifth_j
   
   get sub_entries5 long
   math i = i
   for sixth_j = 0 < sub_entries5
      get offset4 long
      get size4 long
      putarray 1 i offset4
      putarray 2 i size4
      math i + 1
   next sixth_j
   
   get sub_entries6 long
   math i = i
   for seventh_j = 0 < sub_entries6
      get offset5 long
      get size5 long
      putarray 1 i offset5
      putarray 2 i size5
      math i + 1
   next seventh_j
next third_i

if demo = 1
   get entries3 long
   get name_size3 long
   for name3_i = 0 < entries3
      get name3 string
   next name3_i
   
   math i = i
   for fourth_i = 0 < entries3
      get sub_entries7 long
      get entry_size1 long
      for eighth_j = 0 < sub_entries7
         get aps01 long
         get aps02 long
         get offset7 long
         get size7 long
         putarray 1 i offset7
         putarray 2 i size7
         math i + 1
      next eighth_j
   next fourth_i
endif

print " the script has now gathered all known files out of this archive. \n this script will now sort them out for extraction. \n please wait patiently... "

#sortarray 1 2

math full_files = i
if build_index = 1
   #putvarchr MEMORY_FILE10 -1 0
   log MEMORY_FILE10 0 0x100000
   put full_files long MEMORY_FILE10
   if pal_release = 0
      put 1 long MEMORY_FILE10
      if EXISTS == 1
         xmath name_off "(full_files * 22) + 8"
      endif
   elif pal_release = 1
      put 2 long MEMORY_FILE10
      xmath chunks_off "(full_files * 4) + 8"
   endif
endif

if build_index = 1
   print "\n okay, building index file right away... "
endif

for x = 0 < full_files
   getarray offset 1 x
   getarray size 2 x
   if pal_release = 1
      xmath split_archive_number "(offset * 0x800) / nico_size"
      xmath split_archive_next_number "split_archive_number + 1"
      xmath split_archive_test "offset * 0x800"
      xmath split_archive_draft2 "split_archive_number * nico_size"
      xmath split_archive_file_offset "(offset * 0x800) - split_archive_draft2"
      xmath split_archive_draft4 "split_archive_file_offset + size"
      xmath split_archive_file_chunks "(split_archive_draft4 / nico_size) + 1"
      if split_archive_file_chunks == 2
         xmath split_archive_size1 "nico_size - split_archive_file_offset"
         xmath split_archive_size2 "split_archive_draft4 - nico_size"
      endif
   endif
   
   if split_archive_number = 0
      set split_archive_temp_fileload1 string "NICO.DAT"
   elif split_archive_number = 1
      set split_archive_temp_fileload1 string "xab"
   elif split_archive_number = 2
      set split_archive_temp_fileload1 string "xac"
   elif split_archive_number = 3
      set split_archive_temp_fileload1 string "xad"
   elif split_archive_number = 4
      set split_archive_temp_fileload1 string "xae"
   endif
   set split_archive_temp_namesize1 strlen split_archive_temp_fileload1
   math split_archive_temp_namesize1 + 1
   
   if split_archive_next_number = 1
      set split_archive_temp_fileload2 string "xab"
   elif split_archive_next_number = 2
      set split_archive_temp_fileload2 string "xac"
   elif split_archive_next_number = 3
      set split_archive_temp_fileload2 string "xad"
   elif split_archive_next_number = 4
      set split_archive_temp_fileload2 string "xae"
   endif
   set split_archive_temp_namesize2 strlen split_archive_temp_fileload2
   math split_archive_temp_namesize2 + 1
   
   if build_index = 0
      if pal_release = 0
         math offset * 0x800
         log "" offset size
      elif pal_release = 1
         if split_archive_file_chunks != 1
            putvarchr MEMORY_FILE size 0
            log MEMORY_FILE 0 0
         endif
         
         append
         for y = 0 < split_archive_file_chunks
            if y = 0
               open FDSE split_archive_temp_fileload1 split_archive_number
               if split_archive_file_chunks = 1
                  log "" split_archive_file_offset size split_archive_number
               else
                  log MEMORY_FILE split_archive_file_offset split_archive_size1 split_archive_number
               endif
            elif y = 1
               open FDSE split_archive_temp_fileload2 split_archive_next_number
               log MEMORY_FILE 0 split_archive_size2 split_archive_next_number
            endif
         next y
         append
         
         if split_archive_file_chunks != 1
            log "" 0 size MEMORY_FILE
         endif
      endif
   elif build_index = 1
      if pal_release = 0
         put 9 byte MEMORY_FILE10
         putct "NICO.DAT" string -1 MEMORY_FILE10
         put 0 byte MEMORY_FILE10
         put offset long MEMORY_FILE10
         put size long MEMORY_FILE10
         if EXISTS == 1
            put name_off long MEMORY_FILE10
            savepos index_tmp1 MEMORY_FILE10
            goto name_off MEMORY_FILE10
            # the .lst file is not finished yet
            if offset >= 0xf9624 && offset <= 0x142130
               get name line 1
               putct name string -1 MEMORY_FILE10
               put 0x00 byte MEMORY_FILE10
            else
               put 0xff byte MEMORY_FILE10
               put 0x00 byte MEMORY_FILE10
            endif
            savepos name_off MEMORY_FILE10
            savepos index_size MEMORY_FILE10
            goto index_tmp1 MEMORY_FILE10
         else
            savepos index_size MEMORY_FILE10
         endif
         putarray 999 x index_size
      elif pal_release = 1
         put chunks_off long MEMORY_FILE10
         savepos index_tmp1 MEMORY_FILE10
         goto chunks_off MEMORY_FILE10
         if EXISTS == 1
            # the .lst file is not finished yet
            if offset >= 0x1696ba && offset <= 0x1b26a2
               get name line 1
               set name_size strlen name
               math name_size + 1
               put name_size byte MEMORY_FILE10
               putct name string -1 MEMORY_FILE10
               put 0x00 byte MEMORY_FILE10
            else
               put 2 byte MEMORY_FILE10
               put 0xff byte MEMORY_FILE10
               put 0x00 byte MEMORY_FILE10
            endif
         endif
         put split_archive_file_chunks long MEMORY_FILE10
         for y = 0 < split_archive_file_chunks
            if y = 0
               if split_archive_number >= 0 || split_archive_number <= 4
                  put split_archive_temp_namesize1 byte MEMORY_FILE10
                  putct split_archive_temp_fileload1 string -1 MEMORY_FILE10
                  put 0 byte MEMORY_FILE10
               endif
               xmath offset2 "split_archive_file_offset / 0x800"
               put offset2 long MEMORY_FILE10
               if split_archive_file_chunks = 1
                  put size long MEMORY_FILE10
               else
                  put split_archive_size1 long MEMORY_FILE10
               endif
            elif y = 1
               if split_archive_next_number >= 1 || split_archive_next_number <= 4
                  put split_archive_temp_namesize2 byte MEMORY_FILE10
                  putct split_archive_temp_fileload2 string -1 MEMORY_FILE10
                  put 0 byte MEMORY_FILE10
               endif
               put 0 long MEMORY_FILE10
               put split_archive_size2 long MEMORY_FILE10
            endif
         next y
         savepos chunks_off MEMORY_FILE10
         savepos index_size MEMORY_FILE10
         goto index_tmp1 MEMORY_FILE10
      endif
   endif
next x

if build_index = 1
   log "NICO.INDEX" 0 index_size MEMORY_FILE10
endif
o yea, as you can see the script can also "build" an entire file index from the NICO.DAT file so long as the "build_index" number is set to 1 at the start of the script. to support this idea, here are two generated *.INDEX files.
AnonBaiter
Posts: 1125
Joined: Tue Feb 02, 2016 2:35 am

Re: Team Ico Sample Thread

Post by AnonBaiter »

okay, as the .INDEX format i built specifically for Shadow of the Colossus/Wander to Kyozou is still under development i have decided to just update not only the script but the .INDEX files themselves as well.

anyway, the INDEX format itself was built to make NICO.DAT extraction quite easier on everyone's part. the structure itself goes as follows:

Code: Select all

0x00 - total number of files(32-bit)
0x04 - index version(1-2 depending on the file)

if version 1(uses the usual "offset/size/blah" structure):
0x08 and beyond - general file information within a "looped condition"
(current loop position(example: offset 0x08) + 0x00) - archive name size(8-bit, padding byte included - usually fixed for version 1 to make name-seeking per-file a bit easier)
(current loop position(example: offset 0x08) + 0x01) - archive name(the size of this one is as indicated by the "archive name size" variable)
(current loop position(example: offset 0x08) + (archive name size + 1) + 0x00) - file offset(32-bit, if calculated by a multiple of 2048/0x800)
(current loop position(example: offset 0x08) + (archive name size + 1) + 0x04) - file size(32-bit)
(current loop position(example: offset 0x08) + (archive name size + 1) + 0x08) - filename location(32-bit)
(filename location) - actual filename(said filename itself is null-terminated(most of the time) - if nothing else the "\xff\x00" value is shown)

if version 2(uses "file-divided-into-chunks-structure" for the PAL version due to how the archives were stored):
(current loop position(example: offset 0x08) + 0x00) - file location
(file location + 0x00) - filename size(8-bit, same thing as archive name size(see "if version 1" part) but set specifically for a single file)
(file location + 0x01) - filename(same thing as archive name(see "if version 1" part) but set specifically for a single file)
(file location + (filename size + 1) + 0x00) - number of chunks needed to extract the file(32-bit, the value itself goes up to 2 in case one file gets stuck in the middle of one of those split archives(example: xad->xae for a .pss file), otherwise it's just 1)
(file location + (filename size + 1) + 0x04) - archive name size(same info as version 1)
(file location + (filename size + 1) + 0x05) - archive name(same info as version 1)
(file location + (filename size + 1) + (archive name size + 1) + current chunk loop position(example: offset 0x10cf9) + 0x00) - chunk offset(same thing as file offset (see "if version 1" part) but accommodated into a chunked structure)
(file location + (filename size + 1) + (archive name size + 1) + current chunk loop position(example: offset 0x10cf9) + 0x04) - chunk size(same thing as file size (see "if version 1" part) but accommodated into a chunked structure)
in short i'm trying to get the most out of NICO.DAT and its derivatives.