Apex Legends .MSTR

Codecs, formats, encoding/decoding of game audio, video and music
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Apex Legends .MSTR

Post by Durandal217 »

Two years ago Titanfall 2 released with an amazing sound designed that I fell in love with, however Respawn switched from using the built in source engine audio codecs using uncompressed headerless .wavs to using the Miles Sound System with the extension .mstr. It was very difficult to get anyone to look into it and very little progress was made. A few days ago Apex Legends drop with awesome music and a incredible sound design to boot. However the same system is used yet again...

What is known is the following: The files are compressed, and variable size. Considering the frame sizes, they did not appear to be mp3 or ogg. According to what was observed with the codec, it looks to use some form of bink audio. Because using 8-bit quantizers It can easily be seen in the stream.

That is where everything ends...

This is what the offset begins with:

Code: Select all

52 54 53 43 02 00 FF FF 70 07 4A 05 00 00 00 00

or

Code: Select all

RTSC..ÿÿp.J.....


Attached is a sample file if anything else is needed please let me know. I'd love to get the sounds, plus the drop music for this incredible game. If anyone can discover, or come up with a way to extract this, it would be a breakthrough.
Luriam
Posts: 14
Joined: Sat Feb 09, 2019 4:07 pm

Re: Apex Legends .MSTR

Post by Luriam »

Created an account just to bump this thread
happyend
Posts: 157
Joined: Sun Aug 24, 2014 8:54 am

Re: Apex Legends .MSTR

Post by happyend »

All of the large files indexed in .mstr have a similar header and start with "1FCB"
starting with 1FCB are actually Bink Audio (.binka) files
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

happyend wrote:All of the large files indexed in .mstr have a similar header and start with "1FCB"
starting with 1FCB are actually Bink Audio (.binka) files


OK! I was able to extract a piece of audio looking for the headers and was able to convert and play successfully, the question now is, is there an automated script I can use to separate and convert all of the audio?
Luriam
Posts: 14
Joined: Sat Feb 09, 2019 4:07 pm

Re: Apex Legends .MSTR

Post by Luriam »

Wow! That's fantastic news :o Please keep it up, maybe this breakthrough will help us get Titanfall 2's audio files as well
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

Upon further research I might have hit a brick wall. so .binka looks easy enough to extract, however previous extractors would use binkawin.asi and mss32.dll to aid in extract the files. These are not present in apex legends instead what we have is bink2w64, mileswin64, binkawin64 DLL's all of which common sense looks like they changed from 32 bit to 64 bit.

I'e attached a dll if anyone can have a look and see if the original decode function from mss32.dll is still present and if so how to get it to convert to wav.

As of right now the only possible way to extract any audio is to go line by line in a hex editor looking for the 1FCB header and copying that and saving it as a new file, it would be extremely painful to do it this way.
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

So.. Here's where I'm at, I used this:
Image
to help me separate the audio, and using BinkA2Wav to convert, however the problem i'm running into is the audio keeps cutting off, I cannot figure out what i'm doing wrong or why it keeps doing that and I've run out of solutions...

Does anybody have any idea what can be done to remedy the cut off or extract this stuff with proper file names?
Luriam
Posts: 14
Joined: Sat Feb 09, 2019 4:07 pm

Re: Apex Legends .MSTR

Post by Luriam »

Image link seems to be broken, so I can't see what you did. Also could you please upload one of the .wavs, any .wav. I want to check something. Thank you!
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

I've attached the image in question, as well as a sample wave and the file before it was converted.
xyx0826
Posts: 26
Joined: Sat Feb 16, 2019 10:29 pm

Re: Apex Legends .MSTR

Post by xyx0826 »

I put together a C# script for slicing/listing Binka files in a MSTR bank: https://gist.github.com/xyx0826/3186510 ... fe8ba7eb4f
It's crude but it works.

I've got 16589 file entries in Titanfall 2's main stream bank. Average length of the files is 5047 bytes, which is pretty small.
Some of the extracted files are broken after conversion with BinkA2Wav, those playable sound like truncated audio fragments.
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

xyx0826 wrote:I put together a C# script for slicing/listing Binka files in a MSTR bank: https://gist.github.com/xyx0826/3186510 ... fe8ba7eb4f
It's crude but it works.

I've got 16589 file entries in Titanfall 2's main stream bank. Average length of the files is 5047 bytes, which is pretty small.
Some of the extracted files are broken after conversion with BinkA2Wav, those playable sound like truncated audio fragments.


first and foremost excellent work! amazing job. I tried your first version I'm still going through converting but I managed to get around 20,000 files, I saw you updated it today I will go back and try it out.

The question now is why is the audio truncated fragments? Could it be the archive? Or is this the way the audio was placed....
xyx0826
Posts: 26
Joined: Sat Feb 16, 2019 10:29 pm

Re: Apex Legends .MSTR

Post by xyx0826 »

Durandal217 wrote:
xyx0826 wrote:I put together a C# script for slicing/listing Binka files in a MSTR bank: https://gist.github.com/xyx0826/3186510 ... fe8ba7eb4f
It's crude but it works.

I've got 16589 file entries in Titanfall 2's main stream bank. Average length of the files is 5047 bytes, which is pretty small.
Some of the extracted files are broken after conversion with BinkA2Wav, those playable sound like truncated audio fragments.


first and foremost excellent work! amazing job. I tried your first version I'm still going through converting but I managed to get around 20,000 files, I saw you updated it today I will go back and try it out.

The question now is why is the audio truncated fragments? Could it be the archive? Or is this the way the audio was placed....


I tested the script on smaller stream banks (patch banks). It seems like data chunks with a BinkA header only occupy a small portion of the file - The rest of the file does not contain any header. Either it is stored with a special format or it is literally a big chunk of audio.

The script struggles with big files. I'll refactor it when I have time.

Edit: the unknown data chunk can also be some sort of "descriptor" data - e.g. file size/checksum, track mixing instructions etc.
xyx0826
Posts: 26
Joined: Sat Feb 16, 2019 10:29 pm

Re: Apex Legends .MSTR

Post by xyx0826 »

The script is now rewritten to use buffered FileStream. There's a huge performance boost, and bugs are squished.

Still, all sound files extracted are fragments. I tried extracting voice banks; the fragments all sound like beginning of voice lines.

The last slice of the bank is very suspicious; it's very big, though I'm pretty sure it's not a big audio file. There is definitely other data in it.
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

xyx0826 wrote:The script is now rewritten to use buffered FileStream. There's a huge performance boost, and bugs are squished.

Still, all sound files extracted are fragments. I tried extracting voice banks; the fragments all sound like beginning of voice lines.

The last slice of the bank is very suspicious; it's very big, though I'm pretty sure it's not a big audio file. There is definitely other data in it.


Thanks for this i'm using the updated version right now. This makes me wonder now.. The original titanfall utilized multi-channel (6 channels surround sound) uncompressed audio. I wonder if the audio for titanfall 2 and apex legends is using multi-channel audio. Which is the only other way to explain the fragmented audio.

Is there any possibility of decompressing the .mstr archive, to read the data perhaps even find out what audio goes where, (file names, etc.) How difficult does that look?
xyx0826
Posts: 26
Joined: Sat Feb 16, 2019 10:29 pm

Re: Apex Legends .MSTR

Post by xyx0826 »

You can find stringtables in mbnk, mbnk_digest and mprj files. Some of these looks like names to audio assets, while some others refer to audio qualities (occulusion, reverb etc.)

Another possibility is that the fragments are the "indices" of the audio files. The rest of the corresponding audio asset is buried in the big chunk of data at end of file.

How do you decompress the archive though?
Luriam
Posts: 14
Joined: Sat Feb 09, 2019 4:07 pm

Re: Apex Legends .MSTR

Post by Luriam »

Durandal217 wrote:
xyx0826 wrote:The script is now rewritten to use buffered FileStream. There's a huge performance boost, and bugs are squished.

Still, all sound files extracted are fragments. I tried extracting voice banks; the fragments all sound like beginning of voice lines.

The last slice of the bank is very suspicious; it's very big, though I'm pretty sure it's not a big audio file. There is definitely other data in it.


Thanks for this i'm using the updated version right now. This makes me wonder now.. The original titanfall utilized multi-channel (6 channels surround sound) uncompressed audio. I wonder if the audio for titanfall 2 and apex legends is using multi-channel audio. Which is the only other way to explain the fragmented audio.

Is there any possibility of decompressing the .mstr archive, to read the data perhaps even find out what audio goes where, (file names, etc.) How difficult does that look?



You were right about them being multi-channel. That's exactly why I asked you to upload it for me. Because that very same file you uploaded above general_stream_00000256_converted has 6 channels:

Image
xyx0826
Posts: 26
Joined: Sat Feb 16, 2019 10:29 pm

Re: Apex Legends .MSTR

Post by xyx0826 »

I found multiple blocks of twelve 0x00's in the tail blob. They seem like terminator blocks. Each null block is followed by two 0x99's, probably the beginning of a new data chunk.

Here is a tail blob sample. The beginning of the file should be a normal bink audio file.
tail blob sample
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

xyx0826 wrote:I found multiple blocks of twelve 0x00's in the tail blob. They seem like terminator blocks. Each null block is followed by two 0x99's, probably the beginning of a new data chunk.

Here is a tail blob sample. The beginning of the file should be a normal bink audio file.
tail blob sample


It is! When I converted it, it gave me a full voice line, Focus, Fight, Win. However the beginning sounded like there was another line playing at the beginning.

Here is the converted sample.
xyx0826
Posts: 26
Joined: Sat Feb 16, 2019 10:29 pm

Re: Apex Legends .MSTR

Post by xyx0826 »

That's very cool! This file comes from a patch for Titanfall 2's voice bank. The exact voice line can be heard here at the beginning: https://www.youtube.com/watch?v=vT6zuy84RTo

Now here's where the fun begins. Remember the terminator blocks I found? I imagine the sample file to look like this:

Code: Select all

|binka_header|some_audio_data|null_block|some_audio_data|null_block|some_audio_data|null_block|......|


I hexedited the file and removed data from after the first null block (0x298) until before the second null block (0x4eb). Remember

Code: Select all

{ 0x99, 0x99 }
is the beginning of a new data block.

Voila, this modified file converts to five voice lines. Each one of them has a bit trimmed off at the beginning. I guess that's why we keep getting audio fragments at the front of the archive.

I've attached original and modified binka files with their converted wav's.
11_compare.7z


Edit: I'll see if this works on other stream banks. I haven't touched Apex Legends banks yet since they're quite big but I think they have the same format.
Durandal217
Posts: 31
Joined: Sun Apr 10, 2016 3:54 pm

Re: Apex Legends .MSTR

Post by Durandal217 »

xyx0826 wrote:That's very cool! This file comes from a patch for Titanfall 2's voice bank. The exact voice line can be heard here at the beginning: https://www.youtube.com/watch?v=vT6zuy84RTo

Now here's where the fun begins. Remember the terminator blocks I found? I imagine the sample file to look like this:

Code: Select all

|binka_header|some_audio_data|null_block|some_audio_data|null_block|some_audio_data|null_block|......|


I hexedited the file and removed data from after the first null block (0x298) until before the second null block (0x4eb). Remember

Code: Select all

{ 0x99, 0x99 }
is the beginning of a new data block.

Voila, this modified file converts to five voice lines. Each one of them has a bit trimmed off at the beginning. I guess that's why we keep getting audio fragments at the front of the archive.

I've attached original and modified binka files with their converted wav's.
11_compare.7z

Edit: I'll see if this works on other stream banks. I haven't touched Apex Legends banks yet since they're quite big but I think they have the same format.


Two questions: I tend to find these by hex values or offsets, is the offset always the same or it is a different offset every time? and is it possible your script could be modified to find and split and or get these to convert? Hope i worded that right.