Slowly making progress on this, It seems there are string lengths stored in a table, however the table doesn't seem to be of a fixed structure as far as i can tell.
0x0 - 4byte magic 0x4 - 4byte entry count 0x8 - 4byte TOC (possibly) 0xc - 4 byte Table 1 pos - ascii null terminated contents - seems to be the language keys used by the engine 0x10 - 4 byte Table 2 pos - ascii null terminated contents 0x14 - 4 byte Table 3 pos - ascii null terminated contents 0x18 - 4 byte Table 4 pos - Language strings are here in unicode form
TOC Entry 0x0 - 4byte unk (seems to be size of string in table 1 in dlc files else 0, in cosmostextdb its a large number..) 0x4 - 4byte unk 0x8 - 4byte unk 0xc - 4byte unk (seems to be size of string in table 4 in dlc files else 0, in cosmostextdb its a large number..) 0x10 - 4byte crc32 of table 1 entry else 0xffffffff (BZIP2 type, possibly with bytes reversed) 0x14 - 4byte unk 0x18 - 4byte unk 0x1c - 4byte unk
Just adding bytes to the unicode strings seems to currupt the database for other entries, Yet im still not seeing if there is a offset / size listed in the TOC..
Note, in dlc files table2/3/4 all have the same offset specified (possibly different file type completly)