I'm trying to get rip the japanese texts using get CT
but when I use
GetCT TEXT unicode 0xFF or GetCT TEXT unicode 0xFFF8 it can't seem to find and goes at the end of the file
31 00 32 00 33 00 34 00 35 00 36 00 37 00 38 00
39 00 30 00 2E 00 DD 52 26 62 2F 00 3A 00 70 00
74 00 73 00 FF F8 5B 30 44 30 5B 30 4D 30 92 30
4B 30 4F 30 6B 30 93 30 67 30 4D 30 8B 30 88 30
FF F8
1234567890.勝戦/:ptsせいせきをかくにんできるよ
GetCT TEXT string 0x73 seems to work fine and gets the 1234567890.勝戦/:pt text
or maybe I'm doing this wrong and there are other commands and variables suited for this?
troubles with getCT on unicode japanese charcters
-
- Posts: 81
- Joined: Sun Jul 10, 2016 11:07 am
-
- Site Admin
- Posts: 12984
- Joined: Wed Jul 30, 2014 9:32 pm
Re: troubles with getCT on unicode japanese charcters
I guess I have to fix it in next version of quickbms.
Honestly it's the first time I see that end-of-line character, very unlucky
Honestly it's the first time I see that end-of-line character, very unlucky
-
- Site Admin
- Posts: 12984
- Joined: Wed Jul 30, 2014 9:32 pm
Re: troubles with getCT on unicode japanese charcters
In current quickbms use:
In next version 0.9.1:
Basically the default utf8 codepage will result ever in invalid chars because that code uses the JIS codepage.
Let me know if it's all ok and the result is what you expect, please note that "print" is not the best way to display utf8 data.
Code: Select all
codepage 931
goto 0x6d8
getct NAME unicode 0xff
print "%NAME%"
In next version 0.9.1:
Code: Select all
codepage 931
goto 0x6d8
getct NAME unicode 0xf8ff
print "%NAME%"
Basically the default utf8 codepage will result ever in invalid chars because that code uses the JIS codepage.
Let me know if it's all ok and the result is what you expect, please note that "print" is not the best way to display utf8 data.