reading json file with chinese character using streamReader
I have a json text file with chinese character.
the character is shown perfectly in notepad++ when I chosse the encoding as UTF8.
Hence I set the streamReader's encoding as UTF8 too.
But the character is not read correctly and log as some unknown character.
public class SpeechManager :MonoBehaviour {
public string fileName;
public string speechJson;
public int startId;
public int endId;
public Dictionary<int, Speech> speechDictionary=new Dictionary<int, Speech>();
void Start(){
getSpeechData ();
putAllSpeechToDictionary ();
getSpeechByID (100001);
getSpeechByID (100002);
getSpeechByID (100003);
}
public void getSpeechData(){
try
{
string fileFullPath= Application.dataPath+"/textFiles/"+fileName;
StreamReader theReader = new StreamReader(fileFullPath, Encoding.UTF8);
speechJson=theReader.ReadToEnd();
theReader.Close();
}
catch (Exception e)
{
Debug.Log(e);
}
}
public void putAllSpeechToDictionary(){
var N = JSON.Parse (speechJson);
Debug.Log ("試試 "+N.Count); // the chinese character print perfectly here
for(int i=0;i<N.Count;i++){
Debug.Log (N[i]); // but failed here
int id = N[i]["ChatID"].AsInt;
int headiconID= N[i]["HeadiconID"].AsInt;
int chatRole= N[i]["ChatRole"].AsInt;
int chatstring= N[i]["ChatString"].AsInt;
int triggerType= N[i]["TriggerType"].AsInt;
int triggerScene= N[i]["TriggerScene"].AsInt;
int triggerPic= N[i]["TriggerPic"].AsInt;
int nextChatID= N[i]["NextChat"].AsInt;
string speechCharacter= N[i]["SpeechCharacter"];
string speechString= N[i]["SpeechString"];
Speech tempSpeech= new Speech(id, headiconID,chatRole,chatstring,triggerType,triggerScene,triggerPic,nextChatID,speechCharacter,speechString);
Debug.Log ("Adding speech "+ id);
speechDictionary.Add(id, tempSpeech);
}
}
public void getSpeechByID(int speechID){
Debug.Log ("getting speech"+ speechID);
Speech speechReturn= speechDictionary[speechID];
Debug.Log (speechReturn.ToString ());
}
}
If you debug.log speechJson
does the character show correctly? What are you using to parse the JSON? Perhaps it isn't handling Unicode characters correctly?
no, the speechJson is also print the character incorrectly. I have tried to set the encode as unicode in the streamReader. but result in exception
System.ArgumentException: invalid utf-16 sequence at 2282288
And I am using the SimpleJson to parse the string
When creating your stream, try...
new StreamReader(fileFullPath, Encoding.Default, true);
Passing true at the end will let the stream reader try to detect the encoding using "byte order marks" that are written to the beginning of the file (notepad writes these when saving as UTF-8). And using Encoding.Default as the second parameter will use the system's default encoding if it can't detect from the file.
Anyway, kind of a guess, but give it a go.
Out of curiosity, are there other Chinese or unicode characters that work? Or do they all fail?
Your answer

Follow this Question
Related Questions
Problem with special char android!! How can i fix my problem? 1 Answer
How to UnEscape "& quot ;" in a string? 0 Answers
Lightmap encoding high quality 2 Answers
Split heavy operation using coroutine 0 Answers