Wayback Machinekoobas.hobune.stream
May JUN Jul
Previous capture 14 Next capture
2021 2022 2023
2 captures
13 Jun 22 - 14 Jun 22
sparklines
Close Help
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
  • Asset Store
  • Get Unity

UNITY ACCOUNT

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account
  • Blog
  • Forums
  • Answers
  • Evangelists
  • User Groups
  • Beta Program
  • Advisory Panel

Navigation

  • Home
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
    • Blog
    • Forums
    • Answers
    • Evangelists
    • User Groups
    • Beta Program
    • Advisory Panel

Unity account

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account

Language

  • Chinese
  • Spanish
  • Japanese
  • Korean
  • Portuguese
  • Ask a question
  • Spaces
    • Default
    • Help Room
    • META
    • Moderators
    • Topics
    • Questions
    • Users
    • Badges
  • Home /
This question was closed Dec 09, 2014 at 07:30 AM by Andres-Fernandez for the following reason:

The question is answered, right answer was accepted

avatar image
1
Question by Andres-Fernandez · Dec 01, 2014 at 08:54 AM · jsonencodingutf-8

WWW.text string not correct?

Hi there, I've been hitting my head against the wall for a few weeks now because of some weird behaviour I can't quite figure out.

My webplayer has to read a JSON data file from a folder (inside the same server, in a subfolder, so the security sandbox doesn't complain) getting the file via WWW class and using the JSONObject class to parse the data. I've coded it so in the case of the file not being available, it reads some default data from a JSON file from the resources folder. The thing is that when I use the WWW.text property to access the data from the downloaded file, the JSONObject class can't read it, while whenever it reads the (very same) data from the file in the resources folder (using TextAsset.text property) there's no problem at all. Both files are UTF-8 encoded.

I've narrowed the problem down to the point of WWW.text property vs TextAsset.text property reading files UTF-8 encoded and returning differently encoded strings (or something similar, because I'm lost). The docs on WWW.text say that the contents of the web page must be in UTF-8 or ASCII character set, which they are, but still, the string that I get using WWW.text can't be read in the parser. The manual on TextAsset says that it can read .json files (which it does), so I really think that the problem may come from reading a UTF-8 encoded file with the WWW class.

The (updated) code:

 WWW myWWW = new WWW(Application.dataPath + JSONurl);
 yield return myWWW;
 string jsonData = "";
 if (myWWW.error == null) {
     jsonData = myWWW.text;
 } else {
     TextAsset myData = (TextAsset)Resources.Load(JSONFile, typeof(TextAsset));
     jsonData = myData.text;
 }
 JSONObject json = new JSONObject(jsonData);

Any ideas on why WWW.text could be returning a wrong string while reading from a UTF-8 encoded file?

Thanks in advance.

[UPDATE] I've updated the code and done some more tests. The string that receives the data from the WWW.text displays the contents of the file correctly, no matter the encoding of the file (ANSI or UTF-8). Looks like the string returned by WWW.Text and TextAsset.text might be in different encodings and the JSONOBject only accepts the encoding from the TextAsset.text string. I'll keep working on it.

Comment
Add comment · Show 7
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image Andres-Fernandez · Dec 01, 2014 at 09:34 AM 0
Share

Also note that I use JSONObject and if the file downloaded is ANSI encoded, it is read properly (but JSON files must be UTF-8 encoded and the file I read is UTF-8 encoded).

avatar image ajketan · Dec 01, 2014 at 10:21 AM 0
Share

Hi Andres.... I would like to know what JSON parser are you using. I have the exact same code as of yours and SimpleJSON as my JSON parser ... which simply works without any hassle.

avatar image Andres-Fernandez · Dec 01, 2014 at 10:31 AM 0
Share

Hi, I'm using JSONObject from the asset store.

avatar image Andres-Fernandez · Dec 01, 2014 at 11:36 AM 0
Share

So, I tried changing the encoding of the WWW response, since I know it'll be UTF-8 encoded:

 WWW myWWW = new WWW(Application.dataPath + JSONurl);
 yield return myWWW;
 string jsonData = "";
 if (myWWW.error == null) {
 
     Encoding utf8 = Encoding.UTF8;
     Encoding ansi = Encoding.Default;
 
     byte[] utf8bytes = utf8.GetBytes(myWWW.text);
     byte[] ansibytes = Encoding.Convert(utf8, ansi, utf8bytes);
     char[] ansichars = new char[ansi.GetCharCount(ansibytes, 0, ansibytes.Length)];
     ansi.GetChars(ansibytes, 0, ansibytes.Length, ansichars, 0);
     jsonData = new string(ansichars);
 
 } else {
     TextAsset myData = (TextAsset)Resources.Load(JSONFile, typeof(TextAsset));
     jsonData = myData.text;
 }
 JSONObject json = new JSONObject(jsonData);

Still no luck. The string reads the contents perfectly, but parser doesn't like the string it receives if it comes from an UTF-8 encoded file.

avatar image Ruuts · Dec 02, 2014 at 02:21 PM 0
Share

I'm having a similar problem. I went through all the possible encoders (though utf8 should have done the trick) without any luck. Didn't even load a JSON string in the end, but a simple "Hello World". Decoding the byte array in bulk results in a single character (e.g. "<" or "É" - mostly �). Grabbing every byte on its own and printing it to the console shows a lot of fancy characters, none of which resemble the source remotely.

Since i "solved" the problem by rolling back to Unity 4.3 i can only guess the problem lies hidden somewhere in the WWW class.

i still hope i am missing something - hate to miss out on the new ui =)

Show more comments

2 Replies

  • Sort: 
avatar image
9
Best Answer

Answer by Andres-Fernandez · Dec 03, 2014 at 08:22 AM

Finally the solution came from a colleague of mine, who told me that while debugging he noticed that the first three bytes of the WWW.text property were not readable characters. And yes, they were EF BB BF, i.e. the UTF-8 BOM. His workaround (that works at least on webplayer, I haven't checked other platforms) is as simple as not reading the first three bytes, since it seems that the WWW class includes the BOM in the WWW.text string:

 WWW myWWW = new WWW(Application.dataPath + JSONurl);   // UTF-8 encoded json file on the server
 yield return myWWW;
 string jsonData = "";
 if (string.IsNullOrEmpty(myWWW.error)) {
     jsonData = System.Text.Encoding.UTF8.GetString(myWWW.bytes, 3, myWWW.bytes.Length - 3);  // Skip thr first 3 bytes (i.e. the UTF8 BOM)
     JSONObject json = new JSONObject(jsonData);   // JSONObject works now
 }

Haven't checked with other parser or any other situations, but if any of you is experiencing problems with WWW.text you may want to look at the BOM inside your strings.

Comment
Add comment · Show 4 · Share
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image Andres-Fernandez · Dec 03, 2014 at 03:15 PM 0
Share

Obviously, if the file is saved without BO$$anonymous$$, you can read WWW.text directly. In my case, I check for the first 3 bytes of WWW.bytes. If they match EF BB BF then I skip them (since a JSON file can't start with those bytes), if they don't match then I get the whole WWW.text directly.

avatar image JEDressler · Feb 19, 2016 at 08:14 PM 0
Share

This solution saved my day! I was wondering why my decryption wasn't working and before that my JSON object convert was giving an error. Now both works, thx so much!

avatar image viciousesque · Aug 06, 2016 at 11:52 PM 0
Share

YES!

This solution totally worked. Thank you Andres! I've had like two dozen browser windows open for the last four days hunting for a solution to this problem, and this is the only correct solution out there. Totally strange, as this just seems like a very basic bug devs would be running across all the time.

Anyway, thank you for posting this Andres! I've shared this link both in another Unity forums post as well as Stackoverflow.

BTW, I'm not using any additional plug-ins to work with the JSON (as I'm seeing pretty consistently referenced). I'm just parsing and working with the object in native Unity code via JsonUtility.FromJson

avatar image NareshKhandla · Nov 27, 2017 at 06:42 AM 0
Share

I have solved this problem.

Please use "JSONObject" from asset store which is available free.

Download from here : https://assetstore.unity.com/packages/tools/input-management/json-object-710

WWW myWWW = new WWW(Application.dataPath + JSONurl); // UTF-8 encoded json file on the server yield return myWWW; string jsonData = ""; if (string.IsNullOrEmpty(myWWW.error)) { jsonData = System.Text.Encoding.UTF8.GetString(myWWW.bytes, 3, myWWW.bytes.Length - 3); // Skip thr first 3 bytes (i.e. the UTF8 BO$$anonymous$$) JSONObject json = new JSONObject(jsonData); // JSONObject works now

Debug.Log(json.ToString());

}

// Now you use this json.ToString() in your json deserialization. and can't display null reference error ,

avatar image
1

Answer by unimechanic · Dec 02, 2014 at 01:42 PM

I'm using JSONObject from the asset store.

but parser doesn't like the string it receives if it comes from an UTF-8 encoded file

You might get a faster answer by getting in contact with the author. Probably they know whether this is a limitation of the plugin, or could provide a solution.

Comment
Add comment · Show 1 · Share
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image Andres-Fernandez · Dec 02, 2014 at 02:47 PM 0
Share

Thanks for the suggestion...

Follow this Question

Answers Answers and Comments

32 People are following this question.

avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image

Related Questions

Is it possible to set -codepage compiler flag? 0 Answers

Get Json with UTF-8 encode try to convert to decoded UTF-8 String Array 2 Answers

How do I convert from unicode to surrogate pairs? 2 Answers

Problems with Russian characters within strings (C#) 3 Answers

XmlException: Text node cannot appear in this state 2 Answers


Enterprise
Social Q&A

Social
Subscribe on YouTube social-youtube Follow on LinkedIn social-linkedin Follow on Twitter social-twitter Follow on Facebook social-facebook Follow on Instagram social-instagram

Footer

  • Purchase
    • Products
    • Subscription
    • Asset Store
    • Unity Gear
    • Resellers
  • Education
    • Students
    • Educators
    • Certification
    • Learn
    • Center of Excellence
  • Download
    • Unity
    • Beta Program
  • Unity Labs
    • Labs
    • Publications
  • Resources
    • Learn platform
    • Community
    • Documentation
    • Unity QA
    • FAQ
    • Services Status
    • Connect
  • About Unity
    • About Us
    • Blog
    • Events
    • Careers
    • Contact
    • Press
    • Partners
    • Affiliates
    • Security
Copyright © 2020 Unity Technologies
  • Legal
  • Privacy Policy
  • Cookies
  • Do Not Sell My Personal Information
  • Cookies Settings
"Unity", Unity logos, and other Unity trademarks are trademarks or registered trademarks of Unity Technologies or its affiliates in the U.S. and elsewhere (more info here). Other names or brands are trademarks of their respective owners.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • Default
  • Help Room
  • META
  • Moderators
  • Explore
  • Topics
  • Questions
  • Users
  • Badges