Wayback Machinekoobas.hobune.stream
May JUN Jul
Previous capture 14 Next capture
2021 2022 2023
2 captures
13 Jun 22 - 14 Jun 22
sparklines
Close Help
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
  • Asset Store
  • Get Unity

UNITY ACCOUNT

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account
  • Blog
  • Forums
  • Answers
  • Evangelists
  • User Groups
  • Beta Program
  • Advisory Panel

Navigation

  • Home
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
    • Blog
    • Forums
    • Answers
    • Evangelists
    • User Groups
    • Beta Program
    • Advisory Panel

Unity account

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account

Language

  • Chinese
  • Spanish
  • Japanese
  • Korean
  • Portuguese
  • Ask a question
  • Spaces
    • Default
    • Help Room
    • META
    • Moderators
    • Topics
    • Questions
    • Users
    • Badges
  • Home /
avatar image
0
Question by Nis Baggesen · Jan 02, 2014 at 02:31 PM · wwwunicodemime

WWW Object and UTF8 to Unicode conversion

I'm using the WWW object to retrieve data from a webservice.

The data in question is a JSON structure, and typical data might look like this:

 [{"aktivitet_id":18,"afviklinger":[{"afvikling_id":33,"aktivitet_id":18,"lokale_id":"11","lokale_navn":"B39","start":{"day":"28","month":"3","year":"2013","h":"11","m":0,"date":"28-3-2013","datetime":"28-3-2013 11:00","timestamp":1364464800,"mysql":"2013-03-28 11:00:00"},"end":{"day":"28","month":"3","year":"2013","h":"16","m":0,"date":"28-3-2013","datetime":"28-3-2013 16:00","timestamp":1364482800,"mysql":"2013-03-28 16:00:00"},"linked":0,"length":5},{"afvikling_id":34,"aktivitet_id":18,"lokale_id":"30","lokale_navn":"2.02","start":{"day":"29","month":"3","year":"2013","h":"20","m":0,"date":"29-3-2013","datetime":"29-3-2013 20:00","timestamp":1364583600,"mysql":"2013-03-29 20:00:00"},"end":{"day":"30","month":"3","year":"2013","h":"1","m":0,"date":"30-3-2013","datetime":"30-3-2013 1:00","timestamp":1364601600,"mysql":"2013-03-30 01:00:00"},"linked":0,"length":5}],"info":{"title_da":"Klip en h\u00e6l og hug en t\u00e5","text_da":"Af Michael Sonne-J\u00f8rgensen\nDenne bog var ikke som de andre rejsebeskrivelser, der var dumpet ned fra hullet i paddehatten over\u00a0Nimbuuksens hytte. Forfatteren var dens absolutte yndling, Thotvalius Pl\u00f8, den ber\u00f8mte Klyngenkender, legenden der havde sat sin f\u00f8dder overalt p\u00e5 Klyngen.\nBogen,som var blevet kastet i Utrov\u00e6rdighedens Hul af Ord Arsenalet., beskrev 3 historier, som p\u00e5 forunderligvis blev v\u00e6vet sammen til et storsl\u00e5et eventyr p\u00e5 en megetuforudsigelig dag i Kr\u00e6mmerby.\nTag med p\u00e5 eventyr og oplev de dovne skokkeskubbere, der er blevet sendt til byen for at k\u00f8be l\u00f8gposer,\u00a0men som i stedet har lagt en snedig plan om, at bruge alle pengene til at k\u00f8be flamboyant t\u00f8j, og skifte\u00a0navne til deres helte: Ghungas Gumbanik og det beskidte sl\u00e6ng. Hvis de alts\u00e5 lige kan komme af med\u00a0bondemandens nev\u00f8, som er sendt med for at holde \u00f8je med dem.\nOplev, hvordan k\u00f8dsnedkeren Mester Mesterhaks samsurlinge har f\u00e5et en genial ide til at f\u00e5 deres elskede\u00a0og ret s\u00e5 talentl\u00f8se mester \u00f8verst i det nepotistiske k\u00f8dsnedkerhierarki. De vil sl\u00e5 byens absolut hotteste\u00a0cirkustrup ihjel, men f\u00f8rst skal de lige ud af K\u00f8dbyen og over lambruskoernes bro.\nOg sidst, men ikke mindst, tr\u00e6d ind i manegen med B-truppen fra Cirkus Let P\u00e5 T\u00e5, som pr\u00f8ver, med livet\u00a0som indsats, at g\u00f8gle sig igennem en spektakul\u00e6r forestilling for at tilfredsstille Klyngens absolut farligste\u00a0bande: Ghungas Gumbanik og det beskidte sl\u00e6ng, fordi den rigtige cirkustrup p\u00e5 mystisk vis, aldrig dukkede\u00a0op. Men f\u00f8rst skal de lige overbevise de blodt\u00f8rstige k\u00f8dnsedkere, som har hyret Cirkus Let P\u00e5 T\u00e5, om\u00a0at de rent faktisk har noget med cirkusset at g\u00f8re. Og s\u00e5 lige finde ud af, hvordan de f\u00e5r transporteret\u00a0artillerinissens k\u00e6mpe kanon ind i manegen.\n\\\"Klip en h\u00e6l og hug en t\u00e5\\\" er en blanding af almindeligt rollespil, hvor spillerne har ansvaret for, at drive\u00a0eventyret frem via deres roller, og fort\u00e6llerrollespil, hvor spillerne selv skal bidrage med input til\u00a0verdenen.\nVarighed: 3-5 timer\nAntal spillere: 4 spillere og 1 spilleleder\nGenre: Eventyr\nSpillertype: Modne spillere, som har lyst til at v\u00e6re drivkraften i eventyret og tage et stort ansvar i den\u00a0f\u00e6lles oplevelse for at bringe Klyngen til live.\nSpilleledertype: Erfaren spilleleder, der kan improvisere p\u00e5 stedet og holde den strenge disciplin det\u00a0kr\u00e6ver at bevare alvoren i en eventyrlig verden.\nOm forfatteren: Michael har skrevet en masse scenarier i alle mulige genre. Dette er det andet scenarie som foreg\u00e5r i den\u00a0eventyrlige og sk\u00f8re verden Klyngen.","description_da":"","title_en":"Heel and toe, cut and go","text_en":"","description_en":"","author":["Michael Sonne-J\u00f8rgensen"],"price":0,"min_player":4,"max_player":4,"type":"rolle","play_hours":5,"language":"dansk+engelsk","wp_id":"4784"}}]

So pure textual data encoded as UTF8. As you can see it contains a number of UNICODE escape characters as well linebreaks etc.

However when I get the www.text member from the WWW object, these escape codes are preserved even though the text has been converted into a default C# unicode string. So even the unicode version of the string contains e.g. a substring "\u00f8" instead of the properly converted 'ø' character it should be.

I've tried grabbing the www.bytes instead and running those through the System.Text.Encoder.Converter, but that gives me the same result.

Is there some way of getting the raw data or making the WWW object aware of the mimetype of the data it is receiving (which is properly specified in the header by the webservice).

Comment
Add comment
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

2 Replies

· Add your reply
  • Sort: 
avatar image
0

Answer by Briksins · Jan 02, 2014 at 04:21 PM

I believe you should encode string manually to UTF-8

it can be done like this:

 byte[] bytes = Encoding.Default.GetBytes(myString);
 myString = Encoding.UTF8.GetString(bytes);

in your case you can read your WWW as bytes straightaway, and after convert those bytes to string using encoder

 yourString = Encoding.UTF8.GetString(<your_WWW_byts>);
Comment
Add comment · Show 2 · Share
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image Nis Baggesen · Jan 02, 2014 at 05:10 PM 0
Share

But that is exactly what I'm saying is not working: "I've tried grabbing the www.bytes ins$$anonymous$$d and running those through the System.Text.Encoder.Converter, but that gives me the same result."

I've tried

 string sText= Encoding.UTF8.GetString(www.bytes);

and

 byte[] bConvert = UnicodeEncoding.Convert(Encoding.UTF8,Encoding.Unicode,www.bytes);
 string sText= Encoding.Unicode.GetString(bConvert);

and ofcourse just grabbing the raw www.text.

In all cases I end up with unconverted escape codes in the resulting unicode string.

avatar image Nis Baggesen · Jan 02, 2014 at 05:11 PM 0
Share

But thanks for the answer - At least it tells me that I wasn't quite wrong in what I thought should be working. :)

avatar image
0

Answer by Nis Baggesen · Jan 02, 2014 at 06:21 PM

Oh well - It seems like this a a basic problem with C# and unicode escape sequences. They are only actually resolved in very specific places:

http://msdn.microsoft.com/en-us/library/aa664669%28v=vs.71%29.aspx

Guess I will have to write my own converter.

Comment
Add comment · Show 2 · Share
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image DaveA · Jan 02, 2014 at 06:29 PM 0
Share

You may still leverage the Encoding class(es) but probably will need more than one line of code.

avatar image $$anonymous$$ · Nov 22, 2017 at 03:24 AM 0
Share

Did you ever figure this one out? Three and a half years later having similar issues (see: https://answers.unity.com/questions/1433716/jsonutility-deserialization-via-json-file-doesnt-w.html?childToView=1433926#comment-1433926) and the above $$anonymous$$SDN link is expired.

Sorry to bring up a cold case, just seeing if you found a workaround.

Your answer

Hint: You can notify a user about this post by typing @username

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this Question

Answers Answers and Comments

20 People are following this question.

avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image

Related Questions

Loading a large texture at at run time kills fps 1 Answer

Why WWW.text == null ? 1 Answer

"SecurityException: No valid crossdomain" even though the webplayer didn't even try to get the crossdomain policy request. 1 Answer

Sending logs over the Internet 1 Answer

Unity 4.0.0f7 GetThreadContext failed 2 Answers


Enterprise
Social Q&A

Social
Subscribe on YouTube social-youtube Follow on LinkedIn social-linkedin Follow on Twitter social-twitter Follow on Facebook social-facebook Follow on Instagram social-instagram

Footer

  • Purchase
    • Products
    • Subscription
    • Asset Store
    • Unity Gear
    • Resellers
  • Education
    • Students
    • Educators
    • Certification
    • Learn
    • Center of Excellence
  • Download
    • Unity
    • Beta Program
  • Unity Labs
    • Labs
    • Publications
  • Resources
    • Learn platform
    • Community
    • Documentation
    • Unity QA
    • FAQ
    • Services Status
    • Connect
  • About Unity
    • About Us
    • Blog
    • Events
    • Careers
    • Contact
    • Press
    • Partners
    • Affiliates
    • Security
Copyright © 2020 Unity Technologies
  • Legal
  • Privacy Policy
  • Cookies
  • Do Not Sell My Personal Information
  • Cookies Settings
"Unity", Unity logos, and other Unity trademarks are trademarks or registered trademarks of Unity Technologies or its affiliates in the U.S. and elsewhere (more info here). Other names or brands are trademarks of their respective owners.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • Default
  • Help Room
  • META
  • Moderators
  • Explore
  • Topics
  • Questions
  • Users
  • Badges