- Home /
Downloading text from web gives html code.
Hey,
So I managed to connect to the server and download data. The problem is I want the text from the file(.html or .txt), but when run the code it returns me html code. I did it with the www class. And used www.text.
Any ideas?
Zotey
If I understood you right, you want the text as displayed by HT$$anonymous$$L but you are getting the html source from www.text.
This is expected. HT$$anonymous$$L in essence is a text file, and www.text is just giving you the raw text so the html code will be there. It does not parse the HT$$anonymous$$L file to get the text content between tags.
You understood me right but in my html file there is just a sentence there are even no tags. $$anonymous$$y problem is that the code that is displayed, don't even give me the actual sentence. If i would get my code in html(or my sentence) then it would be easy to locate it between the tags.
Here you go. Remember that the only thing what i want to do is download a string of text.
It seems like you are trying to execute a client side java script code. I am not a JS expert but WWW can't execute java script for you on the client side. You have to get text output from the server and any script executions have to happen on the server.
Answer by Bunny83 · Sep 06, 2015 at 12:16 PM
Well, it turns out your webserver / web framework does inject a html page infront of the actual content. That webpage is ment to set a cookie via Javascript and then immediately reloads the actual data with the suffix: News.html?ckattempt=1
. This went by almost unnoticed when testing the link in the browser. However I use a local proxy that has a log window ^^.
So the problem is your webserver. It might has some kind of strange session management. Whatever it is, it is not compatible with your Unity application as you can't execute (web) Javascript code code in your Unity application.
The only solution is to either:
find a better webserver / host which allows plain content access
find a way to disable this cookie inserting automatism.
edit
The actual page returned is
<html>
<body>
<script type="text/javascript" src="/aes.js" >
</script>
<script>
function toNumbers(d){
var e=[];
d.replace(/(..)/g,function(d){
e.push(parseInt(d,16))}
);
return e}
function toHex(){for(var d=[],d=1==arguments.length&&arguments[0].constructor==Array?arguments[0]:arguments,e="",f=0;f<d.length;f++)e+=(16>d[f]?"0":"")+d[f].toString(16);
return e.toLowerCase()}
var a=toNumbers("f655ba9d09a112d4968c63579db590b4"),b=toNumbers("98344c2eee86c3994890592585b49f80"),c=toNumbers("a67301d8765d38398ec776b3e21b132e");
document.cookie="__test="+toHex(slowAES.decrypt(c,2,a,b))+"; expires=Thu, 31-Dec-37 23:55:55 GMT; path=/";
location.href="http://dgdataserver.byethost17.com/Data/News/News.html?ckattempt=1";
</script>
<noscript>
This site requires Javascript to work, please enable Javascript in your browser or use a browser with Javascript support
</noscript>
</body>
</html>
Furthermore that page loads the aes.js file from your root which is used to decrypt the cookie content.
I've just found this one:
https://clicky.com/forums/?id=18265
It seems that:
"This is a newly introduced bot detection feature on free hosting to prevent unwanted bots". See if you can disable that "feature" somehow. If not you have to search for a new hoster ^^.
I guess it's that module:
Indeed After contact with our webmaster he said its a protection against bots. But that seems ridiculous to me. Time to switch back to our own servers.
Thanks for the help all.
Answer by wibble82 · Sep 06, 2015 at 10:48 AM
Hi there
It seems that what your server file contains is:
News: <color=reg>Test</color>
This is a test.
I'm not sure if you're generating text on the server, but if what you want is genuinely just a text file, then store it as a '.txt' file. Depending on the web server it may provide extra information about the file if you're serving html. There's no script I can see on that page though, so I'm guessing its just a plain text file you want?
I know quite a few developers who have done exactly that for their 'news' pages. Then if you want anything funky, you can just add a bit of your own 'markup' to your text file and interpret how you like on the client.
-Chris
@zotey:
Right. If the text should only be text use ".txt" and not ".html". HT$$anonymous$$L stands for Hyper Text $$anonymous$$arkup Language.
Apart from that the text doesn't contain a valid html tag. I guess that the "color=reg" tag is ment to represent a Unity rich text tag. This is probably a typo and you wanted to write "color=red"
Anyways those rich text tags are not valid html tags. So using .html makes no sense as the content is not html and a browser can't interpret those correctly.
The .text property of the WWW class returns the file as it is on your server.
I know i need to store it in a .txt file and that is exactly what i did the first time. But the problem is the same if i store the data in a .txt file im getting the same html code as if i store the data in a html file.
Could you post the html you're getting. As I say, the actual file on the server contains the above, so that's what you're going to get sent.
Your answer
Follow this Question
Related Questions
Is it possible to download a youtube video from unity 3 Answers
Why does nothing happen after yield return www 0 Answers
WWW Downloaded file 1 Answer
www.texture memory leak 1 Answer
download a file 2 Answers