Wayback Machinekoobas.hobune.stream
May JUN Jul
Previous capture 12 Next capture
2021 2022 2023
1 capture
12 Jun 22 - 12 Jun 22
sparklines
Close Help
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
  • Asset Store
  • Get Unity

UNITY ACCOUNT

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account
  • Blog
  • Forums
  • Answers
  • Evangelists
  • User Groups
  • Beta Program
  • Advisory Panel

Navigation

  • Home
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
    • Blog
    • Forums
    • Answers
    • Evangelists
    • User Groups
    • Beta Program
    • Advisory Panel

Unity account

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account

Language

  • Chinese
  • Spanish
  • Japanese
  • Korean
  • Portuguese
  • Ask a question
  • Spaces
    • Default
    • Help Room
    • META
    • Moderators
    • Topics
    • Questions
    • Users
    • Badges
  • Home /
  • Help Room /
avatar image
0
Question by AbdurrahmanKhallouf · Nov 10, 2016 at 09:06 AM · unity 5wwwwebhtml

scraping from a web page with minimal use?

so all I want to do basically is to get the first 3 letters from this page (or similar ones) because amazon API wont support average user rating... I found some answers that suggested using libraries like WATIN or HTML Agility Pack. however; as I said all I want is simple 3 letters.

and when I use www.text it will return some javascript and html that has nothing to do with the actual page we see following the link.

so is there anyway to scrape without using a library? and if not. what is the most straight forward/fastest tool to do it? also if I used a library is it helpful since I am getting unrelated html this is a samle of print(www.text) :

 <!DOCTYPE html>
 <!--[if lt IE 7]> <html lang="en-us" class="a-no-js a-lt-ie9 a-lt-ie8 a-lt-ie7"> <![endif]-->
 <!--[if IE 7]>    <html lang="en-us" class="a-no-js a-lt-ie9 a-lt-ie8"> <![endif]-->
 <!--[if IE 8]>    <html lang="en-us" class="a-no-js a-lt-ie9"> <![endif]-->
 <!--[if gt IE 8]><!-->
 <html class="a-no-js" lang="en-us"><!--<![endif]--><head>
 <meta http-equiv="content-type" content="text/html; charset=UTF-8">
 <meta charset="utf-8">
 <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
 <title dir="ltr">Robot Check</title>
 <meta name="viewport" content="width=device-width">
 <link rel="stylesheet" href="https://images-na.ssl-images-amazon.com/images/G/01/AUIClients/AmazonUI-3c913031596ca78a3768f4e934b1cc02ce238101.secure.min._V1_.css">
 <script>
 
 if (true === true) {
     var ue_t0 = (+ new Date()),
         ue_csm = window,
         ue = { t0: ue_t0, d: function() { return (+new Date() - ue_t0); } },
         ue_furl = "fls-na.amazon.com",
         ue_mid = "ATVPDKIKX0DER",
         ue_sid = (document.cookie.match(/session-id=([0-9-]+)/) || [])[1],
         ue_sn = "opfcaptcha.amazon.com",
         ue_id = '0ZM10RV1AJWTWTYPNSA7';
 }
 </script>
 </head>
 <body>
 
 <!--
         To discuss automated access to Amazon data please contact api-services-support@amazon.com.
         For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.com/ref=rm_c_sv, or our Product Advertising API at https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html/ref=rm_c_ac for advertising use cases.
 -->
 
 <!--
 Correios.DoNotSend
 -->
 
 <div class="a-container a-padding-double-large" style="min-width:350px;padding:44px 0 !important">
 
     <div class="a-row a-spacing-double-large" style="width: 350px; margin: 0 auto">
 
         <div class="a-row a-spacing-medium a-text-center"><i class="a-icon a-logo"></i></div>
 
         <div class="a-box a-alert a-alert-info a-spacing-base">
             <div class="a-box-inner">
                 <i class="a-icon a-icon-alert"></i>
                 <h4>Enter the characters you see below</h4>
                 <p class="a-last">Sorry, we just need to make sure you're not a robot. For best results, please make sure your browser is accepting cookies.</p>
                 </div>
             </div>
 
             <div class="a-section">
 
                 <div class="a-box a-color-offset-background">
                     <div class="a-box-inner a-padding-extra-large">
 
                         <form method="get" action="/errors/validateCaptcha" name="">
                             <input type=hidden name="amzn" value="pS3mS9njBQknPlFyK0aYHg==" /><input type=hidden name="amzn-r" value="&#047;gp&#047;customer&#045;reviews&#047;widgets&#047;average&#045;customer&#045;review&#047;popover&#047;ref&#061;dpx_acr_pop_?contextId&#061;dpx&amp;asin&#061;B01MCRBY4X" /><input type=hidden name="amzn-pt" value="CustomerReviews" />
                             <div class="a-row a-spacing-large">
                                 <div class="a-box">
                                     <div class="a-box-inner">
                                         <h4>Type the characters you see in this image:</h4>
                                         <div class="a-row a-text-center">
                                             <img src="https://images-na.ssl-images-amazon.com/captcha/nzwwotmg/Captcha_cytjkvkwvw.jpg">
                                         </div>
                                         <div class="a-row a-spacing-base">
                                             <div class="a-row">
                                                 <div class="a-column a-span6">
                                                 </div>
                                                 <div class="a-column a-span6 a-span-last a-text-right">
                                                     <a onclick="window.location.reload()">Try different image</a>
                                                 </div>
                                             </div>
                                             <input autocomplete="off" spellcheck="false" placeholder="Type characters" id="captchacharacters" name="field-keywords" class="a-span12" autocapitalize="off" autocorrect="off" type="text">
                                         </div>
                                     </div>
                                 </div>
                             </div>
 
                             <div class="a-section a-spacing-extra-large">
 
                                 <div class="a-row">
                                     <span class="a-button a-button-primary a-span12">
                                         <span class="a-button-inner">
                                             <button type="submit" class="a-button-text">Continue shopping</button>
                                         </span>
                                     </span>
                                 </div>
 
                             </div>
                         </form>
 
                     </div>
                 </div>
 
             </div>
 
         </div>
 
         <div class="a-divider a-divider-section"><div class="a-divider-inner"></div></div>
 
         <div class="a-text-center a-spacing-small a-size-mini">
             <a href="http://www.amazon.com/gp/help/customer/display.html/ref=footer_cou?ie=UTF8&nodeId=508088">Conditions of Use</a>
             <span class="a-letter-space"></span>
             <span class="a-letter-space"></span>
             <span class="a-letter-space"></span>
             <span class="a-letter-space"></span>
             <a href="http://www.amazon.com/gp/help/customer/display.html/ref=footer_privacy?ie=UTF8&nodeId=468496">Privacy Policy</a>
         </div>
         <div class="a-text-center a-size-mini a-color-secondary">
           &copy; 1996-2014, Amazon.com, Inc. or its affiliates
           <script>
            if (true === true) {
              document.write('<img src="https://fls-na.amaz'+'on.com/'+'1/oc-csi/1/OP/requestId=0ZM10RV1AJWTWTYPNSA7&js=1" />');
            };
           </script>
           <noscript>
             <img src="https://fls-na.amazon.com/1/oc-csi/1/OP/requestId=0ZM10RV1AJWTWTYPNSA7&js=0" />
           </noscript>
         </div>
     </div>
     <script>
     if (true === true) {
         var elem = document.createElement("script");
         elem.src = "https://images-na.ssl-images-amazon.com/images/G/01/csminstrumentation/csm-captcha-instrumentation.min._V" + (+ new Date()) + "_.js";
         document.getElementsByTagName('head')[0].appendChild(elem);
     }
     </script>
 </body></html>

also using print(www.text) on some pages like this one gives an empty string. even though its a valid link ...

please, any info might help

Comment
Add comment
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

0 Replies

· Add your reply
  • Sort: 

Your answer

Hint: You can notify a user about this post by typing @username

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this Question

Answers Answers and Comments

118 People are following this question.

avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image

Related Questions

How to load HTML page inside unity game? 1 Answer

How to embed my Unity OpenGL game to my website? 0 Answers

Can not send data to PHP Server 1 Answer

How to load textures on run-time for a game that is built for WebGL 0 Answers

run scipt on html webpage 1 Answer


Enterprise
Social Q&A

Social
Subscribe on YouTube social-youtube Follow on LinkedIn social-linkedin Follow on Twitter social-twitter Follow on Facebook social-facebook Follow on Instagram social-instagram

Footer

  • Purchase
    • Products
    • Subscription
    • Asset Store
    • Unity Gear
    • Resellers
  • Education
    • Students
    • Educators
    • Certification
    • Learn
    • Center of Excellence
  • Download
    • Unity
    • Beta Program
  • Unity Labs
    • Labs
    • Publications
  • Resources
    • Learn platform
    • Community
    • Documentation
    • Unity QA
    • FAQ
    • Services Status
    • Connect
  • About Unity
    • About Us
    • Blog
    • Events
    • Careers
    • Contact
    • Press
    • Partners
    • Affiliates
    • Security
Copyright © 2020 Unity Technologies
  • Legal
  • Privacy Policy
  • Cookies
  • Do Not Sell My Personal Information
  • Cookies Settings
"Unity", Unity logos, and other Unity trademarks are trademarks or registered trademarks of Unity Technologies or its affiliates in the U.S. and elsewhere (more info here). Other names or brands are trademarks of their respective owners.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • Default
  • Help Room
  • META
  • Moderators
  • Explore
  • Topics
  • Questions
  • Users
  • Badges