Latest Inquiries - Data Extraction Software

Couple of Issues while using VWR

Submitted: 1/4/2016

Hello,

 

I need following help in web scraping, where i am facing some issues:

 

I want to scrap some data from following Link.

http://products.tradeindia.com/apparel-fashion/

Issue 1. For page nevigation of data i am using page nevigation function,but only 30 pages are scraping. after that it is showing Page 30 to 60, 60 to 90 and so on. Can you help me to scrape all pages data.

Issue 2: For view contact details, on clicking its showing "javascript%20return%20false" (please refer 2nd file" in HTML Section and not loading proper data. It heapens so offen for link data. Can you guide me to solve this problem.

Help 3: I want proxy server for http://products.tradeindia.com/ 

 

Can you please help me to solve all the problems ASAP.

Regards

Hitesh

 

 

 

 

Tradeindia Final.rip

Replied: 2/13/2016 6:46:45 AM

Hello team

None of the issues are resolved..

For issue 1 Please refer Image "Issue 1".

When number of pages are more than 30 pages. It has option to click in next remaining. which doesn't scrape remaining data.

Lets discuss case:

http://products.tradeindia.com/apparel-fashion/footwear/safety-shoes/

It has 765 records available in 39 pages (20 each). the .rip file attached here has page nevigation function, which scrape data till 30 page. after that website has option to click "next 9 of remaining 9",  which is not recorded in this .rip file.

I need help in scraping reming 9 pages also, so as to make sure i scrape 100% data.

Please let me know if still clarifications are required.


Hitesh


Replied: 1/5/2016 7:47:12 AM

For issue 1, you might try to set 'Dynamic list of links' instead of 'List of links' for page navigation template, however, I don't seem to see where exists thepage 30, 60, 90 .. can you please attach your log file for clarify further? it's better you guide me which category takes this case?

For issue 2, you did link transformation for 'detail' link template , therefor, javascript link has been converted to a direct link, whatever, I'm unable to see 'Javascript%20return%20false' in the outputed log lines.

For issue 3, you can try out the free private proxy switch, please refer to the below topic link:

http://manual.visualwebripper.com/default.aspx?manual_id=2033

Replied: 2/15/2016 4:02:36 AM

A good practice is that you activate 'Navigate in browser' button in toolbar, then manually navigating the special case link:

http://products.tradeindia.com/apparel-fashion/footwear/safety-shoes/

After you deactive 'Navigate in browser' back to edit mode, you can add a new page navigation template for reach next remaing 9 pages.

See the attached new project, I revised the xpath to select numeric links only for 'page list ' page navigation template, for the new 'next' page navigation template , it does select the last special link to match 'next x of remaining x' (x is number).

Tradeindia Final.rip