Latest Inquiries - Data Extraction Software

Yell.com

Submitted: 8/28/2013

Dear Sir/Madam,

I would like to have my project created to acquire the following data from Yell.com.

- Company name

- Company address

- Telephone number

- Mobile number

- Website (if any)

A very important function I wanted, was the ability to only scrape newly added data to Yell.com for specific keyword and location searches. For example, I would search keyword 'Plumber' and location 'Dorset' today. If I searched the same keyword and location tomorrow, the scraper would only extract new data which wasn't there previously. To this end, your office have advised that the software can be customised, so that only newly added listings will be included in the scrape. Is this correct?

To this end, am I correct in that the software will remember the various search criteria and therefore provide the desired results, based on the above?

Any other information you think would be helpful, would be greatly appreciated.

This about sums up my project requirements.

Awaiting your feedback

 

Thanks & Kind Regards.

Mike 

 

 

Replied: 8/28/2013 10:01:05 PM
Please check the attached demo project.

You need to place the project file in default projects folder, then run this project in VWR program.

I've marked check "Duplicate check" for the "Company URL" element, I assume that the URL is the unique key to flag duplicates, the project has been set to "Add to existing data" in Project > Project options > Project data tab, therefore, it will check the duplicates and only export the new data for each run .

F.Y.I:

Incremental web scraping
Yell_demo.rip

Replied: 8/30/2013 11:24:59 PM
The designing for this project should be good, I've only revised the element name from "Company Name" to "Company URL" under "Companies" page area template and marked check the "Duplicate check" option in Misc tab for the "Company URL".

Then the form field inputs you did is good , but please you note, this will combine all possibilities of keyword & location, to make one query once for each row, you can use input data source (e.g, csv file) then assignin specific column in csv input data source to certain form field.

F.Y.I:

Yell-Electricians.rip

Replied: 8/30/2013 5:40:17 AM
Hi Simon,

As always, thanks very much for your response.

Please note that I have now purchased a full version of the product.

I have now amended the project file, so may I kindly ask that you check the following;

- Data entered into the Form fields for both Keyword and Location are workable as is.
- Can you set the 'duplicate check' for the 'Company URL' element as previously described. I cannot for the life of me figure how to do this.

Am I correct now that the way I have set everything up, will scrape keyword 'accountants' for all locations (AB, AL, B etc.), whilst navigating through all page links numbered 1-10. It will then go onto the other keyword 'bookkeeping services' and do the same, and then onto the next etc?

Awaiting your feedback.


Thanks & Kind Regards.

Mike
Yell-Electricians.rip

Replied: 8/29/2013 8:02:54 PM
See the attached new project.

You can make two lines where each line corresponds to each keyword to be submitted, you still can feed each of keywords from external input data source to the form field - keyword.

F.Y.I:

FormSubmit

Yes, page navigation template does navigate to next page , you did right.

I've noticed that you set "duplicate check" option for "Company name", that should be good if you want to use the element to flag unique feature..

If you want to test the project fully, you should purchase the professional version , there is no any limitation , we've provided 30 days refund policy.
Yell-Electricians.rip

Replied: 8/30/2013 5:33:24 AM
Hi Simon,

As always, thanks very much for your response.

Please note that I have now purchased a full version of the product.

I have now amended the project file, so may I kindly ask that you check the following;

- Data entered into the Form fields for both Keyword and Location are workable as is.
- Can you set the 'duplicate check' for the 'Company URL' element as previously described. I cannot for the life of me figure how to do this.

Am I correct now that the way I have set everything up, will scrape keyword 'accountants' for all locations (AB, AL, B etc.), whilst navigating through all page links numbered 1-10. It will then go onto the other keyword 'bookkeeping services' and do the same, and then onto the next etc?

Awaiting your feedback.


Thanks & Kind Regards.

Mike
Yell-Electricians.rip

Replied: 8/29/2013 4:11:37 PM
Hi Simon,

Further to my previous communication, I think I have managed to further customise the project succesfully.

Regarding my objective of;

- performing multiple location searches to one keyword 'electricians & electrical contractors'
- whilst having only new data exported, in comparison to previous searches (removing duplicate data via 'Company URL')
- whilst also having the scraper navigate through page links numbered 1-10 at the bottom of the page
- to produce an Excel report containg Company name, Company address, Telephone, Mobile and Website

Would you please be able to review the attached project file and advise if all is correct in relation to my above objectives.

On a seperate note, is it possible to have multiple keywords in the form field (as I have in the location form field), whilst having multiple locations in the form field, to complete a succesfull scrape in one 'Project Run'

I'm sorry if I'm asking alot, but I'm unable to test fully as the software limits me being on Trial.

Your time time and assistance is greatly appreciated.


Thanks & Kind Regards.

Mike

Yell-Electricians.rip