Latest Inquiries - Data Extraction Software

College Extraction

Submitted: 4/14/2013
Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 I simply want to be able to go to a specific web site and extract name, email and phone number and place them on an excel spreadsheet. I want to extract that information for two different groups, students on one and staff & faculty on the other. So I will have two spreadsheets.
Replied: 4/14/2013 9:44:25 PM
Please give me the guidance as your thought, I'm not sure what you needed on the website, thanks.
Replied: 4/14/2013 5:29:27 PM
Please give me guidance how to reach the two different groups and particular data? it's better if you can attach a few screenshots .. thanks.
Replied: 4/18/2013 12:34:46 AM
Thank you, got it.

Please see attached project and sample data extract. Place both project file and input file in the default Visual Web Ripper project folder (My Documents\Visual Web Ripper\Projects)

The project extracts data for staff only. I'm not sure how to get to the student data without having a username/password for that area of the website.
UscStaff.xls
First Names - Popular.txt
UscStaff.rip

Replied: 4/17/2013 11:26:38 PM
Simon is asking you how to do this on the website in a standard web browser, not how to do it in our software.

I assume you want to submit a web form for all the names in you text file and extract name, email and phone number. This is an easy task, but we need to know where the search form is located on the website. This is a huge website and it would take a long time browsing the whole website to try and find this search form. All we need is for you to provide us with the URL to this search form, or tell us how to navigate to the search form from the homepage.

The URL (https://my.usc.edu/wp/Faculty/SearchForm.do) that is shown on your screenshot cannot be navigated directly, so we need the URL to the page where you submit the search form.
Replied: 4/14/2013 10:37:10 PM
I simply want to be able to go to a specific web site and extract name, email and phone number and place them on an excel spreadsheet. I want to extract that information for two different groups, students on one and staff & faculty on the other. I am looking for a template of some kind to do this. If not, send me the instructions and I will do it myself.
Replied: 4/17/2013 9:43:26 PM
Hello Again, I have to tell you that I mis-understood my specs and now understand better what I needed. I actually called into the office and discussed it with them (Sequentum) and they told me it CAN be done. So Here are the proper specs.

I have attached a screen print of the USC Faculty web page. I want to be able to go there, with a list of first names, last names, famous names, popular names, not-so-poupular names and alternate names, and use these as a template to determine which email address I pull to place on my excel faculty list. Then I do the same thing with Students, again using the list of names. These names come in txt files. I will attach one you give you an idea.

The person I spoke to earlier thought it was easy to do, so please let me know when.

Thanks,

Clive.

Screen Print of USC Faculty Web Site.docx
First Names - Popular.txt

Replied: 4/17/2013 7:48:40 PM
I'm not able to visit the url you gave in screenshot :


May I need to get user/password to first sign in ?

I'm not sure what the list of names(i.e, the txt file you attached) will be used for? not sure you would want to do a search in the list of names, if so, please tell me how to do it?

It's better if you can attach a specific screen-shot to point out where the fields are ..

Thanks.
Replied: 4/17/2013 9:59:27 PM
YOU SAID:

"I'm not sure what the list of names(i.e, the txt file you attached) will be used for? not sure you would want to do a search in the list of names, if so, please tell me how to do it?"


OK, I am trying again. I have attached the screen shots from the home page. I don't have a password. The people at your office told me to cut and paste the names in the text file to the capture area. I got the impression that you knew how to use this software....are you not certified or something? That's why I am asking YOU. If you are not sure, can you please ask someone there?
Screen Print of Web Ripper Software.docx

Replied: 4/17/2013 7:07:58 PM
Hello Again, I have to tell you that I mis-understood my specs and now understand better what I needed. I actually called into the office and discussed it with them (Sequentum) and they told me it CAN be done. So Here are the proper specs.

I have attached a screen print of the USC Faculty web page. I want to be able to go there, with a list of first names, last names, famous names, popular names, not-so-poupular names and alternate names, and use these as a template to determine which email address I pull to place on my excel faculty list. Then I do the same thing with Students, again using the list of names. These names come in txt files. I will attach one you give you an idea.

The person I spoke to earlier thought it was easy to do, so please let me know when.

Thanks,

Clive.

First Names - Popular.txt
Screen Print of USC Faculty Web Site.docx

Replied: 4/18/2013 8:15:18 AM
Thanks Much, I will check it out.

Clive.

Replied: 4/18/2013 6:47:52 PM
Hi Guys, I have a few questions. I did follow your directions and the project went well. I was able to extract the data to the limits of the trial version.

My first question is where do I go to edit the code?

Example, the project is set up for first name, what happens if I want to use last name instead? Where do I go to change that?

Also, where do I specify the name of the txt file I read in? Can I have several txt files in the Projects directory?

Thanks in advance for the answers,

Clive.

Replied: 4/17/2013 9:38:30 PM
Hello Again, I have to tell you that I mis-understood my specs and now understand better what I needed. I actually called into the office and discussed it with them (Sequentum) and they told me it CAN be done. So Here are the proper specs.

I have attached a screen print of the USC Faculty web page. I want to be able to go there, with a list of first names, last names, famous names, popular names, not-so-poupular names and alternate names, and use these as a template to determine which email address I pull to place on my excel faculty list. Then I do the same thing with Students, again using the list of names. These names come in txt files. I will attach one you give you an idea.

The person I spoke to earlier thought it was easy to do, so please let me know when.

Thanks,

Clive.

Screen Print of USC Faculty Web Site.docx
First Names - Popular.txt

Replied: 4/18/2013 8:36:53 PM
Unfortunately, there's no simple answer to your first question. You'll need to learn how to develop simple data extraction projects. I suggest you start by watching some of the training videos listed on this webpage:


The beginner video will show you how to develop a simple data extraction project.

You can set the input file on the "Input Data Source" screen. Please see this page in the manual for more information.



Replied: 4/14/2013 8:10:55 PM
Hi Simon, my thought was to look for "Student" in the URLs or meta-tags for each page searched and put all the others in Staff and Faculty. If you have other ways of making that determination, please feel free to do it too.
Replied: 4/14/2013 10:47:15 PM

Unfortunately, we cannot provide a demo project based on the instructions you have provided.

We need to configure a data extraction project to navigate to the part of the website where the data can be extracted. A data extraction project cannot go to every page on a website and search for something that looks like student or staff data.

Basically, the software cannot find data for you, it can only extract data if you know exact where the data is located on a websites.

Replied: 4/17/2013 11:57:14 PM
Hi, I sent you a series of screen shots and directions to get to the page from the home page www.usc.edu. I am sending it again.
USC SITE UPDATED SCREENSHOT.docx