Latest Inquiries - Data Extraction Software

Running rip via Schduler not working

Submitted: 12/3/2016

Hi, My company ask me to seach for scrapper tools and I find VWR while googling and downloaded the trial version,

So now my first assignment is to scrap this site https://www.breeze.ca.gov/ ,and what i see is, the site has javascript redirect to different page at startup URL load.

I wrote a RIP with these sample license numbers: 6281469,1815409 for the site ,Which is runinng & giving me exported XLS data file perfectly when I manually run it via VWR, but when I setup a scheduler for the same and run, then it is giving me time-out error then blank export data XLS file.

As I am new to VWR so please help me as soon as possible so that I can finalize the scrapper for my company to scrap other assigment. I am attaching you RIP and log file that I got for  both VWR & Scheduler run.

CAGuardCardLicVerify.rip
CAGuardCardLicVerify_info_16_12_03-VWR-Run.log
CAGuardCardLicVerify_info_16_12_03-Scheduer-Run.log

Replied: 12/7/2016 12:32:45 PM

Hi Virendra,

You can try out Content Grabber,  https://contentgrabber.com/

Replied: 12/7/2016 6:32:36 AM

Hi,

Have you tried this option as follow? Some ajax sites need to execute javascript with 'Run only when user is logged on' in windows task scheduler. also suggest that you upgrade to IE 11 for make sure that ajax site can work smoothly.

When you run a Windows task with the option Run whether logged on or not, the task is run without an interactive desktop. JavaScript on some websites does not work correctly without an interactive desktop, and in those cases you need to use the option Run only when user is logged on and then make sure the user is always logged on when the task is running.

F.Y.I:

http://manual.visualwebripper.com/default.aspx?manual_id=28

Replied: 12/5/2016 3:19:22 AM

Hi

I made some changes in the agent Project option's advanced tab. I set the browser to use a sized windows, increase the page load timeout, and the random page delay. 

Let me know this works to you.

CAGuardCardLicVerify.rip

Replied: 12/5/2016 1:50:06 PM

Hi Brian,

Sorry it is still not working via scheduler run, not getting any exported data on XLS, and log file is still saying timeout. Attached is log file of the same. Please help me ASAP.

Thanks, Virendra

CAGuardCardLicVerify_info_16_12_05.log

Replied: 12/7/2016 6:34:14 AM

Hi Brian,

Do you think by your ripper this site can be scrapped??, because i am still struggling with timeout error, if it is not possible then i have bad luck, I may have to hunt for different ripper or crawler.

Thanks Virendra.