Latest Inquiries - Data Extraction Software

Some rips will not run from a non-interactive session

Submitted: 12/22/2015

Having a problem that I can't seem to diagnose very well.

We run rips nightly in a batch, and there are a small handful that will ONLY run from either the webripper program itself, or directly from the command line but in either case, they will only run manually (a person has to run it while logged into the machine).

The problem scrapes are run from a windows service that logs in as the local administrator account, which is the same account as is used to create the rip file.

The command line looks like this:

"c:\Program Files (x86)\Visual Web Ripper\RunProject.exe"  divc retry_errors
I have tried with and without retry_errors

When there are problems, it just won't get any information from the scrape. As if the first page is giving it issues. I have used "wait for element" and even set a 10 second pause in the wait code, and told it to "ignore errors"

At this point I am at a loss. Could you maybe point me in the right direction?

Thanks for any help you can provide.
Scott?
divc.rip

Replied: 12/25/2015 3:43:43 AM
If you don't configure by user logged in , of course , there is no interactive session with desktop, then Javascript wont' work properly, this is depending on what windows sytem does. hope you make sense.
Replied: 12/23/2015 6:33:04 AM

I don't think that you're able to run project from windows service, windows service never made any interaction with desktop, but most of websites need to run on a desktop user interface, otherwise, javascript won't work properly.

Have you tried to simply execute RunProject .exe in console ? does it work with this project?

Replied: 12/24/2015 3:18:04 AM

Ok, I just tried setting it to ONLY run when user was logged in.  That DOES appear to allow it to run properly.

This is a problem.  Do you happen to know of a way to allow interactive session if no one is logged in?

Replied: 12/24/2015 2:43:00 AM

When you run a Windows task with the option Run whether logged on or not, the task is run without an interactive desktop. JavaScript on some websites does not work correctly without an interactive desktop, and in those cases you need to use the option Run only when user is logged on and then make sure the user is always logged on when the task is running.

http://manual.visualwebripper.com/default.aspx?manual_id=28

Please you try to tick off this option - "Run only when user is logged on" in windows task manager for that project you want to schedule, then see if project can be running properly? if not , please you attach info.log file for diagnose further, thanks.

Replied: 12/23/2015 7:52:30 PM

I just tried running it via the build in "Schedule" option, and it does not run there either.


Please advise

Replied: 12/23/2015 6:41:02 AM

Yes, it does work from an interactive cmd.exe session.  All our other 180 some rips run properly when run the same way this one is failing.

Also, I mis-spoke in the earlier message.  What we use is not technically a windows service.  I have a program that gets kicked off by the task scheduler that starts the process.  The task scheduler runs it as the administrator account.

I believe this is the same method that is used when you "schedule" the rip to run from within VWR itself.

There seems to be something unique about this particular rip that is causing it to fail when run from a schedule.

Replied: 12/22/2015 8:59:16 PM
I just wanted to add, that we have ~180 scrapes that all seem to work correctly with this setup.  It is just 5 or so that are having issues.  If I can figure out one, I am hoping it will help me fix the others.