Latest Inquiries - Data Extraction Software

To capture the total row count in export table

Submitted: 2/29/2016
Dear Support,

I would like to do a checking on a column and make sure that it is not empty before inserting into database, for ex: CompanyName.

After the crawl, at the exporting script(or any other option), I would like to check if CompanyName is empty for all the rows in the internal table.

My plan is to do a select statement that where companyname = '' and compare the count with the total row count at the internal table.
If the count is the same, all the data will not be inserted into the database.

Any other option that you could suggest? 
Or is there anyway to set a condition script when the website changed their template/xpath?

Replied: 3/2/2016 2:42:31 AM
You can try to mark check 'Required element' option for 'Company' element, therefore, if the field is missing in current row, then it won't export the whole current row, hope it's what you expect.

Replied: 3/1/2016 9:10:33 AM

For example this project file, i purposely make the companyName unable to be captured.
and at the export script, I would like to have a script that could compare the total row count and the total row count where the comapnyname is empty. If the number is the same, it means all the crawled data does not have companyname, and these row of data will no insert into database.

Is there any way to do this kind of checking?
Or maybe we could do something when the comapnyName turn "yellow"?


Replied: 3/2/2016 3:18:05 AM

See the attached new project.

I made a new element called 'all_company' , it has set condition scripts to check if 'all_company' value is empty, then it does disable next 'ContentArea' page area template and the current table rows won't be exported, whereas.  this is a basic idea how to filter out data you don't want to export..

However, I'm unable to find where exists 'company'  since I just saw it keep to show me yellow in VWR editor, you will need to change xpath for 'all_company' element, that make sure that all company in rows can be selected with 'all_company' , so the detection in condition scripts should be working.

Replied: 3/2/2016 2:50:25 AM
We still need those data without 'company' as sometimes the website does not show the 'company' element for certain job posting.

Is there any option for us to identify that all the 'company' element is unselected (turn yellow)?
Replied: 3/1/2016 2:13:08 AM
It's better to attach the specific project , then you guide me where to have the company name and explain the detailed logic again , so I will be able to figure out exactly what you expect, thanks!