Latest Inquiries - Data Extraction Software

Suggestion: Buffer data output

Submitted: 5/23/2017
Hello,

we have very large scrapers ~2.000.000 rows and facing some performance issues. I've read the performance optimization chapter in the manual and adjusted my projects accordingly. However, I see very heavy disk usage (tested with MySQL and SQLlite) while my project is running.

Problem seems to be the many and small chucks which are getting written to disk while running. Would it be possible to f. e. cache ~10-50 MB in RAM before persisting it to disk/database (or better: make the caching amount value configurable)? This would lead to a less heavy disk usage in my opinion.

On the SQL-side we have "LOAD INFILE" to insert data efficiently - maybe there is a way to load it that way into the database. Another possibility would be a binary stream of data which is written to a propriertary file format.

Would be great if you guys would have a solution for better performance! If there is such an option already, please point me to the right direction.

Sincerly,
Georg F├╝sslin
Replied: 5/25/2017 4:36:12 AM

Hi

Our newer software ContentGrabber.com, is alot faster than Visual Web Ripper. Visual Web Ripper is not really geared for VERY large data sets. 2 million is ALOT of data.

We could offer you a discount because you are a Visual Web Ripper customer. You could try downloading the trial version here https://contentgrabber.com/download

Best Wishes

Kyle Correia