Document converters are used to convert documents into HTML, so the documents can be processed like any other HTML page. For example, if a website is linking to PDF or MS Word files, you may want to extract text or images from within these document instead of downloading the documents.
Important: Most files, including PDF and Word files, are not intended to be converted into HTML, so the converted HTML pages will be much more difficult to work with than standard HTML pages. In most cases, you'll have to select the entire HTML page and use Regular Expressions to extract the target text.
Please read the Visual Web Ripper manual for more information about how to use a document converter with Visual Web Ripper.
Docx to HTML
You can use this document converter to convert Office 2007+ word documents into HTML.
Download the zip file below and extract the content into to the Converters folder in the default Visual Web Ripper document folder
(My Documents\Visual Web Ripper\Converters)
The docxtohtml utility uses PowerTools for Open XML
. Please visit this link for more information about the PowerTools for Open XML library.