You can configure a data extraction project to extract content from a webpage by adding content elements to a template. You can add content elements to a template by using one of these methods:
After you have added a content element of type Element, you can choose Capture Type from the options window. Visual Web Ripper can extract any property from the selected HTML element, such as text, HTML or an element attribute.
Content may be located in multiple locations on a single type of webpage. For example, you may have search results that span several pages where a content selection extracts content well from the first page, but not on the following pages. In such a scenario, you need to fine-tune the content selection manually so that it works for all pages in the search results.
A content transformation script is often used in conjunction with content elements to modify the extracted data. A content transformation script can extract smaller pieces from the extracted data. For example, a single HTML element may contain a full address. The content element extracts the full address into one data field, but you can use content transformation to extract the state or zip code from the full address.