Latest Inquiries - Data Extraction Software

need help with regex transformation

Submitted: 2/29/2016


please find attached my latest .rip project that grabs business addresses from a chamber of commerce website.

I'm having difficulty, using the regex examples you've provided in my last support requests, to build a regex expression to extract the fax number and phone number.  I think I am getting the hang of regex as you can see from the other transformations I managed to pull off.

Thanks for providing the regex coding to get the phone numbers and return project back.


Replied: 3/1/2016 2:34:11 AM
When opening 'liver list' link template, 'directorypage' page area template show me yellow as not found, can you please first verify this? or may you give me one example url where to exists fax / phone number you couldn't extract using content transformation regex scripts?
Replied: 3/2/2016 1:34:41 AM

Please you try to put the below regex scripts to extract general phone number:


Pleae you try to put the below regex scripts to extract fax number:

Fax: (\(\d+\)\d+\-\d+)

Note: the input content in content transformation regex scripts should be the full HTML block like what you show me for example.

Replied: 3/1/2016 5:08:57 AM

Yes.  The page area under the directorypage template selects just the address blocks -- which took a little trial and error to get because page area was selecting the upper search form and extra titles as well as the address areas.  Anyhow, I need to know the regex transformation to filter out just the fax and phone numbers on the html block below:

----------------------- html block paste ------------------------------/
<P><B><A href="" target=_blank>The Arbors Apartments</A></B><BR>3550 Pacific Avenue<BR>Livermore CA 94550-7006<BR>(925)449-9114 &nbsp; Fax: (925)449-3915<BR><I>Desiree Delfoe, General Manager</I><BR><BR><A href="">Request Info</A><BR><A href=";addr=3550+pacific+avenue&amp;csz=livermore,+ca+94550" target=_blank><IMG border=0 alt=Map src="web_map.gif"></A> &nbsp;Member Since 1986&nbsp;<BR><BR><STRONG>About Us...</STRONG><BR>Centrally located yet comfortably secluded. <BR><BR></P>

----------------- end of paste ------------.