Page 1 of 1
existing php libraries for parsing street addresses?
Posted: Thu Sep 04, 2008 3:32 pm
by yekibud
I've got descriptive address strings like this:
Code: Select all
104 N Main Street, 3rd floor Masonic Bldg
West Water Street next to Bobbys Auto Sales
436 Industry Rd 1/2 M from US 27 South of
111 Bridge St off US 31W @ N end of
Main Street over Blue Daisey Flower Shop
112 S. Main Street WILLIAMSTOWN
Hwy 55, Springfield Rd & Corporate Drive
US Highway 68 West, behing Subway CADIZ
etc., that I have to parse the street address from. I was thinking of coming up with my own regex, but then I got tired of writing all the possibilities for Rd, Rd., Road, St, St., Street, and so on. Then I thought there must be some generic libraries to help me out here.
I found Geo::StreetAddress::US in CPAN, which I guess I could get to work - but I wanted to check if there was anything natively in PHP to help me out.
Thanks for the tips.
Re: existing php libraries for parsing street addresses?
Posted: Sat Sep 06, 2008 7:37 am
by GeertDD
PCRE is included in PHP by default. I think it should be able to do the job.
The first step in solving this problem is to determine
where and
what street names are on each line. Then build a regex for it.
Some statements that should help us get started (correct them if needed):
- Each line contains one street name;
- The street name can optionally be preceded by a number;
- Every word of a street name begins with a capital.
I'm not familiar with English addresses and so I'm not sure what to do about the "N" in the first line of your examples. Also "Hwy 55, Springfield Rd & Corporate Drive" could be tricky if "Hwy" is not the street name you want to extract.
Re: existing php libraries for parsing street addresses?
Posted: Mon Sep 08, 2008 11:32 am
by yekibud
Thanks for your reply, GeertDD.
What I'm trying to do is avoid writing my own regex. I started to do so, but it felt like wheel re-inventing.
It seems like there should be pre-cooked regex libraries for tasks like this - I'm sure I'm not the only one who has had to pick postal addresses out of strings of text.
Re: existing php libraries for parsing street addresses?
Posted: Mon Sep 08, 2008 2:51 pm
by GeertDD
Re: existing php libraries for parsing street addresses?
Posted: Mon Sep 08, 2008 3:06 pm
by yekibud
That's a great link! I'll see if I can find anything there.
Thanks.
Re: existing php libraries for parsing street addresses?
Posted: Mon Sep 08, 2008 3:13 pm
by marcth
I'm not sure what country you live in, but it may be worth while visiting your federal postal service website and see what their addressing standards are. Seems to me that if you're going to clean up those addresses, you may as well follow your country's standards.
Re: existing php libraries for parsing street addresses?
Posted: Mon Sep 08, 2008 3:26 pm
by yekibud
Thanks for your reply, marcth. I'm in the US.
it may be worth while visiting your federal postal service website and see what their addressing standards are.
Right - that's what I would do if I wanted to write the regex myself. I'm hoping that somebody has already gone through that trouble and I can just implement the solution - like from the Perl package I mentioned previously, or maybe just grabbing a snippit from regexlib.com, as GeertDD suggested.