PHP Developers Network

A community of PHP developers offering assistance, advice, discussion, and friendship.
 
Loading
It is currently Tue Nov 19, 2019 4:29 am

All times are UTC - 5 hours




Post new topic Reply to topic  [ 12 posts ] 
Author Message
 Post subject: Address parsing
PostPosted: Sun Jan 22, 2012 5:53 pm 
Offline
Forum Newbie

Joined: Sun Jan 22, 2012 5:44 pm
Posts: 6
Input
LEVEL 1 1234 EXAMPLE SMELBOURNE VIC
LOT 1234 EXAMPLE ST PORT HEDLAND WA

Desired output
LEVEL 1 1234 EXAMPLE S:MELBOURNE:VIC
LOT 1234 EXAMPLE ST:PORT HEDLAND:WA

The subsections have been merged badly into a short text field, and I can't access the origional data.
I'm new to regular expressions, but found it easy to isolate the State field, but don't see a way to isolate the city.

Any ideas?


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 6:22 pm 
Offline
Forum Commoner
User avatar

Joined: Thu Dec 15, 2011 2:40 pm
Posts: 85
Location: Nelson, NZ
Inserting the colon before the State is straightforward: something like
Search (?m)[ ](\w+)$
Replace: :\1

In PHP:
Syntax: [ Download ] [ Hide ]
<?php
$regex=',(?s)[ ](\w+\r),';
$string='LEVEL 1 1234 EXAMPLE SMELBOURNE VIC
LOT 1234 EXAMPLE ST PORT HEDLAND WA
'
;
echo '<pre>'.preg_replace($regex, ':$1', $string).'</pre>';
?>
 


Output:
LEVEL 1 1234 EXAMPLE SMELBOURNE:VIC
LOT 1234 EXAMPLE ST PORT HEDLAND:WA

For the colon with the street, I don't have a good idea right now: what is the rule to let regex know that SMELBOURNE is not the town?
If you can give me a rule in plain English, I'm happy to have a look.

:)


Last edited by ragax on Sun Jan 22, 2012 6:34 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 6:32 pm 
Offline
Forum Regular
User avatar

Joined: Tue Sep 28, 2010 11:41 am
Posts: 984
Location: Columbus, Ohio
To be able to help with this, would need a lot more examples of data to help determine what would break up the address from the city.

What if you have LOT 1234 EXAMPLE ST ST MARY WA assuming there is a city named St. Mary. Not trying to be a pain, but without a better sampling, would only be able to say use preg_match() to strip out data. (and look on here, you will see I love figuring things like this out)


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 6:49 pm 
Offline
Forum Newbie

Joined: Sun Jan 22, 2012 5:44 pm
Posts: 6
"can give me a rule in plain English"
Understood, thanks for the advice, and for the welcoming writing style.

With the state, it's either the last two or three charaters seperated by a space. That part I've coded now.

With the city, I'm not sure I can. I know from experience that the word "MELBORNE" is a town, but that "SMELBORNE" isn't, but from what I've read so far, that not the way regex works.
I know that the first part is never longer than 22 characters, and the third part (state) is from the last space onwards, but don't see how to isolate the town.

I'll look into what distinguishes the town and just work on describing it in English.

Thanks for your time.


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 7:17 pm 
Offline
Forum Commoner
User avatar

Joined: Thu Dec 15, 2011 2:40 pm
Posts: 85
Location: Nelson, NZ
No worries!

If you can get it in plain English, regex probably has the grammar to make it work for you.
But the plain English rule looks hairy to me. :)


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 8:05 pm 
Offline
Forum Newbie

Joined: Sun Jan 22, 2012 5:44 pm
Posts: 6


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 8:42 pm 
Offline
Forum Newbie

Joined: Fri Jan 06, 2012 2:43 am
Posts: 9


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 9:42 pm 
Offline
Forum Newbie

Joined: Sun Jan 22, 2012 5:44 pm
Posts: 6
@abareplace

A fine idea, and nice bit of googling too. Will try it out.

------

As a newbie to this forum and php, I'm pleasently stunned by the speed, quality and diversity of answers. Cheers.


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Sun Jan 22, 2012 10:00 pm 
Offline
Forum Commoner
User avatar

Joined: Thu Dec 15, 2011 2:40 pm
Posts: 85
Location: Nelson, NZ
Very cool idea, ABA!


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Mon Jan 23, 2012 12:15 am 
Offline
Forum Newbie

Joined: Sun Jan 22, 2012 5:44 pm
Posts: 6
95% there. Hurrah - I can figure out the rest.

A epilogue if you areinterested...

I've done as you have suggested, and running it against a list of town names and it's working.

One fell through the net though.
"EXAMPLE ST BELL BAY TAS"
There is a town called BELL, and also a town called BELL BAY, and it picked up the first one.

I'm working on triming of the 3 letter state code, then finding the match position, and using the rest of the string as the town.

Thanks again.


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Mon Jan 23, 2012 12:50 am 
Offline
Forum Regular
User avatar

Joined: Tue Sep 28, 2010 11:41 am
Posts: 984
Location: Columbus, Ohio
To keep from matching against the wrong value, put the list in order from longest to shortest. Also, when you are looping through the list of cities, when you find the match do a break; to stop the loop so it can't match on shorter ones.

To take you raw list (cities.txt) and put it in order, use:
Syntax: [ Download ] [ Hide ]
$cities = file('cities.txt');

$arySize = array();
foreach($cities as $strCity) {
        $arySize[$strCity] = strlen($strCity);
}
arsort($arySize);

$fp = fopen('citiesBySize.txt','w');
foreach($arySize as $key=>$val) {
        fwrite($fp,trim($key)."\n");
}
fclose($fp);

echo "Done!";


Top
 Profile  
 
 Post subject: Re: Address parsing
PostPosted: Mon Jan 23, 2012 5:08 pm 
Offline
Forum Newbie

Joined: Sun Jan 22, 2012 5:44 pm
Posts: 6
Of course!

That's more elegant than my solution.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group