Confusing string

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
jnorton
Forum Newbie
Posts: 6
Joined: Tue Jun 19, 2007 4:24 am

Confusing string

Post by jnorton »

Hello,

I really need some help with following string:

M1;, Bedfordshire, Flitwick, M1; 12, A5120;

What I ideally want is for the string to look like:

M1, Bedfordshire, Flitwick, M1 - Junction 12, A5120

So basically I need to match the string part: M1; 12 and turn it into M1 - Junction 12 and then match any remaining semi-colon and delete it.

I also need to convert the following string:

M1;, Hertfordshire, Hemel Hempstead, M1; 8, A414;, M1;, Bedfordshire, Luton Airport, M1; 10, M1;

Again I need to delete the semi colons and replace any string part that reads M[number]; followed by a space then [0-9] with: M[number] - junction [0-9]

Please help as I have spent a couple of days scratching my head on this one.

Thanks,

Justin.
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

Remove semi colons

Code: Select all

$subject = str_replace(';', '', $subject);
Then you might want to split the string up by commas or spaces. Something like

Code: Select all

$parts = array_map('trim', explode(',', $subject));
If the input string is

Code: Select all

M1;, Bedfordshire, Flitwick, M1; 12, A5120;
then you can get the penultimate element of $parts which will be "M1; 12" and str_replace('; ', ' - Junction').
jnorton
Forum Newbie
Posts: 6
Joined: Tue Jun 19, 2007 4:24 am

Post by jnorton »

Hello,

This is fine but what I really need and is at the root of the problem is that the string part M1; 12 may not always be the penultimate array key. In fact the M1; 12 string could appear anywhere with the string. So I ideally need to match the letter M followed by a number sequence followed by a semi-colon followed by a space followed by a number sequence. Confused?!

Thanks,

Justin.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Re: Confusing string

Post by superdezign »

jnorton wrote:Again I need to delete the semi colons and replace any string part that reads M[number]; followed by a space then [0-9] with: M[number] - junction [0-9]
Then just regex that part. You've basically already wrote it.
jnorton
Forum Newbie
Posts: 6
Joined: Tue Jun 19, 2007 4:24 am

Post by jnorton »

I would but I am pretty rubbish at regular expressions - basically dumbfounded newbie.

So if you guys know the syntax then it would really help, otherwise I will be scratching my head again wondering where to put a dot, plus sign etc.

Thanks,

Justin.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

Well, I purposely didn't tell you because I wanted you to figure it out on your own... We have entire board on Regex here you know.

However, I'll help you out this once. But you're going to need to know how to use regex on your own as you get more into web development.

Code: Select all

$str = 'M1;, Bedfordshire, Flitwick, M1; 12, A5120;';

$str = preg_replace('#(M[0-9]);\s+([0-9]+)#', '$1 - Junction $2', $str);
$str = str_replace(';', '', $str);
preg_replace does the regex replacement.
str_replace does the liter replacement.

If you have any questions on how the regex works, feel free to ask.
jnorton
Forum Newbie
Posts: 6
Joined: Tue Jun 19, 2007 4:24 am

Post by jnorton »

I almost got there with my own regex, which was: M[0-9];\s[0-9]

I guess I need to figure out what how the +'s and #'s work with regard to repeating the logic over the entire string.

Thank you very much for your help on this one, as it really would have taken me days to figure out.

Cheers,

Justin.
Last edited by jnorton on Tue Jun 19, 2007 8:41 am, edited 1 time in total.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

The '#' is just a delimiter. It could be replaced with a lot of different symbols such as '/', '|', or '!' as long as you use the same delimiter at the start and end of the pattern.

And the plus means one or more of the preceding character/character class/subpattern. I only added a plus to the \s just in case you have more than one space. I assume it could happen easily.
Post Reply