Page 1 of 1

Probably yet another regular expressions question

Posted: Sun Mar 27, 2005 11:36 am
by Vodex
Hi. I'm knocking up a small app that will convert blog/forum posts to have links to Wikipedia. The idea is the user enters a string like

"A really good place to go is the London Eye - it's tall!"

And it picks out the capitalised phrase 'London Eye', turning into "http://en.wikipedia.org/wiki/London_Eye". But this means wrestling with regexps.

ereg & '[A-Z][a-z]+' finds a capitalised word ('London'). How can I find a string of capitalised words?

I'm not a regexp expert, and all tutorials seem to stop at 'validate an email address' - & not all flavours of regexps seem to be the same...

Whilst I'm at it, how can I find a string containing underscores, so a user could go 'UK_general_election,_2001' for non-capitalised entries?

Thanks!

Posted: Sun Mar 27, 2005 12:04 pm
by feyd

Code: Select all

preg_match_all('#(їA-Z]їa-z]+(\s+їA-Z]їa-z]+)*)#m', $text, $matches);

var_export($matches);

Posted: Sun Apr 03, 2005 7:36 am
by Vodex
feyd wrote:

Code: Select all

preg_match_all('#(їA-Z]їa-z]+(\s+їA-Z]їa-z]+)*)#m', $text, $matches);

var_export($matches);
That didn't seem to do it but have got it fixed now, thanks.

Posted: Sun Apr 03, 2005 2:19 pm
by feyd
odd.. it sure works for me.

Code: Select all

<?php

$text = 'A really good place to go is the London Eye - it\'s tall!';

preg_match_all('#(їA-Z]їa-z]+(\s+їA-Z]їa-z]+)*)#m', $text, $matches);
 
var_export($matches);

?>

Code: Select all

array (
  0 =>
  array (
    0 => 'London Eye',
  ),
  1 =>
  array (
    0 => 'London Eye',
  ),
  2 =>
  array (
    0 => ' Eye',
  ),
)

Posted: Tue Apr 05, 2005 1:40 pm
by Vodex
feyd wrote:odd.. it sure works for me.
Why do I want the phrase 'London Eye' twice, and then the word 'Eye' afterwards? I just want 'London Eye'.

Posted: Tue Apr 05, 2005 2:00 pm
by feyd
that's because of how the pattern is written. Each set of parens gets its own array, plus the entire match. That is the standard return from the regular expression matching functions..

You can simply do the following to get what you want. :|

Code: Select all

$matches = $matches[0];

Posted: Tue Apr 05, 2005 5:32 pm
by Chris Corbyn
feyd wrote:that's because of how the pattern is written. Each set of parens gets its own array, plus the entire match. That is the standard return from the regular expression matching functions..

You can simply do the following to get what you want. :|

Code: Select all

$matches = $matches[0];
Just another quick point on a side note: (?:pattern) Doesn't get stored, just grouped :wink: