preg_split problems

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
davidjwest
Forum Commoner
Posts: 67
Joined: Sat Nov 06, 2004 5:26 am
Location: Leeds, Yorkshire, England

preg_split problems

Post by davidjwest »

Code: Select all

$line = preg_split("\s[0-9]+\s03\sDavid West\s",$description);
I get this error:
Warning: preg_split(): Delimiter must not be alphanumeric or backslash in /home/qhwslos/public_html/results_entry.php on line 59
I've never used this command before and am new to regexes too, what did I do wrong?
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

try this:

Code: Select all

$line = preg_split("|s[0-9]+\s03\sDavid West\s|",$description);
You can't use a \ or a-z A-Z 0-9 to contain your regex
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Burrito
Spockulator
Posts: 4715
Joined: Wed Feb 04, 2004 8:15 pm
Location: Eden, Utah

Post by Burrito »

Moved to regex
davidjwest
Forum Commoner
Posts: 67
Joined: Sat Nov 06, 2004 5:26 am
Location: Leeds, Yorkshire, England

Post by davidjwest »

Thanks, that works, or at least it doesn't give any errors!

My lack of knowledge however means it's not doing what I expected, here's more detail if anyone can spare the time to help, I would be grateful!

I want to extract data from this page:

http://v8lites.teamecosse.org.uk/Autumn ... /Race.html

I was hoping my code would assign just the information covered by my regex, "Pos" is what I after, but it brings back the whole load of data. Am I using the wrong function or is it my regex?
User avatar
sweatje
Forum Contributor
Posts: 277
Joined: Wed Jun 29, 2005 10:04 pm
Location: Iowa, USA

Post by sweatje »

This should give you some ideas. Expressed as a SimpleTest test case:

Code: Select all

function testParseRaceResults() {

$page = "RACE RESULTS (After 60 laps)

Pos No   Driver                           Laps   Race Time       Diff    Speed Points
 1    29 Sarah O'Hallaran                   60  50m44.086s             141.915    180
 2    31 Marc van Brakel                    60  50m44.191s    00.105s  141.910    180
 3    77 Matt Screaton                      60  50m44.251s    00.165s  141.907    165
 4    03 David West                         60  50m44.363s    00.277s  141.902    165
 5    90 Jacques Richard                    60  50m47.406s    03.320s  141.760    155
 6    08 Peter Andersson                    60  50m52.704s    08.618s  141.514    150
 7    04 Scott Dryden                       60  50m56.407s    12.321s  141.343    146
 8    40 Bruce Duncan                       60  51m14.388s    30.302s  140.516    142
 9    05 Neil Marlow                        59  51m11.893s   1 lap(s)  138.286    143
10     5 John Power                         57  50m44.535s   3 lap(s)  134.799    134
11    87 Sharon Owen                        57  51m18.429s   3 lap(s)  133.315    130
12   111 brian freeman SPX                  41  39m28.969s  18 lap(s)  125.647    127
13   177 Ian Newman                         34  30m17.997s  25 lap(s)  132.738    124
14    30 Craig Teasdale                     31  31m40.318s  29 lap(s)  117.454    121
15    82 marcus litwinow                    17  15m02.164s  43 lap(s)  135.674    118
16    07 Alan Strang                         0 DidNotStart  60 lap(s)         
17    -1 Pace Car                            0 DidNotStart  60 lap(s)         
18    86 Simon m Savage                      0 DidNotStart  60 lap(s)         
19   000 The Player                          0 DidNotStart  60 lap(s)    
";

$regex = "~	# tilda delimited regex
\n # look for a start of line
\s*	# and then possibly some whitespace
(?P<Pos> \b\d+\b)	# a capture named 'Pos' (PHP 4.3.3+ only)
\s+
(?P<No> -?\d+\b)
\s+
(?P<Driver> \b[a-z' ]+?(?!>\s{3}))
\s+
(?P<Laps> \b\d+\b)
\s+
(?P<RaceTime> \b(?:\d+m\d+\.\d+s|DidNotStart)\b)
\s+
(?P<Diff> \b(?:\d+\s+[laps()]+|\d+\.\d+s))?
(\s+
(?P<Speed> \b\d+\.\d+)
\s+
(?P<Points> \b\d+))?
	# i = case insensitive
	# x = extended whitespace parsing (i.e. allow these comments)
	# m = multi line
	# s = . include new line
~ixms";

preg_match_all($regex, $page, $matches);

$this->assertEqual(19, count($matches['Driver']));
$this->assertEqual(3, $i = array_search('David West', $matches['Driver'] ));
$this->assertEqual(4, $matches['Pos'][$i]);
$this->assertEqual('03', $matches['No'][$i]);
$this->assertEqual(60, $matches['Laps'][$i]);
$this->assertEqual('50m44.363s', $matches['RaceTime'][$i]);
$this->assertEqual('00.277s', $matches['Diff'][$i]);
$this->assertEqual(141.902, $matches['Speed'][$i]);
$this->assertEqual(165, $matches['Points'][$i]);

}
Post Reply