Page 1 of 1

Delimiter woes

Posted: Fri Mar 30, 2007 5:57 am
by Maugrim_The_Reaper
I'm currently working on a parser for the YAML format. Since the lexer needs to scan for tokens, it uses quite a bit of regex and this one has me stumped. Maybe after another few cups of coffee I'll break it ;).

The regex itself looks fine, it matches checks there are no "special" characters at the start of a new line, and for a subset of those (which are allowed) that they are not followed by any tabs, spaces, null bytes or other wonky things. So far so good. When using it the lexer coughs up an Exception and PHP sends out a Warning:
Warning: preg_match() [function.preg-match]: No ending delimiter '£' found in Yaml\Lexer.php on line 222
I used a unique delimiter in case it was a loose non-escaped forward slash or something obvious I missed. Left the double quotes in place around the regex string...

Code: Select all

"£^([^\0 \t\r\n\x85\-?:,\[\]{}#&*!|>'\"%@]|([\-?:][^\0 \t\r\n\x85]))£"
Anyone have a clue where it's gone wrong?

Posted: Fri Mar 30, 2007 6:22 am
by Maugrim_The_Reaper
I figured out it's the null byte "\0" truncating the string - need to find PHP's alternative...;).

Edit: \x00 not working either...hmm

Posted: Fri Mar 30, 2007 6:38 am
by Maugrim_The_Reaper
It's a stupid question likely but what is the regex for an empty string? Sure, I can fire off an [$str == ''] comparison but wondering if regex (presumably) has this for PHP.

Posted: Fri Mar 30, 2007 7:48 am
by feyd
Because you're using a double quote string a lot of escaping is required for multiple characters in your string.

The following compiled fine for myself.

Code: Select all

<?php

$p = "£^([^\\0 \\t\\r\\n\x85\\-?:,\\[\\]{}#&*!|>'\"%@]|([-?:][^\\0 \\t\\r\\n\x85]))£";

preg_match($p, '');

?>