Page 1 of 1
regex problem
Posted: Mon Dec 20, 2004 11:33 am
by sebnewyork
Hi all
I'm trying to use a regular expression, and php always return:
Warning: No ending delimiter '^' found in...
(that's if I use the carret at the begenning of my regex)
or:
Warning: Delimiter must not be alphanumeric or backslash in...
(that's if I use a backlash to excape characters in my regex)
what's going on?
here's my code:
Code: Select all
<?php
if (isset ($_POST['Submit'])) {
$subject = file_get_contents ($_POST['edit_choice']);
$pattern ='include\?';
preg_match_all ($pattern, $subject, $matches, PREG_SET_ORDER );
foreach ($matches as $match) {
echo $match [0] . "\n" ;
}
}
?>
thanks for any help!
Posted: Mon Dec 20, 2004 1:38 pm
by rehfeld
w/ PCRE (perl compatible regular expressions, aka preg_match_*)
you need to specify delimiters. think of them kinda like quotes.
and like the err message sais, the delimiter cant be a backslash or alphanumeric char
the most common delimiter is a forward slash /
but you are able to use other characters
in your case, this should work
Code: Select all
$pattern ='/include\?/';
// but if your pattern has forward slashes in it like so
$pattern ='/some/file/path/'; // wont work
$pattern ='/some\/file\/path/'; // works, because i escaped the forward slashes in my pattern
// but this would be easier, because i use a diff delimiter char that is not present in my pattern, so that i wont have to escape things
$pattern ='#some/file/path#';
Posted: Mon Dec 20, 2004 7:39 pm
by sebnewyork
thanks but I don't get it!
I tried
$pattern ='/include\?/';
but this doesn't return anything.
What I want to match is any occurrence of the word "include" in my html page. I know there are 4 in the page I'm looking in, and the pattern doesn't return anything.
Why do I need delimitors? I understand from some tutorials, that if I'm looking to match "abc", I should use the pattern "abc".
You are saying I should use
"/abc/" ?...
Posted: Mon Dec 20, 2004 7:53 pm
by rehfeld
pcre uses delimtors, so you need to use them
the php ereg functions are NOT pcre, so you dont use delimtors. thats prob why your confused
i would recomend sticking to pcre though, they are more powerfull, and usually a good deal faster.
they must use delimters because you can specify modifers outside of the pattern
for example, "i" makes the the whole pattern case insensitive
Code: Select all
// match all occurances of the word include, case insensitive because of the i modifier
$pattern = '/include/i'; // the "i" is outside of the actual pattern
why do you have \? in there????
the question mark is a metacharacter, and you escaped it w/ a backslash, which tells it to match a litteral question mark
so unless your seraching for
include? you dont want to do that
btw you dont need regex for this. look at substr_count()
Posted: Mon Dec 20, 2004 8:13 pm
by sebnewyork
thanks a lot for your reply.
I have to say, I'm loosing my hair on this. Major headhake.
I don't know what "pcre" is or is not. I don't know how it can make a difference, since I'm using php, and php can't know if I mean my regex to be "pcre" or not, can he?
At this point I just want the easiest solution, not the best one.
I'm trying to match all occurences of
"include('blablabla')"
in an html page, where blablabla can be anything.
The regex I've come up with after two days of trying different things, is:
/include\('?'\)/
and it does not seem to match anything.
I don't even know if maybe it matches things but I can't see it because I don't have the right "echo" code.
I have:
$pattern ="/include\('?'\)/";
preg_match_all ($pattern, $subject, $matches, PREG_SET_ORDER );
echo $matches [0][ 0] . "zz" .$matches [0][ 1] . "\n" ;
it only returns zz of course. I understand nothing. Please help
Posted: Mon Dec 20, 2004 10:02 pm
by sebnewyork
I have found the regex I needed!
here it is:
/include ?\('.*'\)/
this matches all my include tags in my html code.
Now, I want to extract from those matches the content that's between the single quotes, i.e. from the match
include ('mypage.html')
I want to extract
mypage.html
How would I acheive that?
I thought of using preg_split, and use the single quotes as what splits. but I'm not sure how to do that.
Here's my first attempt, in context, which obviously doesn't work:
Code: Select all
<?php
$subject = file_get_contents ("mypage.html");
$pattern ="/include ?\('.*'\)/";
preg_match_all ($pattern, $subject, $matches);
for ($i = 0;$i<count($matches[0]);$i++){
echo "matched:" . $matches [0][$i] . "<br />"; //works up to here
$chars = preg_split("/'/", $match);
for ($i = 0;$i<count($chars[0]);$i++){
echo "matched:" . $chars [0][$i] . "<br />";
}
}
?>
any help would be greatly appreciated. It took me 3 days to find the regex!
thanks
Posted: Tue Dec 21, 2004 1:16 am
by rehfeld
you dont need to use preg_split in this case, theres no regular expression involved
just use explode(). faster, simpler.
your code wasnt working because your using a variable named $match, which doesnt exist
fix that and it would work.
Code: Select all
<?php
$subject = file_get_contents ("mypage.html");
$pattern ="/include ?\('.*'\)/";
preg_match_all ($pattern, $subject, $matches);
foreach ($matches[0] as $match) {
$parts = explode("'", $match);
echo $parts[1] . '<br>';
}
?>
when i told you before it is difficult to isolate the filename that was so because i was taking into the possibility that you may use single or double quotes in the include staement, and may or may not use parenthesis
like
include "foo.php";
but if your syntax is always the same, you could just use this pattern
and then theres no need to use explode or anything
Code: Select all
$pattern = "/include ?\('([^']+)'\)/";
preg_match_all ($pattern, $subject, $matches);
print_r($matches);
oh and preg_* functions are all PCRE
ereg* functions are NOT
thats how to tell the diff. p means perl
pcre = perl compatible regular expression
btw- if you havent seen this site, take a look. i found this place very helpful
http://www.regular-expressions.info/tutorial.html
Posted: Tue Dec 21, 2004 10:39 am
by sebnewyork
thanks!
I got everything to work.