regex problem

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

regex problem

Post by sebnewyork »

Hi all

I'm trying to use a regular expression, and php always return:

Warning: No ending delimiter '^' found in...
(that's if I use the carret at the begenning of my regex)

or:

Warning: Delimiter must not be alphanumeric or backslash in...
(that's if I use a backlash to excape characters in my regex)

what's going on?
here's my code:

Code: Select all

<?php
if (isset ($_POST['Submit'])) {
	$subject = file_get_contents ($_POST['edit_choice']); 
	$pattern ='include\?';
	preg_match_all ($pattern, $subject, $matches, PREG_SET_ORDER ); 
	foreach ($matches as $match) {
		echo $match [0] . "\n" ;
	}
}
?>
thanks for any help!
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

w/ PCRE (perl compatible regular expressions, aka preg_match_*)

you need to specify delimiters. think of them kinda like quotes.
and like the err message sais, the delimiter cant be a backslash or alphanumeric char

the most common delimiter is a forward slash /
but you are able to use other characters

in your case, this should work

Code: Select all

$pattern ='/include\?/';
// but if your pattern has forward slashes in it like so

$pattern ='/some/file/path/'; // wont work
$pattern ='/some\/file\/path/'; // works, because i escaped the forward slashes in my pattern

// but this would be easier, because i use a diff delimiter char that is not present in my pattern, so that i wont have to escape things
$pattern ='#some/file/path#';
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

Post by sebnewyork »

thanks but I don't get it!
I tried
$pattern ='/include\?/';
but this doesn't return anything.
What I want to match is any occurrence of the word "include" in my html page. I know there are 4 in the page I'm looking in, and the pattern doesn't return anything.
Why do I need delimitors? I understand from some tutorials, that if I'm looking to match "abc", I should use the pattern "abc".
You are saying I should use
"/abc/" ?...
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

pcre uses delimtors, so you need to use them


the php ereg functions are NOT pcre, so you dont use delimtors. thats prob why your confused

i would recomend sticking to pcre though, they are more powerfull, and usually a good deal faster.



they must use delimters because you can specify modifers outside of the pattern

for example, "i" makes the the whole pattern case insensitive

Code: Select all

// match all occurances of the word include, case insensitive because of the i modifier
$pattern = '/include/i'; // the "i" is outside of the actual pattern
why do you have \? in there????

the question mark is a metacharacter, and you escaped it w/ a backslash, which tells it to match a litteral question mark

so unless your seraching for include? you dont want to do that


btw you dont need regex for this. look at substr_count()
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

Post by sebnewyork »

thanks a lot for your reply.
I have to say, I'm loosing my hair on this. Major headhake.
I don't know what "pcre" is or is not. I don't know how it can make a difference, since I'm using php, and php can't know if I mean my regex to be "pcre" or not, can he?
At this point I just want the easiest solution, not the best one.
I'm trying to match all occurences of
"include('blablabla')"
in an html page, where blablabla can be anything.
The regex I've come up with after two days of trying different things, is:
/include\('?'\)/
and it does not seem to match anything.
I don't even know if maybe it matches things but I can't see it because I don't have the right "echo" code.
I have:

$pattern ="/include\('?'\)/";
preg_match_all ($pattern, $subject, $matches, PREG_SET_ORDER );
echo $matches [0][ 0] . "zz" .$matches [0][ 1] . "\n" ;

it only returns zz of course. I understand nothing. Please help
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

Post by sebnewyork »

I have found the regex I needed!
here it is:

/include ?\('.*'\)/

this matches all my include tags in my html code.
Now, I want to extract from those matches the content that's between the single quotes, i.e. from the match

include ('mypage.html')

I want to extract

mypage.html

How would I acheive that?
I thought of using preg_split, and use the single quotes as what splits. but I'm not sure how to do that.

Here's my first attempt, in context, which obviously doesn't work:

Code: Select all

<?php
$subject = file_get_contents ("mypage.html");
$pattern ="/include ?\('.*'\)/";
preg_match_all ($pattern, $subject, $matches); 
for ($i = 0;$i<count($matches[0]);$i++){
	echo "matched:" . $matches [0][$i] . "<br />"; //works up to here
	$chars = preg_split("/'/", $match); 
	for ($i = 0;$i<count($chars[0]);$i++){
		echo "matched:" . $chars [0][$i] . "<br />";
	}
}

?>
any help would be greatly appreciated. It took me 3 days to find the regex!
thanks
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

you dont need to use preg_split in this case, theres no regular expression involved

just use explode(). faster, simpler.

your code wasnt working because your using a variable named $match, which doesnt exist
fix that and it would work.




Code: Select all

<?php

$subject = file_get_contents ("mypage.html");
$pattern ="/include ?\('.*'\)/";
preg_match_all ($pattern, $subject, $matches); 


foreach ($matches[0] as $match) {
    $parts = explode("'", $match);
     echo $parts[1] . '<br>';
}

?>



when i told you before it is difficult to isolate the filename that was so because i was taking into the possibility that you may use single or double quotes in the include staement, and may or may not use parenthesis
like
include "foo.php";


but if your syntax is always the same, you could just use this pattern
and then theres no need to use explode or anything

Code: Select all

$pattern = "/include ?\('([^']+)'\)/";
preg_match_all ($pattern, $subject, $matches); 
print_r($matches);




oh and preg_* functions are all PCRE
ereg* functions are NOT

thats how to tell the diff. p means perl
pcre = perl compatible regular expression


btw- if you havent seen this site, take a look. i found this place very helpful
http://www.regular-expressions.info/tutorial.html
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

Post by sebnewyork »

thanks!
I got everything to work.
Post Reply