preg_match()

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
phoenix121
Forum Commoner
Posts: 28
Joined: Sun Sep 25, 2005 9:09 pm
Location: New Zealand

preg_match()

Post by phoenix121 »

hi
its me again.
I was wondering if there is any way to find something without something after it, such as

finding src="xxx" but not src="http://xxx"

some sample code would be very helpful! thanks
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Code: Select all

$p = '#src=\s*(["\'])?(?!http://)(.*?)\\1#';
phoenix121
Forum Commoner
Posts: 28
Joined: Sun Sep 25, 2005 9:09 pm
Location: New Zealand

Post by phoenix121 »

can you please explain that regular expression in english? im no good with these expressions! :?
phoenix121
Forum Commoner
Posts: 28
Joined: Sun Sep 25, 2005 9:09 pm
Location: New Zealand

Post by phoenix121 »

ok, here's my code so far, and its not working:

Code: Select all

$searchstring = array('#src=\s*(["])?(?!http://)(.*?)\\1#', '#href=\s*(["])?(?!http://)(.*?)\\1#');
	$replacestring = array("src=\"$address", "href=\"$address");
	$contents = str_replace($searchstring, $replacestring, $contents);
$contents is the data
$address is the address (eg. http://www.google.com)

is it the (.*?) bits that are stopping it from working?

edit:
maybe i should have said my problem clearer:
i want it to find and replace src=" or href=" with src="$address or href="$address
its a double quote, and i also want the search thing to search for a match with or without the double quote[/i]
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

it's not working how? The patter will match the entire attribute, not just the beginning part.

You can do what you are trying in 1 expression.

Code: Select all

$searchstring = '#(src|href)=\s*(["\'])?(?!http://)(.*?)\\2#i';
    $replacestring = "\\1=\\2{$address}\\3\\2";
    $contents = str_replace($searchstring, $replacestring, $contents);
d11wtq | Ouch 8O
phoenix121
Forum Commoner
Posts: 28
Joined: Sun Sep 25, 2005 9:09 pm
Location: New Zealand

Post by phoenix121 »

i really need help, somehow, its not working.

i cant think of any ways to do it, since im quite new to php. all i want to the is somehow find all the src=" and href=" that dont have http:// after it, and then replace it with src="$address or href="$address

isn't there any way to do that? :? [/list]
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

*cough* str_replace() ?? Hint: preg_replace() ;)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

:oops: I didn't even notice that.. 8O
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

feyd wrote::oops: I didn't even notice that.. 8O
LOL you're too busy answering posts at lightning spped :P
phoenix121
Forum Commoner
Posts: 28
Joined: Sun Sep 25, 2005 9:09 pm
Location: New Zealand

Post by phoenix121 »

hah, thanks! i should have noticed that, you've already told me to use preg_replace() before! ;)

edit:
one last thing, how do you search for a double quote, quote or nothing (neither)? i got the ["\'] bit, but how about the "nothing" bit?
phoenix121
Forum Commoner
Posts: 28
Joined: Sun Sep 25, 2005 9:09 pm
Location: New Zealand

Post by phoenix121 »

can you use (["\'(space)] to do it?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

the regular expression I posted last will handle a double quote, single quote or nothing being there.
phoenix121
Forum Commoner
Posts: 28
Joined: Sun Sep 25, 2005 9:09 pm
Location: New Zealand

adding in http://.../

Post by phoenix121 »

Code: Select all

<?php
if ($_POST['submit']) {
	$address = $_POST['address'];
	if (!$contents = file_get_contents($address)) {
		echo "<b>Error: Cannot read file! Either file is empty or file does not exist.</b>";
		exit;
	}
	$searchstring = '#(src|href)=\s*(["\'])?(?!http://)(.*?)\\2#i';
	$replacestring = "\\1=\\2{$address}\\3\\2";
	$contents = preg_replace($searchstring, $replacestring, $contents);
	
	echo $contents;
}
?>
thats my code. Because its grabbing the html from another website, some links and images might not work, depending on how that website was coded. so i want it to detect all the hrefs and srcs that don't include the http:// bit and add on the $address variable, which holds the website address. The only ones it changes are:

src="xxx" to src="http://.../xxx" and href="xxx" to href="http://.../xxx"

problems:
*some srcs and hrefs don't have the quote or double quote... some have a slash or just a folder name.
*another problem is if the link is "../", ie. going up a directory

i think i can fix the folder name one, but can anyone help with the slash or the ../?

thanks in advance
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

use preg_replace_callback(). You can then analyze what you find in the attribute and replace as you wish..
Post Reply