looking for a string in a file?...

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

looking for a string in a file?...

Post by sebnewyork »

Hi all

I have been using the fgetcsv() function before to detect a character while opening and reading a file. Now, I need to look for not just one character, but for a string (like for example "<table>" or "<table width=\"100\"> or any string within the file.

How can I do that?

The code I was using before is

Code: Select all

<?php
foreach($pages as $file){
	$fp = fopen ("$file", 'rb');
	while ($line = fgetcsv ($fp, 200, "n")) {
		foreach('n') {
			echo "one n found";
			break;
		}
	}
	fclose ($fp);
}

?>
what should I use if I wanted to acheive the same thing but with a string rather than a single character? I know the following code is not valid, but this is what I'd like to do:

Code: Select all

<?php
foreach($pages as $file){
	$fp = fopen ("$file", 'rb');
	while ($line = fgetcsv ($fp, 200, "<table>")) {
		foreach('<table>') {
			echo "one table tag found";
			break;
		}
	}
	fclose ($fp);
}
?>

Thanks for your help!
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

Code: Select all

<?php

$haystack = file_get_contents($filename);

// faster if you just need to see if needle exists
if (false !== strpos($haystack, $needle)) {
    echo 'needle found';
}

// or

$occurances = substr_count($haystack, $needle);

echo "$needle was found $occurances times";






?>

if you need case insensitive, use strtolower() on both $haystack and $needle before checking via either the above methods, or if using php5, there is stripos()
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

Post by sebnewyork »

thanks rehfeld

I assume the "$needle" can be any string, for example I could have:

$occurances = substr_count($myPage, "<? include('path_to_file.html') ?>");

right?

Basically my goal is to return a list of all the included files in a page, like in my example: "path_to_file.html"
So I'd need somehow to get PHP to look for the opening include tag up to the actual file path

"<? include('"

and the next closing tag

"') ?>

so that it can return just the actual file path, between those two strings.

Can I use the haystack and needle system for that, or is there a special function just for that (returning any content between any occurence of 2 specific strings).

Thanks a lot for your help
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

you could use strpos and substr in a while loop, but it would become extremely difficult.

you should use regular expressions for this, its what they are made for.
take a look at preg_match_all()

this is a tough one though. its because theres so many variations in the syntax for include and require


im not that great at regex, but this should get them all i beleive.
but it doesnt isolate the filename, it just grabs the whole include statement. isolating the filename is where it gets difficult.

Code: Select all

<pre>
<?php

$pattern = '/(include|require(_once)?)([^;]+)/i';


preg_match_all($pattern, $subject, $matches);

print_r($matches);

?>
you can learn more about regex on google or heres a nice one.
http://www.regular-expressions.info/tutorial.html
it takes a long time to learn the seeminlgy cryptic syntax, but its well worth it
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

Post by sebnewyork »

thank you.
Do I need to define the variables $subject and $matches?
if yes, how do I do that? Is $subject the page I want to open and read through?

and does that preg_match_all() function opens the pages, or do I need to use a
file_get_content or fopen() function?

Sorry I'm lost.
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

yes,
$subject = file_get_contents('some_file.php');

if your using an old version of php,
you may need to use fopen() and fread() instead of file_get_contetns()

$matches will be filled by preg_match,
so no you do not need to give it a value beforehand
Last edited by rehfeld on Sat Dec 18, 2004 1:50 pm, edited 1 time in total.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

Code: Select all

<?php
$subject = file_get_contents($file);
$pattern = '/(include|require(_once)?)([^;]+)/i';
preg_match_all($pattern, $subject, $matches);
print_r($matches);
?>
Subject is your content, Matches is an array of all the matches your regex has found....
sebnewyork
Forum Commoner
Posts: 43
Joined: Wed Mar 17, 2004 10:20 pm

Post by sebnewyork »

Ah, thanks, it seems to work... sort of!
But not really.
Here's the code I'm using now:

Code: Select all

<?php
$subject = file_get_contents ($file); 
$pattern ='/(include|require(_once)?)([^;]+)/i';
if (preg_match_all ($pattern, $subject, $matches)) {
	foreach ($matches as $match) {
		echo "<p>" . $match ."</p>"; 
	}
}
?>
and depending on what $file is, it echos a different number of matches, in the form of

Array

Array

Array

etc.
And it returns 4 of them for a page that doesn't has any includes!... So there must be something wrong with the regular expression, I'll have to look into it.

But instead of "Array" I'd like to have the matching string returned, i.e.
"include('blabla.htm')"
How would I do that?
User avatar
markl999
DevNet Resident
Posts: 1972
Joined: Thu Oct 16, 2003 5:49 pm
Location: Manchester (UK)

Post by markl999 »

Code: Select all

foreach ($matches[0] as $match) {
 echo "<p>" . $match ."</p>";
}
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

take a look at http://ca.php.net/preg_match_all to understand how matches are returned[/php_man]
Post Reply