Splitting parts of a String in Variables

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
HoboJoey
Forum Newbie
Posts: 2
Joined: Wed Oct 13, 2010 8:19 am

Splitting parts of a String in Variables

Post by HoboJoey »

Hello there,

I have a file full of list elements like that:

Code: Select all

<li><a href="http://website.com/?somestuff">The Link Name</a> </li>
Now I want to open the file, read the first line and store the "http://website.com/?somestuff" in a variable. The same should happen with "The Link Name" - it should be in another variable. The the script should read the next line and change the variables with the link and the name of the link in the next line. And so on.

This is what I tried:

Code: Select all

<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
$fd = fopen("list","r");
$i='0';
while (!feof($fd)) {
	$z = fgetss($fd,1000,"<a>");
	echo $z;
}
?>
Now I somehow have to get both parts into variables. I think I have to use regex expressions but I can't exactly figure out what to do.

Thanks in advance,

HoboJoey
HoboJoey
Forum Newbie
Posts: 2
Joined: Wed Oct 13, 2010 8:19 am

Re: Splitting parts of a String in Variables

Post by HoboJoey »

Solved it like that:

Code: Select all

<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
$fd = fopen("list","r");
while (!feof($fd)) {
	$zeile = fgetss($fd,1000,"<a>");
	//echo $zeile;
	echo "<br>";
	$pos1 = strpos($zeile, "http://");
	echo $pos1;
	echo "<br>";
	$pos2 = strpos($zeile, "\">");
	echo $pos2;
	echo "<br>";
	$pos2-=$pos1;
	
	$subString = substr ($zeile, $pos1, $pos2);

	echo "subString = $subString <br>";
}
?>
User avatar
AbraCadaver
DevNet Master
Posts: 2572
Joined: Mon Feb 24, 2003 10:12 am
Location: The Republic of Texas
Contact:

Re: Splitting parts of a String in Variables

Post by AbraCadaver »

Try DOMDocument:

Code: Select all

$doc = new DOMDocument();
$doc->loadHTMLFile("list");

foreach($doc->getElementsByTagName('a') as $link) {
	$links[] = array('url' => $link->getAttribute('href'), 'text' => $link->nodeValue);
}
mysql_function(): WARNING: This extension is deprecated as of PHP 5.5.0, and will be removed in the future. Instead, the MySQLi or PDO_MySQLextension should be used. See also MySQL: choosing an API guide and related FAQ for more information.
User avatar
twinedev
Forum Regular
Posts: 984
Joined: Tue Sep 28, 2010 11:41 am
Location: Columbus, Ohio

Re: Splitting parts of a String in Variables

Post by twinedev »

I like the DOMDocument method, I'm going to go read up on that some. Here is a solution using the regex method:

Code: Select all

<?php

	// Get the file into this variable, I just set it to grab CNN for testing...
	$strFile = file_get_contents("http://www.cnn.com/");

	//                           _1__  _2_        _3_
	preg_match_all('%<a .*?href=("|\')(.*?)\1.*?>(.*?)</a>%i', $strFile, $aryMatch, PREG_PATTERN_ORDER);

	// in $aryMatch:
	//   [0] is array of all complete matches
	//   [1] is array of the opening quote, either single or double, so it can match the closing
	//   [2] is array of the actual URL of the link
	//   [3] is array of the text for the link

	if (isset($aryMatch[2]) && count($aryMatch[2]>0)) {
		foreach ($aryMatch[2] as $key=>$strURL) {
			$strLinkText = $aryMatch[3][$key]; // Added this for easier readability
			echo ($key+1),': ';
			if (preg_match('/^javascript:/i',$strURL)) {
				echo "<strong><em>Javascript Call</em></strong><br>\n";
			}
			else {
				echo htmlspecialchars($strLinkText),'<strong> LINKS TO </strong>'.$strURL,"<br>\n";
			}
		}
	}
	else {
		echo "Sorry, no links found...";
	}

?>
A note before you copy and paste that, the editor here kept changing the code on me, the line that has the preg_match for javascript, it is actually supposed to be /^javascript:/i in there.

Another item you may want to consider, depending on your use of the data, is check to see if a link starts with #, which it just to link to an anchor on the same page. If it is, change it from #whatever to be /path/to/file#whatever

-Greg
Post Reply