Page 1 of 1

Simple link popularity scrip help

Posted: Wed Feb 02, 2005 9:23 pm
by ron_j_m
I am attempting to write a link popularity script. I would consider myself to be verry new at php as you will no doubt see from my code. Much of the below code has been scrapped together from various other scripts I have found and a little of my own little stuff.

The problem I am having with the script has to do with adding. The script searches for links to 3 domains (ebay.com, slashdot.org, and purediva.com), from two search engines (google and altavista).
It then reports the link popularity for each domain from both search engines. It seems to do this correctly. I would also like to add the results from both search engines for each domain, to display total link popularity for both search engines combined. I cant seem to get this to work and am hoping one of you geuiness can help...
Any way here is my code

Code: Select all

<?

$total=0;

	$results = array("ebay.com","slashdot.org","purediva.com");
	foreach ($results as $domain) {
   
	// check Yahoo!
	$path ="http://search.yahoo.com/search?p=linkdomain%3A".$domain."&ei=UTF-8&fr=fp-tab-web-t&cop=mss&tab=";
	if(!file_exists($path)) {
		$data = strtolower(implode("", file($path)));
		$data = substr($data, strpos($data, "of about")+9, strlen($data));
		$data = strip_tags(substr($data, 0, strpos($data, " ")));
		if(eregi("[[]]", $data)) {
			$results['yahoo'] = array('0', $path);
		} else {
			$results['yahoo'] = array($data, $path);
			$total+=str_replace(',', '', $data);
		}
	} else {
		$results['yahoo'] = array('n/a', $path);
	}
	
	// check AltaVista
	$path ="http://www.altavista.com/web/results?q=linkdomain%3A".$domain."&kgs=1&kls=0&stq=10";
	if(!file_exists($path)) {
		$data = strtolower(strip_tags(implode("", file($path))));
		$data = substr($data, strpos($data, "altavista found")+15, strlen($data));
		$data = trim(substr($data, 0, strpos($data, "results"))); //echo "$data<br>"; // TEST
		if(eregi("[[]]", $data)) {
			$results['altavista'] = array('0', $path);
		} else {
			$results['altavista'] = array($data, $path);
			$total+=str_replace(',', '', $data);
		}
	} else {
		$results['altavista'] = array('n/a', $path);
	}

$enginetotals = number_format($total);
$yahooresults = $results['yahoo'][0];
$yahoodomain = $results['yahoo'][1];
$altavistaresults = $results['altavista'][0];
$altavistadomain = $results['altavista'][1];
$total1 = ("$yahooresults" + "$altavistaresults");

//====PRINT RESULTS
echo ("Domain = <b>$domain</b> has<br>");
echo ("<a href="$yahoodomain">$yahooresults</a> Yahoo! Results<br>");
echo ("<a href="$altavistadomain">$altavistaresults</a> Altavista Results<br>");
echo ("$enginetotals Total Search Engine Results using enginetotals<br>");
echo ("$total1 Total Search Engine results usint total1<br><br>");

	}

?>
And this is the results

Code: Select all

Domain = ebay.com has
10,100,000 Yahoo! Results
9,720,000 Altavista Results
19,820,000 Total Search Engine Results using enginetotals
19 Total Search Engine results using total1

Domain = slashdot.org has
5,990,000 Yahoo! Results
6,070,000 Altavista Results
31,880,000 Total Search Engine Results using enginetotals
11 Total Search Engine results using total1

Domain = purediva.com has
23 Yahoo! Results
25 Altavista Results
31,880,048 Total Search Engine Results using enginetotals
48 Total Search Engine results using total1
As you can see the total results dont work. I have tried many different ways and I just cant seem to get it to work. The first adds all totals together, and the second only reports the first 2 digits of the total.
If anyone has any ideas on how to get this to work properly, please help me out...
Also if you see anything in this code that would help speed things up or run better let me know. Eventually I'm hoping to add many more search engines to this along with a file upload for procesing domain lists.

Thanks for your time.
Ron

Posted: Wed Feb 02, 2005 9:41 pm
by feyd
you can speed the page by removing the call to file_exists. Currently, you ask php to fetch each page twice because of this.

file_get_contents() is faster than implode(file())

preg_* functions are faster than ereg*

"$yahooresults" + "$altavistaresults" is done faster as $yahooresults + $altavistaresults


as for your total being off, it's because $data is inserted into the $results array before it's stripped of commas.

Posted: Thu Feb 03, 2005 12:55 am
by ron_j_m
Thanks for the help. I was able to strip out the commas using preg_replace, and that worked perfect. I also tried to do some of your other suggestions. I cant quite figure out how to use preg_match instead of eregi.
Here is the updated code. Not sure if its all right or not, but it does work :D

Code: Select all

<?
// SCRIPT START TIME ----------------------------------------------------
$timeparts = explode(' ', microtime()); $starttime = $timeparts[1].substr($timeparts[0],1);
// ---
$total=0;

	$results = array("ebay.com","slashdot.org","purediva.com");
	foreach ($results as $domain) {
   
	// check Yahoo!
	$path ="http://search.yahoo.com/search?p=linkdomain%3A".$domain."&ei=UTF-8&fr=fp-tab-web-t&cop=mss&tab=";
	{
		$data = strtolower(file_get_contents("$path"));
		$data = substr($data, strpos($data, "of about")+9, strlen($data));
		$data = strip_tags(substr($data, 0, strpos($data, " "))); 
		if(eregi("[[]]", $data)) {
			$results['yahoo'] = array('0', $path);   
		} else {
		    $data = preg_replace('/[^0-9]/','',$data);
			$results['yahoo'] = array($data, $path);  
		}
	}
	// check AltaVista
	$path ="http://www.altavista.com/web/results?q=linkdomain%3A".$domain."&kgs=1&kls=0&stq=10";
	{
		$data = strtolower(strip_tags(file_get_contents("$path")));
		$data = substr($data, strpos($data, "altavista found")+15, strlen($data));
		$data = trim(substr($data, 0, strpos($data, "results"))); 
		if(eregi("[[]]", $data)) {

			$results['altavista'] = array('0', $path);
		} else {
		    $data = preg_replace('/[^0-9]/','',$data); 
			$results['altavista'] = array($data, $path);
		}
	}
	
//$enginetotals = number_format($total);
//ENGINE TOTALS
$yahoototal = $results['yahoo'][0];
$altavistatotal = $results['altavista'][0];
$total1 = number_format($yahoototal + $altavistatotal);
$yahooresults = number_format($yahoototal);
$altavistaresults = number_format($altavistatotal);
//ENGINE DOMAIN
$altavistadomain = $results['altavista'][1];
$yahoodomain = $results['yahoo'][1];
//TOTALS


//====PRINT RESULTS
echo ("Domain = <b>$domain</b> has<br>");
echo ("<a href="$yahoodomain">$yahooresults</a> Yahoo! Results<br>");
echo ("<a href="$altavistadomain">$altavistaresults</a> Altavista Results<br>");
echo ("$total1 Total Search Engine results.<br><br>");

	}
//echo ("$enginetotals Total Search Engine Results<br>");

// SCRIPT END TIME ------------------------------------------------------
$timeparts = explode(' ', microtime()); $endtime = $timeparts[1].substr($timeparts[0],1); $execution = round(($endtime - $starttime), 2);
// ----------------------------------------------------------------------
?>
Script executed in <?=$execution?> seconds.<br>
Any other suggestions?

Thanks again.
Ron

Posted: Thu Feb 03, 2005 1:04 am
by feyd

Code: Select all

preg_match('#&#1111;a-z]#i', $data);
is what your eregi calls do.