all links from website

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
shivam0101
Forum Contributor
Posts: 197
Joined: Sat Jun 09, 2007 12:09 am

all links from website

Post by shivam0101 »

Hello,

I am trying to get all links from website. Below is the code i tried. I am getting Allowed memory size of 8388608 bytes exhausted. If there is any alteration has to be done in the code please let me know.

Code: Select all

 
 <?php
$first_link = mysql_query("SELECT * FROM urls WHERE url='http://localhost/mysite/'");
if(mysql_num_rows($first_link) == 0)
{
    mysql_query("INSERT INTO urls SET url='http://localhost/mysite/'");
}
 
callMain();
 
 
 
function callMain()
{
    usleep(100000); 
    
    $get_links_res = mysql_query("SELECT * FROM urls WHERE depth < 2");
    if(mysql_num_rows($get_links_res) >0)
    {
        while($get_links_ret = mysql_fetch_assoc($get_links_res))
        {
            $url = $get_links_ret['url'];
            $url_id = $get_links_ret['url_id'];
            
                $fl = @fopen($url, "r");
                if ($fl) {
                    while ($buffer = @fgets($fl, 4096)) {
                        $contents .= $buffer;
                    }
                } else {
                    echo 'error in reading file <br/>';
                }
 
                fclose ($fl);           
            
            
            preg_match_all("/href\s*=\s*[\'\"]?([+:%\/\?~=&;\\\(\),._a-zA-Z0-9-]*)(#[.a-zA-Z0-9-]*)?[\'\" ]?(\s*rel\s*=\s*[\'\"]?(nofollow)[\'\"]?)?/i", $contents, $regs, PREG_SET_ORDER);
            
            mysql_query("UPDATE urls SET depth = depth+1 WHERE url='$url'");
            foreach($regs as $val)
            {
                $check_outer_links_res = mysql_query("SELECT * FROM urls WHERE url = '$val[1]'");
                if(mysql_num_rows($check_outer_links_res)== 0)
                {
                    echo "INSERT INTO urls SET url = '$val[1]' <br/>";
                    mysql_query("INSERT INTO urls SET url = '$val[1]'");
                }
                
            }
        }
 
        callMain();
    }
}
 
 
Thanks.
User avatar
novice4eva
Forum Contributor
Posts: 327
Joined: Thu Mar 29, 2007 3:48 am
Location: Nepal

Re: all links from website

Post by novice4eva »

You are calling callMain() within callMain() isn't that supposed to go into infinite loop??

EDIT: I would do

Code: Select all

 
while($get_links_ret = mysql_fetch_assoc($get_links_res))
        {
$contents = '';
 
shivam0101
Forum Contributor
Posts: 197
Joined: Sat Jun 09, 2007 12:09 am

Re: all links from website

Post by shivam0101 »

Iam calling the function recursively. It should stop when there is no more links to be added and depth is reached 2
User avatar
novice4eva
Forum Contributor
Posts: 327
Joined: Thu Mar 29, 2007 3:48 am
Location: Nepal

Re: all links from website

Post by novice4eva »

sorry about the recursion thing, thought it was outside of if condition, my bad. but did you try the $contents thing that i had posted earlier.
shivam0101
Forum Contributor
Posts: 197
Joined: Sat Jun 09, 2007 12:09 am

Re: all links from website

Post by shivam0101 »

I tried, It is giving the same error
User avatar
novice4eva
Forum Contributor
Posts: 327
Joined: Thu Mar 29, 2007 3:48 am
Location: Nepal

Re: all links from website

Post by novice4eva »

ok then lets try enabling error and tracking which variable is getting fed lots of data. You might already know how to enable it but then also

Code: Select all

 
ini_set('display_errors', 1);
ini_set('error_reporting', E_ALL);
 
or since your default memory_limit is 8M should we resolve the matter by using

Code: Select all

 
ini_set("memory_limit","12M");//IF NOT SATISFIED INCREASE IT EVEN MORE
 
Post Reply