Page 1 of 1

Need some advice on improving the efficiency of this code

Posted: Sat Sep 29, 2007 7:29 pm
by legend986
I've started a new thread, because this question is completely different. I have written a code that gathers an ip address from a database and then fetches its "as" number and stores the "as" number back into its corresponding location. If you have the patience, you have a have a look at my code else you can directly jump to my question at the end of my post. Here's the code:

Code: Select all

<?php
include "conf_global.php";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$sql_ip = "SELECT * FROM ".$table_main." WHERE asn=0";
echo $sql_ip."<br>";
$result_ip = mysql_query($sql_ip);
echo $result_ip;

while($row_ip = mysql_fetch_assoc($result_ip)) {
		curl_setopt($ch, CURLOPT_POSTFIELDS,
		            "action=do_whois&family=ipv4&method_whois=whois&bulk_paste=".$row_ip['ip']."&submit_paste=Submit");
		            echo "<br>".$row_ip[ip];
		$result = curl_exec ($ch);
		
		$pattern = "/[0-9]{1,9}\s{3,4}[|]\s[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/";
		preg_match($pattern,$result,$req_info);
		
		$pattern_for_ASN = "/[0-9]{1,9}/";
		preg_match($pattern_for_ASN,$req_info[0],$asn);

		echo "<br>".$asn[1].$asn[2];
		$sql_asn = "UPDATE ".$table_main." SET asn = '$asn[0]' WHERE ip='$row_ip[ip]'";
		echo $sql_asn;
		$result_asn = mysql_query($sql_asn);
	}

curl_close ($ch); 
?>
Workflow:

Everytime I execute this script, it checks the database for empty asn numbers and fetches those ip numbers and then works on them. The script works fine for a small database but what would I do in the case of a large database? In my other post, I was suggested to look into pagination but I don't know how to relate the concepts.

Basically the question is "How would I execute a limited number of queries and after executing a limited set of queries, I would Meta Refresh the page and then continue from where I left off the previous time... " Please advice...

Posted: Sat Sep 29, 2007 7:52 pm
by Begby
I would start by seeing if you can do this in a single query. By the way, what the heck is an AS number? Show some example records efore and after they are updated.

Posted: Sat Sep 29, 2007 7:58 pm
by legend986
The program works for a single query or to that matter of fact, any number of queries that can be executed before the script timesout.... An AS Number is just an Autonomous Number - to explain in simple terms, it can be thought of as a number given to a proximity that is controlled by a single ISP. Looks like this:

Before Update:

id IP AS
1 x.x.x.x 0

After updating:

id IP AS
1 x.x.x.x 75646

Hope I was able to explain. To keep it simple... please think of both the ip and AS numbers as simple number. First there's just one number and after updating we get two numbers...

Posted: Sun Sep 30, 2007 4:18 am
by volka

Posted: Sun Sep 30, 2007 1:31 pm
by legend986
Nobody? I've thought of something:

Run php script
After 25 seconds, refresh using Meta-Refresh
Restart Php script

But I don't know how well this would work for a large database...

Posted: Sun Sep 30, 2007 9:49 pm
by Begby
Hrmmm.... The bottleneck here appears to be the curl connection, for every record you are doing a curl connection. The wait time for a response could become considerable. The update for every record is also a bottleneck but not as much.

Would it be possible to update the AS when the record is added? Then you wouldn't have to do them in batch.

What about writing a program in java, python, or C to do this instead? That makes more sense than having PHP do it, and you could also view the update progress in real time.

Posted: Sun Sep 30, 2007 11:00 pm
by legend986
Hrmmm.... The bottleneck here appears to be the curl connection, for every record you are doing a curl connection. The wait time for a response could become considerable. The update for every record is also a bottleneck but not as much.
Yeah. The curl connection is a bottle neck. Well, inorder to get an AS Number, I need to first post an IP address to a website that gives me a response which I have to record. So I don't see a way of avoiding it though I maybe wrong.
Would it be possible to update the AS when the record is added? Then you wouldn't have to do them in batch.
In that case, where would I fetch an IP from in the first place? If you meant files, then actually someone told me they're not much efficient for handling large amounts of data.
What about writing a program in java, python, or C to do this instead? That makes more sense than having PHP do it, and you could also view the update progress in real time.
Could you elaborate please? I'm not into Java. But Python yes, but not very much. I'm not sure how this has to be done using Python...