Page 1 of 1

Contact Database Cleaner

Posted: Tue Oct 27, 2009 4:12 pm
by tymlls05
I have just finished tweaking a contact/mailing list database to fit my needs. At the moment I am bulk uploading contacts through phpmyadmin, using csv files. Excell likes to occasionally overlook quotes, messing up the import and causing fields to be assigned to the wrong columns.

So, for one, is there a more efficient way of importing these csv's into mysql database so I don't have to do in step two below? I don't really feel that my php "skill" is quite ready to do a custom importer using csv.

Two, I wrote a script to help me clean up and correct some of the failed imports and delete rows that were just completely corrupted.

Code: Select all

<?php
    
        //REMOVES BLANKS
    $conn = mysql_connect('localhost', 'uname', 'password') or die ('Error connecting to mysql');
    mysql_select_db('MYDB');
    mysql_query("DELETE FROM contacts WHERE lastname='' AND firstname=''");
    mysql_query("DELETE FROM contacts WHERE `printed`!='0' AND `printed`!='1'");
    mysql_close($conn);
        //END REMOVE BLANKS
    
        //Trade Contact To ADMIN If Blank
    $conn = mysql_connect('localhost', 'uname', 'password') or die ('Error connecting to mysql');
    mysql_select_db('MYDB');
    mysql_query("update `contacts` set `user_group` = 'ADMIN' where `user_group`=''");
    mysql_close($conn);
        //End Contact Trader
    
        //REMOVES DUPLICATES
                /*--This only removes duplicates under each owner. The contact can exist on more than one account but not under the same user*/
    $conn = mysql_connect('localhost', 'uname', 'password') or die ('Error connecting to mysql');
    mysql_select_db('MYDB');
    $query="SELECT lastname,address,user_group FROM contacts WHERE user_group";
    $getuser=mysql_query($query);
    echo mysql_error();
        while($row = mysql_fetch_array($getuser, MYSQL_BOTH)){
            $owner=$row['user_group'];
            $address=$row['address'];
            $lastname=$row['lastname'];
            echo 'We are searching for duplicates in '.$owner.'\'s group.<p></p><p></p>';
            $query1="SELECT lastname,address,user_group FROM contacts WHERE user_group = '".$owner."'";
            $result2=mysql_query($query1);
            $count = mysql_num_rows($result2) - 1;
            while($match = mysql_fetch_array($result2, MYSQL_BOTH)) {
                if ($lastname==$match['lastname']&&$address==$match['address']){
                    mysql_query("DELETE FROM contacts WHERE lastname='".$match['lastname']."' AND address='".$match['address']."' AND user_group='".$match['user_group']."' LIMIT $count");
                    echo '<p>Removed '.$match['lastname'] .' @ '.$match['address'].', belonging to '.$match['user_group'];
                }
            }
        }       
    mysql_close($conn); 
    echo mysql_error();
      //End Duplicate Remover
    ?>

The code above makes a valiant effort to process the few thousand contacts I have imported, but really bogs down the server looping through each row. Do you guys have anything more efficient/optimized I can implement.

Re: Contact Database Cleaner

Posted: Tue Oct 27, 2009 4:19 pm
by Eran
The problem is the the nested query loop. It's very inefficient - you should be able to find all duplicates with one query. What are the conditions for a duplicate? same last name and address belonging to the same user group?

Re: Contact Database Cleaner

Posted: Tue Oct 27, 2009 11:33 pm
by tymlls05
yes. i'm not too familiar with sql conditionals and all of its sorting capabilities. since not all contact details are provided the the same way, I figured my query was the closest I could get to finding the duplicates. I don't know why I didn't try to put it into one query. maybe because I tried to imitate another similar script that nested in order to accomplish the same effect.

but yes, that is how I am attempting to spot the duplicates. last name, address, under the same user group