Page 1 of 2
Insert in a table only if not there.
Posted: Tue Sep 16, 2003 5:58 pm
by jkmcgrath
Hi,
I am wanting to ensure that once a file has been parsed and the data has been inserted in the DB table then when the file is parsed again that is does not insert the data again.
I am unsure of the how to do this. What happens now it just keeps inserting the same data over and over.
Any help welcomed.
Thanks
John
Posted: Tue Sep 16, 2003 6:15 pm
by JAM
A bit unsure of what you mean... What does the file contain, how do you parse it. What do you use it for?
Posted: Tue Sep 16, 2003 6:20 pm
by jkmcgrath
I have stat files from the NFL games that I parse and insert the data in a DB.
What I want to ensure is that once a file has been parsed and inserted if I accidently run the same file again it will not insert the data again creating double stats for that file.
Posted: Tue Sep 16, 2003 6:24 pm
by JAM
There are a couple of ways, depending on what you use the files for before/after the insertion to the db.
1. Delete it.
2. Move it.
3. Open it, write something at the begginning of the file. If you tries to read that file again, and that 'something' is found,abort.
4. Read the file content, read the last inserted data from the db. Compare those two and the different values if they match.
Changed "read all files" to "Read the file content"
Posted: Tue Sep 16, 2003 6:53 pm
by McGruff
Another option: create a unique index in the table.
Venue and date cols might be a good choice - unless there could be two matches in the same stadium on one day?
Posted: Tue Sep 16, 2003 7:00 pm
by JAM
Duh! Very true. About two matches at same stadium, at same day... That could be worth using.
The game ending used with a datetime (at least something that stores the TIME) could be used. Two matches would never finish at the same time...

Posted: Tue Sep 16, 2003 7:15 pm
by jkmcgrath
Well there is 570 records per file so I couldnt use the unique index I dont think.
I was looking thru the file and I noticed that they have the name of the file in the file itself so I could use it as id of sort. Parse that line first compare to the field in the DB and if there skip the file if not parse it into the db.
What ya think?
Posted: Tue Sep 16, 2003 7:35 pm
by McGruff
What columns do you have in the table?
Posted: Tue Sep 16, 2003 9:22 pm
by jkmcgrath
I have 35 columns total with approx 570 records per week. That column total dosent include the id column.
Posted: Tue Sep 16, 2003 9:56 pm
by McGruff
Are you using LOAD DATA INFILE to update?
Posted: Tue Sep 16, 2003 10:14 pm
by jkmcgrath
No I am just using insert.
Posted: Tue Sep 16, 2003 10:56 pm
by McGruff
OK. Was going to check if LOAD DATA complains if unique indexes are breached - expect it would.
Anyway, unique indexes will definitely offer a solution with INSERT.
Work out the minimum columns required to specify a row. I don't know what they will be but, for comparison, think of a books table. An author can have many books so author isn't unique. Several authors could have written a book with the same title so title isn't unique. A two-column index on both author and title almost does it but then the author might bring out a revised version with the same title a few years down the line. Finally, a three column index on author, title and publication date locks it down: a single book can only appear once in the table.
Posted: Wed Sep 17, 2003 5:59 am
by JAM
jkmcgrath wrote:Well there is 570 records per file so I couldnt use the unique index I dont think.
I was looking thru the file and I noticed that they have the name of the file in the file itself so I could use it as id of sort. Parse that line first compare to the field in the DB and if there skip the file if not parse it into the db.
What ya think?
If not going on the unique ideas posted by McGruff, you should be able to use this with some editing. Using the files name as you described could be a workable solution.
Posted: Wed Sep 17, 2003 7:04 am
by Heavy
If you can't make a UNIQUE index on one column you could do it on several columns.
I don't know which database enginge you are using, but for mysql here goes:
You can use varchar(n) as a column type for the filenames and have a unique index for that column.
Assume col1 has UNIQUE index defined.
http://www.mysql.com/doc/en/REPLACE.htmlCode: Select all
INSERT into tablename set col1 = 1 , col2 = 1;
INSERT into tablename set col1 = 1 , col2 = 2;
INSERT into tablename set col1 = 2 , col2 = 3;
won't finish the second query, but will generate an error since you cant insert duplicates into col1.
http://www.mysql.com/doc/en/INSERT.htmlCode: Select all
INSERT IGNORE into tablename set col1 = 1 , col2 = 1;
INSERT IGNORE into tablename set col1 = 1 , col2 = 2;
INSERT IGNORE into tablename set col1 = 2 , col2 = 3;
won't finish the second query, but won't generate an error either.
Code: Select all
REPLACE into tablename set col1 = 1 , col2 = 1;
REPLACE into tablename set col1 = 1 , col2 = 2;
REPLACE into tablename set col1 = 2 , col2 = 3;
will finish the second query as well as the first but will overwrite the row inserted by the first query. It won't generate an error.
Posted: Wed Sep 17, 2003 9:35 am
by jkmcgrath
Roger that guys, you all have been a great help.
I am using MySql (I have MSSQL but trying to stay away from MS solutions when I can) Also I would like it to be DB independent if possible so a user could use it on any DB but that is still in the future. Need to get a base going first.
Thanks Again
John