messy LOAD DATA INFILE query

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

yarons
Forum Newbie
Posts: 24
Joined: Mon Sep 11, 2006 7:25 am

messy LOAD DATA INFILE query

Post by yarons »

Hi all,

Ok, I've been struggling with this for hours now so I thought I may find my answer here.

What I am trying to do is to load a CSV file into a table.
Sounds simple enough but here are my problems:
My file is full of rows looking like this:
20060711~ ~GAFA ~ ~1000002B ~ ~2~ ~ 2078385~ ~20051220~ ~20060320~ ~99991231~ ~20060320~ ~0~ ~5000000.0~ ~.31~ ~0.0~ ~0.0~ ~0.0~ ~3875.0~ ~LALA~ ~3875.0~ ~0.0~ ~0~ ~INT~ ~0.0~ ~0.0~ ~TEST
Each iteam with surrounded by ~ is a different value (the first and last value are missing 1 ~). However, not all values should be included. Fields number 1, 5, 7 and 12 should be omitted and not entered to the table.

Any ideas?

Thanks
GM
Forum Contributor
Posts: 365
Joined: Wed Apr 26, 2006 4:19 am
Location: Italy

Post by GM »

*cough* excel *cough* :wink:
yarons
Forum Newbie
Posts: 24
Joined: Mon Sep 11, 2006 7:25 am

LOL

Post by yarons »

yes i am aware of excel ;)

The thing is, this is an automated process so I get the file in real time and need to store it.
Must be possible with SQL, right?
GM
Forum Contributor
Posts: 365
Joined: Wed Apr 26, 2006 4:19 am
Location: Italy

Post by GM »

Ah, sorry, I thought it was a one time operation :oops:

A quick scan of the manual (p. 703) gives this:
MySQL Manual wrote: You can also discard an input value by assigning it to a user variable and not assigning the variable
to a table column:

LOAD DATA INFILE 'file.txt'
INTO TABLE t1
(column1, @dummy, column2, @dummy, column3);
Could this be useful to you?
yarons
Forum Newbie
Posts: 24
Joined: Mon Sep 11, 2006 7:25 am

hmmm

Post by yarons »

You can also discard an input value by assigning it to a user variable and not assigning the variable
to a table column:

LOAD DATA INFILE 'file.txt'
INTO TABLE t1
(column1, @dummy, column2, @dummy, column3);
I am not too sure how to use this. What I need is to ignore a few columns and get rid of the extra whitespaces and ~'s.
I know you can specify the delimiter in LOAD DATA INFILE but this appears to be more messy as there are 2 ~'s plus spaces.

I didn't expect to struggle with this all day but I am. still no real progress....
GM
Forum Contributor
Posts: 365
Joined: Wed Apr 26, 2006 4:19 am
Location: Italy

Post by GM »

Can you try:

Code: Select all

LOAD DATA INFILE 'C:\\input.txt' INTO TABLE table_name
FIELDS TERMINATED BY '~ ~' 
(@dummy, col1, col2, col3, @dummy, col4, @dummy, col5, col6, col7, col8, @dummy, col9, col10, col11, col12, col13, col14, col15, col16, col17, col18, col19, col20);
???

EDIT - bad syntax - sorry... changed query
Last edited by GM on Tue Sep 12, 2006 11:33 am, edited 1 time in total.
GM
Forum Contributor
Posts: 365
Joined: Wed Apr 26, 2006 4:19 am
Location: Italy

Post by GM »

I've managed to get it to read in the row that you gave as an example, using the (improved) query above.
yarons
Forum Newbie
Posts: 24
Joined: Mon Sep 11, 2006 7:25 am

thanks

Post by yarons »

I have 2 problems though.

1. I use MySQL 4 and @dummy is not supported. my workaround is to get all the columns in and then drop the ones I don't need. Any better way to achieve that with MySQL 4?

2.

Code: Select all

FIELDS TERMINATED BY '~ ~'
doesn't work. I get all nulls. I tried this

Code: Select all

FIELDS ENCLOSED BY '~'
which works almost perfectly only I get on the last column 2 values (the first one and the last one). Can't seem to find a workaround for that.

Thanks
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Re: thanks

Post by onion2k »

yarons wrote:1. I use MySQL 4 and @dummy is not supported.
What makes you think that?
yarons
Forum Newbie
Posts: 24
Joined: Mon Sep 11, 2006 7:25 am

Re: thanks

Post by yarons »

onion2k wrote:
yarons wrote:1. I use MySQL 4 and @dummy is not supported.
What makes you think that?
well, I tried the query with the @dummy fields and got a syntax error. When I took them out the same query worked fine.
I also didn't find any reference to @dummy in MySQL 4 manual. Am I wrong in thinking that?
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Re: thanks

Post by onion2k »

yarons wrote:
onion2k wrote:
yarons wrote:1. I use MySQL 4 and @dummy is not supported.
What makes you think that?
well, I tried the query with the @dummy fields and got a syntax error. When I took them out the same query worked fine.
I also didn't find any reference to @dummy in MySQL 4 manual. Am I wrong in thinking that?
@dummy is an SQL variable. MySQL 4 definitely supports them in most operations (selects, updates, inserts, etc). It's possible that it doesn't like them in a load, but highly unlikely. I reckon they should work fine.
yarons
Forum Newbie
Posts: 24
Joined: Mon Sep 11, 2006 7:25 am

Re: thanks

Post by yarons »

onion2k wrote:
yarons wrote:
onion2k wrote: What makes you think that?
well, I tried the query with the @dummy fields and got a syntax error. When I took them out the same query worked fine.
I also didn't find any reference to @dummy in MySQL 4 manual. Am I wrong in thinking that?
@dummy is an SQL variable. MySQL 4 definitely supports them in most operations (selects, updates, inserts, etc). It's possible that it doesn't like them in a load, but highly unlikely. I reckon they should work fine.
Just tried it again. still no joy.
If you are right then I'm doing something terribly wrong here.
I ran the exact same query as GM suggested. Didn't work with @dummy but worked without.

Can you see what I'm doing wrong here?
GM
Forum Contributor
Posts: 365
Joined: Wed Apr 26, 2006 4:19 am
Location: Italy

Post by GM »

Hi again,

I've had a look in the MySQL 4 manual for LOAD DATA INFILE, and it seems to be quite scarse compared to the MySQL 5 manual. So I can't say with 100% certainty that MySQL 4 does support assigning the input to variables. Unfortunately, I'm without a MySQL 4 environment.

Do you have the option to clean the file up in some other way? (At it's source would be best, if possible).

Could you, for example, parse it into a PHP script and clean it there, and then send it to the database? You'd still be doing it on the fly, but it'd be a little bit of extra work.
yarons
Forum Newbie
Posts: 24
Joined: Mon Sep 11, 2006 7:25 am

Post by yarons »

GM wrote:Hi again,

I've had a look in the MySQL 4 manual for LOAD DATA INFILE, and it seems to be quite scarse compared to the MySQL 5 manual. So I can't say with 100% certainty that MySQL 4 does support assigning the input to variables. Unfortunately, I'm without a MySQL 4 environment.

Do you have the option to clean the file up in some other way? (At it's source would be best, if possible).

Could you, for example, parse it into a PHP script and clean it there, and then send it to the database? You'd still be doing it on the fly, but it'd be a little bit of extra work.
This is possible but it won't be a great idea as I'm dealing with very large files here (~2G).
I have the possiblity to upgrade to MySQL 5. Will this solve both my problems?

Thanks,
GM
Forum Contributor
Posts: 365
Joined: Wed Apr 26, 2006 4:19 am
Location: Italy

Post by GM »

I can't make any guarantees, but for me the query I posted seemed to work in MySQL 5.0.2 (albeit with a single line of input).

Upgrading may affect other databases you've got, so proceed with caution if you choose this route.

Maybe someone else out there will be good enough to run a couple of tests in a MySQL 4 environment? As onion2k says, I find it strange that it isn't supported.

What is the exact error that you get when you try to run the query I posted?
Post Reply