Page 1 of 1
Need quick help with code
Posted: Mon Jun 21, 2010 4:30 pm
by bigfoot
I have some php code that downloads a specific csv file from a government website, pulls specific sets of data from it and creates a report. The problem is that they started using comma's in the data. For example instead of 1000 they started using 1,000. This is of course throwing off my code as it sees it as two seperate values. Not really sure where to start to fix this. I have no control over the data file.
Thanks
Re: Need quick help with code
Posted: Mon Jun 21, 2010 5:25 pm
by hider
Are the numbers save as a variable?
I.E. is the variable "1,000", or are you trying to search for these numbers within a string?
Re: Need quick help with code
Posted: Mon Jun 21, 2010 6:03 pm
by Sofw_Arch_Dev
If you knew how many columns of data there were going to be and if you knew which columns were strings versus numbers, you may be able to infer when a comma was a separator and where it was numeric punctuation.
A better option is to educate your client on what a CSV file is and have them re-do the data file. Tell them that in the context of a CSV file the comma character is explicitly reserved for the separation of values only, and that using it anywhere else in the data corrupts the data. If they want commas in the numbers you can handle that in the display portion of the code.
Re: Need quick help with code
Posted: Tue Jun 22, 2010 1:15 am
by bigfoot
Thank you for your responses.
Here is a copy of my code that I am working with. It may not be the most elegent as I am a newbie at php but it has gotten the job done well for the last 3 years.
Code: Select all
?php
$filename="petroldata.csv";
$webpage="http://ir.eia.gov/wpsr/table9.csv";
$fp= fopen($webpage,"r") or die ("couldn't open $webpage");
$sp= fopen($filename, "w") or die ("couldn't open $filename");
while (! feof($fp)) {
$line=fgets($fp);
$line=str_replace("\r\n",",",$line);
fputs($sp, $line);
fclose($sp);
$d=file_get_contents($filename);
$da= explode(",",$d);
$c1=round($da [34] - $da [35])
The code goes on to call specific cells and display them on an html page. The rest of it is working correctly
The problem is this. In the last line shown $c1=round($da [34] - $da [35]) this should simply take cell 34 and subtract cell 35 and assign it to the variable c1. This has stopped working because the source csv file now contains comma's in the data for any number over 1000. For example, if the data in cell 5 is 2,340, the code is seeing this comma as a delimiter and 2 seperate cells. Not sure that it matters but all of the data is now surrounded with double quotes as well.
Here is one line of the csv file.
"Crude Oil Production ","Domestic Production","5,503","5,445","5,273","5,145","5,452","5,338"
What should be 8 colums is counting as 14.
Unfortunately since this data is taken from a csv file release weekly by the government, there is no way I can just ask them to modify their file structure.
Re: Need quick help with code
Posted: Tue Jun 22, 2010 3:51 pm
by Sofw_Arch_Dev
Ok, getting the data in better shape is out of the question. No problem but worth inquiring since it nips the problem in the bud. The good thing for you is that all of the values, at least according to what you show, are separate strings with double quotes around them. This makes it easy to detect individual values by simply changing the
delimiter value in your explode statement. Try...
This gives you all the individual values of $d in $da, as you wanted. The only thing you'd need to do on the individual values of $da (really just the first and last ones) is trim() the leading or trailing double quote character. To demonstrate, here was my test:
Code: Select all
$str = '"Crude Oil Production ","Domestic Production","5,503","5,445","5,273","5,145","5,452","5,338"';
echo "string is " . $str . "\n\n";
$tokens = explode( "\",\"", $str );
print_r( $tokens );
The output is:
Code: Select all
string is "Crude Oil Production ","Domestic Production","5,503","5,445","5,273","5,145","5,452","5,338"
Array
(
[0] => "Crude Oil Production
[1] => Domestic Production
[2] => 5,503
[3] => 5,445
[4] => 5,273
[5] => 5,145
[6] => 5,452
[7] => 5,338"
)
As you can see, the first one and last one have an extra double quote that needs to be trimmed off. Easy enough to do.
Re: Need quick help with code
Posted: Tue Jun 22, 2010 4:27 pm
by bigfoot
how do I go about removing the first and last double quote?
Re: Need quick help with code
Posted: Wed Jun 23, 2010 8:11 pm
by Sofw_Arch_Dev
You could do something like this.
Code: Select all
$tokens = explode( "\",\"", $str );
foreach( $tokens as &$token ) {
$token = trim( $token, "\"" );
} // foreach
print_r( $tokens );
Make sure you remember the & in the foreach, otherwise you will be modifying a COPY of the array value, not the array value itself.