I have some php code that downloads a specific csv file from a government website, pulls specific sets of data from it and creates a report. The problem is that they started using comma's in the data. For example instead of 1000 they started using 1,000. This is of course throwing off my code as it sees it as two seperate values. Not really sure where to start to fix this. I have no control over the data file.
Thanks
Need quick help with code
Moderator: General Moderators
Re: Need quick help with code
Are the numbers save as a variable?
I.E. is the variable "1,000", or are you trying to search for these numbers within a string?
I.E. is the variable "1,000", or are you trying to search for these numbers within a string?
- Sofw_Arch_Dev
- Forum Commoner
- Posts: 60
- Joined: Tue Mar 16, 2010 4:06 pm
- Location: San Francisco, California, US
Re: Need quick help with code
If you knew how many columns of data there were going to be and if you knew which columns were strings versus numbers, you may be able to infer when a comma was a separator and where it was numeric punctuation.
A better option is to educate your client on what a CSV file is and have them re-do the data file. Tell them that in the context of a CSV file the comma character is explicitly reserved for the separation of values only, and that using it anywhere else in the data corrupts the data. If they want commas in the numbers you can handle that in the display portion of the code.
A better option is to educate your client on what a CSV file is and have them re-do the data file. Tell them that in the context of a CSV file the comma character is explicitly reserved for the separation of values only, and that using it anywhere else in the data corrupts the data. If they want commas in the numbers you can handle that in the display portion of the code.
Re: Need quick help with code
Thank you for your responses.
Here is a copy of my code that I am working with. It may not be the most elegent as I am a newbie at php but it has gotten the job done well for the last 3 years.
The code goes on to call specific cells and display them on an html page. The rest of it is working correctly
The problem is this. In the last line shown $c1=round($da [34] - $da [35]) this should simply take cell 34 and subtract cell 35 and assign it to the variable c1. This has stopped working because the source csv file now contains comma's in the data for any number over 1000. For example, if the data in cell 5 is 2,340, the code is seeing this comma as a delimiter and 2 seperate cells. Not sure that it matters but all of the data is now surrounded with double quotes as well.
Here is one line of the csv file.
"Crude Oil Production ","Domestic Production","5,503","5,445","5,273","5,145","5,452","5,338"
What should be 8 colums is counting as 14.
Unfortunately since this data is taken from a csv file release weekly by the government, there is no way I can just ask them to modify their file structure.
Here is a copy of my code that I am working with. It may not be the most elegent as I am a newbie at php but it has gotten the job done well for the last 3 years.
Code: Select all
?php
$filename="petroldata.csv";
$webpage="http://ir.eia.gov/wpsr/table9.csv";
$fp= fopen($webpage,"r") or die ("couldn't open $webpage");
$sp= fopen($filename, "w") or die ("couldn't open $filename");
while (! feof($fp)) {
$line=fgets($fp);
$line=str_replace("\r\n",",",$line);
fputs($sp, $line);
fclose($sp);
$d=file_get_contents($filename);
$da= explode(",",$d);
$c1=round($da [34] - $da [35])The problem is this. In the last line shown $c1=round($da [34] - $da [35]) this should simply take cell 34 and subtract cell 35 and assign it to the variable c1. This has stopped working because the source csv file now contains comma's in the data for any number over 1000. For example, if the data in cell 5 is 2,340, the code is seeing this comma as a delimiter and 2 seperate cells. Not sure that it matters but all of the data is now surrounded with double quotes as well.
Here is one line of the csv file.
"Crude Oil Production ","Domestic Production","5,503","5,445","5,273","5,145","5,452","5,338"
What should be 8 colums is counting as 14.
Unfortunately since this data is taken from a csv file release weekly by the government, there is no way I can just ask them to modify their file structure.
- Sofw_Arch_Dev
- Forum Commoner
- Posts: 60
- Joined: Tue Mar 16, 2010 4:06 pm
- Location: San Francisco, California, US
Re: Need quick help with code
Ok, getting the data in better shape is out of the question. No problem but worth inquiring since it nips the problem in the bud. The good thing for you is that all of the values, at least according to what you show, are separate strings with double quotes around them. This makes it easy to detect individual values by simply changing the delimiter value in your explode statement. Try...
This gives you all the individual values of $d in $da, as you wanted. The only thing you'd need to do on the individual values of $da (really just the first and last ones) is trim() the leading or trailing double quote character. To demonstrate, here was my test:
The output is:
As you can see, the first one and last one have an extra double quote that needs to be trimmed off. Easy enough to do.
Code: Select all
$da= explode("\",\"",$d);
Code: Select all
$str = '"Crude Oil Production ","Domestic Production","5,503","5,445","5,273","5,145","5,452","5,338"';
echo "string is " . $str . "\n\n";
$tokens = explode( "\",\"", $str );
print_r( $tokens );
Code: Select all
string is "Crude Oil Production ","Domestic Production","5,503","5,445","5,273","5,145","5,452","5,338"
Array
(
[0] => "Crude Oil Production
[1] => Domestic Production
[2] => 5,503
[3] => 5,445
[4] => 5,273
[5] => 5,145
[6] => 5,452
[7] => 5,338"
)
Re: Need quick help with code
how do I go about removing the first and last double quote?
- Sofw_Arch_Dev
- Forum Commoner
- Posts: 60
- Joined: Tue Mar 16, 2010 4:06 pm
- Location: San Francisco, California, US
Re: Need quick help with code
You could do something like this.
Make sure you remember the & in the foreach, otherwise you will be modifying a COPY of the array value, not the array value itself.
Code: Select all
$tokens = explode( "\",\"", $str );
foreach( $tokens as &$token ) {
$token = trim( $token, "\"" );
} // foreach
print_r( $tokens );