Page 1 of 1
Reading text problem (carriage return character)
Posted: Wed Jul 25, 2007 8:18 am
by coool
Hi
I'm using PHP to read row by row a text file and insert it in my database
everything is working fine, BUT ! i've noticed that about 10 columns have a return charachter - that's making a problem because the php consider these return charachters as a new line (new row)
example:
apple 1 fruit
orange 2 fruit
bana
na 1 fruit
___________________
Code: Select all
$output = str_replace("\t", "|", $data);
$values = explode("\n", $output,-1);
foreach($values as $row)
{
$col = explode("|", $row);
//... etc
}
how can i escape or replace the \n in the column value by a space !!!
i tried to str_replcae all \n by a space
but that doesn't work as this removes all \n i.e there's only one row to read which contains all the data
Posted: Wed Jul 25, 2007 8:25 am
by superdezign
You'll need to create some sort of pattern to determine how to separate these, then remove all return characters.
Code: Select all
preg_replace('#(.*?\d.*?)\n#', '$1|', $fileContents);
$data = explode('|', str_replace("\n", '', $fileContents));
Posted: Wed Jul 25, 2007 8:25 am
by Gente
Read file line by line. And check the number of the columns in the line. If it's good insert it into DB, otherwise collect in temporary array until size of this array becomes correct.
Posted: Wed Jul 25, 2007 8:28 am
by Gente
One more thing. What is the correct data in you example:
apple 1 fruit
orange 2 fruit
banana 1 fruit
or
apple 1 fruit
orange 2 fruitbana
na 1 fruit
?

Posted: Wed Jul 25, 2007 9:14 am
by coool
superdezign wrote:You'll need to create some sort of pattern to determine how to separate these, then remove all return characters.
Code: Select all
preg_replace('#(.*?\d.*?)\n#', '$1|', $fileContents);
$data = explode('|', str_replace("\n", '', $fileContents));
can you explain for me the code you've write more.. please
like what does this mean: #(.*?\d.*?)\n#
and what os tje $1| --- this is a variable !! ? of what ?
Posted: Wed Jul 25, 2007 9:16 am
by coool
Gente wrote:One more thing. What is the correct data in you example:
apple 1 fruit
orange 2 fruit
banana 1 fruit
or
apple 1 fruit
orange 2 fruitbana
na 1 fruit
?

OH NO
this is it:
row[0] apple 1 fruit
row[1] orange 2 fruit
row[2] bana
row[3] na 1 fruit
Posted: Wed Jul 25, 2007 9:27 am
by superdezign
coool wrote:superdezign wrote:You'll need to create some sort of pattern to determine how to separate these, then remove all return characters.
Code: Select all
preg_replace('#(.*?\d.*?)\n#', '$1|', $fileContents);
$data = explode('|', str_replace("\n", '', $fileContents));
can you explain for me the code you've write more.. please
like what does this mean: #(.*?\d.*?)\n#
and what os tje $1| --- this is a variable !! ? of what ?
That code was written under the assumption that you had a
basic understanding of regular expressions.
It says, literally, to match anything up to a numeric character, then that numeric character, then anything up to a newline character, and it captures the entire pattern except for the newline character. The '$1|' means to take the first captured pattern (hence the '1') and a pipe character ('|'), and replace the entire matched expression with the new one (as preg_replace is made to do).
Posted: Wed Jul 25, 2007 9:37 am
by coool
It doesn't solve the problem !
Code: Select all
$data = preg_replace('#(.*?\d.*?)\n#', '$1|-|', $data);
$data = explode('|-|', str_replace("\n", '', $data));
echo "<pre>";
print_r($data);
echo "</pre>";
result:
...
[2] => bana
[3] => na 1 fruit
...
Posted: Wed Jul 25, 2007 9:46 am
by superdezign
Then work with it until it does. The problem is in the regex... I think you need to add the 's' modifier to it. And, if the file doesn't end with a newline, it will miss the last entry.
I suggest you learn regex. It's not as hard as you may think, and once you get it, you'll be set for a lot of difficult issues.
EDIT: I added an 'm' modifier as well, and mixing the two together should give the desired result.
Posted: Wed Jul 25, 2007 9:52 am
by Gente
coool wrote:Gente wrote:One more thing. What is the correct data in you example:
apple 1 fruit
orange 2 fruit
banana 1 fruit
or
apple 1 fruit
orange 2 fruitbana
na 1 fruit
?

OH NO
this is it:
row[0] apple 1 fruit
row[1] orange 2 fruit
row[2] bana
row[3] na 1 fruit
Seems you didn't understand me.
If the $row you posted is correct there's nothing to discuss. I just want to pay your attention that it's easy to see that in your example wrong break is in the line 2. In the real situation as I understood it could be in the line 1. So how do you propose to solve this situation?
Posted: Wed Jul 25, 2007 11:17 am
by coool
this
doesn't work !!
okay the last record of every line is a number (not a character)
i.e. why don't i say... replace all \n that is after a character with a space !!
I'm trying to form the regx !!
this doesn't work but it's near !
Code: Select all
$data = preg_replace("\[a-zA-Z]\\n\$", "\\s", $data);
any help in forming the regx please !?
Posted: Wed Jul 25, 2007 11:41 am
by coool
imagine that this is the example
apple 1 fruit 74
orange 2 fruit 213
bana
na 1 fruit 345
so I'm sure that the final \n have a number before it..
but i don't know about the value !!
so assuming that the value have only characters
i want to replace any \n or \r found inside that value by a space !!
____________
forgot the delimiters ! .. just added them.. but still this is not xolving the problem !
Code: Select all
$data = preg_replace("/\[a-zA-Z]\\n\$/", "\\s", $data);
any help in forming the regx please !?
Posted: Thu Jul 26, 2007 2:12 pm
by coool
problem is solved..
thanks for your help
