Page 1 of 1
fopen() and UTF-8: 2 problems
Posted: Tue Jul 19, 2005 6:42 am
by Perfidus
I have several TXT files encoded in UTF-8, they are strings with different vars concatenated by "&".
This files look like this example:
Code: Select all
&age=34&address=90&var3=77&var4=<strong>samba</strong><br>Lets's dance
I'm trying to edit them by using fopen(), cause some vars contain HTML, I'm using a very simple system to know if the var is HTML formated or not, I'm simply checking if it contains a "<":
Code: Select all
$value[$i]=strpos($var_value[$i],"<");
if ($value[$i]!=" "){ //do whatever...
}
I know this is very poor and suggestions are welcome, this is first problem.
Second problem is that those variables in the TXT contain Spanish accents and characters like the Ñ, should be good to know how to open the files without the PHP making a real mess with all those characters.
Posted: Tue Jul 19, 2005 7:01 am
by onion2k
If you're just opening the text file, using explode() to split it up, and echo'ing the resulting variables, PHP will cope fine with foreign characters. Remember to set the charset of the HTML page though.
If you want to do anything with the variables be sure to remember to use the mb_ version of any relevent string function.
Posted: Tue Jul 19, 2005 7:12 am
by Perfidus
Well, I'm not using explode() right now but split(), cause I do not know what is inside any txt, I need to take the var names and values separately:
Code: Select all
$filename = "es/".$destiny.".txt";
$handle = fopen($filename, "r");
$contents = fread($handle, filesize($filename));
$contenidos=split("&",$contents);//Get the string chopped in pieces
$num=2;
echo "<form action=\"lectorarchivos.php\" method=\"post\" enctype=\"multipart/form-data\" name=\"restaurantes\" id=\"restaurantes\"><table>";
for($i=1 ; $i < sizeof($contenidos) ; $i++) {
$puntocorte[$i]=strpos($contenidos[$i] , "=");//A cutting point is defined in order to take what is before and after the "0"
$var_nombre[$i]=substr($contenidos[$i], 0, $puntocorte[$i]);//before "="
$var_value[$i]=substr($contenidos[$i], ($puntocorte[$i]+1));//after "="
$value[$i]=strpos($var_value[$i],"<");//Now I try to know if what comes after "=" is HTMl formated or not...
if ($value[$i]!=" "){//Whatever
}
Posted: Tue Jul 19, 2005 9:25 am
by onion2k
Perfidus wrote:Well, I'm not using explode() right now but split()
Explode() and split() are aliases of each other. They're the same command.
Posted: Tue Jul 19, 2005 9:33 am
by Chris Corbyn
onion2k wrote:Perfidus wrote:Well, I'm not using explode() right now but split()
Explode() and split() are aliases of each other. They're the same command.
Ouch...
Actually, split() takes a regex pattern

explode() doesn't
Posted: Tue Jul 19, 2005 1:17 pm
by onion2k
d11wtq wrote:Actually, split() takes a regex pattern

explode() doesn't
Oh crumbs.. I was miles off. I've always thought they were the same thing. Good job I always use explode() then..
Posted: Tue Jul 19, 2005 1:37 pm
by Chris Corbyn
LOL... well.. generally you wouldn't notice anyway unless you were using split() with a "." or any of [, ], (, ), ^ etc etc

It's not Perl style regex ( /pattern/ ), for that you'd use preg_split() which is much nicer.