Space Characters Getting Converted in Drupal

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Supershaba
Forum Newbie
Posts: 4
Joined: Fri Jun 24, 2011 7:58 am

Space Characters Getting Converted in Drupal

Post by Supershaba »

I've got a problem with my PHP code using Drupal. What I have is an array of names, which has a particular format and can contain characters as well as spaces. The name can be of two formats:

Code: Select all

$name[0] = "--> Psychic Barrtier";
$name[1] ="    Initial Presence";
In my code, I take each line in the array, and use a preg_match statement to see if it matches these two patterns. So, I am basically looking for a line that starts with two dashes and '>' this, followed by a space. Or I am looking for a line that starts with 4 spaces. This is my preg_match statement:

Code: Select all

while (preg_match('/^(--> |--> |--> | {4}|    |Â+)(.+)$/', $name[$i], $capturedname)) 
The problem is, both types of names are passing the preg_match statement, but when I encounter a name that starts with just the 4 spaces, the captured data in the variable $capturedname doesn't match up. Basically, what I'm capturing in $caputredname[0] is the whole thing, then $capturedname[1] would be just the spaces, and $capturedname[2] would be just the name. This is what I get instead:

$capturedname[0]=" Evil Presence"
$capturedname[1] = "�"
$capturedname[2] = "� Evil Presence"

Array 1 plus 2 should equal Array 0, but that's not the case here. The spaces are getting converted to some diamond with a question mark character and $capturedname[2] has the spaces again, it just doesn't add up. Any sort of help would be greatly appreciated. It's been bugging me for 3 days now. Thanks in advance.
Supershaba
Forum Newbie
Posts: 4
Joined: Fri Jun 24, 2011 7:58 am

Re: Space Characters Getting Converted in Drupal

Post by Supershaba »

Any help on this would be appreciated, thanks.
User avatar
Apollo
Forum Regular
Posts: 794
Joined: Wed Apr 30, 2008 2:34 am

Re: Space Characters Getting Converted in Drupal

Post by Apollo »

Most likely an encoding issue.
How are the input strings (the $name array) encoded, and how do you output the result? (i.e. what encoding does your html use?)

Furthermore you have an  character in your regexp, but since php files have no standard encoding, it's a complete guess how this will be saved and interpreted (also depends on the editor you happen to use).
I assume the  is an attempt to represent the non breaking space character (U+00A0), similar to   but that's incorrect. You should either take one byte 0xA0 as a non-breaking space char (if it's ansi encoded), or two bytes 0xC2 0xA0 (for one non-breaking space) if it's utf-8 encoded.

Are you sure you want to literally catch "--> xxx" as well as "--> yyy" and "--&> zzz" ? (seems like it's not sure how many it has been htmlspecialchar'd?) And wouldn't you need to include   as well then?

Solution: make sure EXACTLY how your input is encoded: htmlspecialchar'd or not? (or twice?) utf-8 or iso-8859-1 or win-1252? Then write your expression according to that.
Supershaba
Forum Newbie
Posts: 4
Joined: Fri Jun 24, 2011 7:58 am

Re: Space Characters Getting Converted in Drupal

Post by Supershaba »

It's encoded in UTF-8, so it shouldn't be a problem, but I don't know why it is. As for the Â, I know that's an odd looking thing there, but the only way that I got it to pass the preg_match statement was if I used that character. I tried using '    ', then I used, '    ', but none of those have worked, so it's just mind boggling to me. As for trying to match: "--> xxx" as well as "--> yyy" and "--&> zzz", I have that in there, in case, the user doesn't use the editor or there might be some other differences.
Supershaba
Forum Newbie
Posts: 4
Joined: Fri Jun 24, 2011 7:58 am

Re: Space Characters Getting Converted in Drupal

Post by Supershaba »

Any help regarding this issue would be appreciated.
Post Reply