benefits of using encapsulation characters?
Moderator: General Moderators
benefits of using encapsulation characters?
Someone tried to persuade me into using encapsulation characters in addition to delimiting each variable in an string that is going to be parsed. I've never used encapsulation characters before, and would like to know why this would be benefitial. As far as I can tell, it would just cause more work. The idea I suppose, is to make it less likely that you would separate the variables incorrectly. To me, adding a single character encapsulation to a single character delimited string of variables, is the same as have a three character delimiter.
|variable1|,|variable2|,|variable3| // delimiter is the "comma" and the encaps char is "pipe"
what's the difference in next example?
variable1|,|variable2|,|variable3 / delimiter is "pipe comma pipe"
In both instances, you have to separate them using the same three characters. And there's the same possibilty that that combination of characters resides in any of the variables. So, using encapsulation characters doesn't seem to be any safer. The only difference is that now you have to remove the encapsulation characters on the ends.
Just wondering if someone knows why using encapsulation characters would be benefitial. I can't get a solid explanation from this someone or anywhere on the net.
Thanks in advance for any information about this topic.
|variable1|,|variable2|,|variable3| // delimiter is the "comma" and the encaps char is "pipe"
what's the difference in next example?
variable1|,|variable2|,|variable3 / delimiter is "pipe comma pipe"
In both instances, you have to separate them using the same three characters. And there's the same possibilty that that combination of characters resides in any of the variables. So, using encapsulation characters doesn't seem to be any safer. The only difference is that now you have to remove the encapsulation characters on the ends.
Just wondering if someone knows why using encapsulation characters would be benefitial. I can't get a solid explanation from this someone or anywhere on the net.
Thanks in advance for any information about this topic.
-
RobertPaul
- Forum Contributor
- Posts: 122
- Joined: Sun Sep 18, 2005 8:54 pm
- Location: OCNY
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
serialize()
Personally, I think the way it's implemented is genius. Instead of muddling through escaping mumbo-jumbo, you just store the length of the following string, and that's it. A third solution. And a really fast one too.
Personally, I think the way it's implemented is genius. Instead of muddling through escaping mumbo-jumbo, you just store the length of the following string, and that's it. A third solution. And a really fast one too.
I have tried to ask this person... I don't think they really understand it themselves. They're not really sure - I think they just heard it from someone else. While searching for this earlier, I found that it was recommended by a credit card processor. It had to do with sending data back and fourth, and they recommend using encapsulation characters for that data. But, they don't describe the reason why. That's why I came here. Obviously, there is an important reason that they would recommend using it. I like to be on the up-and-up on these things. If I can make my code more effecient, safe, stable... then I want to know how... but, I also want to know why. This question is all for my curiousity's sake. I don't NEED to know this, just trying to learn about something new.
The only advantage I can see is that if you don't have control over what characters are allowed in the variables, than using the encapsulation chars would help lower the chance that you'd separate the variables incorrectly. Which I can see happening, if you're sharing data with another party. You don't necessarily have control over what type of data/chars they send. So, that makes sense.
But, as someone stated... I'll use what works for me. And, I don't think this would be useful unless I'm getting data that I don't have control over. If I am sure that these variables don't have the pipe character, than, I can use the pipe char as my delimiter without worrying about it.
jshpro2, you mention multiple advantages. Are there any other advantages for using the encapsulation characters?
The only advantage I can see is that if you don't have control over what characters are allowed in the variables, than using the encapsulation chars would help lower the chance that you'd separate the variables incorrectly. Which I can see happening, if you're sharing data with another party. You don't necessarily have control over what type of data/chars they send. So, that makes sense.
But, as someone stated... I'll use what works for me. And, I don't think this would be useful unless I'm getting data that I don't have control over. If I am sure that these variables don't have the pipe character, than, I can use the pipe char as my delimiter without worrying about it.
jshpro2, you mention multiple advantages. Are there any other advantages for using the encapsulation characters?
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
Serialize is quite human editable actually. Try serializing something and then echoing it.That is true he could store his variables in array and then serialize it, but if he wants the string to be human editable (by humans other than the ones who know the inner workings of the serialize function), he will need a more "common seperated value" approach
OK, the main thing is ambiguity. Let's take this example for instance:
Code: Select all
$array = array('array','to','be','transferred');
$string = implode(',',$array);
echo $string; //array,to,be,tranferredA quick fix would be to use a set of characters that nobody would ever think of using:
Code: Select all
$array = array('array','to','be','transferred');
$string = implode('<>',$array);
echo $string; //array<>to<>be<>tranferredThen, you consider escaping characters. This requires a bit more code logic:
Code: Select all
$array = array('ar"ray','to','be','transferred');
$string = '';
foreach($array as $key => $value) {
if (!$key) $string .= ',';
$string .= str_replace('\\','\\\\'$value);
$string .= str_replace(',','\\,'$value);
}These are all fine and dandy for serializing the value, but a bit harder to parse.
The finally step is to give the lengths of the strings, and remove the dependency on escaping characters. This makes the parser very fast, and only sacrifices readability slightly.
When you put data into a string, you've create a document format for it. This format can be as simple or as complex as you want: it's all about what you need.The only advantage I can see is that if you don't have control over what characters are allowed in the variables, than using the encapsulation chars would help lower the chance that you'd separate the variables incorrectly. Which I can see happening, if you're sharing data with another party. You don't necessarily have control over what type of data/chars they send. So, that makes sense.
But, as someone stated... I'll use what works for me. And, I don't think this would be useful unless I'm getting data that I don't have control over. If I am sure that these variables don't have the pipe character, than, I can use the pipe char as my delimiter without worrying about it.