Page 1 of 1
unsure of a good title for this
Posted: Fri Dec 09, 2005 4:13 am
by s.dot
I've been debating this one for quite a while.
Storing things in the database already 'configured' or letting your PHP script do the configuring? For instance, say you had an emoticon system. When processing form input, all instances of ; ) would be turned into <img src="emoticons/wink.gif" alt="wink">.
You could str_replace() this when processing it and store it in the database as an IMG tag. However there are some cons to this. 1) It takes up more database space. 2) Say you wanted to change the alt="wink" to alt="big wink"... that would require you to edit the database.
However, if you just store it as ; ) in the database and use your PHP script to str_replace() (or whatever other method) this will take up more CPU usage.. but would allow for greater (and easier) manipulation of the input and less database space.
The emoticon was a simple example, but when doing BB tags or other things, it can get quite complex.
Discuss?
Posted: Fri Dec 09, 2005 4:57 am
by shiznatix
wow thats a good question, never thought of that.
What I have always done is just str_replace() it and put that into the database instead of doing the str_replace() every time its called...BUT! The other way you mentioned would probably be better. Unless you are dealing with 5000 hits a hour and a extreamly large amount of strings to be str_replaced every page call then I could not imagine that it would do that much, str_replace is not very resource hungry at all.
Just my thoughts.
Posted: Fri Dec 09, 2005 5:38 am
by onion2k
CPU time is more precious than storage space in my opinion .. I parse the data as it goes into the database. What's more, I store the original as it was entered into the form in one column, and the parsed "display" version in a second column. That way if the user wants to edit it later I don't have to 'unparse' it to put it into the form.
Posted: Fri Dec 09, 2005 6:58 am
by Buddha443556
I parse it when it comes out. However, my bbcode/emoticons process is simply a bunch of str_replace with no regard for HTML standards and therefore very fast. There one more reason that's not been mentioned: My layout and bbcode/emoticon are related to the theme/layout so they can change with the layout. I'm not sure how you would handle a layout change if you pre-process (besides the CSS I mean)?
Posted: Fri Dec 09, 2005 8:59 am
by timvw
As already mentionned, (as almost always) you end up with a dilemma between the following:
- use cpu and calculate stuff again and again
- use datastore (memory, file, ...) and do it only once
Anyway, i would suggest the following: Create an extra content table that contains the "manipulated" text. This way, for simple views you can fetch this. For manipulations you use the original (without phpbb replacements etc...)
Posted: Fri Dec 09, 2005 10:57 am
by Chris Corbyn
For your example I'd probably just parse it as it comes out of the DB.... that's not too intensive an example.
However, things like BBCode that use tokenizing and recursion... I'd definitely do it as it goes into the DB to save my poor CPU/Memory. Take my
JavaScript beautifier for example.... that takes a little amount of time to process as well as making the server work.... so I convert once and collect the markup.
Posted: Fri Dec 09, 2005 7:32 pm
by s.dot
seeing as my server is starting to generate a lot of requests and my server load is getting hammered, im starting to switch it all to storing it in the database already formatted =) i was just curious what other people did
Posted: Mon Dec 12, 2005 2:50 am
by GRemm
The alternative isn't really that bad as far as cpu is concerned.
Use php's output buffering to grab the output of the whole page and pass the buffered page through some sort of intercepting filter (both the pattern and the general idea).
A more important concern is what happens when someone actually wants to use a character string that gets translated into a smiley. We have all seen forums where some poor kids code ends up filled with sad faces or stupid img bbcode tags. Yuck.
Keep your tags more complicated and exclusive than the

and ;=] sort of things. This is why bbcode seems to work so well. They have an specific tag structure designed not to interfere with the basics of filling in a form field.
Posted: Mon Dec 12, 2005 7:48 am
by dbevfat
IMO converting only before storing to DB has some drawbacks. Basically, because the data is not in it's "raw" format anymore, you can't have:
1. "don't show emoticons" option (I use it everywhere),
2. changes of parser engine (old entries will be in old format),
3. theme switching (different paths for emoticon images).
I believe these drawbacks alone (there must be others) are too big to even consider this option. A solution would be (as suggested) to hold both the original and the parsed text, but this only really addresses point 2.
Regards
Posted: Mon Dec 12, 2005 8:23 am
by Chris Corbyn
dbevfat wrote:IMO converting only before storing to DB has some drawbacks. Basically, because the data is not in it's "raw" format anymore, you can't have:
1. "don't show emoticons" option (I use it everywhere),
2. changes of parser engine (old entries will be in old format),
3. theme switching (different paths for emoticon images).
I believe these drawbacks alone (there must be others) are too big to even consider this option. A solution would be (as suggested) to hold both the original and the parsed text, but this only really addresses point 2.
Regards
It does cause side-issues yes. You just need to decide if those outweigh the bonusses. If you have the space you could actually store both versions in case you ever need to change the data -- or even just store the `diff' from the two. Some of the phpBB mods have caused us issues in the past after making updates you get strange things like this all over:
Code: Select all
class foo
{
function foo()
{
$this->isBroken();
}
}
Posted: Mon Dec 12, 2005 9:41 am
by Maugrim_The_Reaper
Each strategy is going to have costs associated, so you either choose between flexibility and speed, or compromise on both to limit the cost. Since scrotaye's issue is primarily server processing load - store it after processing. If you really need flexibility (hard to discount with some applications) store the original raw form also. I thinks that's the most adaptable method - assuming you don't also have an issue with database size!

Posted: Mon Dec 12, 2005 11:49 am
by GRemm
What is going to take longer in the end?
Process form input -> store raw input -> parse out bbcode -> store changed input -> redirect to confirmation / whatever -> query db and display output.
Or..
Process form input -> store raw input -> redirect to confirmation / whatever -> query db -> parse bbcode and display output.
From my perspective the parsing at display time option actually has one less storage query and one fewer steps to render the output.
A good question to pose to the experts here.. what takes more cpu load / resources / time
regexp / str_replacing buffered output or regex / str_replacing and storing an entry twice at edit time?
From my limited tests I see the output buffering taking more memory and only a small spike in cpu on one process vs the medium amount of memory and multiple threads spiking when the db gets hit twice.
The output buffering method is faster with my tests as well, but by just a small amount.