Parsing page content - is it a bad idea?
Posted: Fri May 01, 2009 8:47 pm
Right now, my pages are made up of standard headers and footers, with variable content in the middle that comes from external html files that are included on the page.
I was thinking of ways to create a "glossary" effect. Originally I thought about having a script automatically insert links to certain phrases as the page loads (not adding them to the original source file), but after thinking more that might not work so well. Next I thought to use "markup", like wikipedia's [[link|text to display]] format. The script would use "link" to construct a link tag and attach it around the text, unless there's no entry for that term, in which case it would just show the text normally. This way when an entry for that term was eventually added, every instance of the maked-up term would turn into a link.
But to do this, instead of just "include"ing the content files like it does now, it would load the file into a variable, run preg_match to find all occurrences of the markup, go through the result array to find out if a matching entry exists for each thing, construct links for the ones that do, then preg_replace on the content to replace all occurrences of the markup with either the linked version or of the stripped "plain" version.
Would this be too taxing on the server to do this for the content of ever page every time someone wants to look at it? Some of the pages are 15kb+, and maybe someday there will be even bigger ones. Is there a better way to do this? Would it be better to have it create new files with the parsed content and just include them normally?
I was thinking of ways to create a "glossary" effect. Originally I thought about having a script automatically insert links to certain phrases as the page loads (not adding them to the original source file), but after thinking more that might not work so well. Next I thought to use "markup", like wikipedia's [[link|text to display]] format. The script would use "link" to construct a link tag and attach it around the text, unless there's no entry for that term, in which case it would just show the text normally. This way when an entry for that term was eventually added, every instance of the maked-up term would turn into a link.
But to do this, instead of just "include"ing the content files like it does now, it would load the file into a variable, run preg_match to find all occurrences of the markup, go through the result array to find out if a matching entry exists for each thing, construct links for the ones that do, then preg_replace on the content to replace all occurrences of the markup with either the linked version or of the stripped "plain" version.
Would this be too taxing on the server to do this for the content of ever page every time someone wants to look at it? Some of the pages are 15kb+, and maybe someday there will be even bigger ones. Is there a better way to do this? Would it be better to have it create new files with the parsed content and just include them normally?