HTML in language strings

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

AlexC
Forum Commoner
Posts: 83
Joined: Mon May 22, 2006 10:03 am

HTML in language strings

Post by AlexC »

Hey,

Is HTML in language strings (which we're using Gettext for all our translation goodness) bad practice? I can't make my mind up as whether it is or not, and the Gettext manual doesn't seem to sure, either (http://www.gnu.org/software/gettext/manual/gettext.html: "HTML markup, however, is common enough that it's probably ok to use in translatable strings.").

An few examples of what we've ran into:

Code: Select all

<?php
printf( _('<span id="foobar">Hello,</span> %1$s!'), $foo );
echo _('foobar car <a href="http://example.com">zomg</a>');
?>
How can we avoid this, or do we not need to?

Regards,
(I had no idea where to put this, btw)
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: HTML in language strings

Post by alex.barylski »

Is HTML in language strings (which we're using Gettext for all our translation goodness) bad practice?
Yes. What if you port to Windows desktop applications? HTML tags won't likely render.

Not only that, but they are separate resources. What if you fail to boldify the text in a French translation?

There "might" be instances where it's absolutely required but I don't think I have ever encountered such a situation.

I would do something like:

Code: Select all

<strong>#LANG_CODE_1#</strong>#LANG_CODE_2#
Cheers,
Alex
AlexC
Forum Commoner
Posts: 83
Joined: Mon May 22, 2006 10:03 am

Re: HTML in language strings

Post by AlexC »

Yes. What if you port to Windows desktop applications? HTML tags won't likely render.
That's kind of a weird argument/point, if there are a few language strings with HTML in it currently, then surely I am already using vast amounts of HTML (as it is a web application) - so why would it ever be ported to that :D
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: HTML in language strings

Post by alex.barylski »

Perhaps a weird reason but it is a reason. This is where you as a developer need to make an executive decision I guess.

If you know absolutely without a dought it'll never be ported, include HTML.

Are you going to let your users edit the language code? Will you filter the language codes to prevent XSS? Are you prepared to include HTML in all of your translations? What if (instead of porting) you sometime in the future decide to use XXXHTML instead of XHTML are you prepared to modify your language tables AND the HTML templates?

Personally I keep HTML out of anything but HTML template files -- which are alternative snyax PHP scripts if anything.
AlexC
Forum Commoner
Posts: 83
Joined: Mon May 22, 2006 10:03 am

Re: HTML in language strings

Post by AlexC »

Perhaps a weird reason but it is a reason
It's not a reason, it makes no logical sense to convert an entire dedicate web application to something that is not. Even if I *was*, I think the fact there are HTML in language strings would be the least of my worries, since all the other code would need to be ported and would most likely included different language stings. I know 100% that this application will always be a web application.
Are you going to let your users edit the language code? Will you filter the language codes to prevent XSS? Are you prepared to include HTML in all of your translations? What if (instead of porting) you sometime in the future decide to use XXXHTML instead of XHTML are you prepared to modify your language tables AND the HTML templates?
The language strings would be edited by translators and released by us as language packs, HTML is allowed in them regardless, however they will of course be reviewed by us, so any malicious content would be found (these would be translators brought in to translate the application, not edited by your average Joe through the application, just to make things clear).

HTML within the language strings, would not be much (I'm not talking about full blow tables or complex HTML), it would just be small amounts such as in the examples given.
Personally I keep HTML out of anything but HTML template files -- which are alternative snyax PHP scripts if anything.
As do I, however these language strings are causing an issue with that. How would you get around this?
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: HTML in language strings

Post by alex.barylski »

As do I, however these language strings are causing an issue with that. How would you get around this?
By keeping HTML out of the language strings. :P

If you rely on HTML you have introduced a concrete dependency on HTML. If you used an intermediate format like wiki/bbcode markup you would make the dependency a little more abstract/generic or whatever, for lack of a better term.

Instead of using <strong> use *bold* and instead of <u> use _underline_

Then have a filter go over the language strings and replace those with the HTML, XHTML or XXXHTML or whatever your desired output may be.

Alternatively you could partition strings into distinct parts when there is markup involved:

Code: Select all

This is a <strong>string</strong> which relied on <h3>HTML</h3>
Would become:

CODE_1 = This is a
CODE_2 = string
CODE_3 = which replied on
CODE_4 = HTML

And the HTML template would look like

Code: Select all

CODE_1 <strong>CODE_2</strong> CODE_3 <h3>CODE_4</h3>
Codes would then be replaced in a post intercepting filter. Personally I hate using code's though I'd rather just use gettext inline.

Cheers,
Alex
AlexC
Forum Commoner
Posts: 83
Joined: Mon May 22, 2006 10:03 am

Re: HTML in language strings

Post by AlexC »

Splitting the string up is what we currently do (or did) to get around it, it just seems 'hackish' and makes life harder for the translators. Some languages, iirc, have different words depending on what context they are in and what is before/after them, so splitting them up may result in broken language sentences - if you see what I'm getting at?
Personally I hate using code's though I'd rather just use gettext inline.
Same (though I do see advantages to them), and as we're using Gettext + Launchpad for our translations it's best to keep the language strings in English so that translators can just download the .po/pot files and start translating the English version.

What I did try (and sort of works) is something like this:

Code: Select all

<?php printf( t('foobar car %1$szomg%2$s'), '<a href="http://example.com">', '</a>' ); ?>
It just seems again potential to confuse the user, is the string 'zomg' or 'szomg'?
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: HTML in language strings

Post by alex.barylski »

If you have a dedicated server use intl it's functions are better than sprintf
AlexC
Forum Commoner
Posts: 83
Joined: Mon May 22, 2006 10:03 am

Re: HTML in language strings

Post by AlexC »

We do, however this application is distributed for anyone to download and use.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: HTML in language strings

Post by alex.barylski »

Then you have to wait until 5.3 becomes common place...I think that is the version in whcih supports it without having to install from PECL.

Emulating it would be a daunting task...maybe look into Zend_Locale and it's language classes, they might have emulated it, or something close to it.
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: HTML in language strings

Post by allspiritseve »

AlexC wrote:Splitting the string up is what we currently do (or did) to get around it, it just seems 'hackish' and makes life harder for the translators.
My vote is to keep the HTML in the strings. I'm not worried about any so-called "dependency on html". I don't think we're getting rid of HTML any time soon, just look at how long it takes for browsers to adopt HTML & CSS standards. If for some reason in the future you need to convert to another format, it seems like it'd be just as easy to translate HTML as BBCode. The most important thing for me though, as a bilingual student, is that you make the job easiest for your translators :)
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: HTML in language strings

Post by alex.barylski »

My vote is to keep the HTML in the strings. I'm not worried about any so-called "dependency on html". I don't think we're getting rid of HTML any time soon, just look at how long it takes for browsers to adopt HTML & CSS standards. If for some reason in the future you need to convert to another format, it seems like it'd be just as easy to translate HTML as BBCode.
Look at Smarty for instance...it used HTML tags for its modules...then XHTML compliance and such became important, now the module developers had to go and update modules.

If you used <b> instead of the SEO <strong> you need to change, not just templates, but language files too?

Separation of concerns...I'm pednatic about...some others aren't.
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: HTML in language strings

Post by allspiritseve »

PCSpectra wrote:pednatic
Oh, the irony... :lol:

In all seriousness though... is it really that big of a deal to change your template files AND your language files? It seems like a simple search and replace would do the trick. If that saves him jumping through hoops now, why optimize prematurely?
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: HTML in language strings

Post by alex.barylski »

In all seriousness though... is it really that big of a deal to change your template files AND your language files? It seems like a simple search and replace would do the trick. If that saves him jumping through hoops now, why optimize prematurely?
Nothing is a big deal when your talking small scale

As long as the controller-model-view code is minimal, keeping it in a single function isn't bad either... :lol:
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: HTML in language strings

Post by allspiritseve »

PCSpectra wrote:As long as the controller-model-view code is minimal, keeping it in a single function isn't bad either... :lol:
I don't think it's quite the same thing... with everything in a single function you're almost definitely going to run into maintainence problems, and soon... with HTML, maybe he'll sorta run into minor problems in 10 years, MAYBE, if something better comes along and actually gets widespread adoption, but what's the worst he'd have to do to fix it? search and replace a bunch of files. With MVC, you can't automate well-separated code. It's gotta be done right early on.
Post Reply