simpletest. unable to assertWantedText() umlauts

Discussion of testing theory and practice, including methodologies (such as TDD, BDD, DDD, Agile, XP) and software - anything to do with testing goes here. (Formerly "The Testing Side of Development")

Moderator: General Moderators

Post Reply
jmut
Forum Regular
Posts: 945
Joined: Tue Jul 05, 2005 3:54 am
Location: Sofia, Bulgaria
Contact:

simpletest. unable to assertWantedText() umlauts

Post by jmut »

Hi,
I have problem matching umlauts with simple test.
The original page outputs encoding as

Code: Select all

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
Do I have to make any settings in simpletest about encoding?

I tried having umlauts as strings, and umlauts as html entities...in both cases simpletest does not understand the text?
Can someone confirm this...or I am doing something wrong :(
User avatar
sweatje
Forum Contributor
Posts: 277
Joined: Wed Jun 29, 2005 10:04 pm
Location: Iowa, USA

Post by sweatje »

How do regular pcre_match() expressions work with your UTF8 characters? IIRC, isn't this supposed to be a major focus of the PHP6 effort?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Pass 'UTF-8' to HTMLReporter and SimpleTest will output UTF-8.
jmut
Forum Regular
Posts: 945
Joined: Tue Jul 05, 2005 3:54 am
Location: Sofia, Bulgaria
Contact:

Post by jmut »

sweatje wrote:How do regular pcre_match() expressions work with your UTF8 characters? IIRC, isn't this supposed to be a major focus of the PHP6 effort?
Regular preg_match() works...

Code: Select all

HTML code is

Ihre pers&ouml;nlichen Daten

Code: Select all

This catches the umlaut
$txt = file_get_contents('http://localhost/umlautTest.html');

$pr = preg_quote('pers&ouml;nlichen');
preg_match("#$pr#",$txt,$matches);
var_export($matches);

//outputs:

array (
  0 => 'pers&ouml;nlichen',
)
The thing is I cannot do any of that in simpletest using assertText() or something.

So I am going to find out how to fetch source from the browser object. and do regular preg_match.

The key code in simpletest is this

Code: Select all

//parser.php ~line 699
        function decodeHtml($html) {
            static $translations;
            if (! isset($translations)) {
                $translations = array_flip(get_html_translation_table(HTML_ENTITIES));
            }
        return strtr($html, $translations);
       }
//I guess if something here is changed a valid compare using assertText() will be possible.
Last edited by jmut on Fri Aug 25, 2006 5:12 am, edited 1 time in total.
jmut
Forum Regular
Posts: 945
Joined: Tue Jul 05, 2005 3:54 am
Location: Sofia, Bulgaria
Contact:

Post by jmut »

Ambush Commander wrote:Pass 'UTF-8' to HTMLReporter and SimpleTest will output UTF-8.
This is only used for output..have no reference when it comes to compare and stuff...I guess.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Encode your files in UTF-8 and put the actual umlauts in source.
jmut
Forum Regular
Posts: 945
Joined: Tue Jul 05, 2005 3:54 am
Location: Sofia, Bulgaria
Contact:

Post by jmut »

Ambush Commander wrote:Encode your files in UTF-8 and put the actual umlauts in source.
Yes, that works...it is just that I don't have control over the html :)
Thanks anyway....using preg_match seems reasonable enough. So this problem I consider closed.
Post Reply