Prettying XML output

Small, short code snippets that other people may find useful. Do you have a good regex that you would like to share? Share it! Even better, the code can be commented on, and improved.

Moderator: General Moderators

Prettying XML output

Postby TJ » Sat Nov 05, 2005 6:36 pm

I just had need to pretty-format the output of DOMDocument->saveXML() for ease of reading, which comes as a string with no linefeeds or indentation of the XML.

I thought I'd share the function in case others have need of something similar in the future; it took a while to get it settled, and it might need tweaking for your circumstances.

Syntax: [ Download ] [ Hide ]
 * Pretty an XML string typically returned from DOMDocument->saveXML()
 * Ignores ?xml !DOCTYPE !-- tags (adjust regular expressions and pad/indent logic to change this)
 * @param   string $xml the xml text to format
 * @param   boolean $debug set to get debug-prints of RegExp matches
 * @returns string formatted XML
 * @copyright TJ 2005
 * @license GNU Lesser General Public Licence version 2
 * @link

function prettyXML($xml, $debug=false) {
  // add marker linefeeds to aid the pretty-tokeniser
  // adds a linefeed between all tag-end boundaries
  $xml = preg_replace('/(>)(<)(\/*)/', "$1\n$2$3", $xml);

  // now pretty it up (indent the tags)
  $tok = strtok($xml, "\n");
  $formatted = ''; // holds pretty version as it is built
  $pad = 0; // initial indent
  $matches = array(); // returns from preg_matches()

  /* pre- and post- adjustments to the padding indent are made, so changes can be applied to
   * the current line or subsequent lines, or both

  while($tok !== false) { // scan each line and adjust indent based on opening/closing tags

    // test for the various tag states
    if (preg_match('/.+<\/\w[^>]*>$/', $tok, $matches)) { // open and closing tags on same line
      if($debug) echo " =$tok= ";
      $indent=0; // no change
    else if (preg_match('/^<\/\w/', $tok, $matches)) { // closing tag
      if($debug) echo " -$tok- ";
      $pad--; //  outdent now
    else if (preg_match('/^<\w[^>]*[^\/]>.*$/', $tok, $matches)) { // opening tag
      if($debug) echo " +$tok+ ";
      $indent=1; // don't pad this one, only subsequent tags
    else {
      if($debug) echo " !$tok! ";
      $indent = 0; // no indentation needed
    // pad the line with the required number of leading spaces
    $prettyLine = str_pad($tok, strlen($tok)+$pad, ' ', STR_PAD_LEFT);
    $formatted .= $prettyLine . "\n"; // add to the cumulative result, with linefeed
    $tok = strtok("\n"); // get the next token
    $pad += $indent; // update the pad size for subsequent lines
  return $formatted; // pretty format

echo "\r\n" . prettyXML("<root><this><is>a</is><test /></this></root>", true);
Last edited by TJ on Fri Nov 25, 2011 4:01 pm, edited 2 times in total.
Forum Newbie
Posts: 20
Joined: Thu Nov 03, 2005 11:22 pm
Location: Nottingham, UK

Postby » Sat Nov 05, 2005 6:55 pm

seems to be more of a snippet

User avatar
Tranquility In Moderation
Posts: 4990
Joined: Sun Feb 06, 2005 8:18 pm
Location: Indiana

Postby John Cartwright » Sat Nov 05, 2005 7:04 pm

Agreed. Moved to Code Snipplets.
Code: Select all
if ($toBe || $notToBe) echo 'That is the question'; 

NEW HERE?: Please read the Forum Rules, and take the Forum Tour before posting!
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 3:10 am
Location: Toronto

Postby TJ » Sun Nov 06, 2005 12:41 am

Ahhh... ta very muchly... after 36-hours non-stop coding all the forums tend to look alike :lol:
Forum Newbie
Posts: 20
Joined: Thu Nov 03, 2005 11:22 pm
Location: Nottingham, UK

Return to Code Snippets

Who is online

Users browsing this forum: No registered users and 1 guest