interminglingly xml AND php

XML, Perl, Python, and other languages can be discussed here, even if it isn't PHP (We might forgive you).

Moderator: General Moderators

Post Reply
User avatar
freephile
Forum Newbie
Posts: 5
Joined: Mon Sep 16, 2002 9:26 pm
Location: Newburyport, MA
Contact:

interminglingly xml AND php

Post by freephile »

I have created a system that uses PHP's XML parsing engine just fine. My system is a template based web-publishing system where the content file for each page can be a simple include (where PHP and HTML can be mixed together), the content can also be an external resource, or the generated output of some class file, or (drum roll please) the content can be marked up in XML tags.

In the last case, the system (aka the man behind the curtain) fires up the Expat XML parser, and reacts to the content of the tags. In case the content includes PHP code, that code is eval()'d, so that I can add server-side logic and content inside my XML document.

The problem is that the XML parser seems to lose all scope of any previously created functions, but also doesn't allow functions to be redeclared. For example, lets assume that my XML content file is

Code: Select all

<content>
  <story>
     Blah, blah blah We have great stuff, see the table below for a list of products
     &#1111;color=red]<?php display_table($productSheets) ?>&#1111;/color]
  </story>
  <related>
  </related>
  <quote>
  </quote>
</content>
lets also assume that in the head of my template, I include a library of php code that contains the definition for the display_table function. The XML parser complains that it knows nothing of display_table. However, if I redeclare display_table inside the content file (where all PHP code is eventually eval()'d by the parser), the parser complains that I can't redeclare.

I'm wondering if JavaServlets and/or beans etc. can do this. I do not know JSP at all. I'm wondering if PHP can do this. Any idea?

If you go to http://test.freephile.com/company/index.php, then click on the "Open Source" icon, you can view some of the source. The whole system is GPL'd code, but I haven't polished it up enough to release the whole thing yet. Any advice is welcome.

Thanks,
Greg Rundlett
User avatar
hob_goblin
Forum Regular
Posts: 978
Joined: Sun Apr 28, 2002 9:53 pm
Contact:

Post by hob_goblin »

when i want to do stuff like that i usually use the eval() function...

Code: Select all

if("%eval:" == substr($format, 0, 6))&#123;
		$this->html .= eval(substr($format,7)); 
		&#125; else &#123;
                $this->html .= $format;
		&#125;
you know you'd do something like

<!-- eval: display_table($productSheets); //-->

and you'd just tell the script to do a regular expression to evaluate the code in the comment..

does this make sense?
look at http://www.php.net/eval , and if you need more help i will try.
User avatar
freephile
Forum Newbie
Posts: 5
Joined: Mon Sep 16, 2002 9:26 pm
Location: Newburyport, MA
Contact:

Post by freephile »

hob_goblin wrote:when i want to do stuff like that i usually use the eval() function...
[snip]

does this make sense?

Thanks for looking at my question, and the actual code, but you got thrown off the path of my real question by another instance where I use eval(). The instance that you were looking at is a little kludge to allow a page to have multiple sources of page-specific script (js, php). For example a form page might need a form processing class file included, that would be wasteful if included in the general site library of functions. But we digress.

My original question is basically can PHP do XML parsing, and at the same time switch back into PHP mode with access to all currently defined objects, methods, functions, and variables. The answer from my experience is mostly no. I have only successfully used limited PHP functionality within the XML parsing engine. What I'm trying to do is essentially the same thing that PHP does with respect to HTML. PHP will allow you to write oodles of HTML, and any time that you want, you can switch context and write PHP. The parsing engine handles it flawlessly, and keeps track of everything.

To use PHP's XML parsing ability, you must declare a character data handler. The character data handler is a function that does something with plain text (for example: echo() it) To get the XML parser to act differently when it encounters tags within the XML, such as <?php ... ?>, you use a processing instruction handler. My processing instruction handler is eval(). You can see the code for my XML parser here: http://test.freephile.com/inc/_showSour ... Parser.php and it is based on the example provided in the php.net reference manual.

In my experience, if my PHP code is simple, the eval works. However if I try to access a function that *is* already declared, and included in the chain of execution, the eval() does not work. The XML parser seems to be aware that the function exists, because I can't redeclare it, but it seems to have no access to it so it throws an error. If anyone has used PHP and XML in the way I've described, I'd like to hear about it.

Thanks for any additional help.
jason
Site Admin
Posts: 1767
Joined: Thu Apr 18, 2002 3:14 pm
Location: Montreal, CA
Contact:

Post by jason »

Code: Select all

&lt;?php
$title = "Internet-enabled Systems Based on Free softare";
$metaTags = &lt;&lt;&lt;HERE
&lt;meta name="description" content="To foster the adoption of Open Source and Free Software worldwide"&gt;
&lt;meta name="keywords" content="Internet enabled systems, based on Free softare, Mission, Developers, Business Owners"&gt;
HERE;
$scriptTags = '';
$contentFile = $_SERVER&#1111;'DOCUMENT_ROOT'] . '/company/_index.php';
if (empty($templateFile)) $templateFile = "/templates/default.php";
$printerLink_On = true;
$showSource_On = true;
include($_SERVER&#1111;'DOCUMENT_ROOT'] . $templateFile);
?&gt;
Where do you handle the include there? I am a bit confused by your question. You say you are parsing XML, and yet from that code I don't see any parsing of XML going on. What would helpful would be the actual code you are having a problem with.

If it's the code above, the code I see hear, http://test.freephile.com/inc/_prettyPrint.php than just say so, but it doesn't look like it.
User avatar
freephile
Forum Newbie
Posts: 5
Joined: Mon Sep 16, 2002 9:26 pm
Location: Newburyport, MA
Contact:

Post by freephile »

Hi Jason,

(OT: I love your site, and have followed phpclasses for a long time.)

This is how my system works: There is a "title page" which is the addressable URL of the webpage. The title page sets up page-specific parameters, such as title, keywords, page-specific scripts, the content file, and booleans like whether or not to show a printer-friendly link, and show source. The code that you highlighted is the code for a title page. Once the title page has set the 'environment', it defines and includes the template file.

As execution is handed over to the template file via the 'include' function, the real work gets done. (You can see the source code of the default template file at http://test.freephile.com/inc/_showSour ... efault.php) The template file includes the "header file" (http://test.freephile.com/inc/_showSour ... header.php), which is a set of core library functions for the system. It is this header file that figures out what kind of "content file" is used (xml, or regular content) and responds by selecting either the Expat parser, or else my custom parser (which is just a series of regex).

Code: Select all

&lt;?php
// if the content file is an xml file, $contents is created by bmLoadXML 
// otherwise, $contents is created by the custom BM parser 
if (bmIsXML_Story($contentFile)) $contents = bmLoadXML($contentFile); 
else $contents = getContents ($contentFile);
?&gt;
If the content file is 'true' xml, then I use the Expat xml parser in conjunction with output buffering to stuff the content into a variable named $contents If the content file is not true xml, then I use my own function called getContents to read the content file. With both methods of parsing the contents, a single variable called $contents is created. Then a set of regex calls are made to find the pieces of the content within $contents.
What is in the content file: Each content file has a main story, plus related links, plus a sidebar story, plus quotes. This is all maintained in a single file, rather than using a database. Using simple xml tags, I can easily see what is what when editing the content file. Using a simple xml parser, I can put the content into buckets in my template, for layout onscreen. Changing the template allows me infinite combinations for layout of the content.
Back to the problem
See the jobs page (http://test.freephile.com/company/jobs.php) It errors out because it says that I'm calling an undefined function. In my jobs content page, in the main story section, I include a class file to parse RSS data feeds. (http://test.freephile.com/inc/showSourc ... /_jobs.php). Since this content file is 'true' xml starting with the

Code: Select all

&lt;?php
&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
?&gt;
line, it ends up being parsed by the Expat parser. That parser uses a Processing Instruction handler to eval any php code. By the time the php parser hands things off to the xml parser, the getFile function has already been included in the "Title file" by way of the template which includes /inc/_buildIndex.php at the very top. Here is the getFile function in case you're curious:

Code: Select all

&lt;?php
function getFile ($url) {
  $fp = @fopen($url, 'r')          // get the page contents
    or die("Cannot Open $url");
  while ($line = @fgets($fp, 1024)) {
    $contents .= $line;
  }
  fclose($fp);
  return $contents;
}
?&gt;
Now one thing I just noticed is that the error message says 'getfile' while the function is getFile. I don't think that case folding has anything to do with this discrepancy, since case folding acts on the tags, not the data being parsed.

If I simply remove the first <xml> declaration line and leave all other xml tags intact, the file will be parsed by my simple 'getContents' and 'getXML' functions, and php doesn't choke on the call to 'getFile' at all.

Maybe the problem is that I'm using output buffering, but I don't think that should create a scope problem. My guess is that the XML parsing ability of PHP was not meant to work within the context created by a parent script.

Again, I am using XML in the content file for my BlueMantis system to simplify editing of the content, and also to open up possibilities for displaying the content. I need to create a data structure, and have tried to use the DOM parser, without success. When using the event-based parser, I seem to be able to do a better job parsing and keeping everything in scope by just using regex.

I know this is a complex situation, and thank anyone in advance for helping me tackle it. If you are interested by this system, it will be released under the GPL as soon as possible. You can join me on the project at sourceforge. The name of the project is BlueMantis. BlueMantis is supposed to be a system for quickly generating websites that are easy to maintain, and the system allows advanced 3rd party Open Source projects to be easily plugged into the solution.
jason
Site Admin
Posts: 1767
Joined: Thu Apr 18, 2002 3:14 pm
Location: Montreal, CA
Contact:

Post by jason »

Do you have short tags enabled, btw? Essentially, is this allowed:

Code: Select all

&lt;?
// short tags enabled
?&gt;
If it is, that is why the XML tag is messing up. That is my first initial guess, I will look into it more...just thought I should mention this tidbit at the start.

P.S. I don't run PHPClasses, that would be Manuel Lemos. I run PHPComplete
User avatar
freephile
Forum Newbie
Posts: 5
Joined: Mon Sep 16, 2002 9:26 pm
Location: Newburyport, MA
Contact:

parsing XML, with PHP embedded in it

Post by freephile »

A quick look at the source code for the jobs page reveals the heart of the matter:
  • Simple PHP works.
    Advanced PHP that relies on previously included files does NOT work.


For example, the XML parser quite easily gets past the line

Code: Select all

&lt;?php print ('&lt;a href="' . $PHP_SELF . '?rssURL=http://mojolin.com/xml/mojolin.rss"&gt;Open Source Employment&lt;/a&gt;');
?&gt;
But, when it gets to

Code: Select all

&lt;?php
$data= getFile($rssURL);
?&gt;
where getFile() is a function defined in a previously included php file, the xml parser complains that the function is not defined. If I use my own parser rather than the xml parser to parse the same exact file, everything works fine. The really odd part is that if I try to placate the xml parser and redeclare the getFile function within the scope of the content file, then the xml parser complains that I can't redeclare a previously defined function.

I don't think there is any way to get a PI Handler to invoke the PHP parser in the same (parent) context that the XML parser is running in. I wish there were, because I would like to use PHP's own XML parser rather than a series of homegrown regex functions.
User avatar
freephile
Forum Newbie
Posts: 5
Joined: Mon Sep 16, 2002 9:26 pm
Location: Newburyport, MA
Contact:

Post by freephile »

DOOOOOOOOOOOOHHHHHHHHHHHHHHHHHHHHHH
(Greg giving himself dope slap on the forehead)
I figured it out. :oops:

I was obviously having a scope problem. I went through the chain of execution a hundred times, and each time I verified that all the proper files were included. Then when I was totally frustrated that it wasn't working, I would try to redeclare a function, and it wouldn't let me. How can something not exist at one moment, and then exist when you try to re-declare it? SIMPLE: because it didn't exist at the moment when you first tried to use it, but it did exist when you tried to redeclare it.

The problem that I had was because I actually created the XML parser in my first library file (include). So at the moment that the XML parser was created, it did not have access to any functions that were included in later files (actually the very next line in my code includes a second library of functions)

The lesson learned here (the hard way) NEVER do anything in your library files. Only define functions or classes. Wait until all library files are included before you do any work that might actually need to make use of different libraries.

Thanks for helping me through this one. - Greg
Post Reply