The Phrasebook pattern

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

The Phrasebook pattern

Post by nielsene »

I've been thinking a lot about the Phrasebook pattern a lot since Jason mentioned it here -- primarily for pulling my sql queries out of the code, not so much for xhtml blocks.

I've read the document that defines the purpose and use of the phrasebook pattern, but still have a few questions.

1. "Good" formats for the phrase fill:
One of the main reasons I would like to use the PB pattern is to get my long (5-10 line HEREDOC) queries out of the code stream. PHP-mode in Emacs doesn't handle HEREDOC's very nicely so it breaks a lot of the syntax highlighting amd indenting. Regular double-quoting doesn't work nicely for very long queries, neither does single-quoteing. So putting the queries elsewhere would be nice. However if I put the queries else where I'ld want to be able to use SQL-mode on that file for its syntax highlighting. This rules out the use of an XML format for the phrase file as I think Jason was suggesting. (Besides the purpose of PB is to seperate the job of writing different languages, possibly to different developers... why make the SQL guy learn XML...)

However this means that the labelling of phrases with their handle would have to be done via comments, ala phpdoc/javadoc style. I'm not sure I really like this idea, but its what I've come up with so far.

2. How much "introspection" into the phrases is needed? How much meta data should be stored?
Is it safe to expect the PHP developer (PB user) to know the number, order of attributes returned in each tuple generated by the query? Or should the PB pattern include the meta-data needed by the developer to "unpack" the returned data?

It seems like in the case of SQL PB's, there is a need for a standard documentation model
Function name-->query handel
Function arguments -->query parameters
Return structure -->relation header

Most PB's I've seen only have support for the first item explicity and a way to "compile" arguements in, but often without a safe namespace, ie it requires the developer to read the pattern to see the names of the variables and then sets up those variables in advance. (Some are safer and have the develop pass in an array, thus created a new/safe namespace.)

So it seems like ideally we should have two levels of abstraction?
Some sort of database class/library that wraps the individual PB phrases and handles the namespace issues. This also gives a logical place for the required developer documentation.

The PB fiile whose contents are only reached through the database class/library, thereby avoiding the need for increased meta-data. The SQL coder and the library developer might need to coordination when the SQL guy optimizies, rearranges queries, but ideally the library interface remains constant to the application programmers.

However, in doing so it seems like I've written away a lot of the benefits of the PB pattern and am only using it to hide queries.... Or is this the normal way its used?
jason
Site Admin
Posts: 1767
Joined: Thu Apr 18, 2002 3:14 pm
Location: Montreal, CA
Contact:

Post by jason »

Heh, seem's you have been putting a lot of thought into this. Which is cool, because I really don't have a good PB class to use. However, when I was working on one, I came up with a solution to number 1, and I have my opinion on number 2.

1. INI files!

select_user_query = "SELECT whatever FROM tables WHERE some_key = {var}"

Of course, you could take this a step further, and allow multi-line queries. One thing I would suggest with this is a Cache functionality. Actually, this would be better handled with something like a Cache object. For my template stuff, I have a seperate Cache object, and a FileCache object for cache's to FIle's, obviously.

But basically, your cache object would read in this INI file, and parse it, and store it as a PHP array (or whatever you wanted) that could be read instead of the INI file on every use.

This was my idea.

2. I see the PB pattern as being something that allows you to abstract strings out of the program. Basically, you just say 'I want this string'. It has placeholders for variables, and should be 'compiled' into native PHP code for quicker access (essentially, a mini-templating system). What you get in return is merely a string.

In fact, the PB class shouldn't return anything BUT strings (or may an array of strings if that is deemed something important, but I doubt it). As far as the abstraction is concerned, I believe this isn't the responsibility of the PB pattern. It's job is to return the result of a string request. It should also provide a simple variable/value replacement scheme. That's about it. Whatever is doing the calling should handle the abstraction.

Of course, the PB pattern will require a contract between the "String Guy" and the "Programming Guy".

That's my thoughts at least. Excuse any poor thoughts, english, grammar, it's 3 AM in the morning...I should be sleeping, but I was working instead on 2 cups of cawfee (New Jerseian for coffee).
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

jason wrote:Heh, seem's you have been putting a lot of thought into this. Which is cool, because I really don't have a good PB class to use. However, when I was working on one, I came up with a solution to number 1, and I have my opinion on number 2.

1. INI files!

select_user_query = "SELECT whatever FROM tables WHERE some_key = {var}"

Of course, you could take this a step further, and allow multi-line queries. One thing I would suggest with this is a Cache functionality. Actually, this would be better handled with something like a Cache object. For my template stuff, I have a seperate Cache object, and a FileCache object for cache's to FIle's, obviously.

But basically, your cache object would read in this INI file, and parse it, and store it as a PHP array (or whatever you wanted) that could be read instead of the INI file on every use.

This was my idea.
Hmm, that is a simple solution, but I'm afraid it doesn't hold up to the syntax highlighter test. Any SQL highlighter/indenter that I know of will get confused with the assignment/quoting. I think I'm going to have to go with a comment based approach:

Code: Select all

--@label query-one
SELECT foo, bar, baz, qux, quux 
  FROM a NATURAL JOIN
       b JOIN c ON (b.var1 = c.var2)
 WHERE foo+bar=baz
 ORDER BY baz, foo, bar;
--@label query-two
SELECT 1;
...
With any non-comment between two labels (or label and eof) stored under the label's name in the dictionairy. My requirement for proper highlighting of the query is similar to your first rule of templating -- that a visual editor can display it properly.

Of course it means I would probably need to subclass PB to SQL_PB, HTML_PB, XML_PB, so it knows what type of comments to use.
jason wrote: 2. I see the PB pattern as being something that allows you to abstract strings out of the program. Basically, you just say 'I want this string'. It has placeholders for variables, and should be 'compiled' into native PHP code for quicker access (essentially, a mini-templating system). What you get in return is merely a string.

In fact, the PB class shouldn't return anything BUT strings (or may an array of strings if that is deemed something important, but I doubt it). As far as the abstraction is concerned, I believe this isn't the responsibility of the PB pattern. It's job is to return the result of a string request. It should also provide a simple variable/value replacement scheme. That's about it. Whatever is doing the calling should handle the abstraction.

Of course, the PB pattern will require a contract between the "String Guy" and the "Programming Guy".
OK I think I was trying to make it too complicated. I'll try coding up a version when I get past my upcoming deadline and see if about right.

Thanks.
jason
Site Admin
Posts: 1767
Joined: Thu Apr 18, 2002 3:14 pm
Location: Montreal, CA
Contact:

Post by jason »

Hmm, that is a simple solution, but I'm afraid it doesn't hold up to the syntax highlighter test. Any SQL highlighter/indenter that I know of will get confused with the assignment/quoting. I think I'm going to have to go with a comment based approach:
Your right, didn't think of that, and it is important. However, the reason you used the XML was so that you have a single PhrasebookReader that can read any phrasebook.

Syntax highlighting is good, however, you could easily create the SQL in an SQL client, and then simply copy/paste the SQL into the XML file. You may not always want to use SQL for running queries. You may want to create Documentation that displays these queries in a nice format.

Having a different format for each PhraseBook also decreases portability. The benefit to the PhraseBook object presented in the PDF (for the Perl version) was that I could take a single PhraseBook, and use it in any language that supported PhraseBooks.

Let's say I have a String PhraseBook file that contains strings for an application. Now, my application may be web based. But let's say their are parts of the application written in PHP, some parts in C, and other parts in Java. If I had one standard File Format, the (a .pbml if you will), then any client could read it without having to worry about what it actually contains.

In fact, I retract my original statement of using the INI file approach. XML as the file format is a superior method. Your reasoning about Color Coding is fine. However, let's promote a common PhraseBook Markup Language instead that allows this form of color coding.

"Why make the SQL guy learn XML?" The PBML is not difficult to learn. He doesn't have to learn it either. You could create an interface for him too input SQL into. And, by promoting one common standard, we can get SQL Editor makers to support the PBML standard. So a person could easily code PBML documents for SQL. This would be easier to promote and get accepted than many different competing formats.

I was wrong with my INI statement, and I believe the XML format is the correct format.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

jason wrote:
Hmm, that is a simple solution, but I'm afraid it doesn't hold up to the syntax highlighter test. Any SQL highlighter/indenter that I know of will get confused with the assignment/quoting. I think I'm going to have to go with a comment based approach:
Your right, didn't think of that, and it is important. However, the reason you used the XML was so that you have a single PhrasebookReader that can read any phrasebook.

Syntax highlighting is good, however, you could easily create the SQL in an SQL client, and then simply copy/paste the SQL into the XML file. You may not always want to use SQL for running queries. You may want to create Documentation that displays these queries in a nice format.
I'm not sure I understand what you're getting at here. Requiring developers to cut and paste between windows in order to use their normal tools seems to be a poor option. It makes life difficult for the SQL developer who is the primary person who should be editting the PB data file. What were you trying to get at with "You may not always want to use SQL for running queries?" The number one reason I want to move from having my queries embedded in the text is for the syntax highlighting. The number two reason is to localize the queries so I can more easily update the set of queries when the schema changes or preform optimizations. Thus having a source file that is SQL friendly is important. The same things would be true of HTML PB's etc, you always want to use the secondary language's highlighting and authoring tools.

I think this means that PB should be viewed more as an interface or an abstract base class.
jason wrote: Having a different format for each PhraseBook also decreases portability. The benefit to the PhraseBook object presented in the PDF (for the Perl version) was that I could take a single PhraseBook, and use it in any language that supported PhraseBooks.

Let's say I have a String PhraseBook file that contains strings for an application. Now, my application may be web based. But let's say their are parts of the application written in PHP, some parts in C, and other parts in Java. If I had one standard File Format, the (a .pbml if you will), then any client could read it without having to worry about what it actually contains.
I understand and I can see why that would be nice. However, I don't see it being that much extra work to define the needed subclasses, even in the other languages.
jason wrote: In fact, I retract my original statement of using the INI file approach. XML as the file format is a superior method. Your reasoning about Color Coding is fine. However, let's promote a common PhraseBook Markup Language instead that allows this form of color coding.

"Why make the SQL guy learn XML?" The PBML is not difficult to learn. He doesn't have to learn it either. You could create an interface for him too input SQL into. And, by promoting one common standard, we can get SQL Editor makers to support the PBML standard. So a person could easily code PBML documents for SQL. This would be easier to promote and get accepted than many different competing formats.
The purpose of PB is to not require writing strings of one language in another languange. By choosing XML as the file medium you've again mixed languages -- thereby negating part of the reason for using PB. Furthermore in the specific case of SQL and XML, there are some hard feelings amount SQL and/or relational advocates about misuse of XML that I beleive would ensure would cause an XML based standard to fail.
jason wrote: I was wrong with my INI statement, and I believe the XML format is the correct format.
I see where you're coming from, but I don't agree. I think perhaps, though, we'll continue to disagree. We each have different priorities on what PB should accomplish which dictates our views. Still I appreciate the discussion (but its only made me more set against XML or var=value systems).

Going with the comment based system also allows to add

Code: Select all

--@var ... 
--@returnVar ...
meta data to the query for use in a auto-documentor tool in an standard fashion.
jason
Site Admin
Posts: 1767
Joined: Thu Apr 18, 2002 3:14 pm
Location: Montreal, CA
Contact:

Post by jason »

The purpose of PB is to not require writing strings of one language in another languange.
No, that's not the purpose. The point is the same reason you seperate design and data and logic. As a programmer, you don't need to know what the Query does to get you the data, just that it gets you the data. If the SQL coder needs to update the query, he can do that without having to bother the programmer. He can implement new features when a database gets an upgrade, for example. It abstract that part out.

It also allows you to create better Database abstraction. You can have a phrasebook for each type of database, that way no matter which database you use, it uses optimized SQL for that database.
The number one reason I want to move from having my queries embedded in the text is for the syntax highlighting. The number two reason is to localize the queries so I can more easily update the set of queries when the schema changes or preform optimizations. Thus having a source file that is SQL friendly is important.
Syntax highlighting can still be accomplished. I have an SQL application open right that I enter queries into, and test. Once I am done testing that query, I can put it into my code (right now it still goes into PHP directly). If I have a big file with dozens of queries (the PhraseBook), it would be impossible for me to execute just one query.

Designing the pattern around an Editors limitations is not good practice.
However, I don't see it being that much extra work to define the needed subclasses, even in the other languages.
But the point it, it is extra work. Something that Patterns are intended to help you avoid. They are supposed to solve problems, not create new ones. By implementing a seperate reader for many different formats, it creates a larger burden on development. Rather than simply creating a new PhraseBook and using it, you are forced than to define what the syntax of the PhraseBook file format is, implement an object to handle it, and publish the PhraseBook standard.

This is a lot of work to simply be able to use syntax highlighting, which is not always needed, or even used for PhraseBooks.
Furthermore in the specific case of SQL and XML, there are some hard feelings amount SQL and/or relational advocates about misuse of XML that I beleive would ensure would cause an XML based standard to fail.
XML is used to describe data. That's what it's used for. That's what it's being proposed to do in the PhraseBook pattern. It's not being used for anything else.

It's a common standard, it's easy to use, and it's very much cross-platform.

By creating many competing standards, you will have problems.

Consider this: A person wants to use the PhraseBook pattern for something else we haven't thought of (maybe abstracting RegEx's because the RegEx guru doesn't know PHP that well and will fudge things up, bear with me on this example).

Rather than just sitting down and using the PhraseBook pattern, the programmer now has to invest time into setting up a standard to be used, rather than using an exisiting one. Now, a lot of work is going into replication of effort.

What if someone's editor doesn't like your method of using an SQL PhraseBook? Now suddenly, they won't use it. So they define their own standard. Now it becomes much more difficult to use it on different systems.

However, I am willing to concede this point: Your way can make it easier for you to use, and that is important.

Therefore, my recommendation is as follows, you implement the XML PhraseBook standard. This is the default standard used. Anyone can use it. If they need a special setup, or a special structure for whatever reason, they can implement their own.

This is actually employing another pattern. I know what it is, I jsut can't think of the name, and I don't feel like opening the BoF right now (though it's right beside me, heh).

Anyways, be implementing the Default XML standard, you ensure the pattern is still usable by default for most people, and extendable by those who have special cases and needs.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

jason wrote: Designing the pattern around an Editors limitations is not good practice.
I think we are both mixing patterns and implementation of patterns rather badly.

The PB pattern says nothing about how the data is stored. Only what uses the pattern has, the interface, the pros/cons, etc.

Nothing I've said deals with that.

An implementation of a pattern has to target the planned use. You don't right a generic "decorator" or "factory" class that you use everytime you need it. You recognize when your design calls for a "Factory" pattern and then implement a Factory class for you needs. Patterns are basically a meta-interface. I don't go out and say "Hey I need to download a Command pattern" I say "This looks like a Command pattern emerging, lets see what the GoF say about the Command pattern, maybe it will help me avoid some common gotchas in my implementation."

What you are describing I would call an XML-storage based, implementation of the PB Pattern.

A class is only useful when its used. Ditto for a pattern. Making a generic implementation of a pattern seems useful to illustrating the pattern in action, but it seems to generate a compromise class that will not be commonly used.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

jason wrote: Designing the pattern around an Editors limitations is not good practice.
Whoops, skipped my other point.

I'm not proposing any changes to the pattern. I'm proposing an implementation. Designing an implementation around limitations of its expected use IS good practice.

Just as there can be said to be a Template pattern, you designed your implementation of said pattern to work around your requirements, for instance your visual design requirement. I see this as an exact a parallel with what I'm discussing.
Post Reply