Incremental migration from legacy to new code base...

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Incremental migration from legacy to new code base...

Post by nielsene »

My attempts to create a FrontController for my new architecture that falls back to the legacy scripts if there is no handler registered is hitting multiple snags -- horrendous complications in the FrontController, problems hooking up the FrontController to apache, etc.

I'm tempted to completely "separate" the two code bases. Give the new code a custom extension and use AddHandler, etc to register the FrontController's new code, while allowing regular apache/filesystem dispatch to the legacy code. However that still has problems -- needing to change legacy code to point to the new extensions as those actions are written, etc.

Plus I currently make somewhat extensive use of virtual directories (ForceType handlers) and plan to move more of the application to a virtual directory (ie decoupling URLs from the file system), however that breaks the AddHandler method of registering the FrontController as all the directories upto the script.ext must exist.

Has anyone had any experience migrating a legacy (transaction script/page controller architecture) to a front controller based architecture, in place and incrementally?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

I have read your various posts an I get the impression that you are over implementing or over designing or something. AddHandler or ForceType will not solve basic architectural problems. I don't even see where, at this point, clean URLs are of any benefit to you application unless there is a REST interface somewhere. I'd say simplify.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

I have a large existing application with poor architecture. I'm trying to migrate it to a new architecture. I need the two to overlap seemlessly, so I can incrementally switch pages over when adding new functionality while bug fixing in the old as needed. Given my amount of time available for this volunteer project, it will probably take over a year to finish the transition and I can't go that long without user-visible features appearing.

Clean URLs have been one of the requirements from day one; Major sections of the site are "replicated" using URL-parsing:
http://sitename/register/CompName/scriptName
where CompName is any of about 10-20 different open events at a time and ScriptName is a directory with subdirectories containing upwards of 50 scripts.

However by main comments about AddHandler/ForceType were concering how to even configure Apache/PHP to invoked the front controller. As all of them were causing problems... I've found two possible solutions now: one using a prepend_auto_file with a custom 404 to catch the virtual directories and a mod_rewrite based approach.

However I would still like to hear lessons learned, etc about migrating from one architecture to another, in place.
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

nielsene wrote:However I would still like to hear lessons learned, etc about migrating from one architecture to another, in place.
I can't speak directly to migrating to a true FrontController, but I can speak *a lot* about migrating from legacy to new code (while maintaining user expectations).

The project I have the most passion for ("Blacknova Traders") is the perfect example. Its primarily code written over five years ago. Worse, three years ago, development pretty much hit a standstill in the main tree.

So there are things that will make your skin crawl in our tree. Unfiltered input? All over the place. Output? Non-standard. Mix output and processing? You betcha.

Its a mess. However, three years ago, I started a fork of it ("The Kabal Invasion"). One of the first things I did was to add templating (Smarty), which allowed me to migrate pages to a better input/processor/output model (slightly different from true MVC).

That process takes a VERY long time. There are over 40,000 lines of code, and over 10,000 lines of it are related to processor-driven output. Thats challenging to migrate, and it takes a long time to 'do it right'.

Then we have the 'simple' things like filtering input and output. Just migrating the game to not use undefined variables took months.

So, you get the idea: A massively legacy codebase, and I've been hacking away, migrating it to not-so-legacy development.

What did I learn?

To be reasonable, and less idealist. I have a passion for standards and consistency. I love to say "Everything in the game does X", where X is some highly ideal goal like 'outputs html compliant code'.

Unfortunately, when you are trying to improve things AND keep the codebase usable, you have to accept that it will be slow progress, it wont be ideal, and there will be compromises along the way.

For example, I started by cleaning up the "include" mess in every page. There were roughly a half-dozen include calls in every file. Unfortunately, there was no consistency: Some files included 4 others, while others inluded 6. You really couldn't rely on any file having a set of functions available, or having run cleanup code already.

So, I organized the files, rewrote them to be more consistent, and made "global_includes" that each file would include. It helped tremendously, because I was able to say with certainty that any given file had what I needed already done.

However, now, some three years later, I'm doing the opposite. Now that most files do the right thing, I need better performance. That means decreasing the number of files included, and the size of those files. That means going file-by-file and ensuring what the file needs included, and also making sure that it does include it - and not by using a global include with a dozen files.

Its a simple example of the tradeoffs you will have to make. If I started from the beginning and said "The *ideal* solution would be..", then we would have lost progress in hundreds of other files. By being reasonable, and embracing the middleground, we made solid progress.

You asked what my lessons learned were, and in a nutshell, its "Perfect is the enemy of Good". Keep in mind a long term goal (which can and should be hopefully idealistic), but don't freak out if you can't get 100% of the way there, in every file, right now.

Oh, and good luck. Three years later, I'm rewriting half the files in the game again. :P
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

Thanks Roja. In your case, you mentioned forking the code. Did you switch to running off the forked code base from day 1 (or an early day) of did you spend some time getting the forked code into some state where the legacy and new co-existed better? Do most installs of the game use BNT or Kabal Invasion code base now?
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

nielsene wrote:Thanks Roja. In your case, you mentioned forking the code. Did you switch to running off the forked code base from day 1 (or an early day) of did you spend some time getting the forked code into some state where the legacy and new co-existed better?
Personally, I switched to the new codebase on day 1. Because I took an incremental approach, it remained usable, and within 3 months, I did my first release for the public. Some of that time did require getting the forked code into a 'median' between new and old that allowed the game to run without noticable problems affecting the user.
nielsene wrote:Do most installs of the game use BNT or Kabal Invasion code base now?
Thats the funny part. Now, we are doing a reverse fork - the development for the next version of BNT ("Proton Pack") is based on the last TKI release.

TKI unfortunately suffered from serious trademark issues, so we've discontinued it, and have been actively working to remove it from the 'net.

However, the gameplay style and changes in TKI will all be available in Proton, as a configuration option when installing the game. ("Classic, or Advanced").
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

Interesting, and that does sound extremely similiar to where I am. (Except I haven't forked it as I'm the only developer...) I've spent about a month trying to get the new framework up that can coexist. I hope to release the next public verssion/update my beta server by the end of the long weekend and then the long haul can commence of gradual change over.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

nielsene wrote:Clean URLs have been one of the requirements from day one; Major sections of the site are "replicated" using URL-parsing:
http://sitename/register/CompName/scriptName
where CompName is any of about 10-20 different open events at a time and ScriptName is a directory with subdirectories containing upwards of 50 scripts.
Again, I think you are focused on things like clean URLs that have very little to do with the internal architecture. A controller should be able to handle different types of URLs without much difficulty.
nielsene wrote:However by main comments about AddHandler/ForceType were concering how to even configure Apache/PHP to invoked the front controller. As all of them were causing problems... I've found two possible solutions now: one using a prepend_auto_file with a custom 404 to catch the virtual directories and a mod_rewrite based approach.
I really don't understand the "configure Apache/PHP to invoked the front controller" comments you make. A Front Controller is just a PHP script like any other. It is commonly index.php in a basic setup, but you can have multiple FCs and name them anything you want. No Apache configuration required. In fact, most framewords (e.g. Ruby on Rails) are getting away from needing any special server confguration and doing everything internally.
nielsene wrote:However I would still like to hear lessons learned, etc about migrating from one architecture to another, in place.
I usually do it piecemeal, moving one section to the new architecture at a time so I can at least manage testing the changes.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

arborint wrote:
nielsene wrote:Clean URLs have been one of the requirements from day one; Major sections of the site are "replicated" using URL-parsing:
http://sitename/register/CompName/scriptName
where CompName is any of about 10-20 different open events at a time and ScriptName is a directory with subdirectories containing upwards of 50 scripts.
Again, I think you are focused on things like clean URLs that have very little to do with the internal architecture. A controller should be able to handle different types of URLs without much difficulty.
Agreed that the controller can handler different types of URLs without difficulty. However the strucutre of the site's URLs does influence how the controller can be invoked from the webserver.
nielsene wrote:However by main comments about AddHandler/ForceType were concering how to even configure Apache/PHP to invoked the front controller. As all of them were causing problems... I've found two possible solutions now: one using a prepend_auto_file with a custom 404 to catch the virtual directories and a mod_rewrite based approach.
I really don't understand the "configure Apache/PHP to invoked the front controller" comments you make. A Front Controller is just a PHP script like any other. It is commonly index.php in a basic setup, but you can have multiple FCs and name them anything you want. No Apache configuration required. In fact, most framewords (e.g. Ruby on Rails) are getting away from needing any special server confguration and doing everything internally.
I am not going to do a "run the whole site through index.php?action=foo" method of dynamic dispatching. Therefore I need some sort of "trick" to hook up apache to the FC. This is especially true given the legacy support requirements. The FC tries to handle every request, if it can't it has a set method for attempting a legacy page before failing with an InvalidRequestView.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

nielsene wrote:I am not going to do a "run the whole site through index.php?action=foo" method of dynamic dispatching. Therefore I need some sort of "trick" to hook up apache to the FC. This is especially true given the legacy support requirements. The FC tries to handle every request, if it can't it has a set method for attempting a legacy page before failing with an InvalidRequestView.
If you are looking for a "trick" you will probably implement a poor design. Again, the URL style does not matter. Whether it is "index.php?action=foo" or "/action/foo/" or "/foo/" the controller just needs to normalize it internally. Once you do that part then the mapping and routing is straightforward. If you get "foo" as your action and it has a mapping or route then run the new code; if it doesn't then include an old page.

I might also ask why you are not going to do a "run the whole site through index.php?action=foo" method of dynamic dispatching. Is there a technical reason or just a matter of taste?
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

arborint wrote: I might also ask why you are not going to do a "run the whole site through index.php?action=foo" method of dynamic dispatching. Is there a technical reason or just a matter of taste?
Primarily REST-ful type considerations, some taste. Non-idempoetent requests are handled via POST, etc.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

arborint wrote:
nielsene wrote:I am not going to do a "run the whole site through index.php?action=foo" method of dynamic dispatching. Therefore I need some sort of "trick" to hook up apache to the FC. This is especially true given the legacy support requirements. The FC tries to handle every request, if it can't it has a set method for attempting a legacy page before failing with an InvalidRequestView.
If you are looking for a "trick" you will probably implement a poor design. Again, the URL style does not matter. Whether it is "index.php?action=foo" or "/action/foo/" or "/foo/" the controller just needs to normalize it internally. Once you do that part then the mapping and routing is straightforward. If you get "foo" as your action and it has a mapping or route then run the new code; if it doesn't then include an old page.
The URL style DOES matter. If you're using "/action/foo" or "/foo/" or "foo.do" any of the various ways of representating the action in the URL without doing "index.php?action=foo". You have to provide some mechanism for the front controller to be invoked. If the url is:
"http://somesite/action". There is NO .php file to be invoked that start the process of. You need to use something, wether its mod_rewrite, AddHandler, ForceType, or auto_prepend_file to get the controller to execute and start to pick apart the the request_uri/etc. You can not rely on the "simple" apache/file-system dispatch method.
Post Reply