Gimme a name : URI and friends
Moderator: General Moderators
- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
Okay, after reading all your comments, and doing some reading of the URI RFC, I've come to some conclusions:
Seperating validation, normalization and plain-old fixing would negatively affect both performance and readability.
The URI is a complex dataform governed by many different specifications (called schemes). However, URI defines a generic syntax applicable to all schemes, for which the scheme can place further constraints.
The first logical step, then, would be to parse the URI according to the generic syntax and then pass on the components to the correct scheme for further parsing. If the URI turns out to be completely invalid, we can abort real fast. However, notice how parsing and validation are intertwined: you must parse before you can validate.
You also need to parse before you can fix or normalize.
If we seperate the validator logic, we end up parsing the URL twice, and, without common function set, the code gets duplicated too.
Further complicating the situation is the fact that we have two levels of parsing, validation and fixing, the generic URI level and the specific scheme (e.g. HTTP) level.
So... to go full out OOP, we need a ParsedURI, ParsedURI_HTTP and friends, URIValidator, URIValidator_HTTP and friends, URINormalizer, URINormalizer_HTTP and friends, URIFixer for full extensibility and modularity. All this for a spec that really won't change. Oh, and throw in a ResolveRelativeURI.
I'd much rather think of URI validation as a black box that takes a random URI and spits out a good one. Split out the schemas because that changes a lot. But that's it. Top down in a weird convoluted way.
I've got a feeling that splitting them out is the knee-jerk reaction. Does it make less sense after I've explained it this way, or did I completely miss the point?
Seperating validation, normalization and plain-old fixing would negatively affect both performance and readability.
The URI is a complex dataform governed by many different specifications (called schemes). However, URI defines a generic syntax applicable to all schemes, for which the scheme can place further constraints.
The first logical step, then, would be to parse the URI according to the generic syntax and then pass on the components to the correct scheme for further parsing. If the URI turns out to be completely invalid, we can abort real fast. However, notice how parsing and validation are intertwined: you must parse before you can validate.
You also need to parse before you can fix or normalize.
If we seperate the validator logic, we end up parsing the URL twice, and, without common function set, the code gets duplicated too.
Further complicating the situation is the fact that we have two levels of parsing, validation and fixing, the generic URI level and the specific scheme (e.g. HTTP) level.
So... to go full out OOP, we need a ParsedURI, ParsedURI_HTTP and friends, URIValidator, URIValidator_HTTP and friends, URINormalizer, URINormalizer_HTTP and friends, URIFixer for full extensibility and modularity. All this for a spec that really won't change. Oh, and throw in a ResolveRelativeURI.
I'd much rather think of URI validation as a black box that takes a random URI and spits out a good one. Split out the schemas because that changes a lot. But that's it. Top down in a weird convoluted way.
I've got a feeling that splitting them out is the knee-jerk reaction. Does it make less sense after I've explained it this way, or did I completely miss the point?
Hmm... the definition makes sense. Would AttrHandler and ChildHandler make sense too?What about URIHandler. Does that break any conventions?
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
To me they do, but I made the URIHandler recommendation, so I am biased. Although, when I step back and look at AttrHandler and ChildHandler, they don't make as much sense as URIHandler does.Ambush Commander wrote:Hmm... the definition makes sense. Would AttrHandler and ChildHandler make sense too?What about URIHandler. Does that break any conventions?
Re: Gimme a name : URI and friends
whats URIne ?Luke wrote:I would definately go with URIne
- John Cartwright
- Site Admin
- Posts: 11470
- Joined: Tue Dec 23, 2003 2:10 am
- Location: Toronto
- Contact:
Re: Gimme a name : URI and friends
Please stop resurrecting old threads.remshad wrote:whats URIne ?Luke wrote:I would definately go with URIne
Topic Locked.