Time

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
Celauran
Moderator
Posts: 6427
Joined: Tue Nov 09, 2010 2:39 pm
Location: Montreal, Canada

Time

Post by Celauran »

I've been trying to come up with a suitable expression to identify valid times. This works, but I'm wondering if some regex whiz can improve it.

Code: Select all

/^(([01]?[\d]{1})|(2[0-3]))[:h][0-5]{1}[\d]{1}(\s)?(AM|PM)?/i
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Re: Time

Post by Weirdan »

uhm, how about Date::createFromFormat() ?
User avatar
ragax
Forum Commoner
Posts: 85
Joined: Thu Dec 15, 2011 1:40 pm
Location: Nelson, NZ

Re: Time

Post by ragax »

Hi Celauran!

Looks like you've written a nice long expression... Cool!
Let's see if we can tweak it as you were asking.

First thing that jumps at me: you have four places with a 1 in square brackets: {1}
You can remove those. Any time you have a character class, it already means "match one of these", unless you have a quantifier.

Second tiny optimization: you can take the "M" out of (AM|PM), yielding something like:

Code: Select all

(?:(?:A|P)M)
Third thought: On the line above, you can see some non-capturing groups. Your expression has four capturing groups. If you don't plan to capture anything, you can get rid of that overhead by using non-capturing groups. Or, in the case of the lone space character (\s), forget the group altogether.

One more idea: some places use different dividers for the hours and minutes, or none at all. For instance, 1855, or 12.44. So you could add characters in that class, or make it optional.

Also: no need to insert a single digit in a character class, as in the two instances of [\d]

If we make all these tweaks, here's where we are so far (without the delimiters):

Code: Select all

(?i)^(?:[01]?\d|2[0-3])[.:h]?[0-5]\d\s?(?:(?:A|P)M)?
But oops, if you have no AM or PM, this will capture a space character after the time! So you need to make the space character part of the AM PM option. Let's fix this:

Code: Select all

(?i)^(?:[01]?\d|2[0-3])[.:h]?[0-5]\d\(?:s?(?:(?:A|P)M))?
But wait a minute... As it is, the expression allows 08:55 pm and 22:22am. You need to decide if you want to validate the date as well or not. If so, you'll have to capture the hour and control the am pm with conditionals. And we can no longer break the hour at 00-19 vs 20-24. Instead, we need to test for 00 to 12 vs everything else. (If you think 12:33pm is legit). This is where it leads us:

Code: Select all

(?i)^(?:([0]?\d|1[012])|(?:1[3-9]|2[0-3]))[.:h]?[0-5]\d(?:\s?(?(1)(?:(?:A|P)M)))?
There might still be a bug or two in there, but without spending an hour on it, that's how I would start debugging your expression.

Hope this is the kind of info you were looking for!
:)

Wishing you a fun day.
User avatar
ragax
Forum Commoner
Posts: 85
Joined: Thu Dec 15, 2011 1:40 pm
Location: Nelson, NZ

Re: Time

Post by ragax »

One more thought:
At the moment, Am and pM are legal. It's up to you whether to accept that. If not, we need to remove the case insensitive matching and specify the allowed strings: am|AM|pm|PM

Also:
Checking Jeffrey's book, I see that Mastering Regular Expressions has this (after correcting a typo):

Code: Select all

(1[012]|[1-9]):[0-5][0-9] (am|pm)
It doesn't seem to go as far as what we've done. But maybe that's a logic to use for the am/pm test in our expression (captured in group 1), as I haven't fully thought through / debugged that part (what is a legal time to use with the am / pm attributes).
I looked through Jan's Regular Expressions Cookbook and didn't see a time pattern.
User avatar
Celauran
Moderator
Posts: 6427
Joined: Tue Nov 09, 2010 2:39 pm
Location: Montreal, Canada

Re: Time

Post by Celauran »

Weirdan wrote:uhm, how about Date::createFromFormat() ?
Sadly, host has PHP 5.2.17

playful, thanks a lot! I'll take some time to poke through that.
Post Reply