Conditional and Lookbehind

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
veleshanas
Forum Newbie
Posts: 10
Joined: Sat May 24, 2008 10:58 pm

Conditional and Lookbehind

Post by veleshanas »

Hello Forum,

I am writing a regex that concatenates several words into one character string. I am using conditional and lookbehind constructions but I keep getting unexpected results. Could anyone help me understand what is going on?

The input for the regex is a few words:
yom sase rare imas ita

The regex should concatenate them with the following rules:
1. The first consonant of a word should be deleted if the word immediately before ends with a consonant. E.g., yom sase —> yomase.
2. The first vowel of a word should be deleted if the word immediately before ends with a vowel. E.g., tabe imasu —> tabemasu.

Since vowels and consonants are mutually exclusive, I thought I could define them as [aeiou] and [^aeiou], and use a regex conditional to express the two rules by one regex. Unfortunately, my regex below matches more than necessary.

For yom sase rare imas ita,

Code: Select all

/(?(?<=\B[^aeiou]\b ))\b[^aeiou]|\b[aeiou]/
matches
[!MATCH!]om[!MATCH!][!MATCH!]ase[!MATCH!][!MATCH!]are[!MATCH!][!MATCH!]mas[!MATCH!][!MATCH!]ta
where the matched portions are replaced by [!MATCH!].

What I want to match in the same convention would be:
yom [!MATCH!]ase rare [!MATCH!]mas ita


THANK YOU in advance for comments!
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Conditional and Lookbehind

Post by prometheuzz »

Something like this?

Code: Select all

<?php
$test = "yom sase rare imas ita";
echo $test . "\n";
echo preg_replace('/([^aeiou])\s+[^aeiou]|([aeiou])\s+[aeiou]/', '$1$2', $test) . "\n";
/* output:
yom sase rare imas ita
yomase raremas ita
*/
?>
veleshanas
Forum Newbie
Posts: 10
Joined: Sat May 24, 2008 10:58 pm

Re: Conditional and Lookbehind

Post by veleshanas »

Hello prometheuzz,

It works!! Thank you!

I hope I am not asking too much but how would you write a regex to limit the text input for this concatenation? The input should be:
A. alphabets only OR
B. alphabets and spaces BUT NOT
C. spaces only


Can I use conditional to write something like this?
IF the input is spaces, THEN select nothing, ELSE select alphabets and spaces.
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Conditional and Lookbehind

Post by prometheuzz »

veleshanas wrote:Hello prometheuzz,

It works!! Thank you!

I hope I am not asking too much but how would you write a regex to limit the text input for this concatenation? The input should be:
A. alphabets only OR
B. alphabets and spaces BUT NOT
C. spaces only


Can I use conditional to write something like this?
IF the input is spaces, THEN select nothing, ELSE select alphabets and spaces.
You're welcome.

I don't know what exactly you mean by "alphabet", but rejecting input strings that consist of entirely white space characters (or empty input strings) can be done like this:

Code: Select all

if(!preg_match('/^\s*$/', $input)) {
  # your code here
}
User avatar
GeertDD
Forum Contributor
Posts: 274
Joined: Sun Oct 22, 2006 1:47 am
Location: Belgium

Re: Conditional and Lookbehind

Post by GeertDD »

prometheuzz wrote:

Code: Select all

if(!preg_match('/^\s*$/', $input)) {
  # your code here
}
For what it is worth, this can be achieved (a bit quicker?) without regular expressions as well.

Code: Select all

if (trim($str) !== '') { /*your code*/ }
Post Reply