Hi.
I'm new here.
My knowledge about regex is very limited.
The case:
The user should input a mathematical input like this:
2.3+exp(.4*VAR)-3
(The function is f(VAR).
I have done a very simple regex that checks that there is no mistakes: There are numbers, symbols (+-*/) and the "accepted" functions.
The functions that are allowed are: pow(x,y), exp(3), pi()
My regex is:
/^((exp\(|pow\(|\()?((\d*\.?\d*)|VAR|PI\(\))\}*([,\+\-\*\/]|$))+$/
If the function isn't all right, the eval() function gives me a warning. It works.... but isn't nice).
The mistake cases (the regex says they are fine):
* Different number of '(' than ')'. f=2*(3-4))
* Functions without the right parameters: pow(5)
* Function ends with a symbol: 3+4* or 3+exp(
I hope you can help me.
Thanks.
PS: Sorry for my bad english.
Check if it's a mathematical function
Moderator: General Moderators
- superdezign
- DevNet Master
- Posts: 4135
- Joined: Sat Jan 20, 2007 11:06 pm
Re: Check if it's a mathematical function
Your issue is that you are attempting to do this purely through regex. The downfalls include not being able to recognize matching parentheses or any sort of nesting.
I recommend you look into tokenization and lexical parsing. Tokenization is when you convert that string into a series of meaningful tokens. A token is string with meaning, such as "(" being an opening parentheses or "pow" being a function name. Lexical parsing is when you take those token and apply contextual meaning to them. So, when you see a function name (i.e. "pow") and an opening parentheses (i.e. "("), then you know you have a function and it will now expect parameters (which is a series of values/variables and delimiting characters such as commas) and end that expectation with a closing parenthesis.
Be aware that what you are doing is essentially developing a programming language since you plan to get dynamic results. It will not be particularly simple, but it will be rewarding IMO.
I recommend you look into tokenization and lexical parsing. Tokenization is when you convert that string into a series of meaningful tokens. A token is string with meaning, such as "(" being an opening parentheses or "pow" being a function name. Lexical parsing is when you take those token and apply contextual meaning to them. So, when you see a function name (i.e. "pow") and an opening parentheses (i.e. "("), then you know you have a function and it will now expect parameters (which is a series of values/variables and delimiting characters such as commas) and end that expectation with a closing parenthesis.
Be aware that what you are doing is essentially developing a programming language since you plan to get dynamic results. It will not be particularly simple, but it will be rewarding IMO.
Re: Check if it's a mathematical function
Thanks for your fast answer....
I look about tokenize and, as you say, this option is the only one which can work, but is quite difficult to implement.
In my case, this part of the code will be quite rarely executed, and, in case of warning, it's rejected (error_get_last())
That's why I decided to maintain my original idea, with some improvements, to avoid some warnings.
My idea (maybe someone can reuse it....)
The new regex is:
/^((((exp\(|pow\(|\())(?!$))?((([\+\-])?\d*\.?\d*)|(VARIABLE)|(pi\(\)))\)*([\+\-](?!$)|(?<!^)[,\*\/](?!$)|$))+$/
It corrects the issue that the function could ends with a symbol, different to ')' or a number.
For the number (and order) of the parenthesis, a simple for checks it:
With both , the bigger mistakes are detected, although remains the checking for internal functions, which I wouldn't do.... at least for this project xDDD
Thanks.
I look about tokenize and, as you say, this option is the only one which can work, but is quite difficult to implement.
In my case, this part of the code will be quite rarely executed, and, in case of warning, it's rejected (error_get_last())
That's why I decided to maintain my original idea, with some improvements, to avoid some warnings.
My idea (maybe someone can reuse it....)
The new regex is:
/^((((exp\(|pow\(|\())(?!$))?((([\+\-])?\d*\.?\d*)|(VARIABLE)|(pi\(\)))\)*([\+\-](?!$)|(?<!^)[,\*\/](?!$)|$))+$/
It corrects the issue that the function could ends with a symbol, different to ')' or a number.
For the number (and order) of the parenthesis, a simple for checks it:
Code: Select all
$x = 0;
for ($i=0; $i<strlen($tmp);$i++) {
if ($tmp[$i] == '(') $x++;
else if ($tmp[$i] == "')' $x--;
if ($x<0) exit();
}
if ($x !=0)exit();With both , the bigger mistakes are detected, although remains the checking for internal functions, which I wouldn't do.... at least for this project xDDD
Thanks.