Page 1 of 1

A that's not between B and C.

Posted: Tue Sep 29, 2009 2:43 pm
by JellyFish
How do I match a string that is NOT between two strings? In other words how do I match A that is not between B and C? For example let's say A was "foo" and B was "{" and C was "}". In "food {foo bar}", my regular expression should match "foo" in the word "food" at the beginning, but not the "foo" before "bar".

How could this be done in the JavaScript implementation of RegExp?

Re: A that's not between B and C.

Posted: Tue Sep 29, 2009 2:59 pm
by prometheuzz
Do it in two steps:
1 - split on '{'
2 - for each element that split(...) returned, check if it matches: foo(?![^}]*})

Re: A that's not between B and C.

Posted: Tue Sep 29, 2009 3:30 pm
by JellyFish
Hmm, but what I'm doing is a split on /;(?:\s{2,}|\s*\n+\s*|$)/g, so how would I put it all back together?

Code: Select all

 
var parts = "foo;  { foo;  } foo;".split(/{/g);
for (var i in parts)
{
parts[i] = parts[i].split(/;(?:\s{2,}|\s*\n+\s*|$)/g);
}
//How would I put parts back together if each part is not an array? I couldn't use join because it converts to a string.
 

Re: A that's not between B and C.

Posted: Tue Sep 29, 2009 3:39 pm
by prometheuzz
Err, you were only talking about matching some substring. Apparently you want something else (transform some string into another string). Could you provide some examples of what you want. I urge you to post examples that look like your real data: people tend to over-simplify their examples and miss many of the small corner cases.

Re: A that's not between B and C.

Posted: Fri Oct 02, 2009 3:43 pm
by JellyFish
I'm writing a javascript interpreter for this language I'm "inventing". It's not really a serious project right now, it's more experimental than anything. I hope that doesn't make it less of a valid thing to help not me with it.

So what I'm doing right now is "tokenizing" my language's code. I'm creating an array of all the statements then I'll make each statement an array of tokens or parts of the statement. Once I do these I'll pass this tokenized array into the parsing engine which will then make read the tokens, and perform specific things for specific tokens, etc. I'm not really sure how this parser is going to work yet, but I'll get to it.

Mainly right now I'm focusing on the tokenizer--the part that turns the code into special arrays. What I'm trying to do with the tokenizer is separate each statement of the language into it's own array element. In my language the { and } define a separate scope so I don't want the statements within these to be in separate array elements. Illustrations are better then words so here you go.

Code: Select all

 
<textarea id="code">
method argument, {
  method arg, arg;
};
method arg;
method arg;
</textarea>
 
I want this to turn into an array like:

Code: Select all

 
var code = document.getElementById("code").value;
var parts = code.split(/{/g);
for (var i in parts)
{
parts[i] = parts[i].split(/;(?:\s{2,}|\s*\n+\s*|$)/g);
}
 
//What I want is an array like this: ["method argument, {\n  method arg, arg;\n}", "method arg", "method arg"]
 
This is basically it. I'm trying to get the result as shown in the comment above.

Basically, I want to separate the string by something but ignore that something when it's within curly braces. I'll then have the tokenizer separate all the parts of each statement. In my language a "block" is considered a data type so it is treated the same as any other data type in a statement. Sense every block has scope in my language I don't want the statements within the block hanging out in the array, rather I want the block to be parsed later on at the statement level.

Re: A that's not between B and C.

Posted: Sat Oct 03, 2009 2:38 am
by prometheuzz
I see. I don't want to put you down, but I can't recommend a regex split/match solution to tokenize a language. What you could do is write a grammar and let some sort of tool like ANTLR or Yacc (there are many more) generate a lexer+parser for you.

Best of luck.

Re: A that's not between B and C.

Posted: Mon Oct 05, 2009 4:36 pm
by JellyFish
Does ANTLR or Yacc create a interpreter in JavaScript? I want my language to be able to run in a web browser with optional compilation. If compiled it'll compile to JavaScript, but if not compiled it could be interpreted on the fly by a JavaScript interpreter.

Re: A that's not between B and C.

Posted: Tue Oct 06, 2009 12:54 am
by prometheuzz
No, ...

Edit: I mean yes, ANTLR targets JavaScript as well. From the ANTLR website: Currently available with C, C#, ActionScript, JavaScript, and Java targets.