Regex Length: {3} *OR* {6}?

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Regex Length: {3} *OR* {6}?

Post by prometheuzz »

*non-EDIT Edit* Forums weren't responsive for about an hour and a friend suggested the following...

Code: Select all

/[0-9a-z]((.|,|:)[0-9a-z]){0,10}/
This won't match empty strings <input value="" /> being empty. I can also get :: or ::: or ::::: to match.
Not quite sure whether you want to match an empty string or not and what you mean by those multiple :'s.
Also, as I mentioned in my previous reply, you realise that you are NOT matching "a dot OR a comma OR a colon" with the regex "(.|,|:)" ? That part of your regex will match any character because of the dot.
The regex (.|,|:) can be read as: "any character OR a comma OR a colon".
It's closer then where I was earlier though.

Ok on to my hour ago reply...

Thanks for your suggestion though none of the valid CSS selectors validate with your suggested regex filter.
Oh, that is of no surprise to me. I merely pointed out the probable errors in your regex.

I have posted a fully functional regex filter tester with text inputs with existing CSS selectors in posts.

But if you're just better with a list here are some examples of valid CSS selectors (keeping in mind I am not going to need to filter {} curely brackets).

Code: Select all

body h1 h1, h2, h3, h4 div div.border a:hover a:link a:focus
Ok, matching those strings above can be done using the following regex:

Code: Select all

'/^[a-z0-9]+([.,:](?!$))?(\s*[a-z0-9]+\1?)*?$/'
And here's a test-harness for it:

Code: Select all

#!/usr/bin/php
<?php
$tests = array("body", "h1", "h1, h2, h3, h4", "div", "div.border", "a:hover", "a:link", "a:focus", // your correct values
               "bOdy", "h!", "h1, h2. h3: h4", "div.", "?div.border", "a::hover"); // some wrong values?
$regex = '/^[a-z0-9]+([.,:](?!$))?(\s*[a-z0-9]+\1?)*?$/';
foreach($tests as $t) {
    if(preg_match($regex, $t)) {
        print "Yes, a match : $t\n";
    } else {
        print "No match     : $t\n";  
    }
}
/* output:
    Yes, a match : body
    Yes, a match : h1
    Yes, a match : h1, h2, h3, h4
    Yes, a match : div
    Yes, a match : div.border
    Yes, a match : a:hover
    Yes, a match : a:link
    Yes, a match : a:focus
    No match     : bOdy
    No match     : h!
    No match     : h1, h2. h3: h4
    No match     : div.
    No match     : ?div.border
*/
?>
Note that I still don't know exactly what you are trying to match and what not!
So I made a more or less educated guess as to what you don't want to match. Extending the above regex to accept empty strings should be trivial.
User avatar
JAB Creations
DevNet Resident
Posts: 2341
Joined: Thu Jan 13, 2005 6:44 pm
Location: Sarasota Florida
Contact:

Re: Regex Length: {3} *OR* {6}?

Post by JAB Creations »

I'm very grateful that even though you haven't worked with CSS that you're trying to help me out here. I have learned so much. I want to try something that I hope may make a resolution easier to achieve so here is my following idea step-by-step using much more solid patterns that should be easier to work with I hope...

CSS can be a little tricky though I'm trying to slim things down by suggesting the usage of static information (very short arrays) which I think will have simplify the logic (I really hope) as much as possible by creating simpler rules to define...at least that is what is going through my head! :mrgreen:

Oh and if you think you might find a simpler way to do this please I'm open to whatever suggestions you have and don't want to drive any one crazy(ier). :|

Earlier I learned that you can use the solid pipe as "full" options...
none|transparent
...would match exactly none or transparent.

So with that in mind here is my current idea...

1.) Blank Match - The value of the string matches if it's blank.

2.) Element **OR** ID Match - The first matches if they are not blank must be elements.

To simplify the elements part I think it would just be easier if a static array was defined in the regex. Here is a short list of elements (and with constant solid pipes it should be very easy to figure out how to add "one more element" sort of support...
a|code|div|fieldset|form|p|span|table

So where regex expects an element if it does not match the small array above then the string should not match.

***OR***
#ID selector, an ID selector is an element with an id that is selected...
In HTML...
<div id="content"></div>
Corresponding CSS...
#content

I think to keep it simple keep it stupid I'd be more then happy with the same sort of logic by having a static list of ID selectors...
#body|#content|#menua1|#menua2|#menua3|#menua4

3.) Element with Pseudo-Element **OR** Class Match - Either the element is directly styled (without any pseudo-elements) or the element contains a pseudo-elements. Class match I know is a simple a-z0-9 match (no other limitations)...and ITSELF may be followed by a psuedo-element.

Step 2's exact same examples only these are with pseudo-elements...
a:link, code:hover, div:hover, fieldset:active, form:hover, input:focus, p:hover, span:active, table:hover

#body:hover, #content:active, #menua1:focus, #menua2:focus, #menua3:hover, #menua4:hover

So we're either looking for a colon : for psuedo-element **or** a period for a class selector.
HTML Class Selector: <div class="content"></div>
Corresponding CSS Class Selector: div.content

Clarification W3Cschools is making some sort of distinction between Pseudo-classes and Pseudo-elements. I'm just going to combine them in an umbrella term because frankly it's not really that important. So I'll just have a list specified and so long as there is two or more in your example I'll be very able to change it as I need without having to make guesses.

4.1) If : follows the element *OR* element-class *OR* ID selector then the pseudo-elements should repeat the same sort of static logic pattern of how the elements are defined maybe?
active|focus|hover|link|visited

Step 5: Comma separators If there are multiple CSS selectors then in CSS they are separated by commas...
a, div, #content, form:hover, #content a:focus

*Takes a really deep breath*

...so let me share some of the strings of selectors that will should match and I'll list some static rules that they would match in the regex? I think that might help create re-establish creator patterns to maybe make this easier? Oh and don't worry about creating full/complete lists of things, two things separated by a solid pipe makes is all I need to comprehend what I'm looking at.

Pattern:
#content a:link, #side a:link, #promptsajax div.scroll a:link, #promptsajax div.noscroll a:link

Matches
#content|#promptsajax|#side
a|div
link


Pattern:
#content a:link:hover, #side a:link:hover, #promptsajax div.scroll a:link:hover, #promptsajax div.noscroll a:hover, #side div.inset div.scroll a:hover, #content a:link:focus, #side a:link:focus, #promptsajax div.scroll a:link:focus, #promptsajax div.noscroll a:focus, #side div.inset div.scroll a:focus

Matches
#content|#promptsajax|#side
a|div
hover|focus

Pattern:
form fieldset input.button:hover, form fieldset input.text:hover, form fieldset input.url:hover, select:hover, form fieldset textarea:hover, form fieldset input.button:focus, form fieldset input.text:focus, form fieldset input.url:focus, select:focus, form fieldset textarea

Matches
fieldset|form|input|select|textarea
hover|focus


Ok this is perhaps the most complicated thread/explanation I've given in uh, years. I really appreciate your help! :bow:
User avatar
JAB Creations
DevNet Resident
Posts: 2341
Joined: Thu Jan 13, 2005 6:44 pm
Location: Sarasota Florida
Contact:

Re: Regex Length: {3} *OR* {6}?

Post by JAB Creations »

I had to post in a slight rush and only have a little time to start trying to create pieces of the regex string because with the page I am able to comment out the best working regex filter and test out new attempts. Any way this will also be tomorrow's priority 1 goal for me so here are some strings I'm going to try


Right now I'm testing just the static selectors which seem to be working fine (who would guess I'd make a desirable amount of progress before I crash?)

Any way here is an example I just conjured up that matches my current selectors (without any vigorous testing)...

Code: Select all

$regex_0_selectors = '/^(body|div|h1|h2|h3|h4|span|#body|#bottom|#content|#prompts|#promptsajax|#side|#top|#welcome|#menua[1-8])/';
A good and I would imagine simple question is how can I create a range specific to the headers and menus?
#menua1, #menua2, #menua3, #menua4, #menua5, #menua6, #menua7, #menua8, #menua8
h1, h2, h3, h4

I was thinking something like...
#menua[1-9]
h[1-4]

...however when I changed the menu to #menua[1-4] menua5 would still match (which for that specific test shouldn't).

Any way I'm posting this before I start to make (less) sense as I'm going to crash soon. Thanks for reading my ramblings! :mrgreen:
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Regex Length: {3} *OR* {6}?

Post by prometheuzz »

Thanks for the extra info, although it kind of went over my head. I've read a simple tutorial about CSS selectors, but with my newly acquired knowledge, I am still unsure what it is you want exactly.
I would like to see what your input is. Is it a CSS file, or something else? And what exactly do you want to extract from that input file?
If it is a separate CSS file, couldn't you just match everything after a '}' (or the beginning of the string/file) and before a '{'? Just a quick example: if this is your input file:

Code: Select all

body {
  padding-left: 11em;
  font-family: Georgia, "Times New Roman", Times, serif;
  color: purple;
  background-color: #d8da3d 
}
ul.navbar {
  list-style-type: none;
  padding: 0;
  margin: 0;
}
h1, h2, h3 {
  font-family: Helvetica, Geneva, Arial, SunSans-Regular, sans-serif 
}
what do you want to extract from it? Is it the following: body, ul.navbar and h1, h2, h3?
Or is the CSS stuff tangled in an HTML file?
User avatar
JAB Creations
DevNet Resident
Posts: 2341
Joined: Thu Jan 13, 2005 6:44 pm
Location: Sarasota Florida
Contact:

Re: Regex Length: {3} *OR* {6}?

Post by JAB Creations »

Yes, those are the selectors. Here is a good visual I think would help would be a comparison of the terminology contrasting the actual usage...

Code: Select all

selector { property: value;}selector, selector, selector { property: value;}selector:psuedo-element { property: value;}selector, selector:psuedo-element, selector { property: value;}#id_selector { property: value;}element.selector { property: value;}parent_selector child_selector { property: value;}
Obviously a comma separates one selector from another so if you change the background-color of two or more selectors to the same value you can group them accordingly.

The last example is a bit more advanced version of the selectors I use. For example if you have a two column page (content with a small sidebar on the right) I would use #content p to change the styling of the paragraphs in the content without effecting the paragraphs in the sidebar and #sidebar p as vice versa. You'll notice in my selectors (the way I keep things organized in my own personal coding ethics) I use "blood" parent/child selectors such as form fieldset label often.

Currently I'm not absolutely set on the following sets of selectors. Though to help you visualize the process to understand exactly what I'm doing I'll try to create connect all the dots so to say...

1.) Visitors interacts with a GUI editor that allows 25 selectors to be manipulated, each with four properties. That makes (without other form information) about 100 post values (give or take if the number of selectors is off)...

2.) The properties are background-color, background-image, border-color, and color. Visually...

Code: Select all

selector {background-color: value;background-image: value;border-color: value;color: value;}
3.) Visitor hits the save button and the server receives the POST data.

4.) Server calls a function four times, each time with two parameters: regex filter to be used and a number. The number corresponds to the last digit each POST key (not it's value) in the $_POST array.
0 = selector
1 = background-color
2 = background-image
3 = border-color
4 = color

...this way like in the other threads I've made I can apply various regex filters to each type by simply selecting it as a group by it's corresponding end number.

Which brings us to the regex filter and the post array.

Below is the post array and the reason I wish to do it this way is so that when (not if) I change/tweak the selectors I won't have to worry about someone's theme getting screwed up (there will also probably be a "version" post value if I need to force an update on the visitor's cookie)...and yes I've got a security thread going on about all the cookie stuff.

Any way here is a list of selectors. Each selector is simply on it's own line. Keep in mind that if all the post data validates then I implode everything and later on add the {} curely brackets (so don't worry that they are missing if that thought pops in to your head)...
#welcome, #content div.border, #content div.bordernorc, #prompts div.scroll, #prompts #promptstabs div.current
#body, #prompts

body, html

#content div.border div.border div.border, #head, #prompts #promptsbuttons, #prompts #promptstabs, #prompts #promptstabs div, #side div.inset, #side h2, #side form, div.footer, div.header, table tfoot, table thead, table tbody tr

#bottom, #top

h1

#prompts h2, h2, h3, h4, h5, h6

#content a:link, #side a:link, #promptsajax div.scroll a:link, #promptsajax div.noscroll a:link

#content a:link:hover, #side a:link:hover, #promptsajax div.scroll a:link:hover, #promptsajax div.noscroll a:hover, #side div.inset div.scroll a:hover, #content a:link:focus, #side a:link:focus, #promptsajax div.scroll a:link:focus, #promptsajax div.noscroll a:focus, #side div.inset div.scroll a:focus

#content a:link:visited, #side a:link:visited, #promptsajax div.scroll a:link:visited, #promptsajax div.noscroll a:visited, #side div.inset div.scroll a:visited

#menua1, #menua2, #menua3, #menua4, #menua5, #menua6, #menua7, #menua8, #menua9, a.menuaa, a.tools, #toola1, #toola2, div.list div, #sidetoggle a, #sidetoggle div, #prompts #promptsbuttons div a, #prompts #promptstabs div a

#menua1 a:hover, #menua2 a:hover, #menua3 a:hover, #menua4 a:hover, #menua5 a:hover, #menua6 a:hover, #menua7 a:hover, #menua8 a:hover, #menua9 a:hover, #menua1 a:focus, #menua2 a:focus, #menua3 a:focus, #menua4 a:focus, #menua5 a:focus, #menua6 a:focus, #menua7 a:focus, #menua8 a:focus, #menua9 a:focus, a.menuaa:hover, a.tools:hover, #toola1:hover, #toola2:hover, div.list div:hover, #sidetoggle a:hover, #sidetoggle div:hover, a.menuaa:focus, a.tools:focus, #toola1:focus, #toola2:focus, div.list div:focus, #sidetoggle a:focus, #sidetoggle div:focus, #prompts #promptsbuttons a:focus, #prompts #promptsbuttons a:hover

#menua1:visited, #menua2:visited, #menua3:visited, #menua4:visited, #menua5:visited, #menua6:visited, #menua7:visited, #menua8:visited, #menua9:visited, a.menuaa:visited, a.tools:visited, #toola1:visited, #toola2:visited, div.list div:visited, #sidetoggle a:visited, #sidetoggle div:visited, #prompts #promptsbuttons div a:visited, #prompts #promptstabs div a:visited

#bottom div a

#bottom div a:hover, #bottom div a:focus

form

form fieldset input.button, form fieldset input.text, form fieldset input.url, select, form fieldset textarea

form fieldset input.button:hover, form fieldset input.text:hover, form fieldset input.url:hover, select:hover, form fieldset textarea:hover, form fieldset input.button:focus, form fieldset input.text:focus, form fieldset input.url:focus, select:focus, form fieldset textarea

form fieldset label

ol, ul

code

#head #location a:link, #head #location a:visited, .color1

#content span.subtitle, .color2

p, p.noindent, ol li span, ul li span, .color3

.color4

.color5

I might have to add span elements to the last .color selectors. If there is no element before the period it means you can apply the class to any element...
<div class="color4"> or <span class="color4">
...where as if you specify the element before the period such as span.color4 then unless I have .color4 or div.color4 a div with the class of color4 won't be selected, it will just so happen to have a class that isn't selected.

So uh, are my CSS selectors scary or what? :mrgreen: Though seriously I see myself tweaking them in the future.

Once I get all this sorted out it will be much easier to re-implement theme support to my site, here is a preview...
http://www.jabcreations.com/blog/?promp ... s-official

Any way I hope this helps because it doesn't get much more complicated then this (as far as the structure is concerned at least).
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Regex Length: {3} *OR* {6}?

Post by prometheuzz »

Thank you for extra information... But, I can't see where you mention what data you get in and what you want to extract from that input data.
Regex is (as you know) all about strings. So, can you please explain it in terms of input- and output strings?

Here is a simplified example of how I'd like you to formulate your problem:

<EXAMPLE>

I get the following string as input:

Code: Select all

line1 23 some text1
line2 some more 44 text2
line3 and even more 123 text3
The number of lines is variable. Now, I want to remove all the numbers from those lines except when preceded by the string "line", resulting in the desired output:

Code: Select all

line1 some text
line2 some more text
line3 and even more text
</EXAMPLE>

As you have noticed, I know next to nothing about CSS and other web related stuff. That's why I only post in the regex forum. So, if you can explain your problem in strings/regexes, I might (or probably) be able to help you. If not, no hard feelings, and I wish you the best of luck with your website!
User avatar
JAB Creations
DevNet Resident
Posts: 2341
Joined: Thu Jan 13, 2005 6:44 pm
Location: Sarasota Florida
Contact:

Re: Regex Length: {3} *OR* {6}?

Post by JAB Creations »

Ok I'm begining to understand your point of view better. I'll try wrapping the selectors as strings per line in the BB [ c o d e ] ...err code...

As I do 99.9% of the time I preview my posts before I actually post them, so I know that each selector as a string has it's own number below.

Code: Select all

#welcome, #content div.border, #content div.bordernorc, #prompts div.scroll, #prompts #promptstabs div.current
#body, #prompts
body, html
#content div.border div.border div.border, #head, #prompts #promptsbuttons, #prompts #promptstabs, #prompts #promptstabs div, #side div.inset, #side h2, #side form, div.footer, div.header, table tfoot, table thead, table tbody tr
#bottom, #top
h1
#prompts h2, h2, h3, h4, h5, h6
#content a:link, #side a:link, #promptsajax div.scroll a:link, #promptsajax div.noscroll a:link
#content a:link:hover, #side a:link:hover, #promptsajax div.scroll a:link:hover, #promptsajax div.noscroll a:hover, #side div.inset div.scroll a:hover, #content a:link:focus, #side a:link:focus, #promptsajax div.scroll a:link:focus, #promptsajax div.noscroll a:focus, #side div.inset div.scroll a:focus
#content a:link:visited, #side a:link:visited, #promptsajax div.scroll a:link:visited, #promptsajax div.noscroll a:visited, #side div.inset div.scroll a:visited
#menua1, #menua2, #menua3, #menua4, #menua5, #menua6, #menua7, #menua8, #menua9, a.menuaa, a.tools, #toola1, #toola2, div.list div, #sidetoggle a, #sidetoggle div, #prompts #promptsbuttons div a, #prompts #promptstabs div a
#menua1 a:hover, #menua2 a:hover, #menua3 a:hover, #menua4 a:hover, #menua5 a:hover, #menua6 a:hover, #menua7 a:hover, #menua8 a:hover, #menua9 a:hover, #menua1 a:focus, #menua2 a:focus, #menua3 a:focus, #menua4 a:focus, #menua5 a:focus, #menua6 a:focus, #menua7 a:focus, #menua8 a:focus, #menua9 a:focus, a.menuaa:hover, a.tools:hover, #toola1:hover, #toola2:hover, div.list div:hover, #sidetoggle a:hover, #sidetoggle div:hover, a.menuaa:focus, a.tools:focus, #toola1:focus, #toola2:focus, div.list div:focus, #sidetoggle a:focus, #sidetoggle div:focus, #prompts #promptsbuttons a:focus, #prompts #promptsbuttons a:hover
#menua1:visited, #menua2:visited, #menua3:visited, #menua4:visited, #menua5:visited, #menua6:visited, #menua7:visited, #menua8:visited, #menua9:visited, a.menuaa:visited, a.tools:visited, #toola1:visited, #toola2:visited, div.list div:visited, #sidetoggle a:visited, #sidetoggle div:visited, #prompts #promptsbuttons div a:visited, #prompts #promptstabs div a:visited
#bottom div a
#bottom div a:hover, #bottom div a:focus
form
form fieldset input.button, form fieldset input.text, form fieldset input.url, select, form fieldset textarea
form fieldset input.button:hover, form fieldset input.text:hover, form fieldset input.url:hover, select:hover, form fieldset textarea:hover, form fieldset input.button:focus, form fieldset input.text:focus, form fieldset input.url:focus, select:focus, form fieldset textarea
form fieldset label
ol, ul
code
#head #location a:link, #head #location a:visited, .color1
#content span.subtitle, .color2
p, p.noindent, ol li span, ul li span, .color3
.color4
.color5
I think perhaps another thought that may help is what types of security attacks would we want to use the regex to filter out which would not block these strings? Would that be a more effective use of regex in this instance that may also reduce server load maybe?
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Regex Length: {3} *OR* {6}?

Post by prometheuzz »

John, FYI: I will leave this thread for someone else now since either I am not able to make clear to you what it is I exactly need in order to help you, or you are not able to communicate clearly in a string-and-regex-only manner what it is you're after (or both...).
In your last reply you still only provided one example of which I believe it is your input as a string, while I specifically asked for example in- and output. You get some string as input and then you either match or replace parts (or nothing) from that string by using some regex. I still can't see what you're trying to do.
Perhaps someone else (with more knowledge about CSS etc) can help you.
Good luck anyway.

Regards,

Bart.
Post Reply