Parsing Source Code

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Parsing Source Code

Post by Ollie Saunders »

I'm planning on building a CSS abstraction language and I need to be able to parse code. I want a decent way of parsing and processing source code that readable and flexible. A look on wikipedia stated that "most parsers are generated" and this seems like a good technique. I've had a look a some rather complicated by very clever looking code from simpletest's browser and also some very basic wiki syntax parsing stuff that mainly consistuted one giant switch. I had a look at http://pear.php.net/package/PHP_ParserGenerator and http://freshmeat.net/articles/view/1270/

Can anyone offer any suggestions on how I go about doing this? What should I read up on? Is lemon-based generation any good? Also, what's a context-free grammar?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

Have you had a look at HTML Purifier?
(#10850)
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

Yes
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Ye don't want to use HTML Purifier's parsing code. It's ad hoc and was the very first thing I wrote for the library.

One of the biggest concerns is that PHP isn't exactly ideal material for parsing. However, LALR seems like a good and rigorous way to go about building the language if you have no prior constraints.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

I will be heretical and remind you that parsers are the one place where goto's can improve the code (I don't recall if it ever made it into the language).
(#10850)
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

I'd recommend checking out SASS by some cat named Hampton. It's HAML's sister.

Speaking of which, you're now in a book. I'll send photos.
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

CSS abstraction language
Why?
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

I'd recommend checking out SASS
Yes. I wanted to do thing's differently and provide a PHP implementation. Anyway I read though the notes on SASS again and I decided Hampton probably has made all the right decisions and so I would probably copy his almost exactly.

The only signifcant new feature I would add would be @browser. Here were my ideas:

Code: Select all

@browser * {
    // all browsers
}

@browser .gecko {
    // any browser of gecko class
}

@browser #firefox {
    // specific brand of browser in this case firefox
}

@browser .khtml, #firefox {
    // any khtml class before OR firefox
}

@browser [javascript] {
    // any browser capable of interpretting javascript
}

@browser .gecko {
    @browser [javascript] {
        // any gecko class browser with javascript interpretting ability
    }
}


@browser #firefox[version = 2] {
    // firefox at any version from 2.0 to 3.0 not including 3.0
}

@browser #firefox[version = 2.0] {
    // firefox at any version from 2.0 to 2.1 not including 2.1
}

@browser #firefox[version >= 2.0] {
    // firefox at any version after 2.0 including 2.0
}

@browser #firefox[version < 2.0] {
    // firefox at any version before 2.0 not including 2.0
}

@browser :not(.gecko) {
    // any non-gecko browser
}

@browser [os*=mac] {
    // any browser running on an os that contains the substring mac
}

// string property matches (all case insensitive)

[string=string]  // exact
[string*=substring] // contains
[string$=substring] // ends with
[string^=substring] // begins with

// number or version property matches

[num=1]   // exact
[prop>1]  // greater than
[prop>=1] // greater than or equal to
[prop<=1] // less than or equal to
[prop<1]  // less than

// truth matches

[prop]       // is true
:not([prop]) // is false

@browser #safari[os*=win][version >= 1.29][version <= 2.0] {
    // safari running on an os windows with a version greater than 1.29 and less than 2.0
}

// # matches against a short-name, to match against the full User-Agent header use id property:

@browser [id*=mozilla] {

}
I also really wanted to make it possible to style in terms of CSS-classes. Here the nested selectors are decendent from the ones they are nested within BUT crucially the % character is used in the selectors to provide an exception to this rule and position the previous selection at any point:

Code: Select all

.item { /* .item */
    dl% { /* dl.item */
        #comments % { /* #comments dl.item */
            dd { /* #comments dl.item dd */
                p:last-child { /* #comments dl.item dd p:last-child */
                    display:inline;
                }
            }
            dt { /* #comments dl.item dt:after */
                %:after {
                    content:':';
                }
            }
    }
    
}
Buuuttt, then I realised, this feature was also in SASS. The % part was particularly understated, he uses a different symbol.

Anyway, since I've decided I'm not going to start anything new in PHP and Hampton has already done most of it, the project is at an end before it has even started.
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

how can you not trust this man?

Image
Post Reply