RSS v Site Parsing
Posted: Fri May 13, 2005 4:54 am
im atm building a parser for web-sites to extract data i want...
however i am pretty much doing exactly what an RSS feed does except with gile_get_contents and then preg to extract the exact segments i want directly from html code, not in the RSS way.
From my first thoughts, RSS has the advantage of having pre-defined "headers" such as <item>item here</item><description>desc here</here> which makes it much more "standardised" to parse.
Is the way im doing it an acceptable method? i prefer my way as alot of sites still dont have RSS and my class has the ability to parse anything based on [passed arguments of what to parse...
your thoughts please..
however i am pretty much doing exactly what an RSS feed does except with gile_get_contents and then preg to extract the exact segments i want directly from html code, not in the RSS way.
From my first thoughts, RSS has the advantage of having pre-defined "headers" such as <item>item here</item><description>desc here</here> which makes it much more "standardised" to parse.
Is the way im doing it an acceptable method? i prefer my way as alot of sites still dont have RSS and my class has the ability to parse anything based on [passed arguments of what to parse...
your thoughts please..