[RESOLVED] Stumped on some filtering logic..
Posted: Fri Aug 07, 2009 8:56 pm
Hello, everyone.
I'm having a hell of a time with some logic in a script I'm writing (URL scraper).
Here is the pertinent section:
I can't figure out how to apply the filter since it runs each individual filter over the url - I need to find a way to PASS or FAIL a url and break out of the filter loop and move to the next URL.
$data = array of parsed HTML content
$ignore = array of filter items (that I don't want) - eg; '.ico', '.css'
Previous version was using a large number of "if(strpos($url, '.css') === FALSE AND strpos($url, ..."
As you can imagine, ugly.
In all my years of development, I've never delved into logic of this form.
I'm having a hell of a time with some logic in a script I'm writing (URL scraper).
Here is the pertinent section:
Code: Select all
public function get_links($data, $ignore)
{
preg_match_all('/(href)\=(\"|\')[^\"\'\>]+/i',$data,$media);
unset($data);
$data = preg_replace('/(href)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
foreach($data as $url)
{
foreach($ignore as $filter)
{
// And this is where I'm stumped
}
}
}
$data = array of parsed HTML content
$ignore = array of filter items (that I don't want) - eg; '.ico', '.css'
Previous version was using a large number of "if(strpos($url, '.css') === FALSE AND strpos($url, ..."
As you can imagine, ugly.
In all my years of development, I've never delved into logic of this form.