Helpd is needed for regex a large html file.
Posted: Sun Apr 12, 2009 5:26 am
Hi all,
First of all here is a little brief about who I am & what I want to do:
I'm an unemployed for the last few months, here in Israel we have a really good web site for us to search a new job in; though for any job that we want to send our CV we must send an e-mail manually + we should copy the email address & the subject (that inc. job id)...
** All of the jobs are appear on one html page without frames. **
So what I wanted to do is to make a regex for a specific tags that include the email address + work title + job id & move them into a script that will fillter & send email if the job is something that interest me.
While trying to regex I've got no results, below is an example for the regex that I'm running:
1st Note: The html filesize is around: 513KB.
2nd Note: The html code include Hebrew characters (Can be viewed with Windows-1255 \ UTF-8 encoding).
3rd Note: While trying to do a regex to a sample file like:
Hope that you'll be able to give me a hand here.
Thanks in advance!!
Truly Yours,
Yoni D.
First of all here is a little brief about who I am & what I want to do:
I'm an unemployed for the last few months, here in Israel we have a really good web site for us to search a new job in; though for any job that we want to send our CV we must send an e-mail manually + we should copy the email address & the subject (that inc. job id)...
** All of the jobs are appear on one html page without frames. **
So what I wanted to do is to make a regex for a specific tags that include the email address + work title + job id & move them into a script that will fillter & send email if the job is something that interest me.
While trying to regex I've got no results, below is an example for the regex that I'm running:
preg_match_all('/\Wh\d class=PositionTitle\W[\w\s\\/]*\W\W\w\d\W/', $content, $match);
1st Note: The html filesize is around: 513KB.
2nd Note: The html code include Hebrew characters (Can be viewed with Windows-1255 \ UTF-8 encoding).
3rd Note: While trying to do a regex to a sample file like:
& such lines that include characters in the Hebrew language the result was find - So I'm thinking that the filesize may break the regex,.<h1 class=PositionTitle>Cde</h1>
<h2 class=PositionTitle>EFG</h2>
Hope that you'll be able to give me a hand here.
Thanks in advance!!
Truly Yours,
Yoni D.