One of the sites I'm building would greatly benefit from a specific RSS newsfeed. Within this newsfeed, there are links to specific sections of the source site's main page.
href="http://www.RSSsourceSite.com/#newsItem"
For some reason, these links produce a page error when followed from the site I'm developing. Rather than trying to figure out why they break, I would like to just trim off the "#newsItem" part. That way, users may not get directly to a specific area of a specific page, but at least they get to the right page.
So:
http://www.RSSsourceSite.com/#newsItem
The underlined part will always be the same. The bold part could be any combination of letters, numbers, or URL-friendly characters. With the correct regex, I can preg_replace and:
http://www.RSSsourceSite.com/#newsItem
- becomes -
http://www.RSSsourceSite.com/
I suck at REGEX, and it gives me a headache. Can anyone help?
Thanks in advance
Siv
REGEX and URL help please
Moderator: General Moderators
- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
Code: Select all
$stripped_string = preg_replace('%<a\s+href\s*=\s*"http://www.RSSsourceSite.com/(#(*A-Za-z0-9]\s*%', '<a href=http://www.RSSsourceSite.com/', $string_to_search);and line 11 ispreg_replace(): Compilation failed: nothing to repeat at offset 49 in ... [file] ... on line 11
Code: Select all
$rss = preg_replace('%<a\s+href\s*=\s*"http://www.RSSsourceSite.com/(#(*A-Za-z0-9]\s*%', '<a href=http://www.RSSsourceSite.com/', $rss);- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
Try this. I googled your error message and something came up about having '+' and '*' signs in the string. Try this code and see if it helps. If not, then I am at a lost too. I am no regexpert so I am trying to offer what I can with what I know. (Now if we were talking about food, or wine, then we'd be in business.)
Code: Select all
<?php
// These next two lines come from
// http://drupal.org/node/32370
$rss = str_replace('+', '\\+', $rss);
$rss = str_replace('*', '\\*', $rss);
$rss = preg_replace('%<a\s+href\s*=\s*"http://www.RSSsourceSite.com/(#(*A-Za-z0-9]\s*%', '<a href=http://www.RSSsourceSite.com/', $rss);
?>based on your original suggestion, the following seems to work:
Code: Select all
$rss = preg_replace('%<a\s+href\s*=\s*"http://www.RSSsourceSite.com#[A-Za-z0-9]*\s*%', '<a href="http://www.RSSsourceSite.com/', $rss);- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA