Page 1 of 1

Search from "x" to "y" in a specified st

Posted: Tue Jul 19, 2005 12:30 pm
by nutkenz
Hi,

I'm working on my first script ever but I'm already having problems. On this page:
http://php.belnet.be/manual/nl/function.preg-match.php

Someone says you can do this with this code;
email at albert-martin dot com
23-Oct-2004 11:39
Here is a faster way of extracting a special phrase from a HTML page:

Instead of using preg_match, e.g. like this:
preg_match("/<title>(.*)<\/title>/i", $html_content, $match);

use the following:
<?php
function ExtractString($str, $start, $end) {
$str_low = strtolower($str);
if (strpos($str_low, $start) !== false && strpos($str_lower, $end) !== false) {
$pos1 = strpos($str_low, $start) + strlen($start);
$pos2 = strpos($str_low, $end) - $pos1;
return substr($str, $pos1, $pos2);
}
}
$match = ExtractString($html_content, "<title>", "</title>");
?>
However, this did not seem to work, so I modified something I thought was wrong ($str_lower instead of $str_low)

Making my own code:

Code: Select all

<?
$input = "<ElTank> Silver Nariyid Boots, (6) craft (Silver), AL 350 (6Tinks) : Major Coordination, Impenetrability VI, Minor Focus, Bludgeon Bane VI. Dif 286, [1.3/1.0/1.0/1.1/0.4/0.6/0.4], Value 6,161, 400BU";

function ExtractString($str, $start, $end) {
  $str_low = strtolower($str);
  if (strpos($str_low, $start) !== false && strpos($str_low, $end) !== false) {
   $pos1 = strpos($str_low, $start) + strlen($start);
   $pos2 = strpos($str_low, $end) - $pos1;
   return substr($str, $pos1, $pos2);
  }
}

$name = ExtractString($input, "<ElTank>", ",");
echo $name;

?>
I tried this out on http://nutkenz.net/test/add.php but nothing is echo'd...

The script is small but rather complicated (to a beginner), so I thought I'd ask here

Posted: Tue Jul 19, 2005 12:38 pm
by djot
Perhaps you should write down, what you really want to search for ...
Just text or html tags or what kind of data?

Posted: Tue Jul 19, 2005 12:55 pm
by nutkenz
djot wrote:Perhaps you should write down, what you really want to search for ...
Just text or html tags or what kind of data?
The example should output "Silver Nariyid Boots"

Some other examples:

<ELTank> Silver Nariyid Boots, (6) craft (Silver), AL 350 (6Tinks) : Major Coordination, Impenetrability VI, Minor Focus, Bludgeon Bane VI. Dif 286, [1.3/1.0/1.0/1.1/0.4/0.6/0.4], Value 6,161, 400BU

<ELTank> Gold Scalemail Girth, (8) craft (Gold), AL 203 : Lightning Bane VI, Impenetrability V, Major Focus, Flame Bane V, Frost Bane V. Dif 63, Gharu'ndim Only, (activate) Melee Defense of 247, [1.0/1.1/1.0/0.4/0.4/0.6/0.4], Value 12,819, 472BU

<ELTank> Long Leather Gauntlets, (6) craft (Leather), AL 138 (1Tinks) : Impenetrability V, Major Focus, Bludgeon Bane V. Dif 114, (activate) Missile Defense of 161, [1.0/0.8/1.0/0.5/0.5/0.3/0.6], Value 5,940, 247BU

<ELTank> Leather Amuli Leggings, (8) craft (Leather), AL 212 : Impenetrability V, Major Strength, Acid Bane V, Quickness Self VI. Dif 11, Rank 8, (activate) Missile Defense of 187, [1.0/0.8/1.0/0.5/1.0/1.0/0.6], Value 10,390, 3,188BU

<ELTank> Silver Covenant Pauldrons, (6) craft (Silver), AL 284 : Impenetrability VI, Major Focus, Strength Self VI. Dif 262, (wield) Magic Defense of 185, [1.3/1.4/1.4/0.8/0.6/0.6/0.8], Value 6,749, 301BU

I want it so search for the string between "<ELTank>" and ","
This will sometimes be text and will sometimes be numbers

Posted: Tue Jul 19, 2005 12:58 pm
by wwwapu
The problem is that you strtolower() only the $str and not $start nor $end at all.

Code: Select all

$input = "<ElTank> Silver Nariyid Boots, (6) craft (Silver), AL 350 (6Tinks) : Major Coordination, Impenetrability VI, Minor Focus, Bludgeon Bane VI. Dif 286, [1.3/1.0/1.0/1.1/0.4/0.6/0.4], Value 6,161, 400BU";

function ExtractString($str, $start, $end) {
  $str_low = strtolower($str);
// at this point your original $input is: <eltank> silver nariyd...
// $start however is still <ElTank>
// lets modify a little
  $start=strtolower($start);
  $end=strtolower($end);
  if (strpos($str_low, $start) !== false && strpos($str_low, $end) !== false) {
   $pos1 = strpos($str_low, $start) + strlen($start);
   $pos2 = strpos($str_low, $end) - $pos1;
   return substr($str, $pos1, $pos2);
  }
}

$name = ExtractString($input, "<ElTank>", ",");

echo $name;
If you had PHP 5 you could use stripos()

Posted: Tue Jul 19, 2005 1:00 pm
by djot
-
If data is always like this, you may split up the string into an array at the comma, then remove <ELTank> from the first array field.

Perhaps someone can post a regex for this since I hate them :)

djot
-

Posted: Tue Jul 19, 2005 1:02 pm
by Chris Corbyn
Sorry to tear that code apart but you really don't need all that stuff.
Try this ;)

Code: Select all

function grabWhatever($input) { //Please rename this to something suitable

    preg_match('#<ElTank>\s*([^,]+),#is', $input, $matches);
    return $matches[1];

}
So to use it...

Code: Select all

$input = "<ElTank> Silver Nariyid Boots, (6) craft (Silver), AL 350 (6Tinks) : Major Coordination, Impenetrability VI, Minor Focus, Bludgeon Bane VI. Dif 286, [1.3/1.0/1.0/1.1/0.4/0.6/0.4], Value 6,161, 400BU";

echo grabWhatever($input);
EDIT | Corrected index 0 to 1 :)

Posted: Tue Jul 19, 2005 1:09 pm
by nutkenz
d11wtq wrote:Sorry to tear that code apart but you really don't need all that stuff.
Try this ;)

Code: Select all

function grabWhatever($input) { //Please rename this to something suitable

    preg_match('#<ElTank>\s*([^,]+),#is', $input, $matches);
    return $matches[1];

}
So to use it...

Code: Select all

$input = "<ElTank> Silver Nariyid Boots, (6) craft (Silver), AL 350 (6Tinks) : Major Coordination, Impenetrability VI, Minor Focus, Bludgeon Bane VI. Dif 286, [1.3/1.0/1.0/1.1/0.4/0.6/0.4], Value 6,161, 400BU";

echo grabWhatever($input);
EDIT | Corrected index 0 to 1 :)
That'd work perfectly if it would always and only have to search from "<Eltank>" to "," but these should be variables

Code: Select all

<?
$input = "<ElTank> Silver Nariyid Boots, (6) craft (Silver), AL 350 (6Tinks) : Major Coordination, Impenetrability VI, Minor Focus, Bludgeon Bane VI. Dif 286, [1.3/1.0/1.0/1.1/0.4/0.6/0.4], Value 6,161, 400BU";

function ExtractString($input,$from,$to) {

    preg_match('#<ElTank>\s*([^,]+),#is', $input, $matches);
    return $matches[0];

}

$name = ExtractString($input, "<ElTank>", ",");
echo $name;

?>
How would I adjust #<ElTank>\s*([^,]+),#is for this purpose or can't you use variables like that?


Thank you

Posted: Tue Jul 19, 2005 1:45 pm
by Chris Corbyn
Now it *might* start getting complicated... there's certain characters you need to escape in regex - dot "." being a big pitfall...

Also, will $to always be a single character? If not and you know anything about regex you'd see how this will be difficult to put into a generic function but I'll have a stab for you ;)

EDIT | Maybe the other function suits better after all. Not sure what I'm gonna come up with :?

Posted: Tue Jul 19, 2005 1:59 pm
by nutkenz
From ", (" to ")"
From ", AL " to " ("
From "AL ??? (" to " Tinks)" >>> I don't know how many characters or which characters that ??? will be
From "Value " to ", "
From "(wield) Melee Defense of " to ","
From "(wield) Missile Defense of " to ","
From "(wield) Magic Defense of " to ","
From "(activate) Melee Defense of " to ","
From "(activate) Missile Defense of " to ","
From "(activate) Magic Defense of " to ","
From "Rank " to ","
From "Dif " to ","
From " : " to ". "

Posted: Tue Jul 19, 2005 1:59 pm
by Chris Corbyn
I really haven't tested this and I can't say it'll be any better than the original function. Sorry i guess I misunderstood what you were trying to acheive originally ;)

Code: Select all

function ExtractString($string, $from, $to) {

	/*
	 Lifted directly from the regex_escape_string()
	 -- function I posted in Snippets
	 */
	 
	$nasties = array ( //Stuff to always escape
		'.',
		'?',
		'*',
		'+',
		'-',
		'^',
		'$',
		'(',
		')',
		'[',
		']',
		'{',
		'}',
		'\\',
		'|',
		'#'
	);
	
	$goodies = array();
	foreach ($nasties as $v) {
	
		$goodies[] = '\\'.$v; //Escaped versions
		
	}
	
	$from = str_replace($nasties, $goodies, $from); //Make it regex-proof
	$to = str_replace($nasties, $goodies, $to); //

	$re = '#'.$from.'(.*?)'.$to.'#is';
	preg_match($re, $string, $matches);
	return $matches[1];

}
Here's one thing to note: Be as unambigious as you can when giving $from and $to as arguments - i.e. The longer the better.

Posted: Tue Jul 19, 2005 2:22 pm
by nutkenz
It seems to work, thanks! I'll finish the first part soon, I'll let you know I have any problems :)

Posted: Tue Jul 19, 2005 2:28 pm
by nutkenz

Code: Select all

$armor_level = ExtractFromString($input, ", AL ", " (" );
echo $armor_level;
Gives me:

Warning: preg_match() [function.preg-match]: Compilation failed: missing ) at offset 14 in D:\Webs\AC Auction\_debug_tmp.php on line 38

Notice: Undefined offset: 1 in D:\Webs\AC Auction\_debug_tmp.php on line 3

I know it's one of the nasties, but how else am I supposed to do this? :(

Posted: Wed Jul 20, 2005 3:29 am
by nutkenz
Sigh.. The solution was so easy, I just wasn't using lowercase to do the search. This works;

Code: Select all

// Remove from actual version
$input = "<ElTank> Silver Nariyid Boots, (6) craft (Silver), AL 350 (6Tinks) : Major Coordination, Impenetrability VI, Minor Focus, Bludgeon Bane VI. Dif 286, [1.3/1.0/1.0/1.1/0.4/0.6/0.4], Value 6,161, 400BU";

function ExtractFromString($str, $start, $end) {
$str_low = strtolower($str);
   $pos_start = strpos($str_low, $start);
   $pos_end = strpos($str_low, $end, ($pos_start + strlen($start)));
   if ( ($pos_start !== false) && ($pos_end !== false) )
   {
       $pos1 = $pos_start + strlen($start);
       $pos2 = $pos_end - $pos1;
       return substr($str, $pos1, $pos2);
   }
}

$armor_name = ExtractFromString($input, '<eltank>', ',');
echo $armor_name;

$armor_level = ExtractFromString($input, ', al ', ' (' );
echo $armor_level;
Thanks a lot for your help though!