how to extract data from my files

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
jasongr
Forum Contributor
Posts: 206
Joined: Tue Jul 27, 2004 6:19 am

how to extract data from my files

Post by jasongr »

Hello people

I would like to run the following test on my PHP files.
I have 2 PHP functions called: iterate and display
here are the function signatures:

Code: Select all

public function iterate($param1, $param2, $param3 = '');
function display($param1, $param2, $param3 = '');
$param1, $param2 and $param3 are all strings
I would like to know exactly where I am using these functions in the code and what parameters are being used.

I would like to write a script that will iterate over all the PHP files (.php or .inc.php extension) in my projet and to use a regular expression to look for usage of these functions.
Whenever a usage is encountered, it should be written to a log file like so:
<$param1>\t<$param2>\t<path to file>\t<line #>\n

For example, if file a.php contains the code:

Code: Select all

echo display('test', 'Hello');
and file b.php contains the line:

Code: Select all

if ($obj->iterate('run', $val) == true)) {
The log file read:

Code: Select all

test   Hello   a.php   14
run    $val    b.php   20
where 14 is the line in a.php where the function was found and 20 is the line in b.php where the second function was found

I was wondering if someone could help me with this as my main problem is how to formulate the regular expression

regards
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Code: Select all

<?php

	$text = 'if ($obj->iterate(''run'', $val) == true)) {
	if ($obj->iterate(''run'' . ''test'', $val) == true)) {
	if ($obj->iterate(''run'' . $not_work, $val) == true)) {';
	
	preg_match_all('#(iterate|display)\\s*\(((\\s*(([''"]).*?\\\\5)|\\$.*?)\\s*,){1,2}\\s*((([''"]).*?\\\\8)|\\$.*?)\\s*\\)#s', $text, $matches, PREG_SET_ORDER | PREG_OFFSET_CAPTURE);
	
	var_export($matches);

?>

Code: Select all

array (
  0 =&gt;
  array (
    0 =&gt;
    array (
      0 =&gt; 'iterate(''run'', $val)',
      1 =&gt; 10,
    ),
    1 =&gt;
    array (
      0 =&gt; 'iterate',
      1 =&gt; 10,
    ),
    2 =&gt;
    array (
      0 =&gt; '''run'',',
      1 =&gt; 18,
    ),
    3 =&gt;
    array (
      0 =&gt; '''run''',
      1 =&gt; 18,
    ),
    4 =&gt;
    array (
      0 =&gt; '''run''',
      1 =&gt; 18,
    ),
    5 =&gt;
    array (
      0 =&gt; '''',
      1 =&gt; 18,
    ),
    6 =&gt;
    array (
      0 =&gt; '$val',
      1 =&gt; 25,
    ),
  ),
  1 =&gt;
  array (
    0 =&gt;
    array (
      0 =&gt; 'iterate(''run'' . ''test'', $val)',
      1 =&gt; 55,
    ),
    1 =&gt;
    array (
      0 =&gt; 'iterate',
      1 =&gt; 55,
    ),
    2 =&gt;
    array (
      0 =&gt; '''run'' . ''test'',',
      1 =&gt; 63,
    ),
    3 =&gt;
    array (
      0 =&gt; '''run'' . ''test''',
      1 =&gt; 63,
    ),
    4 =&gt;
    array (
      0 =&gt; '''run'' . ''test''',
      1 =&gt; 63,
    ),
    5 =&gt;
    array (
      0 =&gt; '''',
      1 =&gt; 63,
    ),
    6 =&gt;
    array (
      0 =&gt; '$val',
      1 =&gt; 79,
    ),
  ),
)
note how the third line isn't found, due to the more complex nature of the expression involved and my lack of time to fiddle with it more.
timvw
DevNet Master
Posts: 4897
Joined: Mon Jan 19, 2004 11:11 pm
Location: Leuven, Belgium

Post by timvw »

your question has already been answered, but you might want to check out this project: http://ctags.sourceforge.net too....

if i understand a little, it does (much) more than what you asked for :)
jasongr
Forum Contributor
Posts: 206
Joined: Tue Jul 27, 2004 6:19 am

Post by jasongr »

thanks feyd

could you give me a short explanation of this complex regular expression?
I am very new to regular expressions and I tried to understand how you solved it, but I am lost
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

basics:
(iterate|display) = look for iterate or display
\s*\( any amount of whitespace followed by an opening paren.
([''"]).*?\\5)|\$.*? look for a quoted string symboling or a vague description of a variable.
{1,2} expect 1 to 2 of them
([''"]).*?\\8)|\$.*? look for a quoted string symboling or a vague description of a variable.
\s*\) any amount of whitespace followed by the closing paren of the function call.
jasongr
Forum Contributor
Posts: 206
Joined: Tue Jul 27, 2004 6:19 am

Post by jasongr »

I have 2 questions:
what do the numbers 5 and 8 mean in ([''"]).*?\\5)|\$.*? and ([''"]).*?\\8)|\$.*?

and why do you precede them with \\

I notice that you surround your expression with #. Is it custom?
I also saw expressions that are surrounded by '
is there a difference?
do I must surround my regular expressions with # ?

thanks
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

you have to surround your pattern with a symbol character... any will work. However, some characters are metacharacters for patterns. Although you can use them for patterns, it's often suggested to find a symbol you like that isn't one of them. ^+-*()[]{}=?|!$ are all metacharacters in some form or another. /@# are the most often used that I've seen as pattern start and end markers (surrounding a pattern)

the 5 and 8 are back references to marked segments of the pattern. In both cases they are [''"], but from differing positions. \\5 references the 5th subpattern that's marked for remembering.
Post Reply