Regex parsing

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
mathewvp
Forum Commoner
Posts: 28
Joined: Wed Apr 23, 2003 10:28 am

Regex parsing

Post by mathewvp »

I have a text database file which has entries like

//--First Record

|:010 Keyword1:itsvalue
:020 Keyword2:anothervalue
:030 Keyword3:thirdvalue
.......
......
....
:500 Keyword500:somevalue

//Second Record
|:0610 Keyword1:value
......
Records are separated by the "|" symbol(OR)

How do I parse it so that I can separate each record and take the keyword's values

Keywords and values are like "LOCATION:someplace","EMPLOYEE:contract" etc.

Can anybody help please?
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

I wouldn't necessarily use RegEx for that - here's an alternative:

I would read the record using [php_man]file[/php_man], the use foreach and explode the lines along pipe, i.e. |

e.g.:

Code: Select all

<?php
$content=file("myfile.txt");
foreach ($content as $line)
   {
   $record=explode ("|",$line);
   foreach($record as $item)
       {
        list($key,$value)=explode (":",$item);
        $resultArray[][$key]=$value;
       }
   }
?>
This is a quick script from the hip, I don't have time to test it thoroughly, but I reckon you get what I am intending to do, if it doesn't run.
mathewvp
Forum Commoner
Posts: 28
Joined: Wed Apr 23, 2003 10:28 am

Post by mathewvp »

Thanks patrik.But that wont work for me.Each line starts with a :linenumber and also the data will have : like for website addresses (http://).I cannot use explode coz I need the values :010,:020,:030 etc.Its actually line number and some of these lines can be missing in the next record.So its important to have the line numbers.Also another problem I face is the values of keywords are multi line values with linebreaks in between.
Post Reply