trying to extract sprecial words from a binary file...

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
Fredix
Forum Contributor
Posts: 101
Joined: Fri Jul 18, 2003 2:16 pm
Location: Wehr (Eifel) Germany
Contact:

trying to extract sprecial words from a binary file...

Post by Fredix »

Hi folks,
I think I need your help really quickly.

I have the following problem:
I'm trying to extract all words ending on eg. UNG from a binary file looking like:

[quote]ABSPEISENN9ABSPENSTIGú/ABSPERRENÈ;ABSPERRGITTERý;ABSPERRHAHNý;ABSPERRHAHNEý;ABSPERRHAHNEEN‚<ABSPERRKETTEú2ABSPERRUNG6=ABSPERRVENTILú7ABSPIELú9ABSPIELENï@ABSPLITTERNú@ABSPRACHEúBABSPRACHEGEMASSHCABSPRECHEN´EABSPREIZENSFABSPRENGENúDABSPRINGEN¹JABSPRITZENúGABSPRUNGõMABSPRUNGBALKENúIABSPULENúMABSTAINúQABSTAMMENúVABSTAMMUNGQABSTAMMUNGEN¸QABSTAMMUNGSLEHREúYABSTANDúmABSTANDENÄVABSTANDSSUMMEúoABSTATTEN:XABSTAUBEN$ZABSTAUBERÀZABSTAUBERINÀZABSTAUBERINEN$ZABSTAUBERSúrABSTAUBERTORútABSTECHENü]ABSTECHERü]ABSTECHERSw^ABSTECKENúxABSTEHEN£_ABSTEHENDEÙÂABSTEHENDENúzABSTEIGEú|ABSTEIGENNGENCNCVME:0( ÷íàÕÌú­¤
User avatar
DuFF
Forum Contributor
Posts: 495
Joined: Tue Jun 24, 2003 7:49 pm
Location: USA

Post by DuFF »

preg_match() can probably do what you are looking for.
User avatar
Fredix
Forum Contributor
Posts: 101
Joined: Fri Jul 18, 2003 2:16 pm
Location: Wehr (Eifel) Germany
Contact:

Post by Fredix »

Well I know there is preg_match for searching strings but I want to know how I could use it in this situation!

If I just preg_match for ung i get the WHOLE file but I need only the word ==> string in front!!
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Post by Weirdan »

Code: Select all

$res=`strings $filename| grep 'ung\$'`;
echo $res;
//
User avatar
JAM
DevNet Resident
Posts: 2101
Joined: Fri Aug 08, 2003 6:53 pm
Location: Sweden
Contact:

Post by JAM »

Or if you want to script it:

Code: Select all

<pre>
<?php
    $filename = "the.file";
    $handle = fopen ($filename, "rb");
    $s = fread ($handle, filesize ($filename));
    fclose ($handle);
    echo 'This is the file: '.$s.'<br>';

    print_r(thewords($s));

    function thewords($s,$needles='UNG') {
        foreach (preg_split('/[^\w]/',$needles) as $needle) {
            $i = $pcur = $pfound = 0;
            while ($pfound !== false && $needles !== '') {
                $pfound = strpos(substr($s,$pcur),$needle);
                if ($pfound !== false) {
                    $index = $pfound+$pcur;
                    $pcur += ($pfound+strlen($needle));
                    $charsallowed='ABCDEFGHIJKLMOPQRSTUVWXYZ';
                    $e = 1;
                    $check = TRUE;
                    while ($check != FALSE) {
                        if (strspn(substr($s,$index-$e,$e),$charsallowed)) { $e++; } else { $check = FALSE; }
                    }
                    $i++;
                    $pp[$i] = substr($s,$index-$e+1,$e-1).'UNG';
                }
            }
        }
        if (isset($pp)) { ksort($pp); return $pp; }
        return false;
    }
?>
Results:

Code: Select all

This is the file: BSPEISEN...cut...EGRAFIEREN
Array
(
    &#1111;1] =&gt; ABSPERRUNG
    &#1111;2] =&gt; GABSPRUNG
    &#1111;3] =&gt; ABSPRUNG
    &#1111;4] =&gt; VABSTAMMUNG
    &#1111;5] =&gt; ABSTAMMUNG
    &#1111;6] =&gt; ABSTAMMUNG
    &#1111;7] =&gt; ABSTIMMUNG
    &#1111;8] =&gt; BABSTIMMUNG
    &#1111;9] =&gt; ABSTIMMUNG
    &#1111;10] =&gt; ABSTIMMUNG
    &#1111;11] =&gt; ABSTIMMUNG
    &#1111;12] =&gt; ABSTOSSUNG
    &#1111;13] =&gt; ABSTOSSUNG
    &#1111;14] =&gt; ABSTRAFUNG
    &#1111;15] =&gt; ABSTRAFUNG
    &#1111;16] =&gt; ABSTRAHLUNG
    &#1111;17] =&gt; ABSTUFUNG
    &#1111;18] =&gt; ABSTUFUNG
    &#1111;19] =&gt; ABSTUTZUNG
    &#1111;20] =&gt; STRAFUNG
    &#1111;21] =&gt; ABTEILUNG
    &#1111;22] =&gt; ABTEILUNG
    &#1111;23] =&gt; YABTEILUNG
    &#1111;24] =&gt; YABTEILUNG
    &#1111;25] =&gt; ABTEILUNG
    &#1111;26] =&gt; ABTEILUNG
)
Not sure that fits your exact needs (hit 25 hada small b also in it), but change around to fit your needs. Need comments?
Post Reply