get count from each word

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
theclay7
Forum Commoner
Posts: 50
Joined: Wed Feb 19, 2003 3:17 am

get count from each word

Post by theclay7 »

if I have a file (result.txt) that contains :

egg
apple
egg
orange
banana
"chinese word"
"italian word"
"chinese word"

and I would like to count the frequecy of each term, do you know how? because I have a site provide search, every keyword search is stored in a table and now I would like to see which word is most frequent requested.

do I need to use trim() or other functions also to remove the white space and double quote?

thank you.
User avatar
JAM
DevNet Resident
Posts: 2101
Joined: Fri Aug 08, 2003 6:53 pm
Location: Sweden
Contact:

Post by JAM »

Use file() to read a file into an array. In the below, I creted the array directly, but you should get the point. I hope this will get to some ideas.
Use explode and/or regular expressions/string functions to strip and remove spaces and quotes prior to this.

Code: Select all

Array
(
    [0] => egg
    [1] => apple
    [2] => egg
    [3] => orange
    [4] => banana
    [5] => "chinese word"
    [6] => "italian word"
    [7] => "chinese word"
)
Array
(
    [egg] => 2
    [apple] => 1
    [orange] => 1
    [banana] => 1
    ["chinese word"] => 2
    ["italian word"] => 1
)

<pre>
<?php
    $array = array('egg','apple','egg','orange','banana','"chinese word"','"italian word"','"chinese word"');
    $result = array();
    foreach ($array as $value) {
        $i = 0;
        foreach ($array as $counter) {
            if ($counter == $value) {$i++; }
        }
        $result[$value] = $i;
    }
    print_r($array);
    print_r($result);
?>
</pre>
theclay7
Forum Commoner
Posts: 50
Joined: Wed Feb 19, 2003 3:17 am

Post by theclay7 »

thanks...however, I tried...but it is always timeout...it is because

1) I have a record of more than 100,000....this is a headache.

And 2), how can I group likely terms as follow:

hithway 200
hithway 300
hithway special
hithway

Any clue? thanks for you help again.

:roll:
theclay7
Forum Commoner
Posts: 50
Joined: Wed Feb 19, 2003 3:17 am

Post by theclay7 »

Actually, I am trying to output the data from Oracle database..

each day there may be 10,000 + entries to the search field with different words in different patterns.

I would like to count the top 10 frequent words and also count the rest in descending orders. Therefore I tried to spool all the output into a file, but it tooks hours and the txt file is huge, which is not efficient.

Does anyone know any SQL command that will do my task? OR if using PHP. how to be more effective?

thanks
User avatar
JAM
DevNet Resident
Posts: 2101
Joined: Fri Aug 08, 2003 6:53 pm
Location: Sweden
Contact:

Post by JAM »

Ugh, not asking for much are you? ;)

What I can think of is...

http://se.php.net/manual/en/function.set-time-limit.php
To change the execution time setting in php (might want to read the first usercomment about apache also).

http://se.php.net/manual/en/function.soundex.php
For something that might be worth using matching similiar strings. Longshot...

But, if you have, or can get, access to a database, that would be prefered. All the above (counting, grouping, group likely terms, no limit in amount of data) can be done much easier if you have the data stored in a database.

Edit: Writing while you posted last post. Let us think.
User avatar
JAM
DevNet Resident
Posts: 2101
Joined: Fri Aug 08, 2003 6:53 pm
Location: Sweden
Contact:

Post by JAM »

theclay7 wrote:Actually, I am trying to output the data from Oracle database..

each day there may be 10,000 + entries to the search field with different words in different patterns.

I would like to count the top 10 frequent words and also count the rest in descending orders. Therefore I tried to spool all the output into a file, but it tooks hours and the txt file is huge, which is not efficient.

Does anyone know any SQL command that will do my task? OR if using PHP. how to be more effective?

thanks
Not to familiar with Oracle, but count() and 'group by' exists so you would not have any issues with getting the results wanted from using db-queries.

Post some example data from the table holding the serached for data?
Post Reply