Page 1 of 1
get count from each word
Posted: Tue Oct 28, 2003 12:35 am
by theclay7
if I have a file (result.txt) that contains :
egg
apple
egg
orange
banana
"chinese word"
"italian word"
"chinese word"
and I would like to count the frequecy of each term, do you know how? because I have a site provide search, every keyword search is stored in a table and now I would like to see which word is most frequent requested.
do I need to use trim() or other functions also to remove the white space and double quote?
thank you.
Posted: Tue Oct 28, 2003 4:01 am
by JAM
Use file() to read a file into an array. In the below, I creted the array directly, but you should get the point. I hope this will get to some ideas.
Use explode and/or regular expressions/string functions to strip and remove spaces and quotes prior to this.
Code: Select all
Array
(
[0] => egg
[1] => apple
[2] => egg
[3] => orange
[4] => banana
[5] => "chinese word"
[6] => "italian word"
[7] => "chinese word"
)
Array
(
[egg] => 2
[apple] => 1
[orange] => 1
[banana] => 1
["chinese word"] => 2
["italian word"] => 1
)
<pre>
<?php
$array = array('egg','apple','egg','orange','banana','"chinese word"','"italian word"','"chinese word"');
$result = array();
foreach ($array as $value) {
$i = 0;
foreach ($array as $counter) {
if ($counter == $value) {$i++; }
}
$result[$value] = $i;
}
print_r($array);
print_r($result);
?>
</pre>
Posted: Tue Oct 28, 2003 7:47 pm
by theclay7
thanks...however, I tried...but it is always timeout...it is because
1) I have a record of more than 100,000....this is a headache.
And 2), how can I group likely terms as follow:
hithway 200
hithway 300
hithway special
hithway
Any clue? thanks for you help again.

Posted: Tue Oct 28, 2003 7:56 pm
by theclay7
Actually, I am trying to output the data from Oracle database..
each day there may be 10,000 + entries to the search field with different words in different patterns.
I would like to count the top 10 frequent words and also count the rest in descending orders. Therefore I tried to spool all the output into a file, but it tooks hours and the txt file is huge, which is not efficient.
Does anyone know any SQL command that will do my task? OR if using PHP. how to be more effective?
thanks
Posted: Tue Oct 28, 2003 8:00 pm
by JAM
Ugh, not asking for much are you?
What I can think of is...
http://se.php.net/manual/en/function.set-time-limit.php
To change the execution time setting in php (might want to read the first usercomment about apache also).
http://se.php.net/manual/en/function.soundex.php
For something that might be worth using matching similiar strings. Longshot...
But, if you have, or can get, access to a database, that would be prefered. All the above (counting, grouping, group likely terms, no limit in amount of data) can be done much easier if you have the data stored in a database.
Edit: Writing while you posted last post. Let us think.
Posted: Tue Oct 28, 2003 8:03 pm
by JAM
theclay7 wrote:Actually, I am trying to output the data from Oracle database..
each day there may be 10,000 + entries to the search field with different words in different patterns.
I would like to count the top 10 frequent words and also count the rest in descending orders. Therefore I tried to spool all the output into a file, but it tooks hours and the txt file is huge, which is not efficient.
Does anyone know any SQL command that will do my task? OR if using PHP. how to be more effective?
thanks
Not to familiar with Oracle, but count() and 'group by' exists so you would not have any issues with getting the results wanted from using db-queries.
Post some example data from the table holding the serached for data?