Type of data to store

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
keenlearner
Forum Commoner
Posts: 50
Joined: Sun Dec 03, 2006 7:19 am

Type of data to store

Post by keenlearner »

I am making an English part of speech tagging using Hidden Markov Model, one of the data it require is P(Word|Tag) which means what is the probability of a word having the particular part of speech tag. E.g for the word "race"
Total number of noun = 12345
Total number of "race" as noun = 20

P(race|noun) = 20 / 12345 = 0.0016200891

In my program I will need the 0.0016200891 probability value, so my question is which is better ?
1. Store the calculated probaility value immediately in the database, or
2. Store the "total number of noun" and "total number of race as noun" then calculate the probability during the execution


Thank you.
User avatar
aaronhall
DevNet Resident
Posts: 1040
Joined: Tue Aug 13, 2002 5:10 pm
Location: Back in Phoenix, missing the microbrews
Contact:

Re: Type of data to store

Post by aaronhall »

Your schema really shouldn't hold aggregate data along side the data you're aggregating... there's almost always a better way. You can probably cache these calculations per word if queries get too expensive, but it may not be necessary (subjective).
dhampson
Forum Newbie
Posts: 19
Joined: Mon Mar 24, 2008 8:01 pm

Re: Type of data to store

Post by dhampson »

Option 2. It will be easier to modify, update data, and spot mistakes. It may take a little more time now, but it could save you a lot in the future.

--Dave
Post Reply