Type of data to store
Posted: Tue Mar 11, 2008 8:28 pm
I am making an English part of speech tagging using Hidden Markov Model, one of the data it require is P(Word|Tag) which means what is the probability of a word having the particular part of speech tag. E.g for the word "race"
Total number of noun = 12345
Total number of "race" as noun = 20
P(race|noun) = 20 / 12345 = 0.0016200891
In my program I will need the 0.0016200891 probability value, so my question is which is better ?
1. Store the calculated probaility value immediately in the database, or
2. Store the "total number of noun" and "total number of race as noun" then calculate the probability during the execution
Thank you.
Total number of noun = 12345
Total number of "race" as noun = 20
P(race|noun) = 20 / 12345 = 0.0016200891
In my program I will need the 0.0016200891 probability value, so my question is which is better ?
1. Store the calculated probaility value immediately in the database, or
2. Store the "total number of noun" and "total number of race as noun" then calculate the probability during the execution
Thank you.