Postgres BETWEEN two timestamps
Posted: Tue Nov 30, 2010 6:07 pm
Im having problems currently with a postgres database where i am trying to select between datetime1 and datetime2.
The query itself is a tad more complicated than that (will provide full query in a sec), however its important to understand the context of the situation.
I have a database that currently is storing data in the following format every 15 minutes of every day, 24 hours a day:
node_id, poll_stamp, value
where:
node_id - the id of a sensor
poll_stamp - the current timestamp at the time of the sensor polling. format: Y-m-d H:i:s+00
value - you guessed it, a value.
This has led a very large number of records, aprox 850k so far, which also presents the problem of having an efficient query.
There are two requirements of the query which i am trying to meet.
1. Get a cumulative value of all values vetween #2
2. Select all values between 9am day 1 and 8:59am day 2, (this is a requirement, days MUST be measured between 9am day 1 and 9am day 2)
This query will be effective for a single day's worth of data.
This is the query i currently have:
[text]SELECT t.node_id, t.poll_stamp, (SELECT sum(x.value) FROM sensor_data x WHERE x.poll_stamp <= t.poll_stamp AND extract(DOY from poll_stamp) = '2010-12-01 09:00:00') AS value FROM sensor_data t WHERE t.poll_stamp >= '2010-12-01 09:00:00' AND t.poll_stamp <= '2010-12-02 08:59:00' ORDER BY t.poll_stamp[/text]
I have also tried
[text]SELECT t.node_id, t.poll_stamp, (SELECT sum(x.value) FROM sensor_data x WHERE x.poll_stamp <= t.poll_stamp AND extract(DOY from poll_stamp) = '2010-12-01 09:00:00') AS value FROM sensor_data t WHERE t.poll_stamp BETWEEN '2010-12-01 09:00:00' AND '2010-12-02 08:59:00' ORDER BY t.poll_stamp[/text]
The error im receiving with this (using PHP to execute) is:
[text]Warning: pg_query() [function.pg-query]: Query failed: ERROR: invalid input syntax for type double precision: "2010-12-01 09:00:00" in /web/sites/default/dashboard/class.database.php on line 55[/text]
There is also another issue with the above query, it seems to be somewhat inefficient when selecting a large amount of data, due to the cumulative values. This problem will be multiplied by a huge amount, because there is a need to show a cumulative value as described above, for every day of an entire year in a single graph.
So in other words, there is a need to have multiple graphs which can show data from 1 day, 1 week, 1 month, 1 year.
The yearly would need to be averaged out obviously, however with the graph, the cumulative value MUST be reset after each grouping.
So for a 1 week view, on the graph, each day at 9am, the cumulative value would be reset to 0.
Example: if the first cumulative value at the END of day 1 (8:59am) were to be 85.9, then at 9:00am day 2, would be reset to 0, so that it then counts from 9am-9am the next day.
This would mean that with the current query, i would need to run it 7 times in order to generate a weeks worth of data, and if that original query is already slow, then you can see where the efficiency problem is.
I hope i have explained this enough, if you need more info about anything, just let me know.
cheers
Weiry
The query itself is a tad more complicated than that (will provide full query in a sec), however its important to understand the context of the situation.
I have a database that currently is storing data in the following format every 15 minutes of every day, 24 hours a day:
node_id, poll_stamp, value
where:
node_id - the id of a sensor
poll_stamp - the current timestamp at the time of the sensor polling. format: Y-m-d H:i:s+00
value - you guessed it, a value.
This has led a very large number of records, aprox 850k so far, which also presents the problem of having an efficient query.
There are two requirements of the query which i am trying to meet.
1. Get a cumulative value of all values vetween #2
2. Select all values between 9am day 1 and 8:59am day 2, (this is a requirement, days MUST be measured between 9am day 1 and 9am day 2)
This query will be effective for a single day's worth of data.
This is the query i currently have:
[text]SELECT t.node_id, t.poll_stamp, (SELECT sum(x.value) FROM sensor_data x WHERE x.poll_stamp <= t.poll_stamp AND extract(DOY from poll_stamp) = '2010-12-01 09:00:00') AS value FROM sensor_data t WHERE t.poll_stamp >= '2010-12-01 09:00:00' AND t.poll_stamp <= '2010-12-02 08:59:00' ORDER BY t.poll_stamp[/text]
I have also tried
[text]SELECT t.node_id, t.poll_stamp, (SELECT sum(x.value) FROM sensor_data x WHERE x.poll_stamp <= t.poll_stamp AND extract(DOY from poll_stamp) = '2010-12-01 09:00:00') AS value FROM sensor_data t WHERE t.poll_stamp BETWEEN '2010-12-01 09:00:00' AND '2010-12-02 08:59:00' ORDER BY t.poll_stamp[/text]
The error im receiving with this (using PHP to execute) is:
[text]Warning: pg_query() [function.pg-query]: Query failed: ERROR: invalid input syntax for type double precision: "2010-12-01 09:00:00" in /web/sites/default/dashboard/class.database.php on line 55[/text]
There is also another issue with the above query, it seems to be somewhat inefficient when selecting a large amount of data, due to the cumulative values. This problem will be multiplied by a huge amount, because there is a need to show a cumulative value as described above, for every day of an entire year in a single graph.
So in other words, there is a need to have multiple graphs which can show data from 1 day, 1 week, 1 month, 1 year.
The yearly would need to be averaged out obviously, however with the graph, the cumulative value MUST be reset after each grouping.
So for a 1 week view, on the graph, each day at 9am, the cumulative value would be reset to 0.
Example: if the first cumulative value at the END of day 1 (8:59am) were to be 85.9, then at 9:00am day 2, would be reset to 0, so that it then counts from 9am-9am the next day.
This would mean that with the current query, i would need to run it 7 times in order to generate a weeks worth of data, and if that original query is already slow, then you can see where the efficiency problem is.
I hope i have explained this enough, if you need more info about anything, just let me know.
cheers
Weiry