Postgres BETWEEN two timestamps

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
User avatar
Weiry
Forum Contributor
Posts: 323
Joined: Wed Sep 09, 2009 5:55 am
Location: Australia

Postgres BETWEEN two timestamps

Post by Weiry »

Im having problems currently with a postgres database where i am trying to select between datetime1 and datetime2.

The query itself is a tad more complicated than that (will provide full query in a sec), however its important to understand the context of the situation.

I have a database that currently is storing data in the following format every 15 minutes of every day, 24 hours a day:
node_id, poll_stamp, value

where:
node_id - the id of a sensor
poll_stamp - the current timestamp at the time of the sensor polling. format: Y-m-d H:i:s+00
value - you guessed it, a value.

This has led a very large number of records, aprox 850k so far, which also presents the problem of having an efficient query.

There are two requirements of the query which i am trying to meet.
1. Get a cumulative value of all values vetween #2
2. Select all values between 9am day 1 and 8:59am day 2, (this is a requirement, days MUST be measured between 9am day 1 and 9am day 2)

This query will be effective for a single day's worth of data.
This is the query i currently have:
[text]SELECT t.node_id, t.poll_stamp, (SELECT sum(x.value) FROM sensor_data x WHERE x.poll_stamp <= t.poll_stamp AND extract(DOY from poll_stamp) = '2010-12-01 09:00:00') AS value FROM sensor_data t WHERE t.poll_stamp >= '2010-12-01 09:00:00' AND t.poll_stamp <= '2010-12-02 08:59:00' ORDER BY t.poll_stamp[/text]
I have also tried
[text]SELECT t.node_id, t.poll_stamp, (SELECT sum(x.value) FROM sensor_data x WHERE x.poll_stamp <= t.poll_stamp AND extract(DOY from poll_stamp) = '2010-12-01 09:00:00') AS value FROM sensor_data t WHERE t.poll_stamp BETWEEN '2010-12-01 09:00:00' AND '2010-12-02 08:59:00' ORDER BY t.poll_stamp[/text]

The error im receiving with this (using PHP to execute) is:
[text]Warning: pg_query() [function.pg-query]: Query failed: ERROR: invalid input syntax for type double precision: "2010-12-01 09:00:00" in /web/sites/default/dashboard/class.database.php on line 55[/text]


There is also another issue with the above query, it seems to be somewhat inefficient when selecting a large amount of data, due to the cumulative values. This problem will be multiplied by a huge amount, because there is a need to show a cumulative value as described above, for every day of an entire year in a single graph.

So in other words, there is a need to have multiple graphs which can show data from 1 day, 1 week, 1 month, 1 year.
The yearly would need to be averaged out obviously, however with the graph, the cumulative value MUST be reset after each grouping.
So for a 1 week view, on the graph, each day at 9am, the cumulative value would be reset to 0.

Example: if the first cumulative value at the END of day 1 (8:59am) were to be 85.9, then at 9:00am day 2, would be reset to 0, so that it then counts from 9am-9am the next day.

This would mean that with the current query, i would need to run it 7 times in order to generate a weeks worth of data, and if that original query is already slow, then you can see where the efficiency problem is.

I hope i have explained this enough, if you need more info about anything, just let me know.

cheers
Weiry
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Postgres BETWEEN two timestamps

Post by Christopher »

The problem is that your second date should be the next day (the 13th in your examples):

Code: Select all

WHERE t.poll_stamp BETWEEN '2010-12-01 09:00:00' AND '2010-13-02 08:59:00'
// OR
WHERE t.poll_stamp >= '2010-12-01 09:00:00' AND t.poll_stamp <= '2010-13-02 08:59:00'
And it probably should be:

Code: Select all

WHERE t.poll_stamp >= '2010-12-01 09:00:00' AND t.poll_stamp < '2010-13-02 09:00:00'
(#10850)
User avatar
Weiry
Forum Contributor
Posts: 323
Joined: Wed Sep 09, 2009 5:55 am
Location: Australia

Re: Postgres BETWEEN two timestamps

Post by Weiry »

Christopher wrote:The problem is that your second date should be the next day (the 13th in your examples):
[text]WHERE t.poll_stamp BETWEEN '2010-12-01 09:00:00' AND '2010-13-02 08:59:00'[/text]
Actually the date format you presented there is not a valid date format... Y-m-d H:i:s, not Y-d-m H:i:s, and i am using the Y-m-d format, i am selecting the 1st and the 2nd of the month.
Christopher wrote:And it probably should be:

Code: Select all

WHERE t.poll_stamp >= '2010-12-01 09:00:00' AND t.poll_stamp < '2010-13-02 09:00:00'
Your example wants me to select 1 month and 1 day? Also refer to my previous point.



EDIT:: I forgot to post this originally, this is the error i receive.
Warning: pg_query() [function.pg-query]: Query failed: ERROR: invalid input syntax for type double precision: "2010-12-01 09:00:00"
Last edited by Weiry on Wed Dec 01, 2010 6:27 pm, edited 1 time in total.
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Postgres BETWEEN two timestamps

Post by VladSun »

Seems like a double post ...
viewtopic.php?f=1&t=124927

You need type casting:

Code: Select all

WHERE t.poll_stamp BETWEEN timestamp '2010-12-01 09:00:00' AND timestamp '2010-13-02 08:59:00'
Please, use [ syntax=sql ] BB tags.
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Postgres BETWEEN two timestamps

Post by VladSun »

You may calculate cumulative values by using JOINs or subselects instead of user variables (though I'm not sure what will be the performance issues).

Probably the best will be a windowing function (I love PGSQL ;) )

Code: Select all

select 
	test.cdate,
	test.id,
	sum(test.id) over (rows unbounded preceding)
from 
	test 
group by 
	test.cdate,
	test.id
order by
	test.id
http://www.postgresql.org/docs/current/ ... indow.html
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
Weiry
Forum Contributor
Posts: 323
Joined: Wed Sep 09, 2009 5:55 am
Location: Australia

Re: Postgres BETWEEN two timestamps

Post by Weiry »

VladSun wrote: You need type casting:

Code: Select all

WHERE t.poll_stamp BETWEEN timestamp '2010-12-01 09:00:00' AND timestamp '2010-13-02 08:59:00'
Updated the query i have to the following:

Code: Select all

SELECT t.node_id, t.poll_stamp, 
(SELECT sum(x.value) 
     FROM sensor_data x 
     WHERE x.poll_stamp <= t.poll_stamp AND extract(DOY from poll_stamp) = '2010-12-02 09:00:00' AND x.node_id = '65') AS value 
FROM sensor_data t 
WHERE t.node_id = '65' AND t.poll_stamp BETWEEN timestamp '2010-12-02 09:00:00' AND timestamp '2010-12-03 08:59:00'
ORDER BY t.poll_stamp
And i still receive the same error:
[text]Warning: pg_query() [function.pg-query]: Query failed: ERROR: invalid input syntax for type double precision: "2010-12-02 09:00:00" [/text]

I will have a look at that windowing function now, still pretty new to PGSQL
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Postgres BETWEEN two timestamps

Post by VladSun »

Weiry wrote:And i still receive the same error...
Again ...
VladSun wrote:You need type casting...
every where you use strings instead of timestamps...
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Postgres BETWEEN two timestamps

Post by VladSun »

Code: Select all

extract(DOY FROM poll_stamp) = '2010-12-02 09:00:00'
That's an interesting comparison :twisted: Rethink it ...
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
Weiry
Forum Contributor
Posts: 323
Joined: Wed Sep 09, 2009 5:55 am
Location: Australia

Re: Postgres BETWEEN two timestamps

Post by Weiry »

VladSun wrote:

Code: Select all

extract(DOY FROM poll_stamp) = '2010-12-02 09:00:00'
That's an interesting comparison :twisted: Rethink it ...
And i think thats a classic example of staring at code for too long. 8O :roll:

I couldn't exactly remove that, however i did replace it.
Full query now:

Code: Select all

SELECT t.fk_sensor_node_id, t.poll_stamp, 
   (SELECT sum(x.value) 
    FROM sensor_data x 
    WHERE x.poll_stamp <= t.poll_stamp 
        AND x.poll_stamp BETWEEN timestamp '2010-11-13 09:00:00+00' AND timestamp '2010-11-14 09:00:00+00' 
        AND x.node_id = '65') AS value 
FROM sensor_data t 
WHERE t.node_id = '65' 
    AND t.poll_stamp BETWEEN timestamp '2010-11-13 09:00:00+00' AND timestamp '2010-11-14 09:00:00+00' 
ORDER BY t.poll_stamp DESC
This generates exactly what i need and works reasonably well.

Now i have to scale this out to be able to include weeks,months and year.
The only way i know of to do this is to use date_truc(), although i am slightly concerned with the large number of data and the processing that its going to be doing.

Since we have an application entering this data into the database, i think i will ask if there can be an additional table created which dumps an average value per day, this should reduce the amount of processing required when a whole year's worth of data is queried rather than try to get a query to do this all in one.
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Postgres BETWEEN two timestamps

Post by VladSun »

Did you try

Code: Select all

sum(tableFieldHere) over (rows unbounded preceding) as aliasName
instead fo subselecting?
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
Weiry
Forum Contributor
Posts: 323
Joined: Wed Sep 09, 2009 5:55 am
Location: Australia

Re: Postgres BETWEEN two timestamps

Post by Weiry »

I did, it immediately gave me the error:

[text]Warning: pg_query() [function.pg-query]: Query failed: ERROR: syntax error at or near "over" LINE 1: ...T t.node_id, t.poll_stamp, sum(t.value) over (rows... ^[/text]
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Postgres BETWEEN two timestamps

Post by VladSun »

PostgreSQL 8.4.5 here. Works fine.
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
Weiry
Forum Contributor
Posts: 323
Joined: Wed Sep 09, 2009 5:55 am
Location: Australia

Re: Postgres BETWEEN two timestamps

Post by Weiry »

Running PostgreSQL 8.3.11 and causes the error.

What version did 'over' appear in pgsql?
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: Postgres BETWEEN two timestamps

Post by VladSun »

I don't know but I'm using a "stable" version ...
There are 10 types of people in this world, those who understand binary and those who don't
Post Reply