The data table currently holds 870,969,236 records and this is an example query I'm running:
Code: Select all
SELECT
data.*,
DATE_FORMAT(data.reading_datetime, '%Y-%m') AS chart_date,
units_of_measurement.unit AS unitname,
dataset.name, dataset.parameter,
AVG(data.value) AS average,
STD(data.value) AS standard_deviation,
COUNT(data.value) AS num_of_results
FROM
data
LEFT JOIN
dataset ON data.dataset_id = dataset.id
LEFT JOIN
units_of_measurement ON dataset.uom_id = units_of_measurement.id
WHERE
reading_datetime > '2000-07-16' AND
reading_datetime < '2009-07-25' AND
(
dataset_id = 4 OR
dataset_id = 8 OR
dataset_id = 13 OR
dataset_id = 24
)
GROUP BY
data.longitude,
data.latitude,
data.dataset_id ,
DATE_FORMAT(data.reading_datetime, '%Y'),
DATE_FORMAT(data.reading_datetime, '%m')
ORDER BY
data.reading_datetime ASCI've added one index to the data table of (dataset_id, reading_datetime) which has reduced the rows being used down from 870969236 to 90060802 but I'd like to get this down further if possible.
I have tried adding another index for the groups (longitude, latitude, dataset_id, reading_datetime) but this didn't reduce the rows being used at all.
If anyone has any suggestions or methods on I could optimize the table, I'd love to hear them.
Thanks in advance