New air quality statistics

The air quality data we recently began to measure has been expanded to include statistics for this month, this year and all-time.  This data is now available on the this month and year and all-time record pages.

In the short period that we have been measuring air quality the air is exceptionally clean. For the vast majority of the time the particulates and air quality index values have been extremely low (good) and have been elevated to a slight degree only for fairly short periods of time.

To include these new statistics changes were required to be made to our database. Since the recording of air quality began from 26 December 2020, a routine is ran at the start of each day to fetch the data from the previous day to insert a new row into a database table for that day. However generally only the averages and maximums were shown, without the earliest time of the occurrence. With these improvements now the averages, minimum and maximums are measured for each day along with the time of minimums and maximums.

However, it should noted that these additions began from 7 February 2021, because we only retain the real-time data for 7 days (for data storage reasons). Therefore the time of occurrence will not be shown for statistics before this date. Also we collecting a vast range of daily parameters, most of which haven’t been published to a great deal on the web site, but is being collected for completeness and for future analysis.

To provide a bit more information about how this works a MySQL database procedure (as shown below) is scheduled to run at the start of each day to insert the previous day data into a table of daily air quality averages and extremes. This uses the ROW_NUMBER window function to calculate the earliest time for each individual minimum and maximum for each day. A query for each of one of these calculations are then joined to a query that essentially merges these calculations into one table. Because these calculations are rather expensive a variable retrieves the latest date of data already inserted into the data to limit the quantity of data that is calculated to only the data not already captured in the daily table.

These queries are both performed on the real-time table (every 10 seconds) and an 5 minute interval logged data table, with the results inserted into temporary tables. These two temporary tables are then used to determine the greatest value of each of these data sources to insert the data for each column of data. This is so that the data is suitably captured due any possible network outages that may affect the currency of data. The SQL COALESCE function is used to return the first non- null value from these data sources, with preference provided to the Real-time table.

A PHP script contains SQL queries that retrieve the necessary information from this table and are assigns to variables when the web page is visited. This allows for these statistics to be shown on the web page and is the same principle as several other statistics on these and other pages. The SQL queries are fairly simple because the daily air quality data already shows all the required summarised information for each day.