New air quality statistics

The air quality data we recently began to measure has been expanded to include statistics for this month, this year and all-time.  This data is now available on the this month and year and all-time record pages.

In the short period that we have been measuring air quality the air is exceptionally clean. For the vast majority of the time the particulates and air quality index values have been extremely low (good) and have been elevated to a slight degree only for fairly short periods of time.

To include these new statistics changes were required to be made to our database. Since the recording of air quality began from 26 December 2020, a routine is ran at the start of each day to fetch the data from the previous day to insert a new row into a database table for that day. However generally only the averages and maximums were shown, without the earliest time of the occurrence. With these improvements now the averages, minimum and maximums are measured for each day along with the time of minimums and maximums.

However, it should noted that these additions began from 7 February 2021, because we only retain the real-time data for 7 days (for data storage reasons). Therefore the time of occurrence will not be shown for statistics before this date. Also we collecting a vast range of daily parameters, most of which haven’t been published to a great deal on the web site, but is being collected for completeness and for future analysis.

To provide a bit more information about how this works a MySQL database procedure (as shown below) is scheduled to run at the start of each day to insert the previous day data into a table of daily air quality averages and extremes. This uses the ROW_NUMBER window function to calculate the earliest time for each individual minimum and maximum for each day. A query for each of one of these calculations are then joined to a query that essentially merges these calculations into one table. Because these calculations are rather expensive a variable retrieves the latest date of data already inserted into the data to limit the quantity of data that is calculated to only the data not already captured in the daily table.

These queries are both performed on the real-time table (every 10 seconds) and an 5 minute interval logged data table, with the results inserted into temporary tables. These two temporary tables are then used to determine the greatest value of each of these data sources to insert the data for each column of data. This is so that the data is suitably captured due any possible network outages that may affect the currency of data. The SQL COALESCE function is used to return the first non- null value from these data sources, with preference provided to the Real-time table.

A PHP script contains SQL queries that retrieve the necessary information from this table and are assigns to variables when the web page is visited. This allows for these statistics to be shown on the web page and is the same principle as several other statistics on these and other pages. The SQL queries are fairly simple because the daily air quality data already shows all the required summarised information for each day.

Solar and air quality data

New data is now available on multiple pages to incorporate solar radiation, sunshine and air quality following the start of recording these parameters from 26 December 2020. These sensor were installed along with the change to using a Weatherlink Live receiving unit between 19 and 22 December 2020, which were then tested for compatibility with our processes and existing data.

The following pages have been updated with additional data:

  • The home page: the solar and air quality data have been added to the real time updates on the home page, sunshine/ night-time bands have been added to the real-time graph, the station forecast have been removed and a Bureau of Meteorology forecast added
  •  Gauges: A gauge for solar radiation was added
  •  Today and Yesterday: Averages, Totals and Extremes for Solar, Insolation, Sunshine and Air Quality added
  •  System Status: Reworked to align with the system changes and to provide additional system information now available with this change
  •  Glossary: Updated with additional terms of Air Quality, Insolation, Solar Radiation and Sunshine.

 A routine now runs at the start of each day to collate the air quality data into an additional daily database table used in the Historical graphs. The Realtime database table was altered to include air quality data. A monthly air quality database is also populated with data each 5 minutes.

A combination of our regular data updating on the website and these Realtime and Monthly database tables are used to update the tables and graphs on the website. The Realtime and Monthly database tables are also used as in the input to update the daily air quality database table.

Coming soon will be the addition of UV data. Also air quality data for the this month, this year and all-time records is also planned. Additional data will need to added to the daily database table in order to show that data along with the date and time of the occurrences.

Under progress is the calculation of daily potential evapotranspiration and possibly evaporation now that we are measuring solar radiation. Due to limitations with the system I am using it appears these will have to be calculated, rather than supplied by the weather station directly. These are rather complex multi-step calculations, so considerations are being made with the practicalities of that and what data will be shown from that on the website.

September 2020 weather review

The review of the weather at Ferny Grove during September 2020 is now available.

During September 2020 overnight temperatures were above average and were the highest in 10 years. Similarly to August 2020, the month saw very few cool nights and many warm nights throughout the month. The 18th to 25th saw overnight temperatures peak at 9 ºC above average. Daytime temperatures were close to average and saw a mixture of above and below average temperatures.

Rainfall in September was below average but was close to the long term median, with a total of 14.2 mm (42.7% of the long term average). The long term rainfall for September is generally quite low, hence the close to median rainfall. While the rainfall total was quite low and continuation of the trend of low monthly rainfall totals, the year to date rainfall however remains above average at 9.8 % above the long term average and the highest since 2015. Low daily rainfall totals with light showers during some days during the month. A thunderstorm passed to the south on the 25th, as well as another thunderstorm to the north and brought a brief light shower.

Long period rainfall totals remained well below average during most periods while deteriorating further. 6 monthly rainfall fell significantly to 161.7 mm below average being more than 6 months since significant rainfall at the start of the year. 9 monthly rainfall was 79.4 mm above average while 12 to 48-month rainfall was between 72.4 mm below average for 12 month rainfall and 245.3 mm below average for 24 month rainfall.

The summary containing the key information can be found here.

The full report of more detailed analysis is available here.

Website analytics

A new page is now available that has been in development lately of a different nature to the existing content on the website. We use the privacy friendly open source website analytics platform Matomo that is hosted on-premises where we fully own the data that allows us to understand the usage and interest in this service we provide.

This website analytics data have been used to determine whether there is a correlation between the visits are made to the website compared to amount of the rainfall at the time and to test whether visitor activity is stronger during wetter weather. From this it have been discovered that at this time that while a lot of visits are made when there is no rainfall, when expressed in percentage terms that there is a general trend in more interest in the website during wetter weather, whilst visits across the range of rainfall totals are very consistent.

This information of visits by rainfall have been published on the new page as four charts using this information calculated from our databases. In addition to this, more standard summarised data over time of the number of visits, the type of devices used and the visits made by returning and frequently returning visitors. On that page are explanations of the definitions relating to this data and also there are key metric for various periods to provide a quick summary.

Thank you to all those who have shown an interest in this website and the many who have come back to this website many times. I have been quite surprised at the strong visitor activity that has occurred and there is clearly an interest on local real-time weather data. This website is in continued development with completed new works announced on this blog in addition to updates to the website that are mentioned on our Website Info page.

The below information is provided to share how this page is put together at this time for anyone who want to have more understanding of this. Both the analytics data and weather data are stored in MySQL databases, which are both used to create four database tables containing the data for the four graphs for the visits by rainfall. These calculations use table joins and appropriate aggregations to produce the data to allow for a fair comparison. But because these are quite complex calculations the database queries are not executed when a graph is viewed as retrieving the current data will add significant time to load the graph.

So given that this data doesn’t change too quickly, a SQL procedure is executed once a month to update the data for when someone visits the page. The SQL code used in that procedure is similar to the below example. The methodology used is that a table is created to calculate the amount of rainfall in the previous 24 hours for each hour during the past year, of which this data is not readily available in the database. This is derived by first calculating the rain for each hour, then the cumulative rainfall during the last year for each hour and then calculate the amount of rainfall for each hour compared to the same time in the previous day.

Once we have that data then table joins are used to merge the analytics data to this rainfall data for each of the four tables. Finally at the end the rainfall data is deleted, as the data is no longer needed.

The other graphs that don’t compare the rainfall by visits use php scripts to run a query to retrieve the necessary information from the database and insert that data into arrays in a format that the Highcharts graphs can use. This is quite similar to the other graphs on the website. The Javascript code for the highcharts is heavily based on the graphs elsewhere on this website, which is accessible through viewing the page source code. From the Javascript code the code of the php scripts used as the data input can be accessed by appending a ?view=sce to the url.

The summary statistics at the top of the page uses data from a php script that contains a series of variables whose values are assigned from the result of SQL queries. That php script is used as an php include in the web page like this: include './utils/SQL-queries/analyticsData.php'; and the variables in that script are used within the table such as: <?php echo $visits7day ?>

Similarly a php script is used to return a variable for populating the update time for the visits by rainfall data. This is quite simple in that it returns the distinct updated_time field of the one of the database tables containing this data and the query is formatted in php to show the date in the required date format.

These are the SQL statements to calculate the summary statistics in the table: