Recent developments in the openair suite of data analysis packages

Measuring Air Quality 2023


David Carslaw

Department of Chemistry, University of York

Ricardo Energy & Environment

28th March 2023

Overview



  1. Accessing UK Air Quality Data

  2. Examples of openair Functions

  3. ‘De-weathering’ Air Quality Data

openair


A package of tools developed in open-source software called R specifically designed for air quality data analysis

  • Used extensively worldwide by academia, the public and private sectors

  • Many functions covering a wide range of issues

    • Data import from extensive UK networks
    • Directional analysis
    • Source attribution
    • Back trajectories
    • Model evaluation
    • Utility functions
  • Over 400,000 downloads

  • Sister packages: openairmaps, worldmet and deweather


Accessing UK Air Quality Data

Air quality sites available in the UK



There are a lot of air quality sites available across many networks accessible by openair

  • AURN, national and local networks

  • ‘Local’ networks include the London Air Quality Network, and networks from Sussex, Essex, Kent, North Lincolnshire, …

  • For example, in 2023 there are:

    • 373 Urban Traffic sites measuring NO2

    • 129 ozone sites operational in 2023


Lots of pre-calculated statistics available



Annual, monthly, daily, daily maximum 8-hour, DAQI, … with data capture rates


annual_aurn <- importAURN(year = 2022, data_type = "annual")
glimpse(annual_aurn)
Rows: 170
Columns: 30
$ uka_code           <fct> UKA00513, UKA00615, UKA00933, UKA00451, UKA00137, U…
$ code               <chr> "ABD7", "ABD8", "ABD9", "ACTH", "AH", "ARM6", "BAAR…
$ site               <chr> "Aberdeen Union Street Roadside", "Aberdeen Welling…
$ date               <dttm> 2022-01-01, 2022-01-01, 2022-01-01, 2022-01-01, 20…
$ o3                 <dbl> NA, NA, 53.98682, 61.57285, 65.75499, NA, NA, NA, 5…
$ o3_capture         <dbl> NA, NA, 0.952511, 0.939155, 0.990753, NA, NA, NA, 0…
$ o3.summer_capture  <dbl> NA, NA, 0.9975, 0.9419, 0.9898, NA, NA, NA, 0.9957,…
$ o3.daily.max.8hour <dbl> NA, NA, 67.24956, 71.01198, 74.95352, NA, NA, NA, 6…
$ o3.aot40v          <int> NA, NA, 924, NA, 3839, NA, NA, NA, 3459, NA, NA, NA…
$ o3.aot40f          <int> NA, NA, 2529, 4369, 10067, NA, NA, NA, 7543, NA, NA…
$ somo35             <dbl> NA, NA, 1487.9531, 2248.1085, 3226.5075, NA, NA, NA…
$ somo35_capture     <dbl> NA, NA, 0.9507, 0.9260, 0.9863, NA, NA, NA, 0.9918,…
$ no                 <dbl> 19.504377, 19.595931, 6.346165, NA, 0.285401, 20.66…
$ no_capture         <dbl> 0.993721, 0.997489, 0.995548, NA, 0.970662, 0.98710…
$ no2                <dbl> 27.019278, 24.222175, 16.259698, NA, 3.197492, 22.1…
$ no2_capture        <dbl> 0.993721, 0.997489, 0.995548, NA, 0.970662, 0.98710…
$ nox                <dbl> 56.925572, 54.268851, 25.990348, NA, 3.635102, 53.8…
$ nox_capture        <dbl> 0.993721, 0.997489, 0.995548, NA, 0.970662, 0.98710…
$ so2                <dbl> NA, NA, NA, NA, NA, NA, NA, 0.814916, 1.010338, NA,…
$ so2_capture        <dbl> NA, NA, NA, NA, NA, NA, NA, 0.903767, 0.990297, NA,…
$ co                 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ co_capture         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ pm10               <dbl> NA, NA, 11.25077, 6.22880, NA, 16.43729, NA, NA, NA…
$ pm10_capture       <dbl> NA, NA, 0.999543, 0.997489, NA, 0.964384, NA, NA, N…
$ pm2.5              <dbl> NA, NA, 5.952630, 3.847239, NA, NA, NA, NA, NA, NA,…
$ pm2.5_capture      <dbl> NA, NA, 0.999543, 0.997489, NA, NA, NA, NA, NA, NA,…
$ gr10               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ gr10_capture       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ gr2.5              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ gr2.5_capture      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…

openairmaps for air quality network information



Look at sites in the AURN, Air Quality England and local networks…

networkMap(source = c("aurn", "aqe", "local"), 
           cluster = FALSE)

openairmaps for air quality network information



Look at sites in the AURN, Air Quality England and local networks…that were open between 2015 and 2023 (useful for trend analysis)

networkMap(source = c("aurn", "aqe", "local"), cluster = FALSE,
           year = 2015:2023)

Polar plots


  • Useful for local source attribution

  • Just considering concentration by wind direction only tells us about the direction of major sources

  • Considering the joint wind speed-direction variation says much more about the nature of the emission sources

Source apportionment with polar plots


If we have a whole network of sites, we can triangulate potential sources. For example, where do the highest 10% of concentrations of PM10 come from?

  • Import data and manipulate to access meteorological data
  • Plot the probability of exceeding the 90th percentile concentration with satellite base map
# get data
nlincs_local <- importLocal(
  c("SCN6", "SC12", "SC10", "AMVL"), 
  year = 2021, meta = TRUE
)
nlincs_aurn <- importAURN(
  c("SCN2"), 
  year = 2021, meta = TRUE
)
# combine
nlincs_all <- bind_rows(
  nlincs_local, nlincs_aurn
)
# reuse modelled met for local data
nlincs_all <-
  nlincs_all %>%
  select(-ws, -wd, -air_temp) %>%
  left_join(
    select(nlincs_aurn, date, ws:air_temp), 
    by = join_by(date)
  )
# polar plot map!
polarMap(
  nlincs_all, "pm10", alpha = 3 / 4,
  statistic = "cpf", percentile = 90,
  cols = "YlGnBu",
  provider = "Esri.WorldImagery"
)
  • Hourly variation in PM10 by wind direction
  • Different base maps
annulusMap(
    nlincs_all, "pm10", 
    alpha = 3 / 4,
    # use multiple providers
    provider = c("Stamen.Toner", "Esri.WorldImagery")
)

Tracking changes in sources


  • Polar plots can be very useful for tracking changes in different source contributions

  • In this case the difference in concentrations of SO2 and PM10 are considered between the periods 2010/2011 and 2020/2021

SO2

PM10

Source attribution with back trajectories


  • Interest is in knowing source origins and contributions
  • Most relevant for regional pollutants such as PM2.5 and PM10
  • openair has a range of back trajectory modelling techniques available
  • Use of the NOAA Hysplit model to pre-calculate back trajectories

Example for PM2.5 at North Kensignton in London

Deweathering Air Quality Data

Dominant meteorology



Things would be much easier if the weather was the same every day!


  • Meteorology can mask or emphasise trends
    • Trends often affected more by the weather than by changes in emissions
  • Detecting change — is it due to an intervention that affects emissions or by the weather?
  • Many sophisticated change-point methods but the elephant in the room is often meteorology
  • Extensions to openair and the rmweather package can ‘deweather’ time series of air quality data — now widely used globally with the Covid-19 interest

A good vintage!


  • We tend to describe air quality in terms of “good” or “bad” years; like wine!

  • Important to know whether things have improved (or got worse) because of the weather or emissions

  • A more quantitative approach is required …

Example at Camden roadside site



Question: What would the 2019 NO2 concentration at the Camden Roadside site be if it experienced weather conditions of other years?

  • Build model for 2019 to explain NO2 concentrations using measured meteorological data
  • Predict NO2 concentrations based on other meteorological years
  • Rank the predictions
  • NO2 varies from 39 to 45 μg m-3 (2019 was 43 μg m-3)

Example at the Port of Dover



Fuel quality changes detected in SO2 concentrations


  • Difficult to see changes in SO2 in raw data

  • Deweathered data much clearer and can be linked to known changes in fuel sulphur content

Thank you for your attention


David Carslaw (david.carslaw@york.ac.uk, david.carslaw@ricardo.com)

Further information