Kernel density plot for daily mean exceedance statisticsSource:
This function is used to explore the conditions leading to exeedances of air quality limits. Currently the focus is on understanding the conditions under which daily limit values for PM10 are in excess of a specified threshold. Kernel density estimates are calculated and plotted to highlight those conditions.
kernelExceed( polar, x = "wd", y = "ws", pollutant = "pm10", type = "default", by = c("day", "dayhour", "all"), limit = 50, data.thresh = 0, more.than = TRUE, cols = "default", nbin = 256, auto.text = TRUE, ... )
A data frame minimally containing
dateand at least three other numeric variables, typically
x-axis variable. Mandatory.
y-axis variable. Mandatory
Mandatory. A pollutant name corresponding to a variable in a data frame should be supplied e.g.
pollutant = "nox"
The type of analysis to be done. The default is will produce a single plot using the entire data. Other types include "hour" (for hour of the day), "weekday" (for day of the week) and "month" (for month of the year), "year" for a polarPlot for each year. It is also possible to choose
typeas another variable in the data frame. For example,
type = "o3"will plot four kernel exceedance plots for different levels of ozone, split into four quantiles (approximately equal numbers of counts in each of the four splits). This offers great flexibility for understanding the variation of different variables dependent on another. See function
cutDatafor further details.
bydetermines how data above the
by = "day"will select all data (typically hours) on days where the daily mean value is above
by = "dayhour"will select only those data above
limiton days where the daily mean value is above
by = "hour"will select all data above
The threshold above which the
pollutantconcentration will be considered.
The data capture threshold to use ( the data using
timeAverageto daily means. A value of zero means that all available data will be used in a particular period regardless if of the number of values available. Conversely, a value of 100 will mean that all data will need to be present for the average to be calculated, else it is recorded as
TRUEdata will be selected that are greater than
FALSEdata will be selected that less than
Colours to be used for plotting. Options include "default", "increment", "heat", "spectral", "hue", "brewer1" and user defined (see manual for more details). The same line colour can be set for all pollutant e.g.
cols = "black".
number of bins to be used for the kernel density estimate.
TRUEtitles and axis labels will automatically try and format pollutant names and units properly e.g. by subscripting the `2' in NO2.
Other graphical parameters passed onto
cutData. For example,
kernelExceedpasses the option
hemisphere = "southern"on to
cutDatato provide southern (rather than default northern) hemisphere handling of
type = "season". Similarly, common axis and title labelling options (such as
main) are passed to
quickTextto handle routine formatting.
kernelExceed functions is for exploring the conditions under
which exceedances of air pollution limits occur. Currently it is focused on
the daily mean (European) Limit Value for PM10 of 50~ug/m3 not to be
exceeded on more than 35 days. However, the function is sufficiently
flexible to consider other limits e.g. could be used to explore days where
the daily mean are greater than 100~ug/m3.
By default the function will plot the kernel density estimate of wind speed
and wind directions for all days where the concentration of
pollutant is greater than
limit. Understanding the conditions
where exceedances occur can help with source identification.
The function offers different ways of selecting the data on days where the
pollutant are greater than
limit through setting
By default it will select all data on days where
limit. With the default setting of
by it will
select all data on those days where
pollutant is greater than
limit, even if individual data (e.g. hours) are less than
by = "dayhour" will additionally ensure that
all data on the those dates are also greater than
by = "all" will select all values of
pollutant above limit,
regardless of when they occur.
The usefulness of the function is greatly enhanced through using
type, which conditions the data according to the level of another
variable. For example,
type = "season" will show the kernel density
estimate by spring, summer, autumn and winter and
type = "so2" will
attempt to show the kernel density estimates by quantiles of SO2
concentration. By considering different values of
type it is
possible to develop a good understanding of the conditions under which
To aid interpretation the plot will also show the estimated number
of days or hours where exeedances occur. For
type = "default" the
number of days should exactly correspond to the actual number of exceedance
days. However, with different values of
type the number of days is
an estimate. It is an estimate because conditioning breaks up individual
days and the estimate is based on the proportion of data calculated for
each level of
This function automatically chooses the bandwidth for the kernel density estimate. We generally find that most data sets are not overly sensitive to the choice of bandwidth. One important reason for this insensitivity is likley to be the characteristics of air pollution itself. Due to atmospheric dispersion processes, pollutant plumes generally mix rapidly in the atmosphere. This means that pollutant concentrations are ‘smeared-out’ and extra fidelity brought about by narrower bandwidths do not recover any more detail.