Function for importing common 1 hour average (hourly) UK Automatic Urban and Rural Network (AURN) Air Quality Archive data files previously downloaded in ".csv" format for use with the openair package. The function uses read.table (in utils) and rbind (in reshape).

importAURNCsv(file = file.choose(), header.at = 5, data.at = 7,
  na.strings = c("No data", "", "NA"), date.name = "Date",
  date.break = "-", time.name = "time", misc.info = c(1, 2, 3, 4),
  is.site = 4, bad.24 = TRUE, correct.time = -3600,
  output = "final", data.order = c("value", "status", "unit"),
  simplify.names = TRUE, ...)

Arguments

file

The name of the AURN file to be imported. Default, file.choose opens browser. Use of read.table (in utils) also allows this to be a readable text-mode connection or url (although these options are currently not fully tested).

header.at

The file row holding header information. This is used to set names for the resulting imported data frame, but may be subject to further modifications depending on following argument settings.

data.at

The first file row holding actual data. When generating the data frame, the function will ignore all information before this row, and attempt to include all data from this row onwards.

na.strings

Strings of any terms that are to be interpreted as NA values within the file.

date.name

Header name of column holding date information. Combined with time information as single date column in the generated data frame.

date.break

The break character separating days, months and years in date information. For example, "-" in "01-01-2009".

time.name

Header name of column holding time information. Combined with date information as single date column in the generated data frame.

misc.info

Row numbers of any additional information that may be required from the original file. Each line retained as a character vector in the generated data frame comment.

is.site

Header name of column holding site information. Setting to NULL turns this option off.

bad.24

Reset AURN 24 time stamp. AURN time series are logged as 00:00:01 to 24:00:00 as opposed to the more conventional 00:00:00 to 23:59:59. bad.24 = TRUE resets the time stamp which is not allowed in some time series classes or functions.

correct.time

Numerical correction (in seconds) for imported date. AURN data is logged retrospectively. For 1 hour average data, correct.time = -3600 resets this to the start of the sampling period.

output

Output style. Default "final" using import().

data.order

A vector of names defining the order of data types. AURN files typically include three data types, actual data and associated data quality and measurement unit reports. Here, these are defined as "value", "status" and "unit", respectively.

simplify.names

A logical (default TRUE) prompting the function to try to simply data frame names using common chemical shorthand. FALSE retains names from original file, although these may be modified if they contain unallowed characters or non-unique names.

...

Other parameters. Passed onto and handled by import().

Value

The function returns a data frame for use in openair. By comparison to the original file, the resulting data frame is modified as follows: Time and date information will combined in a single column "date", formatted as a conventional timeseries (as.POSIXct). Time adjustments may also be made, subject to "bad.24" and "correct.time" argument settings. Using default settings, the argument correct.time = - 3600 resets the time stamp to the start of the measurement period. If name simplification was requested (simplify.names = TRUE), common chemical names will be simplified. For example, "carbon monoxide" will be reset to "co". Currently, this option only applies to inorganics and particulates, not organics. Non-value information will be rationalised according to data.order. For example, for the default, data.order = c("value", "status", "unit"), the status and unit columns following the "co" column will be automatically renamed "unit.co" and "status.co", respectively. An additional "site" column will be generated. Multiple "site" files are allowed. Additional information (as defined in "misc.info") and data adjustments (as defined by "bad.24" and "correct.time") are retained in the data frame comment.

Details

The importAURN() function was developed for use with air quality monitoring site data files downloaded in standard hourly (or 1 hour average) format using the Air Quality Archive email service. Argument defaults are set to common values to simplify both the import operation and use with openair.

Similar file structures can also be imported using this function with argument modification.

See also

Generic import function import, or direct (on-line) data import function importAURN. Other dedicated import functions available for other file types, e.g.: importKCL, importADMS, etc.

Examples

########## #example 1 ########## #data obtained from email service: #http://www.airquality.co.uk/archive/data_selector.php #or #http://www.airquality.co.uk/archive/data_and_statistics.php?action=step_pre_1 #example file "AirQualityDataHourly.csv" Brighton Roadside and Brighton Preston Park 2008. #import data as mydata ## mydata <- importAURN.csv("AirQualityDataHourly.csv") #read additional information retained by importAURN ## comment(mydata) #analysis data by site ## boxplot(nox ~ site, data = mydata) ########## #example 2 ########## #example using data from url #import data as otherdata ## otherdata <- importAURN.csv( ## "http://www.airquality.co.uk/archive/data_files/site_data/HG1_2007.csv") #use openair function ## summarise(otherdata) ########## #example 3 ########## #example of importing other similar data formats #import 15 min average so2 data from Bexley using url ## so2.15min.data <- importAURN.csv( ## "http://www.airquality.co.uk/archive/data_files/15min_site_data/BEX_2008.csv", ## correct.time = -900) #note: correct.time amended for 15 min offset/correction. #additional comments ## comment(so2.15min.data) #analysis ## diurnal.error(so2.15min.data, pollutant="so2") #wrapper for above operation ##(e.g. if you have to do this -or similar- a lot of time) ## my.import.wrapper <- function(file, correct.time = -900, ...) ## { importAURN.csv(file = file, correct.time = correct.time, ...) } #same as above ## so2.15min.data.again <- my.import.wrapper( ## "http://www.airquality.co.uk/archive/data_files/15min_site_data/BEX_2008.csv") #analysis ## timeVariation(so2.15min.data.again, pollutant="so2")