Renaming the columns of data frames which are stored in lists of lists
OK, so the scenario is as follows:
- we have a list of 2 elements which in turn are again lists with 2 elements (each of which is a data frame).
- None of the elements in question carry names (neither the list entries nor the data frames)
- we want to only set the names of the data frames that are buried 2 levels down the main list
First we create some mock data that resembles the scenario (mimicking temperature and relative humidity observations during January and February 2010)
## create 2 mock months
date_jan <- as.Date(seq(1, 31, 1), origin = "2010-01-01")
date_feb <- as.Date(seq(1, 28, 1), origin = "2010-02-01")
## create mock observations for the months
Ta_200_jan <- rnorm(31, 10, 3)
Ta_200_feb <- rnorm(28, 11, 3)
rH_200_jan <- rnorm(31, 75, 10)
rH_200_feb <- rnorm(28, 70, 10)
df1 <- data.frame(V1 = date_jan, V2 = Ta_200_jan)
df2 <- data.frame(V1 = date_jan, V2 = rH_200_jan)
df3 <- data.frame(V1 = date_feb, V2 = Ta_200_feb)
df4 <- data.frame(V1 = date_feb, V2 = rH_200_feb)
lst <- list(list(df1, df2), list(df3, df4))
So now we have a list of two elements which are again a list of 2 which is made up of 2 data frames each.
None of these elements are named (actually the columns of the data frames are named V1 and V2 - which is not very informative).
This is what the list structure looks like:
str(lst)
## List of 2
## $ :List of 2
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ V1: Date[1:31], format: "2010-01-02" ...
## .. ..$ V2: num [1:31] 8.79 4.55 8.11 16.58 9.71 ...
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ V1: Date[1:31], format: "2010-01-02" ...
## .. ..$ V2: num [1:31] 81.1 67.4 75.8 64.7 73.9 ...
## $ :List of 2
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ V1: Date[1:28], format: "2010-02-02" ...
## .. ..$ V2: num [1:28] 16.5 15.47 13.38 18.46 8.96 ...
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ V1: Date[1:28], format: "2010-02-02" ...
## .. ..$ V2: num [1:28] 67.3 71.8 87.3 75.3 80.7 ...
Now we define the names to set
name.x <- c("Date")
name.y <- c("Ta_200", "rH_200")
And finally, we use lapply() to recursively set the column names of the data frames within the list of lists
The crux is to define a data frame (y) at iteration 2 which is subsequently returned (and as lapply() always returns a list, we again get a list of lists)
lst <- lapply(seq(lst), function(i) {
lapply(seq(name.y), function(j) {
y <- data.frame(lst[[i]][[j]])
names(y) <- c(name.x, name.y[j])
return(y)
})
})
And this is what we end up with:
str(lst)
## List of 2
## $ :List of 2
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ Ta_200: num [1:31] 8.79 4.55 8.11 16.58 9.71 ...
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ rH_200: num [1:31] 81.1 67.4 75.8 64.7 73.9 ...
## $ :List of 2
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ Ta_200: num [1:28] 16.5 15.47 13.38 18.46 8.96 ...
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ rH_200: num [1:28] 67.3 71.8 87.3 75.3 80.7 ...
Problem solved!
we now have a list of lists with named columns for each data frame with correct labels for date and parameter of the observations!
PS: if you wanted to name the first level entries of the list according to the month of observation, this would do the job:
names(lst) <- c("January", "February")
str(lst)
## List of 2
## $ January :List of 2
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ Ta_200: num [1:31] 8.79 4.55 8.11 16.58 9.71 ...
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ rH_200: num [1:31] 81.1 67.4 75.8 64.7 73.9 ...
## $ February:List of 2
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ Ta_200: num [1:28] 16.5 15.47 13.38 18.46 8.96 ...
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ rH_200: num [1:28] 67.3 71.8 87.3 75.3 80.7 ...
I leave it up to your imagination how to set the names of the second level list entries…
sessionInfo()
## R version 2.15.1 (2012-06-22)
## Platform: x86_64-pc-mingw32/x64 (64-bit)
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats grDevices utils datasets grid graphics methods
## [8] base
##
## other attached packages:
## [1] knitr_0.6.3 raster_1.9-92 sp_0.9-99
## [4] reshape_0.8.4 plyr_1.7.1 latticeExtra_0.6-19
## [7] lattice_0.20-6 RColorBrewer_1.0-5
##
## loaded via a namespace (and not attached):
## [1] digest_0.5.2 evaluate_0.4.2 formatR_0.5 parser_0.0-16
## [5] Rcpp_0.9.13 stringr_0.6 tools_2.15.1
No comments:
Post a Comment