What I used for this example is an open data about “Recycling places”, you can find it on the web page of Portal de datos Públicos.
The data, is an xlsx file
The file has 8 columns, one of them is town. So, now, the questions is:
How do I generate multiple files, one for each town?. The answer is simple: R
Why R? Because, you can automatize it. It avoid you to make different filters, and save the new file each time.
Let’s start:
To read the file, you can use the XLConnect package, and to split the data: the plyr package.
You can load the file with the function readWorksheet, but, as you can see, on the first row, six of the eight cells, are merged. So, when you load the file, that will disappear. Thus, we will read the file with headers and we’ll rename the columns with troubles; and then we will remove the row without data.
After that, we will use the d_ply function, which lets us to split the data. On this function, we put the field on which the split should be based. Then, we use the sdf function, which allows us to write the csv files; and as a final step, we extract the names from the field chosen, to paste them on the name of the new files.
library(XLConnect) #Functions to read excel library(plyr) #Functions to split data wb = loadWorkbook("Formato_Puntos_de_almacenamiento_Muni_consolidado.xlsx") df = readWorksheet(wb, sheet = "Hoja1", header = TRUE) colnames(df)[c(5, 6)] <- c("Este UTM", "Norte UTM") #change col names df2 <- df[-1,] #remove first row d_ply(df2, .(Comuna), function(sdf) write.csv(sdf, file=paste(sdf$Comuna[[1]],".csv",sep=""))) #write multiple csv