A DBI for PostgreSQL on R

Between the capabilities of R there is the possibility of querying databases thorough R. The DBMS that I know more it’s PostgreSQL. What I like about it, that it is an open source object-relational DBMS. It’s so simple, an also it has an extension for Spatial and Geographical objects called PostGIS.

Thus, the DBI (Database Interface) package I’ve chosen for querying PostgreSQL is RPostgreSQL. To work with it, just I have to download the package from the Repository and use the following code:

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, host = "localhost", user= "***", password="***", dbname="Procesos_UChile")

dbListConnections(drv)
dbGetInfo(drv)
summary(con)

df = dbReadTable(con,'etapas_sept_2013')
summary(df)
head(df, 3)
dbDisconnect(con)

This DBI is a nice product, but it’s limited by the ram, this problem appears when I tried to read a table over 10GB. So, I’m stuck on here. I know, this year was released a library called PivotalR, which allows you to manage big amounts of data with the library madlib.

Pivotal is a software company that provides software and services for the development of custom applications for data and analytics based on cloud computing technology.

And they made a an open-source library for scalable in-database analytics that provides data parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data called Madlib.

The next step is trying to installing this library on ubuntu to see how it works. The instructions are on this URL:

https://gist.github.com/thinkerbot/8699369

You can also watch a presentation of PivotalR with a demo on the following video:
https://www.youtube.com/watch?v=6cmyRCMY6j0

Some Reflextions about Open Source Software

On my time at the University, I learned some of Mapinfo, and Arcview 3.x as a main GIS package, including Avenue as a programming language. Well, and mapobject too.

On my latest years, they acquire a couple of licences of the brand new ArcGIS. When I putted my hands on it, I was a bit excited because the kind of things you can do.

Obviously, the problem with Commercial Softwares is when appears bugs, and the new versions are coming; or whether you need them on your own laptop. So, first of all, I found ArcExplorer, but I didn’t like it, because, you can’t  do much with just a viewer. If you are looking for a viewer, just transform your shapefiles into a kml.

Just by accident, saw a GIS called gvSIG, the problem of this, it was that it used to crush a lot. I’ve heard that know is in a very good shape and is stable. Even on that time, it had some interesting things and for free. I get back to Arcgis, but some time after that I found Quantum GIS, now called renamed as QGIS. I was astonish, for me it was better than ARCGIS, and it’s improving everyday, and you can see how often are new plugins. Well, since that time I’ve been using it.

Being in QGIS itself, I have discovered the existence of GRASS GIS, it’s a bit hard to understand it. His logic it isn’t the same as with others GIS Softwares, but the amazing of it, is you can it with the command console, and as with QGIS, you can use Python with it.

Beside the greats algorithms of GRASS, is the ability of having a connexion with R. Though, I haven’t tried it, I know you can use SAGA with R. Well, there is another tool for Geospatial things, it’s PostGIS (this allow you to deal with geometry), which is an extension of the DBMS called PostgreSQL; another great of this DBMS is pgRouting.

As I come from the GIS world, it isn’t very easy for me, but, I’m learning R to do Spatial Analysis and Data Analysis. It’s amazing what the people have done with R, it’s very fast. And what I love about the open source, is not a black box as tend to occurs with Commercial Software. With an Open Source Software, you can check what algorithms they apply, and it is free.

What I don’t know, is if anyone is doing a development about the Four-step model for transportation planning. If not, we still have to pay for the black box which is TransCAD.

If you’d like to check the things the package you can use on R, I recommend you to follow the next link: http://cran.r-project.org/web/views/Spatial.html

In fact, there are more projects on the open source line. You can check the web page of the GeoDA Center.

My First Post

The motivation of this blog, is because I'm on the process of learning R. I studied Geographical Engineering, we always used softwares, but we never saw R, not even for doing some stats; so I didn't know what it was.
When I had to do the thesis a teacher told me to do everything on R; so, thence I knew how amazing it is. Though, still I'm a beginner.
A couple of months ago, I watched some videos from R-bloggers; there some R gurú told that the best way to learn is doing a blog. So, here I am; and, also it has a double purpose, because it will help me to improve my writing skills.

For posting this, I'm using R, just I followed the excellent code from William K. Morris’s Blog.