Example to connect to a SQLite database using R Project for Statistical Computing

Getting source data into your R environment is the first step in data or statistical analysis, whether this be in R or R-Studio. Often this data resides in relational data stores. For those new to R, this is how to access the SQLite database from R.

Edit highlighted lines as appropriate.

# Load the SQLite library
# Assign the sqlite datbase and full path to a variable
  dbfile = "/Library/WebServer/Documents/Dash2.5/data/livedb.sqlite";
# Instantiate the dbDriver to a convenient object
  sqlite = dbDriver("SQLite");
# Assign the connection string to a connection object
  mydb = dbConnect(sqlite, dbfile);
# Request a list of tables using the connection object
# Assign the results of a SQL query to an object
  results = dbSendQuery(mydb, "SELECT * FROM DailyCensus");
# Return results from a custom object to a data.frame
  data = fetch(results);
# Print data frame to console
# Clear the results and close the connection

Using R-Studio with Meditech Data Repository

I’m often asked about my tool set for working with the Meditech Data Repository (DR) to mine clinical data. In most cases I use the tools that come bundled with the underlying MS SQL Server. They cost nothing extra, so there’s no excuse for not using them.

SQL Server Management Studio (SSMS) and SQL Server Reporting Services (SSRS) are simple to use once setup and configured properly and they’re great for basic observational analysis (reports and dashboards). If management need more eye catching eye-candy in their dashboards, there’s always d3.js (which I absolutely love!).

Lately, I’ve been spending quality time with the free and open source R-Studio writing R code. This is widely used in non-healthcare settings but for some reason it hasn’t made in-roads in Healthcare (with the exception of academic and research environments). While the learning curve is initially steep it is proving very worth while.R-StudioScreenGrab

R-Studio can connect directly to the DR with the RODBC package and gives you access to a huge armory of data processing, math and statistics tools along with advanced charting and illustration.

RplotExampleIf you really want to do “predictive analytics” with very large multivariate datasets, this free and open source package offers machine learning, modeling and more statistical packages than I know what to do with.

Installation of R and then R-Studio couldn’t be simpler with both Mac and Windows installers.

To get you started this code snippet will pull the latest data from [livedb.[dbo].[AdmVisits], select a column and throw a chart on the page. From here on, the possibilities are limited by your imagination and persistence.

     # Load an ODBC database handler
     # Load plotting library
     # Create a connection handle
     dbhandle <- odbcDriverConnect('driver={SQL Server};
     # Fire the SQL query and assign results to a data.frame
     results <- as.data.frame(sqlQuery(dbhandle, '
     SELECT TOP 500 * FROM [livedb].[dbo].[AdmVisits]
     # Close the database connection
     # Summarize the returned data to the console
     # Subset your results for a desired column
     race <- results [ , "RaceName"]
     # Illustrate it