John is a Data Analysis Librarian/Consultant working in the Data & Visualization Services Department. I help Duke University students and researchers navigate practical data science challenges. To that end I provide a series of workshops on R and OpenRefine; offer walk-in and by-appointment consultations; and host the R we having fun yet learning series (Rfun).
My workshops and presentations include twitter stream gathering, web scraping, data parsing, and data cleaning. Each workshop is designed as a hands-on experience where you can download practice data, presentation slides, and workbook guides. Recorded video streaming is often available.
MSLS Library & Information Science, 1992
UNC Chapel Hill
BA Sociology, 1990
Ciphers of Sybaritic Sophistry, 1900
Novel School of Fiction
Workshop Materials cover Web Scraping, Data Cleaning, Reconciliation, Twitter data, JSON/HTML parsing
The Data Science & Visualization Institute invited a presentation on web scraping and creating customized data sources
Use R to gather, analyze and visualize tweets. This presentation includes information about text mining, sentiment analysis, network graph analysis, term-document matrix, and word clouds
ggvis is a visualization R library designed to leverage the tidyverse. This presentation demonstrates basic ggvis commands and syntax, and considers whether you can learn just one: ggvis or ggplot2
Using the flexdashboards package you can easily create attractive dashboard summaries
Explore two methods of gathering real-time twitter stream data, hands-on exercises in applying for twitter API Keys and configuring a twitter-stream data gathering tool. Investigate and discuss historical twitter data gathering. Discuss considerations for analysis.
In this Hand’s on presentation given to the Research Computing Symposium (2017), participants use R to gather movie data from the OMDB API. In part 2, participants access the API of Fire and Ice.
I’m teaching at DSVIL 2018 in June: Web Scraping, Data Cleaning, and HTML/JSON Parsing.
R Markdown is the backbone of R’s dynamic documents, literate data science, and reproducibility.
Announcing the Fall 2017 R Learning Series: dates, topics, registration, and access to learning materials.