Data Science isn't Magic
Friday February 20, 2015
This is the flow for a DC Open Data Day 2015 workshop. It may not make sense out of context. For a fun summary in tweets (with photos!) you might check out a storification of it.
View this page as slides to make this site's base URL appear quite large so that people can find this easily.
planspace.org
Workshop Outline
- Intro / Disclaimer
- 80% Definitions (slides)
- Data Science is OSEMN (slides)
- What's the Problem? (slides)
- Find NYC attendance data
- Data Science Tools (slides)
- Connect to RStudio in the cloud
- Backup plan:
- Install the appropriate
R
distribution for your system from this mirror. - Install the RStudio IDE for
R
. The RStudio site should suggest an appropriate package for your system. - Download and unzip the files we're using. (They're on GitHub, so you can clone if you prefer.)
- Install the appropriate
- Backup plan:
- Working with
R
- Working with one day of data (
01-day_attendance.R
) - Selecting usable data points (
02-select_totals.R
) - The relationship between temperature and attendance (
03-merge_and_plot.R
)
- Working with one day of data (
- Bonus: Introducing the DC voter file
Additional Resources
- For learning
R
:- Try R is an interactive web site that guides you through
R
functionality in your web browser. - The walking introduction to R is an
R
script that you can open and work through in RStudio.
- Try R is an interactive web site that guides you through
- For more fun data: