How to get kids’ hands dirty with data? (Part 1)

In my last post, thScreen Shot 2016-09-05 at 4.09.33 PMe question was what kind of programming projects should students be working on if not video games? And an answer I provided, without much detail, is helping kids work with real data. This post is some thoughts about how to make that possible.

Real data is information that’s been collected in a scientific process that reflects things that are actually going on. Ideally I’m thinking of information that is already in some kind of tabular form (whether an Excel spreadsheet, a database table of some sort, or even a comma or tab separated value text file). But the data might instead be in a more programmatic form, such as JSON or XML files.

2000px-NOAA_logo.svgThe good news is there is tons of open-source data out there, and a lot of it is extremely useful and relevant. One of the best places to start is data.gov, a government portal to tons of government-generated data sets. This includes atmospheric data, to health data on US citizens, locations of American military bases, crime statistics, and countless others useful relevant things for kids to work with and address real-world problems.

Data.gov has a developers section with an API, but this is less useful than it sounds because the API just contains metadata and links to the data, and can’t actually directly access the data. Still it’s a good place to start.

Individual agencies like NASA, the FDA, NOAA and countless others each have their own independent APIs as well, that might allow a person to do more with the data.

nasaLogo-570x450In the short term, what a teacher would do is decide what kind of data students should work with, download it, and get it into a format the kids can use. Ideally this should be a database of some sort. Working with databases sounds big and scary. But the basic statements of SQL are not that hard, and every language has tons of packages to allow the language to work with an SQL database. This may be harder than I think, because I haven’t actually attempted to teach it yet. But I believe it can be done.

Why do we want to teach kids to work with databases? Because this is the skill most organizations, be they corporations, the government or nonprofits, are going to need their programmers to do. Of the students that end up doing some kind of coding, whether as full-time programmers or para-coders doing some code on the side of another job, very few will be making video games, but most will be working with some kind of data.

So what could the students do? Here are some of the kinds of projects they could attempt, with my estimation of their level of difficulty:

  • A search program for database keywords (easier)
  • Joining different tables to make another table (easy-medium)
  • Programmatically create graphs such as bar graphs (medium)
  • “Dynamic” graphs that change over another dimension like time (medium-hard)
  • Heatmaps or colored maps of states, districts or countries (harder)
  • Interactive data-tools that respond to user input (hard)
  • Animations showing motion of objects like planes or ships (really hard)

Ideally you’d like to be doing something that Excel can’t do easily. This might mean generating graphs based on different queries a user could enter.

Obviously this is going to take a lot of guidance at first. The emphasis here should be on the students solving a real-life problem. As much as possible students should make decisions about what kind of data they’d like to work with and what kind of problems they’d like to solve.

I’m going to be working with a small group of students to experiment with exploring data this year and I’ll post about how it’s working.

But I’m thinking bigger. I’m thinking of making a tool or IDE that helps students to do more of this independently. But that deserves a post of its own, which will be part 2 of this topic.

Leave a Comment