Social Media Data Mining with Raspberry Pi: 9 Videos for the Complete Beginner

Since the start of this year, I’ve been working on a project to take a $30 Raspberry Pi 2 computer turn it to create a social media data mining machine using the programming language Python. The words “programming language” may be off-putting, but my goal is to work through the process step-by-step so that even a complete beginner can follow along and accomplish the feat.

The inexpensive, adaptable $30 Raspberry Pi 2I’m motivated by two impulses. My first impulse to help people gain control over and ownership of the information regarding interaction that surrounds us. My second impulse is to demonstrate that mastery of social media information is not limited to the corporate, the government, or the otherwise well-funded sphere. This is not a video series for those who already are technologically wealthy¬†and adept. It’s for anyone who has $30 to spare, a willingness to tinker, but the feeling that they’ve been left out of the social media data race. I hope to make the point that anyone can use social media data mining to find out who’s talking to whom. The powers that be are already watching down at us: my hope is that we little folks can start to watch up.

I’m starting the project by shooting videos. The video series has further potential, but has proceeded far enough along to represent a fairly good arc of skill development. Eventually I’d like to transcribe the videos and create a written and illustrated how-to pamphlet; these videos are just the start.

Throughout the videos, I’ve tried not to cover up the temporary mistakes, detours and puzzling bugs that are typical of programming. No one I know of hooks up the perfect computer system or writes a perfect program on the first try. Working through error messages and sleuthing through them is part of the process, and you’ll see that occasionally in these videos.

Please feel free to share the videos if you find them useful. I’d also appreciate any feedback you might have to offer.

Video 1: Hardware Setup for the Raspberry Pi

Video 2: Setting up the Raspberry Pi’s Raspbian Operating System

Video 3: Using the Raspberry Pi’s Text and Graphical Operating Systems

Video 4: Installing R

Video 5: Twitter, Tweepy and Python

Video 6: Debugging

Video 7: Saving Twitter Posts in a CSV File

Video 8: Extracting and Saving Data on Twitter URLs, Hashtags, and Mentions

Video 9: Custom Input

Remembering Pete Seeger 1-28-14: Collective Memory, Shared on Twitter

Activist folksinger Pete Seeger died at the age of 94 on January 27, 2014. As word of Seeger’s death spread on January 28, Twitter was flooded with tributes, including 28,226 posts made to the social media outlet’s #PeteSeeger hashtag channel by 9 PM. Of those posts, 21,617 (some 76.8%) were “re-tweets” of others’ posts. Pete Seeger wouldn’t have minded: he was a staunch believer in people forming publics to sing together, hearing a call and issuing a response, finding a tune and amplifying it not by microphones but in sheer numbers.

What did the world sing today about Pete Seeger? To answer that question, I tuned the Tweet Archivist Desktop (a handy $10 tool) to the #PeteSeeger hashtag, where it archived users’ public posts silently and efficiently in a background window on my computer. I used NodeXL (free and open-source) to find the most common word pairs in posts and to visualize them in the graphic you see below. When pairs are connected into chains and webs, the result is a semantic network that captures the spirit of the day.

Remembering Pete Seeger: a data visualization of a semantic network of the most common words and their connections in the 28,226 #PeteSeeger Twitter contributions from midnight to 9 PM on January 28 2014

In case you’re wondering, the word “communist” only appears 29 times in all those posts, far too rarely to reach the threshold required to appear in the image. “Thank” or “thanks” appears over 2,000 times.