Social Media Data Mining with Raspberry Pi: 9 Videos for the Complete Beginner

Since the start of this year, I’ve been working on a project to take a $30 Raspberry Pi 2 computer turn it to create a social media data mining machine using the programming language Python. The words “programming language” may be off-putting, but my goal is to work through the process step-by-step so that even a complete beginner can follow along and accomplish the feat.

The inexpensive, adaptable $30 Raspberry Pi 2I’m motivated by two impulses. My first impulse to help people gain control over and ownership of the information regarding interaction that surrounds us. My second impulse is to demonstrate that mastery of social media information is not limited to the corporate, the government, or the otherwise well-funded sphere. This is not a video series for those who already are technologically wealthy and adept. It’s for anyone who has $30 to spare, a willingness to tinker, but the feeling that they’ve been left out of the social media data race. I hope to make the point that anyone can use social media data mining to find out who’s talking to whom. The powers that be are already watching down at us: my hope is that we little folks can start to watch up.

I’m starting the project by shooting videos. The video series has further potential, but has proceeded far enough along to represent a fairly good arc of skill development. Eventually I’d like to transcribe the videos and create a written and illustrated how-to pamphlet; these videos are just the start.

Throughout the videos, I’ve tried not to cover up the temporary mistakes, detours and puzzling bugs that are typical of programming. No one I know of hooks up the perfect computer system or writes a perfect program on the first try. Working through error messages and sleuthing through them is part of the process, and you’ll see that occasionally in these videos.

Please feel free to share the videos if you find them useful. I’d also appreciate any feedback you might have to offer.

Video 1: Hardware Setup for the Raspberry Pi

Video 2: Setting up the Raspberry Pi’s Raspbian Operating System

Video 3: Using the Raspberry Pi’s Text and Graphical Operating Systems

Video 4: Installing R

Video 5: Twitter, Tweepy and Python

Video 6: Debugging

Video 7: Saving Twitter Posts in a CSV File

Video 8: Extracting and Saving Data on Twitter URLs, Hashtags, and Mentions

Video 9: Custom Input

The U.S. Senate on Twitter: Week One

Over the last five years, the social media platform Twitter has become a standard part of the communications package of U.S. senators.  An analysis of Twitter activity by senators in the first four days of the 115th Congress (Tuesday January 3 to Thursday January 6) reveals a large amount of communication with a great deal of variety between members of the Senate. During this period, the 100 members of the Senate posted out 1,792 Twitter posts (“Tweets”).  Many of these posts were accomplished impersonally (as with Senators’ speeches, statements and letters) through work delegated to hired communications staff.

The distribution of these Tweets is uneven. The office of New York Senator Charles Schumer posted the largest number of Tweets during the four days at 79, with Texas Senator John Cornyn not far behind at 69 Tweets. These two most voluminous Twitter users directed their posts in different ways: three out of five of Sen. Cornyn’s Tweets mentioned or replied to another Twitter user, while Sen. Schumer broadcasted his Tweets slightly more than half the time without any reference to any other Twitter user.  While both senators had much to say, Sen. Schumer acted as more of a broadcaster and Sen. Cornyn acted as more of a communicator.  The least communicative Senator on Twitter was Thad Cochran of Mississippi, who only posted one Tweet during the first four days of the 115th Congress:

Thad Cochran's one and only Twitter post during the first four days of the 115th Congress was directed toward Vice President-Elect Mike Pence

For a member of the U.S. Senate, Sen. Cochran’s single Tweet went relatively unnoticed, with only 5 retweets, 24 likes, and 14 replies. Vice President-Elect Mike Pence did not respond to Sen. Cochran’s outreach.

Those senators who do not communicate tend not to be the recipient of communication. Sen. Cochran, for instance, was not mentioned by any other senator during the new Senate’s first week.  Sens. Schumer and Cornyn, on the other hand, received multiple mentions from other senators during the period.  The most mentioned senator during the first four days of the 115th Congress was Catherine Cortez Masto, the new Senator for Nevada.  Most of these mentions by other senators were messages of welcome, although some noted her work, as in this retweeting message from Sen. Schumer regarding the new Senate’s plans to dismantle the existing health care structure:

Senator Chuck Schumer Retweets Senator Catherine Cortez Masto on the repeal of Health Care for millions of Americans

Patterns of communication between members of the Senate via Twitter tended to be partisan, as the following social network graph of mentions and replies indicates. This network graph uses a “spring embedded” visualization technique so that ties (indicated via curved lines) draw connected nodes closer to one another:

Twitter network of United States Senators. Lines indicate mentions or replies. January 3-6, 2017. Red nodes are Republicans, Blue are Democrats, Green are Independents, and gray are non-senate accounts.

Most Democratic senators’ accounts tend to cluster close to one another (although Senator Michael Bennet of Colorado is the network’s only “isolate,” not mentioning or referring to any other Twitter account during the period), and most Republican senators’ accounts also tend to cluster close to one another as well. Interestingly, the two Independents of the Senate, Senators Angus King of Maine and Bernie Sanders of Vermont, are clustered closely to Democrats’ accounts, Sen. Sanders most markedly so. Sen. King clusters with Democrats in this period because he mentions the same non-Senate Twitter account that they do (as indicated in gray).

There are exceptions to strict partisanship. Many members of the Senate refer to the same non-Senate accounts across party lines, as in the case of Senator Heidi Heitkamp of North Dakota, represented as the blue dot in the upper left of the network graph. While Sen. Heitkamp does not directly converse on Twitter with Republican senators, neither does she converse with her Democratic colleagues.  Because Sen. Heitkamp mentions an account that is also mentioned by Senator Joni Ernst of Iowa, she is clustered with Republicans. Senator Tim Scott (represented as the red dot toward the bottom of the network graph) follows the same pattern, not directly mentioning Democratic senators’ accounts but mentioning a number of the same non-senate accounts that they do. Senator Joe Manchin takes this pattern of cross-partisanship through indirect contact to its fullest extent, referring to the Twitter accounts of Vice President Mike Pence as well as the news outlets Fox News and the news shows Fox & Friends and Morning Joe that are popular targets of communication for a number of Republican senators. In the network of Twitter communication, this places Sen. Manchin squarely in the midst of the Republican upper-half of the Senatorial network.

As the large number of gray-shaded accounts in the network indicate, members of the Senate spend considerable energy communicating to accounts outside the Senate. Some 471 accounts outside the Senate were targets of communication during the first 4 days of the 115th Congress. The most common target of communication during these days was President-Elect Donald Trump, who was mentioned in 17 senators’ Tweets. The next most common target was the account of Planned Parenthood, whose federal funding for poor women’s pap smears and contraception is under threat from Senate budget cutters.  Rounding out the top ten most-referred to Twitter accounts by senators are four Democratic senators, the collective account of Senate Democrats, one Trump cabinet pick (Governor Rick Perry of Texas), and two national news outlets.

Patterns of reference to particular media outlets are highlighted in the network graph below, which is identical to the graph above but which features the accounts of national news outlets with graphic icons. The three most commonly referred-to media outlets during the period were MSNBC (10 references), Fox News (8 references), and C-SPAN (7 references). The location of some of these news outlets is unsurprising. Right-leaning Fox News, Politico, and The Hill are referred to most commonly by by Republicans, and left-leaning NPR is referred to exclusively by Democratic senators. However, the centrist CNN is surprisingly only referred to by Republican senators, and the right-leaning Washington Times is referred by by both Republicans and Democrats in the Senate.

Twitter Network of the U.S. Senate from Jan 3-6 2017 with national media outlets highlighted as graphic icons

 

Data collection and visualization for this post was carried out with NodeXL software.

Social Media Accounts of Candidates for the Maine State Senate

Deciding who to vote for in state legislative campaigns can sometimes be tricky because thorough coverage of local candidates can be hard to find. In the state of Maine,  state legislators in Maine are known for their accessibility. This may be because Maine’s legislative districts tend to be small; it may also be due to the friendly nature of Maine folk in general. Whatever the reason, getting in touch with candidates for Maine political office is both important and possible.

In this day and age, the quickest way to learn about state legislative candidates and to find their contact information is through social media platforms like individual web pages, Facebook and Twitter.  To help you in that process, the I’ve put together a spreadsheet with information about the social media presence of the 70 candidates for the Maine Senate in 2016, along with some additional contextual information. To download this information for personal use, click here for a Microsoft Excel file.

This sort of information changes all the time — if you have updated information about new accounts, please share a comment below to let me know, or write to james.m.cook@maine.edu.

Presentation Materials for Twitter Adoption in U.S. Legislatures at #SMSociety 2016 Conference

The following are links to supporting materials for the presentation “Twitter Adoption in U.S. Legislatures: A Fifty-State Study” made to the 2016 International Conference on Social Media & Society on Wednesday, July 13 at Goldsmiths, University of London.

1. Free full-text access:

ACM DL Author-ize serviceTwitter Adoption in U.S. Legislatures: A Fifty-State Study

James M. Cook
SMSociety ’16 Proceedings of the 7th 2016 International Conference on Social Media & Society, 2016

2. Download Powerpoint Presentation Slides from presentation

3. Abstract: This study draws theoretical inspiration from the literature on Twitter adoption and Twitter activity in United States legislatures, applying predictions from those limited studies to all 7,378 politicians serving across 50 American state legislatures in the fall of 2015. Tests of bivariate association carried out for individual states lead to widely varying results, indicating an underlying diversity of legislative environments. However, a pooled multivariate analysis for all 50 states indicates that the number of constituents per legislator, district youth, district level of educational attainment, legislative professionalism, being a woman, sitting in the upper chamber, holding a leadership position, and legislative inexperience are all significantly and positively associated with Twitter adoption and Twitter activity. Controlling for these factors, legislator party, majority status, partisan instability, district income, and the percent of households in a state with an Internet connection are not significantly related to either Twitter adoption or recent Twitter use. A significant share of variation in social media adoption by legislators remains unexplained, leaving considerable room for further theoretical development and the development of contingent historical accounts.

Please feel free to review these materials before or after my presentation. I look forward to your comments.

A Hashtag Crash: CCS2016, Meet CCS2016, CCS2016, CCS2016 and CCS2016

Visit the Twitter hashtag channel #CCS2016 for information on the 2016 Conference on Complex Systems taking place in Amsterdam this upcoming September. Well actually, isn’t #CCS2016 the hashtag covering the 2016 Canadian Crowdfunding Summit? Or, wait, does #CCS2016 refer to the 2016 Content and Commerce Summit meeting in Orlando, Florida? Or is #CCS2016 the hashtag for announcements regarding 2016 Comic Con Spain? Could #CCS2016 be a hashtag for a cinematography conference in Caracas, Venezuela?

The answer is yes, yes, yes, yes, and yes. The Twitter hashtag channel #CCS2016 has been used to promote all of these, a simultaneous indication of the popularity, bottom-up flexibility, and strategic difficulty involved in using the social media platform. #CCS2016 has been used beyond this, within the past year referring to events as diverse as a Brazilian country music festival, a high school spirit effort, a “Corporate Community Summit” and an academic conference on “Cities as Community Spaces.”

The graph below features all participants in the #CCS2016 hashtag channel from April 1 to 11, 2016. Every dot (called a node) represents a Twitter account that has made a post including the #CCS2016 hashtag. Every line (called a tie) represents an instance in which one Twitter account has mentioned or replied to another Twitter account. Together, these nodes and ties make up a springtime social network for CCS2016.

Social Network for Twitter accounts using the #CCS2016 hashtag from April 1-11, 2016

As you can see, this is a disconnected network.  The large dark blue network at the top of the graph consists of Twitter users discussing Comic Con Spain.  The smaller light blue network below it is beginning to grow after the announcement of an annual Conference on Complex Systems.  To its left are participants in the upcoming Canadian Crowdfunding Summit.  To the lower right are a handful of remaining nodes discussing less popular or timely representations of the title “CCS2016.”

Separate conversations are put in separate areas of this two-dimensional graph.  On Twitter itself, no such separation is afforded. The purpose of a hashtag is to provide a space that community members can visit when they have something to say, or have a desire to listen.  In this busy, muddied virtual room called “#CCS2016,” multiple conversations are taking place on top of one another.

Why don’t all these different groups use a different hashtag?  A social media marketer would advise always reviewing past use of a phrase before adopting it as a hashtag of one’s own, lest one be accused of acting as a hashtag crasher.  But regardless, lines of distinction in Twitter can help keep the conversation coherent.  Different users are speaking different languages: English, Portugese, Spanish.  This is a sorting mechanism.  A second sorting mechanism comes from the ties charted in the graph above: people with a particular interest in a hashtag are most likely to find out about that hashtag because they follow other people with the same particular interest.  This means relevant hashtag posts are most likely to appear in a user’s Twitter timeline.  Finally, popularity of particular uses for a hashtag may shift over time, as one event comes to a head and another recedes into the past.

At times, it’s a mess, but this is what civic democracy looks like.

Stages of Teaching and Learning Social Media Analytics (Presentation Notes)

This afternoon, I’ll be making a short presentation of thoughts on teaching social media analytics at the 2015 conference of the International Communication Association as part of its BlueSky Workshop on Tools for Teaching and Learning of Social Media Analytics. While the workshop is focused on the experience of teaching using a series of particular tools, I am interested in rejecting the question, “Which tools are best for teaching?,” and supplanting it with the idea of building capability in students in a progressive strategy. At different stages in students’ development as social media researchers, different analytic platforms may be more or less appropriate as teaching tools.

Below is a copy of notes for my presentation; notes can also be downloaded as a PDF here.


Objective: To introduce unexperienced undergraduate students to the process of analyzing social media with sufficient breadth that they may continue to learn independently.

Teaching Challenges Provoking Implementation:

  • As the mandate for higher education continues to widen, undergraduate students tend more and more to be non-traditional, to lack preparation, to lack confidence, and to be fascinated by but intimidated by math, research and technology.
  • Social media platforms are in a state of constant change.
  • Social media analytics packages and methods are rapidly evolving now and are likely to experience significant change in the next decade.

Learning Outcomes: Students who complete a course in social media analytics will be able to:

  1. Find and navigate social media platforms
  2. Recognize the common elements of social media:
    1. Individuals
    2. Actions
    3. Memberships
    4. Relationships
  3. Extract observations of these elements into datasets:
    1. Individual-level
    2. 1-mode network
    3. 2-mode network
  4. To analyze data and report data visualizations, qualitative categorizations and quantitative statistics

Strategy: A gentle, stepwise series of stages taking students from where they are to where they need to be, introducing students to a variety of analytic platforms, and focusing on the social research skills that will remain constant despite changes in social media and social media analytic platforms.

Stages of learning social media analytics, from Consumer to Manager to Secondhand Gatherer to Primary Gatherer to Analyst

Teaching Challenges in Implementation:

  • Universal access for students who no longer share a common campus, common hardware and common software
  • Reasonable yet challenging entry for students who come to class with a variety of previous experience and capabilities
  • A variety of reasonable endpoints for students who vary in their level of progression and accomplishment

Before #MillionsMarchNYC, a Protest Movement Takes to Facebook and Twitter

December 12 2014 is the day before the Millions March in New York City, an organized reaction to the death of unarmed black men at the hands of the police and more broadly to structural forms of racial discrimination. Tomorrow, a variety of professional journalists will hopefully describe the messages and activities of the protest and reactions to this protest. Today, we can study the run-up to the Millions March by watching people talk about it on Facebook and Twitter.

Facebook, the social media website that most people know best, lets users create personal accounts and pages that they control. Administrators of a page for a group or event can allow posts by others, but they can also purge them if they find the content disagreeable. For announcing activist events, Facebook is a top-down affair. If you want to know what movement organizers think of their protest event, look at their Facebook page. The following is a word cloud taken from administrative posts to the MillionsMarchNYC. Words are larger in this graphic if they occur more frequently:

A Wordle Word Cloud representing the frequency of various words used by organizers of the MillionsMarch in New York City on December 13 2014

We see a lot of practical information here, many references to locations and plans and logistical concerns.  This is what’s on the mind of movement leaders. What’s on the mind of the many thousands who are thinking about going?

Twitter is a social media website unlike Facebook, a website on which certain people own Pages and control those Pages’ content. On Twitter, subjects are organized by hashtags, which no one owns, no one can purge, and which therefore tends to be driven from the bottom up. A corporation with an image problem on Facebook can simply delete comments. Woe betide the corporation that offends on Twitter; it may entirely lose control of the public conversation about itself. If you want to know what people are thinking about a social movement inside and outside its leadership, look at Twitter.

To do just that, I’ve gathered up all Twitter posts (“Tweets”) using the hashtag #MillionsMarchNYC. Perhaps the simplest way of characterizing #MillionsMarchNYC tweets is over time; as of 12 Noon on December 12, here’s the trend in posting volume:

Graph: Volume of Twitter Posts using the hashtag #MillionsMarchNYC through December 12, 12 Noon Eastern Time

5,415 tweets using the newly-created hashtag were posted from November 26 to December 12, but the dates November 26-30 are not even included in this graph because the number of tweets during that initial period — just 6 — is miniscule in comparison to the conversation two weeks later. The trend clearly indicates a spike in use of the #MillionsMarchNYC hashtag, especially over the last few days before the march, but what ideas are associated with the spiking hashtag?

A useful feature of Twitter for answering that question is that a single post may contain more than one hashtag. The co-occurrence of #MillionsMarchNYC with other hashtags in the set of Nov. 30 – Dec. 12 tweets is indicated in the following frequency table:

hashtag frequency
#blacklivesmatter 2020
#icantbreathe 1360
#dec1314 1232
#nyc2palestine 832
#shutitdown 317
#ericgarner 316
#nyc 170
#stolenlives 168
#ferguson 136
#dayofanger 128
#thisstopstoday 116
#mikebrown 103
#justiceleaguenyc 91
#fromtherivertothesea 90
#washingtonsquarepark 72
#indictthesystem 71
#nyc2ferguson 66
#anonymous 62
#wecantbreathe 62
#expectus 61
#nojusticenopeace 55
#intersectionality 49
#palestine 47
#d1314 44
#12/13/14 35
#41986 35
#12/13/2014 35
#freepalestine 27
#millionsmarchsf 25
#akaigurley 22
#weekofoutrage 22
#justiceforericgarner 21
#directaction 19
#justiceformikebrown 19
#alllivesmatter 18
#jailsupport 18
#millionsmarch 17
#handsupdontshoot 16
#equalrights4all 15
#humanrightsday 15
#michaelbrown 15
#opbelgium 14
#santacon 14
#ftp 12
#icantbreath 12
#nycprotest 12
#dayofresistance 11
#dec1213 11
#love 10
#millionsmarchoakland 10
#nojusticenopeacenoracistpolice 10

In the interest of brevity, I’ve only included hashtags used at least ten times in this list.  Just three hashtags co-occur with #MillionsMarchNYC more than 1,000 times: #blacklivesmatter, #icantbreathe and #dec1314.  The tail of the distribution is long, however, with many hashtags occurring a handful of times or just once:

Frequency Distribution of Hashtag CoOccurrence with #MillionsMarchNYC from November 26 to December 12 2014

These many hashtags do not simply co-occur with #MillionsMarchNYC in these Twitter posts, however.  They also sometimes co-occur with one another, forming a co-occurrence network that tells us something about the symbolic landscape of the leadup to this protest.

Sometimes the truth is messy; the following is a graph showing the complete co-occurrence network of hashtags used with #MillionsMarchNYC (the #MillionsMarchNYC hashtag itself is removed from the network to highlight connections between other tags).  Every hashtag is a node in this network and every co-occurrence between two hashtags appears as a tie between the two nodes.  A tie is drawn more darkly if the co-occurrence happens more often, and a node is drawn in greater size if the hashtag it represents co-occurs with a greater number of other hashtags.  Nodes are given different colors to highlight sets of nodes that are more strongly connected with one another:

Complete Co-Occurrent Network of Hashtags in #MillionsMarchNYC Twitter Posts, Nov. 26 to Dec 12

That’s pretty hard to read, isn’t it?  A few tags are evident, but there are so many that they overlap with one another, blending into a blurry mess.  The culture of a social movement can actually be a lot like that, with a large number of voices saying so many things.  But if we start to filter out the least common hashtag utterances, clearer patterns begin to emerge.

Here’s the same Twitter hashtag network, but this time just showing the hashtags for which co-occurrences happen at least 5 times:

Network of Hashtags Co-Occurring at least 5 times in the #millionsmarchnyc network on Twitter, Nov. 26-Dec. 12 2014

Here’s the same Twitter hashtag network, but this time just showing the hashtags that co-occur with some other hashtag at least 20 times:

Network of Hashtags Co-Occurring at Least 20 Times in the #millionsmarchnyc hashtag on Twitter, Nov. 26 - Dec. 12 2014

And here’s the same Twitter hashtag network, but this time just showing the hashtags that co-occur with some other hashtag at least 100 times:

Cultural Network of Hashtag Co-occurrence in the Tweets mentioning #MillionsMarchNYC from November 26 to December 12 2014

If we filter for frequency, we lose detail, but at the same time the core of this movement’s culture becomes apparent.

Although I stand by my claim that this hashtag network indicates something about social movement culture, I should note a few important limitations.  First, the use of a hashtag involves a person deciding how they would like others to categorize their declarations.  These are professions of manifest culture; latent culture remains hidden.  Second, Twitter is not a form of social media that is used by everyone; according to the Pew Internet Project young adults, urbanites and African-Americans are disproportionately likely to post to Twitter.  However, it’s important to note that this is exactly the population that forms the strongest constituency for the Millions March in New York City. In addition, even with the limitations I’ve just noted, the conversation on Twitter is much more expansive and inclusive than the conversation within the movement’s core organizing cadre.  If we’re interested in distinctions between leaders and potential participants in a social movement, Twitter is a pretty good place to look.

Tool Postscript. For data gathering, I used the Twitter API.  For data processing, I used UCINET.  For data visualization, I used NodeXL.

A Map of Popular Connotations for 12 Social Media Sites, Winter 2014

If I say “Facebook is…,” how would you complete the sentence?

The response of any individual person to that question may be idiosyncratic, but when we look at the aggregate patterns that build up across the responses of many people, trends emerge that reflect our cultural beliefs and values regarding social media.  One convenient way to track trends is through Google Autocomplete.  When you enter a term in the Google search bar, have you ever noticed that certain suggestions appear to complete your thought automatically?

Google Autocomplete suggestions in November of 2014 for Facebook Is...

These are not random suggestions.  Rather, they reflect a weighted combination of how often different phrases appear in other Google “users’ searches and content on the web.”  Speaking in sociological terms, they are an indication of the most salient cultural associations with the phrase you’ve started typing.

In the autocompletion of “Facebook is…” that you see above, results are presented as a simple list of items, but it’s possible to obtain richer information than this. First, I’ve nabbed Google’s autocompletion lists for 12 of the most popular English-language social media platforms: Facebook, Twitter, Tumblr, LinkedIn, Vine, Flickr, MySpace, Ello, Instagram, Pinterest, Google+, and YouTube. To each platform’s name I’ve added the prompting word “is” and found up to 10 most-popular search suggestions (Some new platforms like Ello have low enough search volume to generate few results. Some other platforms have repetitive results I’ve combined — “Flickr is slow” and “Flickr is too slow” are just counted as “Flickr is slow.”). An interesting feature of these lists is commonality. Despite the rich variety and nearly endless possibility of the English language, many words to complete the phrase “_______ is…” appear on Google’s top 10 list for more than one social media platform. For instance, the phrase “______ is slow” is among the top 10 results for Facebook, Tumblr, Flickr, Pinterest and YouTube. The phrase “_______ is dead” is among the top 10 results for a full 9 out of the 12 social media platforms studied here.

To graph commonalities, I’ve created the 2-mode semantic network graph you see below. A 2-mode (or “bimodal”) graph is one in which there are two kinds of nodes indicating two different kinds of objects. In this graph, social media platforms are the first kind of node, and they are indicated in yellow. The second kind of node is a top-10 ending of the phrase “________ is” by Google autocomplete. These are color-coded pink if the phrase completions indicate negative sentiment, green if the phrase completions indicate positive sentiment, and white if there is no clear sentiment expressed with the phrase completion. For some ambiguous phrases such as “YouTube is on fire” and “Pinterest is ruining my life,” a quick browse through Google search results helps to make sentiment more clear (both of these phrases turn out to be complimentary). Finally, a line is drawn from a social media platform to a phrase if that phrase is listed in the top 10 Google autocomplete results for that social media platform.

Social Media Is... Most Common Associations of Popular Social Media Sites as Identified through Google Autocomplete

For the 12 social media platforms, there are 68 distinct phrase completions listed in the Google autocomplete top 10. A large majority of these phrase completions communicate clear sentiment, and a large majority of those sentiments are criticisms. Mentions of slow speed, crashes and unavailability appear common. With the exception of YouTube and Pinterest, all of the 12 social media platforms are popularly depicted as “dead” or “dying.” Predictions of doom for social media platforms appear to be a cultural universal, at least among the socially-distinct set of participants in social media and web searches. Facebook, LinkedIn, Vine, Flickr, Ello and Instagram have no positive phrases listed in their autocompletions. A strikingly positive deviation from the negative trend appears for MySpace. This finding is unintuitive, considering how far interest in MySpace has fallen since 2008. Consider the trend in Google search volume for “MySpace” from 2004-2014:

Relative Search Volume for MySpace in Google, via Google Trends, 2004 to 2014

The letters on that graph indicate influential mainstream news articles mentioning MySpace; does the lack of any articles whatsoever since 2010 hint at an explanation? Without newspaper or magazine articles promoting the MySpace network, and with hardly anyone searching for Myspace anymore, who is left but a small group of true believers in the once-great social network? The strongly positive sentiment toward MySpace in its top-10 rankings may be due to positivity in the small set of people who are still paying attention.

What other patterns do you notice in this graph of popular search completions for social media platforms? Do the autocompletions distinguish between different social media platforms, or do they unify?

Graphing #MEPolitics, the Maine Politics Twitter Network

On the social media platform Twitter, users post messages of 140 characters or less. Those messages can include links to web pages or communications to other Twitter accounts using the @ (“at”) sign. When a # sign is placed in front of a word in a Twitter post, the word becomes a “hashtag” and that post is added to a stream of all other posts using the same hashtag. Direct mentions and replies build pair bonds in the Twitter environment; hashtags build community.

For years, people interested in discussing Maine politics have used the #MEPolitics hashtag to broadcast, to speak and to listen. As Election Day 2014 approaches, volume of chatter on the #MEPolitics hashtag has increased. Who’s speaking most? Who is speaking to whom (and who isn’t)? What’s being talked about? To find out, I’ve gathered all posts (popularly called “Tweets”) using the #MEPolitics hashtag over the last weekend: October 24-26, 2014. The following is a graph of the resulting social network, in which each unique contributor to #MEPolitics is represented by a dot, each tie indicates that one contributor has mentioned or replied to another contributor in a Tweet, and contributors are placed closest to those in the network with whom they tend to communicate most:

Network of Twitter Posts using the #MEPolitics Hashtag from 10-24 to 10-26 2014. Ties indicate mentions or replies.

A few features of the #MEPolitics network are immediately apparent. First, nearly every one of the 603 participants in the #MEPolitics hashtag over the weekend is a communicator and not just a broadcaster; only 23 individuals posted Tweets during the period without referring to or being referred to in some way by another Twitter user (these are the loners colored light green in the lower-left of the graph). Second, most participants (565 out of 603 participants) are connected to one another either directly or indirectly in one giant conversation; the few unconnected conversations graphed in the lower-right corner are happening in small groups of 2 or 3. Third, the large conversation in which most Tweeters are participating is itself divided up into smaller clusters, in-groups whose members more frequently communicate with one another than with outsiders. These smaller clusters of conversation are color-coded in the graph above.

What’s going on inside those clusters of communication? To help clarify, I’ve depicted each Maine candidate for governor or federal office not as a simple dot, but rather using their profile picture. Also rendered by their profile images are the Twitter accounts of the Democratic Party and Republican Party of Maine. We can see from the graph that independent gubernatorial candidate Eliot Cutler and independent congressional candidate Richard Murphy are, not surprisingly, located in their own unique sub-community separated from the communities of discussion surrounding the major-party candidates. Perhaps more surprisingly, conversation involving Republican candidates is not embedded in a single Twitter community, but rather split among four sets. Indeed, both Senator Susan Collins and Governor Paul LePage have two Twitter accounts each, and each of their accounts is placed in its own commnunity. The Democratic Party and Democratic Party candidates, in contrast, are all located in the same sub-group of accounts. It is fair to say, at least in the context of Twitter communication and at least for this time period, that Maine Democrats have a more cohesive social media community than Maine Republicans.

A careful observer may notice the absence of one candidate and one party from this graph. Where is Republican congressional candidate Isaac Misiuk, for instance? Where is the Maine Green Independent Party, which is fielding a slate of 13 candidates in this cycle? The answer is that neither Misiuk nor the MGIP are included in the graph because neither participated in the #MEPolitics discussion, at least over the weekend.

Finally, there are some notable clusters of communication with non-party, non-candidate accounts at the center; these are indicated with a text label identifying the most central account of a cluster. M.E. McRider (BikinInMaine) is a conservative citizen (“Fighting the spread of the disease which is liberalism!“) who posted 130 provocative Tweets over the period, attracting 48 responses:

M.E. McRider bikinInMaine Twitter user declares Harry Reid officially a domestic enemy of the United States

On the left, blogger Bruce Bourgoine posted 46 Tweets over the weekend, a smaller number than McRider, attracting 36 responses:

Bruce Bourgoine posts a criticism of Rand Paul as a user of misinformation

The Kennebec Journal (KJ_Online) and Bangor Daily News (bangordailynews) are two Maine newspapers sitting at the center of their own circles of conversation. The Portland Press Herald, another prominent Maine Newspaper, isn’t in its own independent Tweeting group; rather, its Tweets are referred to predominantly by Democratic candidates and their followers.

Of course, it’s not just the structure of the #MEPolitics network that matters; the content of discussion this weekend matters too. With Election Day just a week and a half away, what subjects in Maine politics are being talked about the most? The ten most-used hashtags in last weekend’s #MEPolitics discussion were:

Top Ten Hashtags
1. #mepolitics: 2542 uses
2. #michaud2014: 386 uses
3. #michaud: 354 uses
4. #lepage: 320 uses
5. #hillaryclinton: 302 uses
6. #mike: 288 uses
7. #eliotcutler: 278 uses
8. #cutler: 246 uses
9. #maine: 224 uses
10. #poll: 206 uses

The weekend visit by Hillary Clinton on behalf of Democratic candidates and the race for Governor appear to have garnered the highest volume of attention. This pattern is borne out in a listing of the ten most linked-to web pages in #MEPolitics Tweets:

Top Ten Page Links
1. Story: Paul LePage leads polls: 45 links
2. Story: Michaud does best one-on-one: 24 links
3. Story: Hillary Clinton endorses Mike Michaud: 22 links
4. Editorial: the Governor’s race will determine health outcomes of sick Mainers: 21 links
5. Story: A retrospective on Mike Michaud’s record in the U.S. Congress: 14 links
6. Story: poll on bear baiting: 13 links
7. Video: Eliot Cutler asks Mainers to vote for someone else if he can’t win: 11 links
8. Story: Eliot Cutler benefits from out-of-state money: 11 links
9. Another Story: Eliot Cutler benefits from out-of-state money: 10 links
10. Michaud Campaign TV Ad: Cutler supporters who will vote for Mike Michaud: 10 links

Remember bear baiting? Although there are many letters to the editor being published about this controversial referendum, relatively few Twitter users are discussing the possible ban over social media. The subject of a bear baiting ban garnered only one link in the top ten links of the weekend. All other stories have to do with the race for the Blaine House.

You may notice a trend toward citing newspaper articles in the top ten link list. Let’s look at the ten most linked-to domains for a deeper look:

Top Ten Domains
1. pressherald.com: 181 links
2. bangordailynews.com: 109 links
3. youtube.com: 93 links
4. michaud2014.com: 37 links
5. centralmaine.com (Kennebec Journal): 31 links
6. conventionofstates.com: 22 links
7. blogspot.com: 16 links
8. sunjournal.com: 16 links
9. huffingtonpost.com: 14 links
10. lepage2014.com: 14 links

Newspaper links are indeed the most popular, with the Portland Press Herald, the Bangor Daily News, the Kennebec Journal and the Lewiston Sun-Journal gaining spots in the top 10. Social media sites are also quite popular, with YouTube, Huffington Post and the blogging platform Blogspot representing the form. Campaign websites for Paul LePage and Mike Michaud make the list (notably, Eliot Cutler’s page does not). The final entrant in the top ten list of linked sources is the website conventionofstates.com, which proposes a new Constitutional Convention to amend the U.S. Constitution. Tweets mentioning this website consist almost entirely of posts made by M.E. McRider (handle @BikinInMaine) and responses to these posts.

McRider has made an impact this weekend in an otherwise election-centric week, and that impact is felt in discussion as well. Some Twitter users might elevate the salience of their favorite websites by simply posting a link again and again, a kind of anti-social behavior that some say borders on spamming. Yet McRider elicited responses as well, as evidenced by this last list of the ten most mentioned or replied-to accounts:

1. Mike Michaud (Democratic candidate for Governor)
2. Hillary Clinton
3. Eliot Cutler (Independent candidate for Governor)
4. Maine Democratic Party
5. Amy S. Fried, University of Maine political science professor and political columnist
6. Shenna Bellows (Democratic candidate for Senate)
7. M.E. McRider
8. Paul LePage (Republican candidate for Governor)
9. Bangor Daily News
10. Randy Billings, reporter for the Portland Press Herald

Last weekend, these were the speakers closest to the center of Maine political discussion on Twitter.


Methodological note: analysis and visualization was performed using NodeXL, a free and open-source plugin for Microsoft Excel that makes social media analysis accessible to almost anyone with a computer.

Building Offline Community to study Online Community: the Social Media & Society Conference

Attending academic conferences can feel a bit like living in a retelling of Goldilocks and the Three Bears. A conference that’s too small can leave you feeling underfed. On the other hand, a conference that’s too large can be overwhelming, intimidating and even alienating. A conference on a highly particular subject may be quite useful if you select just the right one, but may be completely useless if you’re even slightly off the mark. The presentations at an overly general conference may lack those crucial connections that stimulate career-changing “aha!” insights. If you’ve been to enough conferences, you probably know what I mean.

How rare, and therefore how precious, is the conference that hits the Goldilocks sweet spot in between these distasteful extremes. The 2013 Social Media & Society International Conference was that conference for me. Gathering and connecting presentations on the causes, kinds and consequences of online social connection, #SMSociety13 managed to be more than simply the sum of its individual presentations. Researchers across diverse fields of social science, humanities, business and computer science shared distinctive approaches and concerns regarding the same substantive subject, which meant that we all had some basis for understanding but also had something to learn:

Topics of discussion at #SMSociety13, the 2013 Social Media and Society Conference

Attendance numbered in the sweetly moderate middle between a hundred and two hundred, providing a critical but collegial mass of thinkers who began conversations during one set of presentations and continued them across others. How do we bridge (or barricade) the quantitative-qualitative divide? How do we know who is “really” speaking in an online environment, and how do participants manage the online presentation of self? What are the ways in which online interaction leads to offline action? As we ran into one another again and again in various combinations, these questions carried over into the late night at a pub and over danishes in the morning, with an aggregate from far-flung places becoming a quirky community.

Photos from the 2013 Social Media and Society Conference at Dalhousie University in Halifax, Nova Scotia

The Social Media & Society International Conference meets again at Ryerson University in Toronto on September 27-28, 2014. Got a paper or panel in mind? Submit through this link: I’d love to see you there. Abstracts are due April 18. Poster proposals are due May 23.

1 2