Call for Applications: Maine Policy Scholar Program

Are you a University of Maine at Augusta student taking classes in the 2015-2016 academic year? Are you interested in politics and/or policy?  Are you looking for a way to take your work to the next level?

The University of Maine at Augusta, continuing its association with the Maine Community Foundation, has the opportunity to nominate a Maine Policy Scholar for the 2015-2016 academic year.  The successful applicant to the Maine Policy Scholar program receives a $1,500 scholarship with a budget of $1,000 for research expenses, and is expected to delve into applied research into a real Maine policy issue.

As the UMA advisor for the program, I’ll be working throughout next year with next year’s Maine Policy Scholar to help her or him in developing and carrying out an applied research program.  The selected student will also participate in three-four statewide meetings with faculty and scholars from across the University of Maine system for rigorous review of progress.  The year culminates in the presentation of a research memo to a board of state political leaders convened by the Maine Community Foundation. This memo has historically been also directed to a Maine political leader relevant to the subject of the student’s research, such as the Governor or the head of a state executive agency.  This is a good chance to gain valuable experience while you make a difference in Maine policy.

Applicants must be matriculated UMA students with a GPA of at least 3.00, and must have completed 60 or more credits of coursework by September 2015.  Previous work in applied research or previous study of research methods is ideal.

Are you interested?  Applications must be received by March 7.   Applications should consist of a current resume describing academic and professional experience and a letter of intent including a description of a proposed research topic. Send applications as an e-mail attachment to or by mail to James Cook, Assistant Professor of Social Science, University of Maine at Augusta, 46 University Drive, Augusta, ME 04330.

For more information on the application process or the Maine Policy Scholars program, please feel free to contact me at 621-3190 or  Additional information is also available at on the web.

Recent Maine Policy Scholars, with links to their final policy memos, are:

The University Without Walls: Splash Brings Teaching and Learning Together

The traditional model of university learning truncates students’ vision on both ends. High school students may be told that they should aspire to higher education, but unless family members are part of that tradition they may not know why.  Once admitted, undergraduates study academic subjects and are tested for signs of accomplishment, but have limited opportunities to take the next step of sharing their knowledge and skills with others.

MIT Splash 2014 scene at the registration desk in the Infinite Corridor, Memorial Lobby

In my role as a parent, I recently accompanied my ninth-grader to the Massachusetts Institute of Technology’s Splash program. Taking place every year on the weekend before Thanksgiving, Splash is a 2-day, 20-hour marathon of 1-hour classes taught by MIT undergraduates on subjects tending heavily toward the natural sciences, computer science and mathematics but also including the social sciences, humanities, arts and popular culture. Some classes are designed for attendees who are already advanced in mathematics, computer programming or other technical skills, but most classes require no prior knowledge at all. To get a full idea of the breadth of Splash, see 2014’s course catalog and its list of 618 unique classes. Any high school student may attend, and the cost of attendance is a relatively low $40 (with financial aid available).

MIT Splash Session on Systems of Voting and their functional outcomes -- November 2014

To keep Splash focused on high-schoolers and to let those high-schoolers spread their wings, parents are prohibited from attending sessions, but a separate session for parents led by Jordan Moldow ’14 was informative. Moldow’s “Behind the Scenes” presentation gave me a sense of the scale of this effort:

MIT Splash 2014 Statistics: 2500 students, 457 teachers, 618 unique classes, 30 student administrators

Put together 2500 students, 457 teachers, 30 administrators and many more volunteers and you’ve got a takeover of the MIT campus for a weekend. MIT donates space, which is helpful considering that this is a non-profit student effort. It’s also a smart move by MIT, considering that 2500 geek-minded young people every year have a chance to fall in love with the campus; you couldn’t dream of a better effort to recruit future applicants.

MIT’s Splash is not an overnight success; rather, it is the result of long, cumulative investment. MIT inaugurated its ESP (Educational Studies Program) for teens in 1957 with its High School Summer Project, and ESP launched its first Splash weekend in 1988. Splash is now in its 26th year, and has become such a phenomenon at MIT that according to Moldow, “a large portion of MIT students will at some point during their time here do something for Splash.”

The organizational effort to keep Splash is considerable. No Splash leaders or organizers are paid; all are volunteers. Chairs, treasurers, secretaries, administrative organizers, art directors, publicity directors, website administrators and directors of teacher development form a core group that meets twice a week during the school year, once to make group decisions and once more in a work session to carry out those decisions.

New Splash teachers are cultivated every year, months before the event itself. Veteran teachers act as directors of teacher development, Moldow explained to parents. Their role is to “communicate with teachers and critique their syllabi or class descriptions. We run 3-6 teacher trainings each year to talk about what are effective methods of presenting materials to a class, to make sure students absorb information, to make sure it is entertaining and to make sure that students are engaged.” A few members of the Cambridge, Massachusetts community volunteer to teach Splash classes, but most teachers are undergraduate MIT students. By learning to teach, MIT students improve their command of the subjects they study while practicing the important skill of communicating advanced knowledge in a concise and comprehensible way. “Just as we are trying to serve students by teaching them, so we are also trying to serve teachers by helping them to become better teachers,” Moldow said.

As Splash has become more and more popular at MIT, the organizers of Splash have sought to expand the program beyond that university’s walls to involve other campuses. In 2009, Splash alumni formed Learning Unlimited, a non-profit organization that organizes an annual “SplashCon” and supports over 20 universities that have started running their own local Splashes. If students at your university are interested in starting a Splash, Learning Unlimited will bring them into a nurturing network of advisors and supply them with the software they need to make a Splash run — at no charge. Get in touch with the leadership of Learning United here to spread this model of education that so spectacularly brings down the Ivory Tower’s walls.

Before #MillionsMarchNYC, a Protest Movement Takes to Facebook and Twitter

December 12 2014 is the day before the Millions March in New York City, an organized reaction to the death of unarmed black men at the hands of the police and more broadly to structural forms of racial discrimination. Tomorrow, a variety of professional journalists will hopefully describe the messages and activities of the protest and reactions to this protest. Today, we can study the run-up to the Millions March by watching people talk about it on Facebook and Twitter.

Facebook, the social media website that most people know best, lets users create personal accounts and pages that they control. Administrators of a page for a group or event can allow posts by others, but they can also purge them if they find the content disagreeable. For announcing activist events, Facebook is a top-down affair. If you want to know what movement organizers think of their protest event, look at their Facebook page. The following is a word cloud taken from administrative posts to the MillionsMarchNYC. Words are larger in this graphic if they occur more frequently:

A Wordle Word Cloud representing the frequency of various words used by organizers of the MillionsMarch in New York City on December 13 2014

We see a lot of practical information here, many references to locations and plans and logistical concerns.  This is what’s on the mind of movement leaders. What’s on the mind of the many thousands who are thinking about going?

Twitter is a social media website unlike Facebook, a website on which certain people own Pages and control those Pages’ content. On Twitter, subjects are organized by hashtags, which no one owns, no one can purge, and which therefore tends to be driven from the bottom up. A corporation with an image problem on Facebook can simply delete comments. Woe betide the corporation that offends on Twitter; it may entirely lose control of the public conversation about itself. If you want to know what people are thinking about a social movement inside and outside its leadership, look at Twitter.

To do just that, I’ve gathered up all Twitter posts (“Tweets”) using the hashtag #MillionsMarchNYC. Perhaps the simplest way of characterizing #MillionsMarchNYC tweets is over time; as of 12 Noon on December 12, here’s the trend in posting volume:

Graph: Volume of Twitter Posts using the hashtag #MillionsMarchNYC through December 12, 12 Noon Eastern Time

5,415 tweets using the newly-created hashtag were posted from November 26 to December 12, but the dates November 26-30 are not even included in this graph because the number of tweets during that initial period — just 6 — is miniscule in comparison to the conversation two weeks later. The trend clearly indicates a spike in use of the #MillionsMarchNYC hashtag, especially over the last few days before the march, but what ideas are associated with the spiking hashtag?

A useful feature of Twitter for answering that question is that a single post may contain more than one hashtag. The co-occurrence of #MillionsMarchNYC with other hashtags in the set of Nov. 30 – Dec. 12 tweets is indicated in the following frequency table:

hashtag frequency
#blacklivesmatter 2020
#icantbreathe 1360
#dec1314 1232
#nyc2palestine 832
#shutitdown 317
#ericgarner 316
#nyc 170
#stolenlives 168
#ferguson 136
#dayofanger 128
#thisstopstoday 116
#mikebrown 103
#justiceleaguenyc 91
#fromtherivertothesea 90
#washingtonsquarepark 72
#indictthesystem 71
#nyc2ferguson 66
#anonymous 62
#wecantbreathe 62
#expectus 61
#nojusticenopeace 55
#intersectionality 49
#palestine 47
#d1314 44
#12/13/14 35
#41986 35
#12/13/2014 35
#freepalestine 27
#millionsmarchsf 25
#akaigurley 22
#weekofoutrage 22
#justiceforericgarner 21
#directaction 19
#justiceformikebrown 19
#alllivesmatter 18
#jailsupport 18
#millionsmarch 17
#handsupdontshoot 16
#equalrights4all 15
#humanrightsday 15
#michaelbrown 15
#opbelgium 14
#santacon 14
#ftp 12
#icantbreath 12
#nycprotest 12
#dayofresistance 11
#dec1213 11
#love 10
#millionsmarchoakland 10
#nojusticenopeacenoracistpolice 10

In the interest of brevity, I’ve only included hashtags used at least ten times in this list.  Just three hashtags co-occur with #MillionsMarchNYC more than 1,000 times: #blacklivesmatter, #icantbreathe and #dec1314.  The tail of the distribution is long, however, with many hashtags occurring a handful of times or just once:

Frequency Distribution of Hashtag CoOccurrence with #MillionsMarchNYC from November 26 to December 12 2014

These many hashtags do not simply co-occur with #MillionsMarchNYC in these Twitter posts, however.  They also sometimes co-occur with one another, forming a co-occurrence network that tells us something about the symbolic landscape of the leadup to this protest.

Sometimes the truth is messy; the following is a graph showing the complete co-occurrence network of hashtags used with #MillionsMarchNYC (the #MillionsMarchNYC hashtag itself is removed from the network to highlight connections between other tags).  Every hashtag is a node in this network and every co-occurrence between two hashtags appears as a tie between the two nodes.  A tie is drawn more darkly if the co-occurrence happens more often, and a node is drawn in greater size if the hashtag it represents co-occurs with a greater number of other hashtags.  Nodes are given different colors to highlight sets of nodes that are more strongly connected with one another:

Complete Co-Occurrent Network of Hashtags in #MillionsMarchNYC Twitter Posts, Nov. 26 to Dec 12

That’s pretty hard to read, isn’t it?  A few tags are evident, but there are so many that they overlap with one another, blending into a blurry mess.  The culture of a social movement can actually be a lot like that, with a large number of voices saying so many things.  But if we start to filter out the least common hashtag utterances, clearer patterns begin to emerge.

Here’s the same Twitter hashtag network, but this time just showing the hashtags for which co-occurrences happen at least 5 times:

Network of Hashtags Co-Occurring at least 5 times in the #millionsmarchnyc network on Twitter, Nov. 26-Dec. 12 2014

Here’s the same Twitter hashtag network, but this time just showing the hashtags that co-occur with some other hashtag at least 20 times:

Network of Hashtags Co-Occurring at Least 20 Times in the #millionsmarchnyc hashtag on Twitter, Nov. 26 - Dec. 12 2014

And here’s the same Twitter hashtag network, but this time just showing the hashtags that co-occur with some other hashtag at least 100 times:

Cultural Network of Hashtag Co-occurrence in the Tweets mentioning #MillionsMarchNYC from November 26 to December 12 2014

If we filter for frequency, we lose detail, but at the same time the core of this movement’s culture becomes apparent.

Although I stand by my claim that this hashtag network indicates something about social movement culture, I should note a few important limitations.  First, the use of a hashtag involves a person deciding how they would like others to categorize their declarations.  These are professions of manifest culture; latent culture remains hidden.  Second, Twitter is not a form of social media that is used by everyone; according to the Pew Internet Project young adults, urbanites and African-Americans are disproportionately likely to post to Twitter.  However, it’s important to note that this is exactly the population that forms the strongest constituency for the Millions March in New York City. In addition, even with the limitations I’ve just noted, the conversation on Twitter is much more expansive and inclusive than the conversation within the movement’s core organizing cadre.  If we’re interested in distinctions between leaders and potential participants in a social movement, Twitter is a pretty good place to look.

Tool Postscript. For data gathering, I used the Twitter API.  For data processing, I used UCINET.  For data visualization, I used NodeXL.

A Map of Popular Connotations for 12 Social Media Sites, Winter 2014

If I say “Facebook is…,” how would you complete the sentence?

The response of any individual person to that question may be idiosyncratic, but when we look at the aggregate patterns that build up across the responses of many people, trends emerge that reflect our cultural beliefs and values regarding social media.  One convenient way to track trends is through Google Autocomplete.  When you enter a term in the Google search bar, have you ever noticed that certain suggestions appear to complete your thought automatically?

Google Autocomplete suggestions in November of 2014 for Facebook Is...

These are not random suggestions.  Rather, they reflect a weighted combination of how often different phrases appear in other Google “users’ searches and content on the web.”  Speaking in sociological terms, they are an indication of the most salient cultural associations with the phrase you’ve started typing.

In the autocompletion of “Facebook is…” that you see above, results are presented as a simple list of items, but it’s possible to obtain richer information than this. First, I’ve nabbed Google’s autocompletion lists for 12 of the most popular English-language social media platforms: Facebook, Twitter, Tumblr, LinkedIn, Vine, Flickr, MySpace, Ello, Instagram, Pinterest, Google+, and YouTube. To each platform’s name I’ve added the prompting word “is” and found up to 10 most-popular search suggestions (Some new platforms like Ello have low enough search volume to generate few results. Some other platforms have repetitive results I’ve combined — “Flickr is slow” and “Flickr is too slow” are just counted as “Flickr is slow.”). An interesting feature of these lists is commonality. Despite the rich variety and nearly endless possibility of the English language, many words to complete the phrase “_______ is…” appear on Google’s top 10 list for more than one social media platform. For instance, the phrase “______ is slow” is among the top 10 results for Facebook, Tumblr, Flickr, Pinterest and YouTube. The phrase “_______ is dead” is among the top 10 results for a full 9 out of the 12 social media platforms studied here.

To graph commonalities, I’ve created the 2-mode semantic network graph you see below. A 2-mode (or “bimodal”) graph is one in which there are two kinds of nodes indicating two different kinds of objects. In this graph, social media platforms are the first kind of node, and they are indicated in yellow. The second kind of node is a top-10 ending of the phrase “________ is” by Google autocomplete. These are color-coded pink if the phrase completions indicate negative sentiment, green if the phrase completions indicate positive sentiment, and white if there is no clear sentiment expressed with the phrase completion. For some ambiguous phrases such as “YouTube is on fire” and “Pinterest is ruining my life,” a quick browse through Google search results helps to make sentiment more clear (both of these phrases turn out to be complimentary). Finally, a line is drawn from a social media platform to a phrase if that phrase is listed in the top 10 Google autocomplete results for that social media platform.

Social Media Is... Most Common Associations of Popular Social Media Sites as Identified through Google Autocomplete

For the 12 social media platforms, there are 68 distinct phrase completions listed in the Google autocomplete top 10. A large majority of these phrase completions communicate clear sentiment, and a large majority of those sentiments are criticisms. Mentions of slow speed, crashes and unavailability appear common. With the exception of YouTube and Pinterest, all of the 12 social media platforms are popularly depicted as “dead” or “dying.” Predictions of doom for social media platforms appear to be a cultural universal, at least among the socially-distinct set of participants in social media and web searches. Facebook, LinkedIn, Vine, Flickr, Ello and Instagram have no positive phrases listed in their autocompletions. A strikingly positive deviation from the negative trend appears for MySpace. This finding is unintuitive, considering how far interest in MySpace has fallen since 2008. Consider the trend in Google search volume for “MySpace” from 2004-2014:

Relative Search Volume for MySpace in Google, via Google Trends, 2004 to 2014

The letters on that graph indicate influential mainstream news articles mentioning MySpace; does the lack of any articles whatsoever since 2010 hint at an explanation? Without newspaper or magazine articles promoting the MySpace network, and with hardly anyone searching for Myspace anymore, who is left but a small group of true believers in the once-great social network? The strongly positive sentiment toward MySpace in its top-10 rankings may be due to positivity in the small set of people who are still paying attention.

What other patterns do you notice in this graph of popular search completions for social media platforms? Do the autocompletions distinguish between different social media platforms, or do they unify?

Gas Prices in and out of Context: Hi and Lois need a Fact Check

On October 18 2014, the comic strip Hi and Lois comic strip looked back with fondness on a time when gas prices were just 35.9 cents a gallon.  At the present day, the middle-class character Hi grimaces as he pumps gas costing $3.99 cents a gallon.  In a meta-analysis of existing research, social scientist Michael R. Hagerty found that people tend to view their own lives as getting better but at the same time tend to look backward in time and conclude that the lot of the average person is getting worse.  In other words, we use rose-colored glasses to view our own lives, but gray-tinted glasses to view trends in the world in general.

Hi’s view of the world is certainly tinted gray in the strip you see below, but is this pessimist funk merited?  I don’t think so; the way out of the trap of our psychological biases is to check for sociological context.  Doing that, I’d alter the Hi and Lois strip from the original into a more realistic new version:

Put Hi and Lois in Context -- are gas prices in 2014 really that bad?

Correction 1: Gas hasn’t had a price of $3.99 per gallon in the United States since July of 2008. The average price per gallon of gas in the United States was down to about $3.10 in the middle of October 2014, and they’re getting even better a month later. Source: St. Louis Federal Reserve Bank Economic Research Database.

Correction 2: The last time gas cost 35.9 cents a gallon in the United States was the year 1969, but that literal price doesn’t tell the whole story; those 35.9 cents were worth a whole lot more in 1969 than they are worth today. If we adjust for inflation, paying 35.9 cents in 1969 had the same punch to our wallets as paying $2.32 today. Sources: Bureau of Labor Statistics and

Correction 3: Why do we put gasoline in cars? To go somewhere. Chance Brown forgets that the fuel efficiency of cars was far different in 1969 from the fuel efficiency we experience nowadays. In 1969, passenger cars traveled 13.6 miles on a gallon of gas, on average. In 2013, the last full year for which data is available, passenger cars traveled 36.0 miles on a gallon of gas, on average. Sources: U.S. Department of Transportation and Federal Highway Administration.

If we put all these pieces of information together, it turns out that on average and adjusting for inflation, it took 17 cents to travel a mile in a car in 1969. In contrast, it only takes 8.6 cents to travel a mile in a car today.  The depiction of gas prices as a rising social problem doesn’t match the cheaper cost of transportation today.  There may be other social problems associated with fossil fuel transportation, but economy is not one of them.  Unless Hi is driving an extra-large SUV and driving his fuel efficiency far below average, he should be smiling, not frowning.  Even and especially when trends seem obvious, it’s important to put them in context.

Talking Around The University of Maine at Augusta: A Twitter Mention Graph

Like many institutions of higher education these days, the University of Maine at Augusta communicates about its accomplishments and keeps track of the work of others using the social media service Twitter. In its communications, UMA traces the paths of the community that surrounds it.

Unlike the social media platform Facebook (oriented toward friend and family relationships) or Pinterest (devoted to the sharing of images), Twitter acts like a news clipping service of sorts. Limited to 140 characters of text, Twitter posts are like headlines in a newspaper, with links to web pages containing more information. Making headlines social, Twitter posts can mention other Twitter accounts that are relevant to the story. By tracking those mentions, we can find communities of posters who find one another’s work relevant.

To generate the social network graph you see below, I’ve searched through all Twitter posts made this year by the university’s official account, @UMAugusta, and identified all of the other Twitter accounts that @UMAugusta has mentioned. In a second step, I looked at the records of each of the Twitter accounts @UMAugusta mentioned and found out whether and how often they referred to one another. The result, formally speaking, is a level 1.5 ego network. In the graph below, Twitter accounts are indicated with labeled dots; in the parlance of social network analysis, these are called “nodes” or “vertices.” The larger a dot is in the graph, the more often it is mentioned by other Twitter accounts. Mentions between Twitter accounts are indicated with curved lines, which network analysts refer to variously as “lines,” “arcs,” “edges” or “ties.” The darker a line is, the more often mentioning occurred between two Twitter accounts.

Who Mentions Whom? A social network of mentions over Twitter surrounding @UMAugusta from January to October 2014

To highlight structure in the network of mentions surrounding @UMAugusta, I identified five clusters of Twitter accounts who mentioned one another especially often. These clusters are color-coded in the network graph above. Because the identification of clusters of conversants was driven by data, not by pre-conceived notions about which accounts might “naturally” be grouped together, it is curious to see how particular clusters focus on particular domains. Some patterns:

  • The dark green cluster in the lower-right of the graph consists strongly of offices and officers connected to student life and services at the University of Maine at Augusta.
  • The dark blue cluster in the upper-left of the graph is anchored around newspapers and newspaper reporters of central and southern Maine — the Portland Press Herald, the Kennebec Journal of Augusta and the Morning Sentinel of Waterville. These three newspapers are not simply tied by geography, but are also published under the aegis of the MaineToday Media company; @centralmesports is a joint outlet of the Kennebec Journal and Morning Sentinel. Other central Maine institutions — Colby College and the Holocaust & Human Rights Center — are also featured in this cluster.
  • The light green cluster in the lower-left of the graph features strong representation in the arts, with the 5 Rivers Arts Alliance, Harlow Gallery, photographer Jill Guthrie, and The Band Apollo included.
  • Immediate substantive commonalities in the red upper-right cluster, including my own account, the Maine State Library, the Maine Humanities Council and a edu-metrics website NerdScholar are elusive. We are tied to one another because of our mutual communications across disciplinary boundaries.
  • The light-blue cluster at the bottom of the graph is a remainder category, consisting mostly of Twitter accounts that UMA has mentioned but that do not mention other accounts often.
  • Finally, although these clusters identify groups of accounts that communicate more often internally, connections between clusters are frequent, indicating that most of the accounts mentioned by the University of Maine at Augusta are part of a broader community.

Data mining and visualization for this graph of the @UMAugusta network were carried out using free and open source NodeXL software.

Graphing #MEPolitics, the Maine Politics Twitter Network

On the social media platform Twitter, users post messages of 140 characters or less. Those messages can include links to web pages or communications to other Twitter accounts using the @ (“at”) sign. When a # sign is placed in front of a word in a Twitter post, the word becomes a “hashtag” and that post is added to a stream of all other posts using the same hashtag. Direct mentions and replies build pair bonds in the Twitter environment; hashtags build community.

For years, people interested in discussing Maine politics have used the #MEPolitics hashtag to broadcast, to speak and to listen. As Election Day 2014 approaches, volume of chatter on the #MEPolitics hashtag has increased. Who’s speaking most? Who is speaking to whom (and who isn’t)? What’s being talked about? To find out, I’ve gathered all posts (popularly called “Tweets”) using the #MEPolitics hashtag over the last weekend: October 24-26, 2014. The following is a graph of the resulting social network, in which each unique contributor to #MEPolitics is represented by a dot, each tie indicates that one contributor has mentioned or replied to another contributor in a Tweet, and contributors are placed closest to those in the network with whom they tend to communicate most:

Network of Twitter Posts using the #MEPolitics Hashtag from 10-24 to 10-26 2014. Ties indicate mentions or replies.

A few features of the #MEPolitics network are immediately apparent. First, nearly every one of the 603 participants in the #MEPolitics hashtag over the weekend is a communicator and not just a broadcaster; only 23 individuals posted Tweets during the period without referring to or being referred to in some way by another Twitter user (these are the loners colored light green in the lower-left of the graph). Second, most participants (565 out of 603 participants) are connected to one another either directly or indirectly in one giant conversation; the few unconnected conversations graphed in the lower-right corner are happening in small groups of 2 or 3. Third, the large conversation in which most Tweeters are participating is itself divided up into smaller clusters, in-groups whose members more frequently communicate with one another than with outsiders. These smaller clusters of conversation are color-coded in the graph above.

What’s going on inside those clusters of communication? To help clarify, I’ve depicted each Maine candidate for governor or federal office not as a simple dot, but rather using their profile picture. Also rendered by their profile images are the Twitter accounts of the Democratic Party and Republican Party of Maine. We can see from the graph that independent gubernatorial candidate Eliot Cutler and independent congressional candidate Richard Murphy are, not surprisingly, located in their own unique sub-community separated from the communities of discussion surrounding the major-party candidates. Perhaps more surprisingly, conversation involving Republican candidates is not embedded in a single Twitter community, but rather split among four sets. Indeed, both Senator Susan Collins and Governor Paul LePage have two Twitter accounts each, and each of their accounts is placed in its own commnunity. The Democratic Party and Democratic Party candidates, in contrast, are all located in the same sub-group of accounts. It is fair to say, at least in the context of Twitter communication and at least for this time period, that Maine Democrats have a more cohesive social media community than Maine Republicans.

A careful observer may notice the absence of one candidate and one party from this graph. Where is Republican congressional candidate Isaac Misiuk, for instance? Where is the Maine Green Independent Party, which is fielding a slate of 13 candidates in this cycle? The answer is that neither Misiuk nor the MGIP are included in the graph because neither participated in the #MEPolitics discussion, at least over the weekend.

Finally, there are some notable clusters of communication with non-party, non-candidate accounts at the center; these are indicated with a text label identifying the most central account of a cluster. M.E. McRider (BikinInMaine) is a conservative citizen (“Fighting the spread of the disease which is liberalism!“) who posted 130 provocative Tweets over the period, attracting 48 responses:

M.E. McRider bikinInMaine Twitter user declares Harry Reid officially a domestic enemy of the United States

On the left, blogger Bruce Bourgoine posted 46 Tweets over the weekend, a smaller number than McRider, attracting 36 responses:

Bruce Bourgoine posts a criticism of Rand Paul as a user of misinformation

The Kennebec Journal (KJ_Online) and Bangor Daily News (bangordailynews) are two Maine newspapers sitting at the center of their own circles of conversation. The Portland Press Herald, another prominent Maine Newspaper, isn’t in its own independent Tweeting group; rather, its Tweets are referred to predominantly by Democratic candidates and their followers.

Of course, it’s not just the structure of the #MEPolitics network that matters; the content of discussion this weekend matters too. With Election Day just a week and a half away, what subjects in Maine politics are being talked about the most? The ten most-used hashtags in last weekend’s #MEPolitics discussion were:

Top Ten Hashtags
1. #mepolitics: 2542 uses
2. #michaud2014: 386 uses
3. #michaud: 354 uses
4. #lepage: 320 uses
5. #hillaryclinton: 302 uses
6. #mike: 288 uses
7. #eliotcutler: 278 uses
8. #cutler: 246 uses
9. #maine: 224 uses
10. #poll: 206 uses

The weekend visit by Hillary Clinton on behalf of Democratic candidates and the race for Governor appear to have garnered the highest volume of attention. This pattern is borne out in a listing of the ten most linked-to web pages in #MEPolitics Tweets:

Top Ten Page Links
1. Story: Paul LePage leads polls: 45 links
2. Story: Michaud does best one-on-one: 24 links
3. Story: Hillary Clinton endorses Mike Michaud: 22 links
4. Editorial: the Governor’s race will determine health outcomes of sick Mainers: 21 links
5. Story: A retrospective on Mike Michaud’s record in the U.S. Congress: 14 links
6. Story: poll on bear baiting: 13 links
7. Video: Eliot Cutler asks Mainers to vote for someone else if he can’t win: 11 links
8. Story: Eliot Cutler benefits from out-of-state money: 11 links
9. Another Story: Eliot Cutler benefits from out-of-state money: 10 links
10. Michaud Campaign TV Ad: Cutler supporters who will vote for Mike Michaud: 10 links

Remember bear baiting? Although there are many letters to the editor being published about this controversial referendum, relatively few Twitter users are discussing the possible ban over social media. The subject of a bear baiting ban garnered only one link in the top ten links of the weekend. All other stories have to do with the race for the Blaine House.

You may notice a trend toward citing newspaper articles in the top ten link list. Let’s look at the ten most linked-to domains for a deeper look:

Top Ten Domains
1. 181 links
2. 109 links
3. 93 links
4. 37 links
5. (Kennebec Journal): 31 links
6. 22 links
7. 16 links
8. 16 links
9. 14 links
10. 14 links

Newspaper links are indeed the most popular, with the Portland Press Herald, the Bangor Daily News, the Kennebec Journal and the Lewiston Sun-Journal gaining spots in the top 10. Social media sites are also quite popular, with YouTube, Huffington Post and the blogging platform Blogspot representing the form. Campaign websites for Paul LePage and Mike Michaud make the list (notably, Eliot Cutler’s page does not). The final entrant in the top ten list of linked sources is the website, which proposes a new Constitutional Convention to amend the U.S. Constitution. Tweets mentioning this website consist almost entirely of posts made by M.E. McRider (handle @BikinInMaine) and responses to these posts.

McRider has made an impact this weekend in an otherwise election-centric week, and that impact is felt in discussion as well. Some Twitter users might elevate the salience of their favorite websites by simply posting a link again and again, a kind of anti-social behavior that some say borders on spamming. Yet McRider elicited responses as well, as evidenced by this last list of the ten most mentioned or replied-to accounts:

1. Mike Michaud (Democratic candidate for Governor)
2. Hillary Clinton
3. Eliot Cutler (Independent candidate for Governor)
4. Maine Democratic Party
5. Amy S. Fried, University of Maine political science professor and political columnist
6. Shenna Bellows (Democratic candidate for Senate)
7. M.E. McRider
8. Paul LePage (Republican candidate for Governor)
9. Bangor Daily News
10. Randy Billings, reporter for the Portland Press Herald

Last weekend, these were the speakers closest to the center of Maine political discussion on Twitter.

Methodological note: analysis and visualization was performed using NodeXL, a free and open-source plugin for Microsoft Excel that makes social media analysis accessible to almost anyone with a computer.

Convocation Remarks on the University of Maine at Augusta theme for 2014: “Innovation”

Convocation at the University of Maine at Augusta, September 19 2014

UMA Convocation Fall 2014
Framing the Theme – “Innovation”

Good afternoon.  Last spring, the UMA Faculty Colloquium Committee identified a special theme of innovation to reflect the University’s 50th anniversary. The committee asks that every member of the faculty, staff and student body read and reflect upon a book about innovation, Outliers by Malcolm Gladwell.  Look for activities throughout the year celebrating UMA’s 50 years of innovation.  As we kick off the year today, I’ve been asked to frame the theme of innovation in a few remarks.

When most of us hear the word “innovation,” we focus on the creation of something new.  But there is more to innovation than newness.  The word “innovation” comes from the Latin innovare, to renew or to make new.  What do we renew?  What do we make new?  Something that was already there.  To innovate is to make something new out of what came before.

To write a “novel” means literally to create a story that is new.  But in the introduction to her novel Frankenstein, a novel of ghastly innovation, author Mary Shelley admits stitching together her story from the science, philosophy and mythology of the day before adding her own animating spark.  “Everything must have a beginning,” Shelley writes, but “that beginning must be linked to something that went before…. Invention does not consist in creating out of void… the materials must, in the first place, be afforded.[i]”  The innovative stories we tell are based on what came before.

Every human being on Earth is a unique innovation, a Frankenstein experiment of sorts, with a genome ripped from our parents and stitched together in a brand new way.  Thanks to mutation, even identical twins don’t have exactly the same set of genes.  But neither is any human being entirely new.  We are variations on the genetic themes set by our parents, and as social scientists know we draw heavily from our environment in fashioning our public selves.  The new, innovative you is based on what came before.

The University of Maine at Augusta is itself an innovation.  Our history tells us that 50 years ago, there was no college or university in Augusta – and when UMA held its first classes on September 12 1965, it had no campus of its own.  Our first classrooms were in Cony High School, set aside for use after school hours; that’s innovative.  Our bookstore was fit into a Cony High School coat closet; that’s innovative[ii].  Even these humble beginnings were not completely new, but based on what came before: an existing school, repurposed and reimagined. In its next 50 years, UMA will rely on already existing strengths as it finds innovative new ways to fulfill its purpose.

And what is that purpose?  What is a university for?  At first glance, it may appear to some that a university is a business selling a product called a diploma to customers called students.  Once purchased, the diploma product can be redeemed by the customer for future economic profit.  Well, it certainly takes money for a person to live and for a university to run.  But is an education just another consumer purchase?  Is a university an assembly-line factory?  Are faculty here to sell?  Are students here to shop?

I think not.  We are here because we share a dream.  We dream of becoming more than we are.  We dream of remaking ourselves, putting parts of our lives that came before together with something new and adding an animating spark.  We know this dream of innovation can come true because we see it happen here every day — for some sooner, for some a bit later.  The poet Adelaide Anne Procter shares a truth we at UMA know well: if we miss our first shot at remaking ourselves a second chance, a third chance will come.  It is never too late.  Procter writes:

“Have we not all, amid life’s petty strife,

Some pure ideal of a noble life

That once seemed possible? Did we not hear

The flutter of its wings, and feel it near,

And just within our reach? It was. And yet

We lost it in this daily jar and fret,

And now live idle in a vague regret;

But still our place is kept, and it will wait,

Ready for us to fill it, soon or late.

No star is ever lost we once have seen,

We always may be what we might have been[iii].”


This is the heart of innovation: to draw from what came before, to honor those who inspire your work today, to dream of being more than you are.

[i] Shelley, Mary. 1818.  Frankenstein, or, the Modern Prometheus.  London: Lackington, Hughes, Harding, Mavor and Jones.

[ii] Brookes, Kenneth. 1977.  The Story of the University of Maine at Augusta: The Jewett Years.  University of Maine at Augusta publication.

[iii] Procter, Adelaide Anne. 1864. “A Legend of Provence” (excerpt).  P. 191 in The Poems of Adelaide A. Procter.  Boston: Ticknor and Fields.

Finding and Extracting Variables from Web Pages with PHP: A How-to for Social Scientists in the Rough

“Data Mining”: Just Another Way for Social Scientists to Ask Questions

If social science is the study of the structure of interactions, groups and classes, and if interactions, groups and classes are increasingly tied to the online environment, then it is increasingly important for social scientists to learn how to collect data online. Fortunately, the approach to “data mining” online interaction is fundamentally the same as the approach to studying offline social interaction:

  1. We approach the subject,
  2. We query the subject, and
  3. We obtain variables based on the responses we’re given.

Because the online environment and our online subjects are different, the way we make online queries must be different from the way we make offline queries. In data mining we don’t question human beings who can flexibly interpret a question; instead, we question computers responsible for the architecture of the online social system, and they will only respond if questioned in precisely the right way.


Learning to Mine the Web for Social Data — Without a Computer Science Degree

I’ve been trying to learn how to mine social information from websites on my own, without the benefit of any formal education in computer science.  This is kind of fun even when it’s frustrating, as long as I remember that getting information from the online environment is like solving a puzzle.  On most websites, social information (relations, communications, and group memberships) is stored in a database (like XMLSQL or JSON); some content management software (like WordPress, Joomla or Drupal) takes the information stored in a database and posts it on web pages, surrounded by code that makes the information comprehensible to humans like you and me.  If websites are researcher-friendly, they allow databases to be queried directly through an Application-Programming Interface (API).

Many websites don’t let a person query their databases, even when all the information published on those websites is public.  What’s a social scientist to do?  Well, we could literally read each single web page, find the information about relations, communications and group memberships we’re interested in, write down that information, and enter it into our own database for analysis.  We could do this, hypothetically, but at the practical scale of the Internet it’s often impossible.  Manually collecting interactions on a website with 10,000 participants could take years — and by the time we were done, there would be a whole new set of interactions to observe!

Fortunately, because web pages on social websites are written by computers, there are inevitably patterns in the way they’re written.  Visit a typical page on a social media website and use your browser’s “View source” command to look at the raw HTML language creating that page.  You’ll find sections that look like this:

<div class=”post” postid=”32“><div class=”comments”><a name=”comments”></a><h3>3 Comments on “Lucille’s First Blog Post”</h3><div class=”commentblock”>
<div class=”comment” id=”444“><a href=”/member.php?memberid=”201” usertitle=”Tim – click here to go to my blog”> Tim</a>: Greetings! How are you, Lucille?</div>
<div class=”comment” id=”445“><a href=”/member.php?memberid=”1181” usertitle=”Lucille – click here to go to my blog”> Lucille</a>: Hey, Tom. I’m new here. How do I respond to your comment?</div>
<div class=”comment” id=”446“><a href=”/member.php?memberid=”201” usertitle=”Tim – click here to go to my blog”> Tim</a>: Congratulations, Lucille, you just did!  Welcome to the community.</div>

That may look like a cluttered mess, but if you look carefully you can find important information.  Some of that information is the content that users write.   Other pieces of information track posts, comments and users by number or name. These names and numbers (indicated in red above) can be thought of as social science variables, and encouragingly they’re placed in predictable locations in a web page:

variable preceded by followed by
post id <div postid=” “><div >
comment id <div id=” “><a href=”/member.php?
member id member.php?memberid=” ” usertitle=”
member name  usertitle=”  – click here to go to my blog

There should be a set of rules for finding these predictable locations, and my goal in data mining is to explain those rules in a computer program that automatically reads many pages on a website, much faster than I can read them.  In English, the rules would look like this:

“Find text that is preceded by [preceding text] and is followed by [following text].  This text is an instance of [variable name].”

Unfortunately, computers don’t understand English.  I am familiar with a language called PHP that can read lines of a web page.  I didn’t know of a command in PHP that would let me carry out the rule described above.  What to do?  Ask a friend.  I asked a friend of mine with a PhD in Computer Science if he could identify such a command in PHP. His answer: “Well, you don’t want to use PHP. The first thing to do is teach yourself Perl.” The Perl programming language, he went on to explain, has much more efficient and flexible approach to handling strings as variables, and if I was going to be serious about data mining efficiently, I should use Perl.

I can’t tell you how many times some computer science expert has told me I shouldn’t follow a path because it was “inelegant” or “inefficient.”  Well, that may be wonderful advice for professional computer programmers who have to design and maintain huge information edifices, or to those who have a few extra semesters to spare in their learning quest, but in my case I say a hearty “Baloney!” to that.  Research does not need to and often cannot wait for the most efficient or elegant or masterful technique to be mastered.  Sometimes the most important thing to do is to get the darned research done.

In my case, this means that I’m going to use PHP, even though it may not be elegant or efficient or flexible or have objects to orient or [insert computer science tech phrase here].  I’m going to use PHP because I know it and it will — clumsily or not — get the darned job done.  Good enough may not be perfect but it is, by definition, good enough.  As long as the result is accurate, I can live with that.


A Rough but Ready Method for Extracting Variables from Web Pages with PHP — Explode!

It took a bit of reading through PHP’s online manual, but eventually I found a method that works for me — the “explode” command.  In what follows, I’m going to assume that you are familiar with PHP.  If you aren’t, that’s OK — you’ll just have to find another way to extract information out of a web page.

The PHP command “Explode” takes a string — a line of text in a web page — and splits it into parts.  “Explode” splits your line of text wherever a certain delimiter is found.  A delimiter is nothing more than a piece of text you want to use as a splitting point.  Let’s use an example, the web page snippet listed above:

<div class=”post” postid=”32″><div class=”comments”><a name=”comments”></a><h3>3 Comments on “Lucille’s First Blog Post”</h3><div class=”commentblock”>

<div class=”comment” id=”444″><a href=”/member.php?memberid=”201″ usertitle=”Tim – click here to go to my blog”> Tim</a>: Greetings! How are you, Lucille?</div>

<div class=”comment” id=”445″><a href=”/member.php?memberid=”1181″ usertitle=”Lucille – click here to go to my blog”> Lucille</a>: Hey, Tom. I’m new here. How do I respond to your comment?</div>

<div class=”comment” id=”446″><a href=”/member.php?memberid=”201″ usertitle=”Tim – click here to go to my blog”> Tim</a>: Congratulations, Lucille, you just did! Welcome to the community.</div>


Let’s say I’d like to look through 5,000 web pages like this, representing 5,000 individual blog posts.  In each of these 5,000 web pages, the particular post id and comment ids and member ids may change, but the places where they can be found and the code surrounding them remain the same.  We’ll use the code surrounding our desired information as delimiters.

To get really specific, let’s say I’d like to extract a member id number from the above web page every place it occurs.

The first step is to find a line of the web page on which a member id number exists.  To do this, I’ll use the stristr command, which searches for text. The command if (stristr($line, ‘?memberid=’)) {…} takes a look at a line of a website ($line) and asks if it contains a certain piece of text (in this case, ?memberid=).  If the piece of text is found, then what ever commands inside the brackets { } are executed.  If the piece of text is not found, then your computer won’t do anything.

So far, we have:

if (stristr($line, ‘?memberid=’))


What goes inside the brackets?  Some exploding!  Our first line of code inside the brackets tells the computer to split a line of website code using the text memberid= as the delimiter.

$cutstart = explode(‘memberid=’, $line);

This leaves a line of website code in two pieces, with the delimiter memberid= removed.  Those two pieces are set by the explode command to be $cutstart[0] and $cutstart[1]:

Original line of text: <div id=”444″><a href=”/member.php?memberid=”201″ usertitle=”Tim – click here to go to my blog”> Tim</a>: Greetings! How are you, Lucille?</div>

$cutstart[0]: <div id=”444″><a href=”/member.php?

$cutstart[1]: “201” usertitle=”Tim – click here to go to my blog”> Tim</a>: Greetings! How are you, Lucille?</div>

Where’s the member id number we want?  It’s the number right at the start of $cutstart[1], sitting in between the double quotation marks.  To get at that, let’s add another line of code to explode $cutstart[1] which tells the computer to split $cutstart[1] into pieces at the spots where there are double quotation marks.  The command in the second line of code inside the brackets is:

$cutend = explode(‘”‘, $cutstart[1]);

and takes $cutstart[1] apart, turning it into the pieces $cutend[0]$cutend[1], $cutend[2], $cutend[3] like so:

original $cutstart[1]: “201” usertitle=”Tim – click here to go to my blog”> Tim</a>: Greetings! How are you, Lucille?</div>

$cutend[0]: 201

$cutend[1]: usertitle= 

$cutend[2]: Tim – click here to go to my blog

$cutend[3]: > Tim</a>: Greetings! How are you, Lucille?</div>

Which part am I interested in?  Only the member id number, and finally that’s what I’ve got in $cutend[0].  If I want, I can rename it to help me remember what I’ve got:

$memberid = $cutend[0];

Taken all together, the code looks like this.

if (stristr($line, ‘?memberid=’))
$cutstart = explode(‘memberid=’, $line);
$cutend = explode(‘”‘, $cutstart[1]);
$memberid = $cutend[0];

This may not be the most elegant or efficient solution, but it’s pretty simple — and most importantly, gosh darn it, it works.  A novice data miner like me will never get hired away by Google for basic programming like this, and if you’re a social scientist with mad programming skills you may scoff at the elementary nature of this step.  That’s OK; this isn’t written for the Google corporation or wicked-fast coders.  I wrote all this out because the code was a big step for me in becoming a better, more complete social scientist.  If you’re looking to take the same step, I hope this post helps you along.

Credit goes to Tizag for helping me to understand the “explode” command a bit better. In turn, if you can think of a way for me to explain this more clearly or fully, please let me know by sharing a comment.

1 2 3 4 5 6 7