Blue for Boys, Pink for Girls? (Paoletti in Search Context)

I’ve been reading and enjoying Pink and Blue: Telling the Boys from the Girls in America. In the book, University of Maryland American Studies Associate Professor Jo B. Paoletti uses catalogs, sewing patterns, historical portraits, newspaper advertisements and similar media to document the emergence of pink as a color for girls’ clothing and blue as a color for boys’ clothing. Paoletti traces color preferences to changes in textile and cleansing technology, connection of local media outlets into national media networks, feedback between consumers and marketers, social mobility, changes in psychological theories of development and the reaction of new generations against the generations before.

Paoletti not only makes a case that the emergence of pink for girls and blue for boys in clothing is relatively new, but more strongly asserts that pink was often seen as a boy’s color and blue as a girl’s color in many areas of the United States early in the 20th Century. Paoletti’s “favorite primary source” for this claim in the book comes from a 1918 issue of the Chicago clothing trade magazine The Infants’ Department:

“Pink or Blue? Which is intended for boys and which for girls? This question comes from one of our readers this month, and the discussion may be of interest to others. There has been a great diversity of opinion on this subject, but the generally accepted rule is pink for the boy and blue for the girl. The reason is that pink being a more decided and stronger color, is more suitable for the boy; while blue, which is more delicate and dainty is prettier for the girl. In later years the shade of pink has been much improved. Perhaps if we had the delicate flesh tints when baby layettes were first sold, the rule might have been reversed.

“The nursery rhyme of ‘Little Boy Blue’ is responsible for the thought that blue is for boys. Stationers, too, reverse the colors, but as they sell only announcement cards and baby books, they can not be considered authorities.

“If a customer is too fussy on this subject, suggest that she blend the two colors, an effective and pretty custom which originated on the other side, and which after all is the only way of getting the laugh on the stork.”

Paoletti finds additional text sources that seem to display an inconsistent set of color choices around the turn of the century: for instance, Paoletti pairs a pink-for-girls and blue-for-boys quote of the novel Little Women with a blue-for-girls and pink-for-boys recommendation by the 1890 Ladies Home Journal (Paoletti, p. 87).

In a July 2012 letter to the Archives of Sexual Behavior, Assistant Professor of Psychology Marco Del Guidice counters Paoletti’s historical claim by using a search of the millions of books scanned in by Google for the phrases like “blue for boys,” “pink for girls,” “blue for girls” and “pink for boys.” You can perform just such a search for yourself right here, showing these results:

Pink for Girls, Blue for Boys, Blue for Girls, Pink for Boys Google Ngrams Book Search from 1820-2008

Del Guidice interprets these results as a systematic refutation of Paoletti’s “anecdotal” claims, even going so far as to term them “A Scientific Urban Legend”:

“Gender-coded references to pink and blue begin to appear around 1890 and intensify after World War II. However, all the gender-color associations found in the database conform to the familiar convention of pink for girls and blue for boys. An equivalent search of the British English corpus revealed exactly the same pattern. In other words, this massive book database contains no trace of the alleged pink-blue reversal; on the contrary, the results show remarkable consistency in gender coding over time in both the U.S. and the UK, starting from the late nineteenth century and continuing throughout the twentieth century.

“If one considers the totality of the evidence, the most parsimonious conclusion is that the Pink-Blue Reversal (PBR) as usually described never happened, and that the magainze excerpts cited in support of the PBR are anomalous or unrepresentative of the broader cultural context. Not only do the present findings run counter to the standard PBR account; they also fail to support Paoletti’s claim that pink and blue were inconsistently associated with gender until the 1950s.

The replication of Del Guidice’s Google Books search shown above show that his contention may be slightly too strong:.the phrase “blue for girls” is not absent entirely from the corpus of books Google has scanned, but rather is present at a low level throughout the 20th Century. Indeed, around the turn of the 20th Century, there appears to have been a period in which the appearance of “blue for girls” rivaled the appearance of “pink for girls” in books. One such source is a 1920 skit in the Chicago-published journal “Public Libraries,” performed by the staff of the Pomona Public Library for the Pomona, California City Council and library board of trustees to illustrate librarians’ daily activities:

Pink for Boys and Blue for Girls in 1920 Library Skit

“Dear me! Someone wants to know what colored ribbons to use for a boy baby and what for a girl. I can never remember! But thank goodness it’s catalogued here somewhere … (Paws through catalog drawer) Oh, here it is: “Infants: Colors for boy and girl” (Returns to phone) Hello, Pink is used for boys and blue for girls … Yes … That’s right. You’re welcome.”

But while Paoletti’s claim that blue for boys and pink for girls is not dominant before World War II appears to be supported, and while some mention of “blue for girls” appears in early scanned books, Del Guidice’s more general point appears to be borne out: there is no point at which pink for girls and blue for boys is swamped by contrary mentions.

Track Social Networks… to Find the People Tracking You

As the course designer and instructor for an undergraduate social networks course at the University of Maine at Augusta, I am often asked why students should take the course. I think there are many answers to this question. One answer comes from a humanities standpoint: learning how to represent patterns in relationships with attention to meaningful visual cues can deepen understanding of design and lead to innovation in art. Culturally speaking, networks have geek appeal as sparkling and colorful objects lending panache to infographics. If critical thinking is important to you, you might be interested in network analysis for the challenge of mastering multidimensionality and matrix mathematics; as you work at network puzzles you’ll develop your logical and quantitative reasoning ability. But these appeal aren’t all: the study of social networks can be practically useful, too.

One practical use of social network analysis is highlighted by the Disconnect extension you can add to your Chrome, Firefox, Safari, or Opera internet browser…

worried faceI should break in here. Whenever you read "extension you can add to your internet browser," you should begin to get nervous. Many add-ins, add-ons, and add-arounds to your internet browsing or Facebook or Twitter experience are so colorful and fun to play with. But they have a second purpose lurking behind the colorful and fun one: to track your movement across websites so someone can sell data about where you go and what you do. But when consulting Disconnect's privacy policy, I was pleasantly surprised to discover that the Disconnect extension collects information about you only minimally and doesn't sell information to advertisers: "Disconnect never sells your personal info.... Our browser extensions don't collect any of your personal info. Unlike most websites, our site doesn’t collect your IP address."

… so as I was saying, the Disconnect extension available for most internet browsers makes use of social network analysis to share useful information about websites that let your data leak out to third parties:

If you install the Disconnect extension in your browser, then visit a website, it will create a network graph (or “sociogram”) with that website at the center, visually linked to other websites that are given data whenever you visit that site. By bringing those network graphs together for different websites, you can figure out how your personal information might be combined and how that combination might be harmful to you.

That might sound a little abstract, so let me make it concrete. Consider the mini-industry on the internet of “Print-On-Demand” apparel. On websites like CafePress, Zazzle and Skreened, you can browse through thousands of t-shirt designs made up by people like you. If you find a design you like, you can put it on a t-shirt that fits your style, order that shirt, and have it printed up and sent specially to you. The printer gets a cut of the profits, the designer gets a cut of the profits, and you get just the shirt you want.

While these print-on-demand services are offering you a service that makes them a little money, are they harvesting your data on the sly? To find out, I activated the Disconnect extension in my browser and visited the CafePress, Zazzle and Skreened websites. Disconnect produced three sociograms, which I combine to form the network graph you see below:

How the Skreened, CafePress and Zazzle websites track your visits: February 2014

The above image is current as of February 2014, and represents an change in tracking since the last time I looked at these websites in December of 2012:

Skreened, CafePress and Zazzle website tracking technology habits: December 2012

There are a number of patterns to notice. Consistently and by a wide margin, CafePress has been sending information about you to the largest number of third-party websites. Over time, on the other hand, Skreened and Zazzle (to a lesser extent) have started to catch up, sending more information about you to other companies. Those companies include Lucky Orange (“We don’t just tell you who is on your site, we show you what they are doing”), Monetate (“helping you understand your customers’ situations, behaviors and preferences”), Retention Science (“analyze & predict customer behaviors”), and Tell Apart (“If you’ve ever clicked on an ad for a pair of shoes that seem like they were made for you, Tell Apart may very well have been responsible“).

When the practices of individual websites such as CafePress, Skreened and Zazzle are combined into a network, we can find points of overlap. CafePress and Skreened send their information to three websites in common:,, and Each of these services tracks users by IP address, so that your behavior at CafePress and your behavior at Skreened can be combined: these data mining companies can bring together your behavior at CafePress and your behavior at Skreened to figure out aspects of your identity and preferences that might not be apparent if they had access to only one of the websites. All three websites send data to, leading to even more detailed insights about you. Would you be surprised to find out that also receives information about visitors from, and Would it surprise you to know that is owned by Google, bringing this overlap into even sharper focus?

Looking at simple lists of the third-party recipients of your information on a website can give you a rough sense of how leaky an individual website is. Looking at the network overlap in recipients tells you which of those recipients are likely to be learning the most about you, constructing an increasingly accurate virtual you for sale.

Don’t-Miss UMA Colloquium: Laura Rodas on Academic Integrity, 2/12/2014

The UMA RaP Colloquium Series presents

“The State of Academic Integrity at UMA”

Laura Rodas, Coordinator of Community Standards and Mediation
Wednesday, February 12, 12 noon
University of Maine at Augusta Katz Library

As part of its continuing commitment to building intellectual community, the University of Maine at Augusta holds a regular Research and Pedagogy (RaP) colloquium series at which UMA faculty and staff present works in progress to their peers. Ensuing discussion promotes collaboration through the exchange of ideas and the development of relationships across colleges, programs, departments and disciplines. When we meet to present and to learn, we discover that amidst the accumulated knowledge of the centuries, there are still new thoughts to be spoken out loud.

Academic honesty in higher education is of the utmost importance. During February 12th’s RaP session, Laura Rodas will lead discussion focused on UMA’s Academic Integrity Code and procedures, the responsibilities of faculty members, students, and the Office of the Dean of Students and the logistics of making a complaint.  Special attention will be paid to delineation of academic sanctions vs. disciplinary sanctions, repeat violations, and examples of challenging academic integrity matters.  A question and answer period with refreshments will follow.

The Research And Pedagogy program is made possible by the support of the Faculty Senate and the Office of the Provost.  If you are interested in giving a presentation at a future RAP session, please contact:

Remembering Pete Seeger 1-28-14: Collective Memory, Shared on Twitter

Activist folksinger Pete Seeger died at the age of 94 on January 27, 2014. As word of Seeger’s death spread on January 28, Twitter was flooded with tributes, including 28,226 posts made to the social media outlet’s #PeteSeeger hashtag channel by 9 PM. Of those posts, 21,617 (some 76.8%) were “re-tweets” of others’ posts. Pete Seeger wouldn’t have minded: he was a staunch believer in people forming publics to sing together, hearing a call and issuing a response, finding a tune and amplifying it not by microphones but in sheer numbers.

What did the world sing today about Pete Seeger? To answer that question, I tuned the Tweet Archivist Desktop (a handy $10 tool) to the #PeteSeeger hashtag, where it archived users’ public posts silently and efficiently in a background window on my computer. I used NodeXL (free and open-source) to find the most common word pairs in posts and to visualize them in the graphic you see below. When pairs are connected into chains and webs, the result is a semantic network that captures the spirit of the day.

Remembering Pete Seeger: a data visualization of a semantic network of the most common words and their connections in the 28,226 #PeteSeeger Twitter contributions from midnight to 9 PM on January 28 2014

In case you’re wondering, the word “communist” only appears 29 times in all those posts, far too rarely to reach the threshold required to appear in the image. “Thank” or “thanks” appears over 2,000 times.

A Hashtag Contested: Positive and Negative Social Media Reaction to the RSA-NSA Scandal

For some time now, public relations professionals have been worrying about “the bashtag problem.” Corporations may spend years cultivating positive conversations about their products over social media by developing and promoting a hashtag, only to see “their” hashtag fall into bashtag status when negative social media posts about that organization swamp the positive posts the organization seeks. Upset that public criticism may “ruin their brand,” some corporations have developed intimidation strategies to shut up and shut down isolated critics. But when large numbers of people join in the bashtagging, there’s no easy way to stop the dissent.

Through the fall of 2013, cybersecurity corporation RSA enjoyed positive references on its #RSAC hashtag on Twitter that it had developed to advertise its annual professional conference. In late December, however, it emerged that RSA’s data encryption products had a “back door” built into them that allowed the National Security Agency (NSA) to break users’ encryption and (possibly without a warrant) snoop on private communications. On December 23, RSA issued a “non-denial” that seemed to implicitly acknowledge the arrangement. On that day, the positive flavor of the #RSAC hashtag changed.

After collecting the Twitter posts (or “tweets”) of the #RSAC hashtag using the Tweet Archivist Desktop, I’ve looked at the content of each one, determining whether its attitude toward RSA or the RSA Conference (RSAC) was positive, negative or neutral. The following graph tracks the volume of positivity, negativity and neutrality in the #RSAC hashtag from December 21 2013 through January 14 2014 (today):

Volume of Tweets Positive, Negative and Neutral Toward RSA in the #RSAC hashtag, 12/21/2013 to 1/14/2014

After an initial burst in which some prominent conference speakers canceled their participation in protest, it appeared that negative tweets regarding the RSA Conference might abate over the end-of-year holidays, and RSA began to use the channel to promote its conference again. Then, on January 7, RSA let out a teaser of a Tweet about the identity of its keynote speaker:

RSA Tweets on January 7 2014: Click here to find out who has been announced as #RSAC closing keynote speaker for 2014

That speaker is Stephen Colbert. With a celebrity drawn into the story, public attention returned, generating a new peak of critical #RSAC tweets that seems to be continuing. Some of those tweets are original, but the bulk of them constitute just a few messages, tweeted and retweeted over and over again over the #RSAC hashtag channel. Anti-surveillance social movement organization Fight For the Future has deployed a special web page

Fight for the Future asks its followers to send out automated tweets to overwhelm the #RSAC hashtag

… on which it asks its followers to share this message on Twitter: "Surveillance is no joke! Tell @StephenAtHome to cancel his keynote at this NSA tainted conference. #RSAC"

15.4% of all Tweets on the #RSAC hashtag from December 21 2013 to January 14 2014 are this one Tweet, posted over and over. Another Fight for the Future mass tweet, "Does Stephen Colbert secretly love the NSA? There's only one way to find out: #RSAC," accounts for another 2.1% of #RSAC Tweets during the period.

Fight for the Future is part of a coalition of anti-surveillance groups who have announced a national day of protest on February 11. It’s called “The Day We Fight Back.” Where will the fight be? On the streets? Will there be a march? A picket? A rally in some square?

Apparently not. According to press materials, all activities will be taking place on the internet, where followers will be encouraged to share graphics on their blogs, to change their profile photos on Facebook, and to chant pre-written slogans over Twitter.

In American social movements, web banners are replacing cloth banners. Marches are giving way to orchestrated internet bashtagging. Yesterday’s gone, yesterday’s gone.

The Paradigm that isn’t in your Introduction to Sociology Text

The first chapter of the Introduction to Sociology textbook I teach with today is not very different from the first chapter of the Introduction to Sociology textbook I read as an undergraduate student in the 1980s. In text after text, there’s a nod to Marx, Weber and Durkheim (followed by a sniff at Comte). An identification of historically unrecognized founders such as Jane Addams and W.E.B. Du Bois follows. Then there’s a reference to C. Wright Mills and the “sociological imagination” before the big finish: an identification of the “big three” paradigms of sociology. These are without variation identified as functionalism, symbolic interactionism, and conflict theory.

When I made the transition to graduate school and started reading and listening to professional sociologists, I noticed immediately that the phrases “functionalism,” “symbolic interactionism” and “conflict theory” were not being used in journal articles, conferences, colloquia or seminars. When I asked my graduate advisors whether they considered themselves to be functionalists, symbolic interactionists or conflict theorists, they’d raise their eyebrows and say, “well, really I’m not any of those things.” It’s not as though functionalists, symbolic interactionists or conflict theorists never existed. Rather, these divisions were identified in the middle of the 20th Century as a handy way of summarizing the then-current fault lines of the discipline. Despite the fact that sociologists have largely moved on from these conceptual categories in their work, there seems to be a reluctance upon the part of textbook publishers to let go of the “big three.”

Some change has been creeping in. Perhaps the largest innovation over the last quarter century has been to occasionally add reference to postmodernism as an alternative fourth paradigm. Unlike the other three terms, the term “postmodernism” does make a major appearance in modern scholarship, as the following graph showing the occurrence of the paradigmatic phrases in the Google Scholar database of publications shows:

Occurrence of the Phrases Postmodernism, Conflict Theory, Functionalism and Symbolic Interactionism in the Google Scholar database from 2000 to 2013

The presence of “postmodernism” in Google Scholar search results should perhaps not be taken as an indication of the presence of “postmodernism” in the sociological literature, since postmodernism is an intellectual movement reaching far into the humanities. Similarly, the relative presence of “functionalism” may be overstated in this graph since functionalism also describes an intellectual movement in architecture and linguistics. Still, the presence of postmodernism appears considerable, and possibly explains the movement’s new inclusion in sociology texts.

Bringing the Networks In

I’ve brought this up before, but I’d like to make a current case for bringing the study of social networks into the mix of paradigms in an Introduction to Sociology course. Social network analysis is a field centered in sociology that doesn’t fit neatly into any of the three classic 20th Century paradigms identified in introductory textbooks. It isn’t macrosocial like conflict theory or functionalism (although work related to it has macrosocial implications), and while it deals with the nature of interaction social network analysis largely eschews the study of symbols, expectations and meanings that is of central importance to symbolic interactionism. Instead, social network analysis draws from graph theory, matrix algebra and theories about groups to focus on the structure of communication and affiliation outside the individual, primarily at a micro- to meso-social level. Although some pounce on the word “analysis” to suggest that the study of social networks is only methodology, the contention that the structure of social relations represented by networks has consequences for individuals, groups and societies involves a strong and distinct image of society that creates a basis for the creation of social theory. That’s what a paradigm is. The distinctiveness and conceptual clarity of network analysis gives it the potential to stand along symbolic interactionism, conflict theory, functionalism and postmodernism in an introduction to sociology text.

The case for social network analysis as a paradigm worth inclusion is bolstered by pure volume. Let’s add Google Scholar counts for “social network analysis,” a movement in sociological study that is largely left out of introductory sociology textbooks. In contrast to “postmodernism” and “functionalism,” the phrase “social network analysis” leads to a restrictive search, leaving out “social networks” references that don’t contain analysis and “network analysis” references that don’t feature the modifier “social.” The phrase “social network analysis” pretty much guarantees that results will fall within the social science and probably underestimates the actual volume of scholarship on the subject. This creates what’s called a conservative test of the presence of social network sociology. Here are the results with “social network analysis” added in:

Occurrence of Paradigmatic Phrases, including "Social Network Analysis," in Google Scholar Database from 2000 to 2013

At the turn of the 21st Century the relative presence of “social network analysis” was nothing remarkable, but for the past six years “social network analysis” has outperformed the three classic sociological paradigmatic phrases by an increasingly large margin, even when restrictively phrased. In the year 2013, “social network analysis” outperformed “postmodernism” for the first time.

Google Scholar is a very handy (and widely replicable) way of assessing the volume of scholarship for a subject, but the tool cannot easily filter by discipline. On the other hand, the University of Maine at Augusta Library’s physical and online collection of books and journals is more limited in breadth than Google Scholar’s database in contents but allows results to be filtered by discipline.

Social Science Publications in the UMA Library Collection Published since 2000 Featuring these Phrases...
Social Science Publications in the UMA Library Collection Published since 2000 Featuring these Phrases...

As you can see, these results indicate the same pattern: since the year 2000, new publications in the social sciences mentioning social network analysis have strongly surpassed publications mentioning the three classic paradigms, approaching the number of publications in the social sciences for “postmodernism.” Last year, the number of new publications for “social network analysis” in the university collection surpassed those for “postmodernism” as well.

Introductory textbook authors, pick up those pens. There are a number of audacious social facts in the network paradigm worth sharing.

YouTube, Socially HalfBaked

In undergraduate courses, I often exhort students to express their ideas in measurable terms and to make sure that what they think they’re measuring and what they’re actually measuring have a reasonable connection.  This could be seen as the worry of a fussy academic, but there are real consequences to fuzzy thinking and fuzzy measurement in what some people call “the real world.”  I recently came across a “real-world” example of fuzzy research in the field of social media analytics that I’d like to share with you.   As this example shows, the use of trendy and colorful infographics can’t always bridge an information gap.

Thinking about YouTube: All Views? Views Per Video? Average Video Length?

On November 27 2013, the social media analytic company SocialBakers released a report in which it confidently declared that “Videos Under Two Minutes Generate the Most YouTube Views.” This is an ambiguous claim with at least two possible meanings:

Possible Meaning #1: If we count all YouTube views, most of the views will be of videos under two minutes long.
Possible Meaning #2: A video of less than two minutes in duration will tend to obtain more views than a longer video.

These possible meanings may sound similar, but they are substantially quite different. Meaning #1 brings to mind the saying that “most car crashes happen within a mile of home.” This may be true, but that fact doesn’t imply that driving close to home is more dangerous because we also do most of our driving within a mile of home. In the same vein, it might be that most video views are for videos that are under two minutes long, but if most videos are under two minutes long, that’s not at all surprising.

What we really want to know if we’re driving is what locations are more risky. For every mile we drive closer to home, are we more or less likely to crash? If we’re posting YouTube videos with the hope of obtaining views, what we really want to know is whether a single short video tends to snag more views than a single medium-length video or a single extended-length video. That question is expressed in Meaning #2.

It appears from the following text that SocialBakers is interested in testing the question expressed in Meaning #2:

“Using YouTube to reach your Fan’s can be a tricky proposition. Done right, and you can create something that your audience will remember for a long time after, and will want to share with their friends. Videos have the potential to really go viral. But how long should a video be? Make it too long, and people will be yawning and looking for something more interesting to occupy their time. Make it too short, and you might risk your content being easily forgettable and your message undelivered. We did some data investigation to get to the bottom of what video length, on YouTube, will makes the biggest impact….”

Sounds straightforward, doesn’t it? But watch as SocialBakers nimbly shifts back to Meaning #1:

“To do this, we looked at the 300 most viewed channels among different industries. The first thing we noticed is that videos between 16 seconds to 120 seconds generate almost 50% of all views on YouTube. The most successful videos are almost unanimously below 2 minutes in length.”

Did you notice the shift? In the second sentence from that passage, they’re measuring the number of views for all videos and comparing it to the number of views for all videos between 16 and 120 seconds. The problem is that there may just be a whole lot of videos between 16 and 120 seconds long — if so, it’s no wonder that they account for all those views. What we need to know to figure out whether this information is useful is another piece of information: what percent of YouTube videos are between 16 seconds and 120 seconds long. If such videos make up 70% of YouTube videos, then it’s not at all impressive that they generate 50% of all views. In fact, that result would be underwhelming. If, on the other hand, such videos make up just 20% of YouTube videos, then it would be quite impressive for them to garner 50% of all views.

Well, what does SocialBakers actually measure? To figure this out, let’s look at the company’s slickly-produced infographics from its brief report:

SocialBakers: Videos under two minutes generate the most YouTube views

This infographic doesn’t clarify matters at all. The numbers reported are percentages, but what are they percentages of? If you look closely, you’ll notice the large-text title implies that the percentages in the graphic are percentages of views (“generate the most YouTube views”). On the other hand, the tiny text underneath the graphic tells us that what SocialBakers has calculated is the “average length of YouTube videos,” not the share of views generated by YouTube videos.

SocialBakers’ second infographic makes it clear what’s going on. Take a close look at the numbers listed below, which are labeled “Lengths of YouTube Videos”:

SocialBakers: Common Lengths of YouTube Videos

All of the counts at the top of each bar add up to 579,112 videos. Those must be counts of videos, not counts of views, because a just one recent video from the top channel, PewDiePie, has gained nearly 2 million videos. The number of videos of 0-15 seconds (50,505) is 8.8% of 579,112. The number of videos of 16-30 seconds (90,619) is 15.6% of 579,112. The second infographic confirms for us that the first infographic is measuring the commonality of videos of different lengths — not the share of views obtained by videos of various lengths. Those two different-looking infographics are really just sharing the same information in different layouts.

SocialBakers’ infographics don’t have tell us whether a long video tends to obtain more views than a short video, because the infographics don’t measure the number of views per video. Those infographics don’t describe views at all (and there is no more data described in SocialBakers’ report to make up for this lack). Regardless, SocialBakers concludes that “Everyone Loves Short and Sweet Videos,” that “it is often far more effective to take up a small amount of viewing bandwidth in order to keep your audience entertained,” and that “you usually can’t go wrong by making sure your video is short and sweet.” Let’s not forget the title of SocialBakers’ report: “Videos Under Two Minutes Generate the Most YouTube Views.

Check That Data… If You Can

SocialBakers’ conclusions in the headline and text of its report don’t follow at all from the information SocialBakers has presented, but the uncomfortable truth is that most people will nod their heads and accept those conclusions anyway. If video producers follow SocialBakers’ recommendations on the basis of this report alone, they do so at their peril. If you are a consumer of social media advice, it is wise for you to be in the minority who check out claims.

A more thorough way to check out claims would be to replicate SocialBakers’ study. In order to carry out a replication, however, we would need to know what SocialBakers actually did in its study. SocialBakers shares some information in its infographics: we know from those graphics, for instance, that SocialBakers studied videos in the date range of July 1 to September 23, 2013. But did it study all new videos introduced during that period? All existing videos introduced during that period? Some other quantity entirely? We don’t know. We’re also unclear about how many videos SocialBakers measured; was it “videos from the top 300 most viewed brand channels across different industries” (infographic #2) or “videos from a sample of the top 300 most viewed brand channels” (infographic #1)? What kind of sample? What industries were selected and by what standard? Since we don’t know these details, we can’t replicate SocialBakers’ study to directly test its claims. This is probably not a mistake. If SocialBakers told you exactly how to replicate its work, after all, it would be releasing a proprietary business secret. Social media consulting as a business thrives on some secrecy, unlike social research as an academic pursuit, which thrives on the sharing of technique.

What we’ll have to settle for is a more indirect replication. This indirect replication starts with SocialBakers’ central claim for video producers: that a short video will gather more views than a long video.  SocialBakers has a 230-employee-strong stable of employees that can muster.  As a single busy individual, I’ll have to look at YouTube videos on a more modest scale.   I can take a fairly good look nonetheless: to follow the spirit of SocialBakers’ notion, I looked at the 10 YouTube channels with the most subscribers on November 30 2013:

1. Spotlight
2. PewDiePie
3. Smosh
4. HolaSoyGerman
5. JennaMarbles
6. RihannaVEVO
7. nigahiga
8. RayWilliamJohnson
9. OneDirectionVEVO
10. Machinima

I’ve gathered information on the length of, and number of views of, the ten most recent videos from each channel, resulting in 100 videos. This is an admittedly small set compared to that obtained by SocialBakers, but it has two advantages. First, these are the most recent successful videos by the most successful channels on YouTube, so if we are interested in emulating success, this is where we ought to look. Second, the procedure by which I obtained these measurements is “transparent,” meaning that I’ve told you exactly how it’s done. If you don’t believe my results, you can replicate my work to show me I’m wrong.

Let’s look at the results I obtained in three ways. First, we’ll look at the simple number of videos of various lengths. Because there are 100 total videos, these counts can also be read as percentages:

Number of Videos of Various Lengths (source: 10 most recent videos from each of the 10 most-subscribed YouTube video channels)

The results here are quite striking: the most common video length is not between 31 seconds and a minute, as reported in SocialBakers’ chart, but rather between 5 and 10 minutes. The ten most successful YouTube channels produce relatively lengthy videos, not short ones: only 5 out of their most recent 100 videos are of a minute or less in length, and only 9 out of the most recent 100 videos run for two minutes or less.

Second, let’s look at the raw number of views of these 100 videos:

Number of Video Views in Ranges of Different Video Lengths for the 10 most recent videos of the 10 most popular YouTube Channels

With over 1.1 billion video views, the videos between 3 minutes and 10 minutes in length clearly have the most views. However, from our first chart above we also know that videos between 3 minutes and 10 minutes in length account for the largest number of videos (72 out of 100 of them). Is the dominant presence of video views in this range due simply to the number of videos in the range? To find out, we can divide the total number of views in each length category by the total number of videos in a category. The result is the average number of views per video in a category, graphed below:

Average Number of Views per Video, by Length of Video, YouTube November 2013

Finally we can arrive at an answer to the question posed by SocialBakers: if we believe that the ten most popular video channels provide a model to emulate, and if we believe the length of a video is what drives people to view a video or not, then video producers seeking viewers would be well advised to upload videos of between 3 and 5 minutes in length. The next most advisable length for a video would be somewhere in the range of 5 to 10 minutes. Compared to the longer videos from these popular producers, videos of two minutes or less appear to be among the least popular on YouTube, not the most popular.

Keep Asking Questions

At this point, you may have more questions than answers. For instance, are the ten most popular video channels really the model to emulate? Could they have advantages that middle-range producers can’t touch? And is it possible that the length of a video isn’t what leads people to watch, but some other feature of a video that might itself be associated with length? To answer these questions, we’d need (yes) more research. But in order to get to this second tier of questions, we need to answer our first question — and that in turn means our measurements must be able to answer our question, and that we need to be specific in describing how our measurements are made.

Congress Collects Comments on Adjunct Working Conditions: Add Yours by December 20

With every passing year, universities rely more strongly on part-time adjunct faculty to teach their courses. These faculty are not tenured, have no avenue to be tenured, and live complicated lives due to their complicated situation. The House Committee on Education and the Workforce is asking adjunct instructors to share their experience so that they may better understand the working conditions of adjunct faculty and areas for possible policy improvement. Among the questions Congress poses to adjuncts:

“For how long have you worked as a contingent faculty or instructor?

How would you describe the working conditions of contingent faculty and instructors at your college or university, including matters like compensation, benefits, opportunities for growth and advancement, job stability, and administrative and professional support?

How do those conditions help or hinder your ability to earn a living and have a stable and successful career in higher education? What impact, if any, do those working conditions have on students or higher education generally?

How do those working conditions help or hinder your ability to do your job, or how do they otherwise affect students in achieving their educational goals?

Comments close on December 20, 2013. If you have something to say, don’t wait: make your comment through this web page.

1 4 5 6 7