This week’s lecture provides some context for your assignment of last week, in which you collected information on corporate interlocks. Why? What does such information tell us? And what leads to more enduring interlocks? This second question leads us to the question of difference. It turns out that a different kind of two-mode matrix can be transformed to describe the opposite of a relationship: the existence of difference among nodes in a network.
Reconsidering Corporate Boards
One of your readings for this week is a classic by G. William Domhoff, a political sociologist who wrote the classic Who Rules America? books. In your reading, Domhoff explains why social researchers might want to look at the overlapping memberships on different corporate boards in the United States of America. The networks created from affiliation matrices of corporate boards are called “interlocking directorates,” and for Domhoff they reveal structures of power:
“Why do people agree to become directors of two or more corporations? For most of them, it’s too easy for us to say that they make good money, anywhere from tens of thousands to a few hundred thousand per board, because those high levels of compensation are fairly recent. It’s more likely that they receive various intangibles out of it, such as prestige, information, and new connections. Based on his interviews with corporate directors, sociologist Michael Useem (1984) concluded that service on two or three corporate boards widened the horizons of top executives, which he called “business scan.” He also showed that those who sit on two or more corporate boards are more likely to be in policy-discussion groups and to receive appointments to government advisory committees (Useem, 1980). Thus, becoming a director, and then an interlocking director, can help move a person to the heart of the power structure. This finding recently has been supported by a sophisticated “small world” analysis of the interlocks among corporations, foundations, policy-discussion groups, and cultural organizations for the years 1962 to 1995 (Barnes, 2005).”
In a second reading for the week, researchers W. Gary Wagner, Jeffrey Pfeffer, and Charles A. O’Reilly III write for the economic sociology journal Administrative Science Quarterly to understand why corporate leadership of the type you’ve just studied changes over time. In particular, they looked at 31 of the biggest U.S. corporations (the Fortune 500) from 1976 to 1980, and they discovered that individual corporate leaders who were different in age and the number of years worked at the corporation from their fellow leaders were more likely to leave the corporation. Corporations that had greater amounts of difference in age and longevity experienced greater leadership turnover than corporations that had greater amounts of similarity.
Wagner, Pfeffer and O’Reilly’s work (“Organizational Demography and Turnover in Top-Management Groups“) has been quite influential, cited by over 900 other research studies in organizational behavior. Why? Because it indicates that difference affects relations between individuals and the cohesiveness of organizations. In other words, it makes a strong case that differences between nodes affects the way ties form in networks.
Working with 2-mode Matrices for Social Comparison
In the following video, we consider a number of ways in which 2-mode matrices can be converted to 1-mode matrices as a way of comparing node characteristics. How are nodes similar? How are nodes different? To provide a concrete example of using 2-mode data for comparison, real-life 2-mode information from the Maine State Legislature is included. Why would we want to work with 2-mode information? The explanation of variation in an outcome of importance is one good reason. This video starts with 2-mode comparisons with which you should feel comfortable after next week (groups), and continues on to think about similarities and differences in actions and other characteristics:
In the above video, individual attributes are described in binary terms — that is, they take on a value of either “0” or a “1.” But some attributes move beyond the binary. Consider age, for instance, in the following imaginary world of five people:
A 1-mode matrix describing the relations between these individuals could describe their difference in years of age:
In another comparison that goes beyond binary, consider the following 2-mode matrix of cities in Maine, including information about each city’s location in latitude and longitude:
How do these cities relate to one another in terms of location? BlueMM kindly shares a formula for converting two cities’ latitudes and longitudes into distance in miles:
Distance between City A and City B in miles = ArcCosine(Cosine(Radians(90-Latitude of City A))*Cosine(Radians(90-Latitude of City B))+Sine(Radians(90-Latitude of City A))*Sine(Radians(90-Latitude of City B))*Cosine(Radians(Longitude of City A-Longitude of City B)))*3958.756
Do I expect you to remember or use that long formula? Absolutely not! But what I would like you to remember is what formulas like this can find you: a way to describe relations between communities. In this case, you get a matrix of distances in miles between cities:
|Auburn||Augusta||Bangor||Bar Harbor||Calais||Fort Kent||Houlton||Kittery||Lewiston||Portland||Rockland|
The mathematics used above, whether simple or complicated, is nothing more than a tool like a screwdriver, a wrench or a lathe: it turns raw information into something else, something useful, something that could have real-life meaning for you. In this case, we get a picture of possible social relationships between cities: if the cities that are closest to one another have more interaction with one another, then we have an idea of who will find marriage partners where, who will make sales where, and possibly even the paths by which epidemics will spread.
Now You Can Do It: Calculating a Difference Matrix in R
If difference between nodes matters to networks, how can difference be measured? The answer depends on whether the difference is categorical or numerical. Let’s look at an example for generating each using the research program R.
Calculating Categorical Differences
Let’s imagine a network containing the following nodes, and let’s imagine that those nodes have the following characteristics:
There are two varieties of characteristics here — gender and age — and each is of a different kind. Gender (“M” for masculine-identified and “F” for feminine-identified) is a categorical characteristic in which no numbers are used and there is no scaling or ordering to differences. In other words, it doesn’t make sense to say that feminine is twice the value of masculine or that masculine is greater than feminine. Gender, at least as measured this way (and there are many ways to measure gender), simply involves categories of difference.
To create a network of similarity and difference, we can simply think of each possible category as if it were a group. A persons-by-groups affiliation matrix, in which the number 1 indicates group membership and the number 0 indicates no such membership, can then be created:
If this matrix is saved in comma-delimited format as a .csv file, which looks like this…
Node,Feminine,Masculine Lisa,1,0 Mona,1,0 Ned,0,1 Oswald,0,1 Patricia,1,0 Quentin,0,1 Rhea,1,0
… then the following script in R (created for last week’s lecture) can import the 2-mode matrix…
library(igraph) affiliation_data <- read.csv(file.choose(),header=TRUE,row.names=1) affiliation_matrix <- as.matrix(affiliation_data) two_mode_network <- graph.incidence(affiliation_matrix) one_mode_networks=bipartite.projection(two_mode_network) get.adjacency(one_mode_networks$proj1,sparse=FALSE,attr="weight")
… and generate a 1-mode matrix in which a 1 indicates gender similarity and a 0 indicates gender difference:
This technique is not limited to gender categories; it can be used for any categorical way of describing node characteristics.
Calculating Numerical Differences
Not all characteristics are categorical. Some are numerical, like latitude, longitude, income, SAT score, and years of age. Let’s return to the seven nodes used as an example in this lecture, listing only the nodes’ names and their ages:
If we save this matrix and save it in the form of a .csv file, it looks like this…
Node,Age Lisa,18 Mona,32 Ned,21 Oswald,72 Patricia,44 Quentin,83 Rhea,51
… but we can’t use the same script as above to indicate similarity and difference, because differences will be measured differently. Rather than the dichotomy of similarity (1) and difference (0), age difference can be measured in terms of the number of years by which one node is older or younger than another.
To work with a .csv containing numerical characteristics and create a matrix of differences, we can use this script:
library(igraph) age_data <- read.csv(file.choose(),header=TRUE,row.names=1) age_matrix <- as.matrix(age_data) age_differences <- outer(age_matrix, age_matrix, FUN="-") write.csv(age_differences, file = "agedifferences.csv") age_csv <-read.csv("agedifferences.csv",header=TRUE,row.names=1) age_difference_matrix <- as.matrix(age_csv) colnames(age_difference_matrix) = rownames(age_difference_matrix) age_difference_matrix
The writing and re-reading of a csv file is a bit bulky, but necessary to work around column and row name recognition issues (if you can write a more efficient script, please share it in the comments section below!). The script produces an easily readable matrix in which each cell refers to the amount by which a row node is older (positive number) or younger (negative number) than a column node.
We could, hypothetically, create a network graph of these differences in which the plotted tie strength would be equal to the age difference between nodes. That is a kind of relation. To do so, we’d add the following commands to the end of our R script:
age_difference_graph <- graph.adjacency(age_difference_matrix, weighted=TRUE, diag=FALSE) plot(age_difference_graph,edge.label=round(E(age_difference_graph)$weight, 3),edge.curved=0.1)
But as you can see, even with curved edges to allow direction of difference to become apparent, the graph is very busy, too busy:
Why? Because every node has age, and because the relation is difference with direction, every pair of nodes has two ties to plot representing two versions of the age difference between them. Even if we simplified the network to represent the absolute value of the difference, the network would still be absolutely full of ties. A matrix, in this case, is much easier to read and interpret.
So What? Stay Tuned…
Why would we bother calculating how different two people in a network might be? Wagner and his colleagues already have provided one answer: differences can split groups apart. Another line of research focuses on the opposite trend, called homophily: in networks, similarity tends to bring people together. We’ll look at homophily, and other general patterns in social networks, next week.
Accessing Articles via UMA Libraries
One of this week’s readings is not presented to you in the syllabus as either a direct link or a passage that you could find in one of your textbooks. Instead, it is a reading that can only be accessed by using the password-protected databases of the University of Maine online library system:
Using “the password-protected databases of the University of Maine online library system” might sound complicated and daunting, but I promise that it isn’t. This video was produced for the Spring 2014 version of this course, but the techniques for obtaining one of our readings, demonstrated step by step, still work:
After you’ve gone through the process once, it’s much easier the second time. Some day when you have an extra moment, I encourage you to poke around in the University library system. It’s amazing how much information is at your fingertips — and available to you for free if you are a registered University student.
Class Participation: In The Midst of the Storm, Breathe
Even if we hadn’t just had a major storm, we’d be at the time of the semester that presents perhaps the greatest strain for undergraduate students: the semester seems like it has always been going on, like it will never end, and like it presents an insurmountable hill of work. On top of that, many of you are without power or heat in the wake of our powerful storm here in Maine and understandably need to take care of life necessities. For that reason, during this hardest week of the semester, while you’re being asked to do so much in so many classes, your class participation exercise for the week is to breathe. Find your favorite hillside, the most comfortable nook in your apartment, whatever your favorite place is to simply be. Then close your eyes, count to three, and… breathe.
Review this lecture as needed, nail down this week’s readings, and take care of yourself and your loved ones. I’ll see you next week.