SPECIAL NOTE: THE FOLLOWING INSTRUCTIONS ASSUME THAT YOU WORK WITH A WINDOWS COMPUTER, AS MOST STUDENTS DO. IF YOU WORK WITH AN APPLE MACINTOSH COMPUTER, PLEASE SEE THE SPECIAL INSTRUCTIONS IN THIS LECTURE.
Two weeks ago, we described a number of patterns that can be discerned in network structure, including those of paths, distance, geodesics, degree, betweenness centrality, closeness centrality, and density. Most of these ways of measuring a network are quite simple in the abstract, involving nothing more than counting and careful arithmetic. Nevertheless, human calculations can be prone to error, and even the simple act of counting ties can prove difficult to practically impossible for a human when social networks have a large size.
For that reason, having introduced the concepts of network structure, we’re going to begin to make a turn to learning a computer program called R to carry out network measurements for us. This week, I’m asking you to install R on your computer (any Linux, Windows or Apple computer should do; a smartphone will not suffice) and go through the process of composing and writing your own R computer program to create a network graph from an edge list. In your homework for this week, you’ll be using R to represent the family network that you’ve been working on in past weeks.
We have a reading for the week by John M. Quick entitled “R Tutorial Series: Introduction to the R Project for Statistical Computing, available online at http://rtutorialseries.blogspot.com/2009/10/r-tutorial-series-introduction-to-r_11.html. Usually, I encourage students to complete their readings before reading a lecture. This week, however, I’d encourage you to read this lecture before reading Quick’s introduction. Quick’s writing is meant as a reference guide for you if you get stuck on the practical steps of R installation and use. This lecture, in contrast, is a step-by-step guide to all of the steps you need to finish the week’s homework.
This week is a technical week with technical work. For that reason, I’d like you to focus on the completion of the technical task represented in this week’s homework assignment, Homework #4. To give you the time and space to do that, we’ll hold off on lecture participation for the week. You WILL be expected to complete Homework Assignment #4, and that homework will be graded, but there will be NO assigned lecture participation for you, and we’ll simply skip a participation grade for this week.
Installing R on Your Computer (Windows)
What is R?
R (with an online headquarters at http://r-project.org) is a piece of software that is absolutely free of charge and is nevertheless capable of various research tasks. Many of the statisticians, methodologists and programmers who designed R were reacting to closed-off, expensive, for profit statistical packages such as STATA, SAS and SPSS that could cost thousands of dollars per copy to purchase and that were slow to incorporate new ways of researching and describing the world. From the beginning, R has been designed to be both flexible and open to innovation; as social and natural scientists develop new techniques, they are free to write packages, extensions to the software that add new capabilities. This week, you’ll be installing R, installing a package called igraph (website at igraph.org), and using R and igraph to create a network graph from an edge list.
Installing R: A Walkthrough
The video walks through the process of installing R. To install R, you’ll need to have a computer available (if you do not have a computer available the University of Maine at Augusta’s student lab computers will do, although you’ll have to freshly install R every time you use one, a process that takes a few minutes). To find the right download file, you have to navigate through the potentially confusing text overload of the R Project website at http://r-project.org, but once you’ve made it through the selection of a site and the appropriate “base” download, the rest of installation is a breeze. This video shows the installation procedure for a Windows computer, but it’s also possible to install R for a Linux or Apple computer using much the same technique:
Running R on Your Computer
The R Console Window and the R Script Window
When you start working with R, you’ll find that you most often open two windows: the Console window and the Script window. The Console window is usually the one you’ll find already open when you start the R program:
You can type in commands (pieces of text that tell R to know or do something) inside the Console window, and your commands should appear next to the red “>” sign. In order to type commands in the Console window, it has to be selected. All you have to do to select the Console window is hover your computer’s cursor over the Console window and click.
Console commands can be easily entered one by one, but they’re not saved when you exit a program, and if you need to run your commands multiple times, you’ll need to type your commands in over and over again. At first, that’s not a big problem, but over time it can get to be quite a hassle. To use our written commands to R over and over again, we’ll save them in “scripts.” A script is nothing more than a sequence of commands with one command written per line. Scripts appear in the Script window, which you can open by selecting the “File” menu option at the top of the R screen, then selecting “New script.”
If that seems a bit abstract, why don’t we make the difference between the Console window and the Script window more clear by working through an example? Play the following video, and as you do so, run the R program on your computer and type in commands to follow along:
When scripts are saved, it’s very important to know where you’ve saved those scripts! As the video points out, you should be sure to pay attention to where your “working directory” is. Your “working directory” is the folder on your computer where all scripts will be saved. To choose your working directory on your computer, be sure the Console window is selected, then click “File” and “Change dir…” in the drop-down menu at the top of the R program.
Installing and Running R and igraph on A Macintosh Computer
The research program R and the R package for social network analysis called igraph don’t exactly install and run the same way on an Apple Macintosh computer as they do on a Windows computer. The following video shows you how to install R and run a script with the igraph package on it when you’re using a Mac computer. There are a few options you’ll need to click when installing igraph, and there’s a very simple step that differs from the typical Windows way of running a script. First, select all the text in your script. THEN, and only then, use the “Execute” command. That’ll make it work, in a slightly different way than on a standard Windows computer. Got a Mac? Watch:
Installing igraph on a Mac is a bit different than in Windows. You would follow these steps on a Mac:
1. From the Package and Data menu, select Package Installer.
2. Click the Get List button to display the available set of packages.
3. After clicking the Get List button, use the search box to show only packages that match the name you are looking for: igraph.
4. Select igraph and press the Install Selected button.
In the Mac OS X version of R, in the Packages menu there should be a Package Manager that shows each package’s current status. If you can’t get igraph to work, troubleshoot: what does your Package Manager say igraph’s current status is? Is igraph listed among the packages? If not, try installing it again and see what happens.
For both the Mac and Windows environments, if you end up having trouble, post a comment here and I’ll do what I can to lend a helping hand!
Creating Social Network Graphs in R with igraph
You’ll be doing amazing network analysis with R by the end of the semester. But for now, let’s just focus on getting to the simplest of network graphs. This video you see below is a very basic introduction to the use of R in conjunction with the package igraph to take a social network, to describe that network in the form of an edge list, and to generate an image of the network as a graph.
If you follow the steps indicated in the video above, you’ll be ready for Homework #4, which is due on October 8. The steps of Homework #4 are:
- Use R to create an accurate network image of the family network you created for Homework #3. Include node labels for each family member.
- Save an image file of the network you created in R to your computer.
- Upload the image to the appropriate Blackboard Homework Assignments section labeled “Homework #4: R Family.” To upload that image, click on the title”Homework #4: R Family,” then look for the area to upload your work (subtitled “Attach File”). Be sure to click “Submit!”
Good luck on your homework this week, a small but crucial step toward a remainder of the semester based in network analysis through programming. If you have any questions as we move forward, please let me know; share a comment in the comments box below and I’ll respond as promptly and completely as I can.