Extracting text with R in a Tidy fashion

Gathering data to compare PhD programs in Statistics

Entacmaea quadricolor

When it comes to looking at PhD programs it can be difficult to find reliable resources that allow you to compare and constrast programs en masse. For something that seems so important of a decision it seems strange the expectation is to hunt one-by-one through university websites to try and find program data, and even stranger that most programs do not publicly divulge all sorts of information on admissions and graduation.

There were a few resources I wanted to try and pool together to get a better idea on programs, and thus began this minor project.

By the end of this the goal is to have two dataframes, one for Statistics PhD programs and another for Biostatistics PhD programs, and try to scrape program size, USNews ranking, and presence of a consulting center.

library(pdftools)
library(tidyverse)
library(magrittr)
# OR load all of Tidyverse at once:
# tidyverse_packages() %>% lapply(library, character.only = TRUE)

To start off lets pull some information from the ASA Statistics and Biostatistics Degree Data, specifically the list of universities granting PhD’s in Statistics and Biostatistics and how many PhD’s were granted between 2003 and 2018.

# URL locations
stat.phdnums.loc <-
  "https://ww2.amstat.org/misc/StatsPhD2003-MostRecent.pdf"

biostat.phdnums.loc <-
  "https://ww2.amstat.org/misc/BiostatsPhD2003-MostRecent.pdf"

# Reading in PDF's
stat.phdnums <-
  pdf_text(stat.phdnums.loc) %>%
  read_lines()

biostat.phdnums <-
  pdf_text(biostat.phdnums.loc) %>%
  read_lines()

Here we can see the raw format the data is in.

stat.phdnums
##  [1] "      All data from National Center for Education Statistics and retrieved by Steve Pierson (pierson@amstat.org). The data here includes biostatistics degrees as"
##  [2] "                     categorized by the CIP Code 27.05. For more information on degrees included, see http://community.amstat.org/blogs/steve-"                   
##  [3] "                                                       pierson/2014/07/28/categorization-of-statistics-degrees."                                                  
##  [4] "                    Statistics PhD's               2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2003-2018"                     
##  [5] " 1 North Carolina State University at Raleigh         15       9    12    13     22    15    32    19   14    20    12     17    19    20     19     25     283"  
##  [6] " 2 Iowa State University                                9      8     5    11     14     4    11    13   14      9   15     28    14      9    18     27     209"  
##  [7] " 3 University of Wisconsin-Madison                      5      6    14     9     20    13    15    12   13      9   13     10    13    24     15     17     208"  
##  [8] " 4 Stanford University*                               10      12    14     6      9    10     5     9   13      7   13     10     4      9    23       6    160"  
##  [9] " 5 Texas A & M University-College Station               6     12    18    10      7     4     5    11   16    18    10      7    15      4     8       8    159"  
## [10] " 6 Pennsylvania State University                        6      9     8     1      8     9    10    10   14      9   12      9    17    14      6       9    151"  
## [11] " 7 Ohio State University-Main Campus                    3      5     3    12     12     7    11    11    6    11      8    13    14    12      6     12     146"  
## [12] " 8 Purdue University-Main Campus                        6      5    12     8      8     7    10    12    9      9     5    10    14      8     9     13     145"  
## [13] " 9 University of Michigan-Ann Arbor                     8      5     3     3      8     9    12    11    7    11    13      9     8    11     13       9    140"  
## [14] "10 University of California-Los Angeles                 2      2     2     6      9     7     9     8    7    11    11      8     6    12     13     10     123"  
## [15] "11 University of North Carolina at Chapel Hill          5      5     3     6      1    12     4     5    8    14      9     8    10    14     11       8    123"  
## [16] "12 Duke University                                      7      1     3     7      5     7     9     5    8      4     9    12     9      8    12     14     120"  
## [17] "13 University of California-Berkeley                    5      5     8     7      7     5    11     6   10      4   13      7     6      5     7       7    113"  
## [18] "14 Carnegie Mellon University                           6      8     3    12      6     1     5     6    8    10      7    10    12      8     2       9    113"  
## [19] "15 Florida State University                             3      4     3     3      6     7     5     5   13      7     3     8     7      9     9     18     110"  
## [20] "16 University of Minnesota-Twin Cities                  7      3     7     8      8     9     8     2    8      5     9    10     9      5     8       3    109"  
## [21] "17 Virginia Polytechnic Institute                       5      2     7     8      7     5     5     5    7      5     7     6    13    10      7       6    105"  
## [22] "18 University of Florida                                1      4     5     9      7    11     4     5    3    11      7     6     5      7    10       8    103"  
## [23] "19 Rice University                                      3      3     5    11      4     7     7    11    2      6     4     6     3      8     7       7     94"  
## [24] "20 Columbia University B55                              3      4     6     9      3     3     4     5    9      2     9     8     5      6     7     11      94"  
## [25] "21 Harvard University                                   4      5     3     4      2     2     8     4    6      7     8     7     9      8     6     11      94"  
## [26] "22 University of Chicago                                2      7     4     6      4     5     9     4    4      7   10      2     8      7     5     10      94"  
## [27] "23 University of California-Davis                       4      3    10     4      8     4     8     3    3      6     5     5     8      3     8     11      93"  
## [28] "24 University of Georgia                                0      4     6     6      2     4    10     6    3      0     8     9    12      9     7       7     93"  
## [29] "25 University of Connecticut                            2      3     5     2      8     5     3     5    5      5   10      9     3      7     7     10      89"  
## [30] "26 University of Pennsylvania                           3      5     4     3      4     3     6     9    3      8     5    10     5      6    10       3     87"  
## [31] "27 University of Missouri-Columbia                      2      4     5     5      6     3     6     8    3      2   12      8     7      6     4       6     87"  
## [32] "28 University of Washington                             4      3     1     4      5     5     9     7    7      5     5     6     6      6     8       5     86"  
## [33] "29 Rutgers University-New Brunswick                     2      3     6     3      4     5     6     7    4      3     7     7     6      5     4       8     80"  
## [34] "30 University of California-Riverside                   1      1     5     2      6     5     7     0   10    12      7     3     3      5     3       9     79"  
## [35] ""                                                                                                                                                                 
## [36] "31 Baylor University                          3 5 1 3 6 2 7 5 7 2  6 5 4  5  7  7 75"                                                                             
## [37] "32 Southern Methodist University              2 6 7 1 4 3 5 3 3 3  6 6 6  5  8  3 71"                                                                             
## [38] "33 Michigan State University                  6 1 3 6 5 1 2 3 6 7  2 4 3  8  6  8 71"                                                                             
## [39] "34 University of Illinois at Urbana-Champaign 2 1 3 5 6 1 2 3 8 5  2 6 5  2  7 12 70"                                                                             
## [40] "35 University of South Carolina-Columbia      2 3 3 2 4 7 5 3 1 5  7 4 4  5  8  5 68"                                                                             
## [41] "36 University of California-Santa Barbara     4 2 2 6 3 4 5 2 5 2  6 2 7  7  2  9 68"                                                                             
## [42] "37 University of Kentucky                     0 5 6 4 4 5 2 5 2 3  1 3 4  6 10  6 66"                                                                             
## [43] "38 University of Maryland-College Park        0 0 5 3 1 7 4 4 6 4 10 2 8  2  3  5 64"                                                                             
## [44] "39 Cornell University                         3 5 4 2 6 0 3 3 4 4  7 3 6  5  6  2 63"                                                                             
## [45] "40 Colorado State University-Fort Collins     4 4 2 3 4 4 7 4 1 2  4 5 2 11  3  2 62"                                                                             
## [46] "41 University of Maryland-Baltimore County    3 0 5 6 2 2 2 2 4 1  6 5 5  7  4  5 59"                                                                             
## [47] "42 University of Pittsburgh                   2 4 5 2 4 3 5 2 7 5  3 2 4  5  2  4 59"                                                                             
## [48] "43 Temple University                          0 0 0 0 0 8 3 2 7 5  5 2 5 11  3  7 58"                                                                             
## [49] "44 Western Michigan University                6 5 1 2 1 2 6 3 3 3  2 2 2  6  4  6 54"                                                                             
## [50] "45 George Washington University               2 2 1 3 0 4 1 3 1 7  0 6 0  5  8  9 52"                                                                             
## [51] "46 Kansas State University                    2 5 2 4 1 3 2 5 2 5  3 4 5  1  0  7 51"                                                                             
## [52] "47 University of Rochester                    4 0 0 2 2 2 3 3 3 4  1 5 4  4  4  4 45"                                                                             
## [53] "48 University of Iowa                         4 2 1 0 7 5 3 4 1 6  2 1 3  1  4  1 45"                                                                             
## [54] "49 The University of Texas at Dallas          0 2 0 3 4 1 0 3 4 2  4 1 4  5  6  3 42"                                                                             
## [55] "50 University of Nebraska-Lincoln             0 0 1 2 0 3 1 6 3 1  3 5 3  5  5  4 42"                                                                             
## [56] "51 Yale University                            4 5 1 1 2 1 4 1 3 3  3 2 4  2  1  4 41"                                                                             
## [57] "52 Bowling Green State University             0 0 0 0 0 1 5 2 2 2  4 7 3  4  4  3 37"                                                                             
## [58] "53 Northwestern University                    1 0 1 1 3 2 0 2 3 1  4 6 4  3  3  3 37"                                                                             
## [59] "54 Oregon State University                    0 0 2 3 2 2 3 1 1 3  2 1 3  2  2  9 36"                                                                             
## [60] "55 University of New Mexico                   4 2 3 1 3 0 5 3 2 1  1 2 3  2  3  0 35"                                                                             
## [61] "56 University of Virginia-Main Campus         1 3 3 0 0 0 2 4 1 2  3 3 2  4  2  3 33"                                                                             
## [62] "57 The University of Alabama                  0 0 4 4 1 2 4 1 3 2  4 1 0  2  1  3 32"                                                                             
## [63] "58 George Mason University                    0 0 0 0 0 0 1 2 1 3  2 5 4  4  1  3 26"                                                                             
## [64] "59 North Dakota State University              1 2 1 0 0 1 2 2 2 1  0 3 3  3  1  3 25"                                                                             
## [65] "60 Case Western Reserve University            2 3 3 2 1 3 2 2 5 1  1 0 0  0  0  0 25"                                                                             
## [66] "61 Arizona State University-Tempe             0 0 0 0 0 0 0 6 2 2  2 3 1  1  1  6 24"                                                                             
## [67] "62 University of California-Irvine            0 0 0 0 0 0 0 0 1 1  0 6 7  1  4  3 23"                                                                             
## [68] "63 Oklahoma State University                  0 0 1 0 2 1 2 1 0 2  2 7 1  0  3  1 23"                                                                             
## [69] "64 SUNY at Albany                             0 1 2 0 0 1 2 2 0 0  1 3 3  0  3  2 20"                                                                             
## [70] ""                                                                                                                                                                 
## [71] "65 The University of Texas at San Antonio          0   0   0   0   0   0   0   1   0   5   2   3   0   0   6   2  19"                                             
## [72] "66 University of Wyoming                           0   0   2   0   0   0   1   2   2   1   2   2   1   0   1   5  19"                                             
## [73] "67 University of Arizona                           0   0   0   1   0   0   0   0   0   0   0   1   2   1   4   4  13"                                             
## [74] "68 American University                             3   0   3   1   2   1   0   0   1   2   0   0   0   0   0   0  13"                                             
## [75] "69 University of Cincinnati-Main Campus            0   0   0   0   0   0   0   0   0   0   0   0   1   2   0   4   7"                                             
## [76] "70 Montana State University                        3   0   1   0   0   1   0   0   0   0   0   0   0   0   0   0   5"                                             
## [77] "71 West Virginia University                        0   0   0   0   0   0   0   0   0   0   0   0   0   0   3   1   4"                                             
## [78] "72 The University of Texas at Austin               0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   3   4"                                             
## [79] "73 University of New Hampshire                     0   0   0   0   0   0   0   0   0   0   0   1   0   0   2   0   3"                                             
## [80] "74 Boston University                               0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   1   2"                                             
## [81] "75 Central Michigan University                     0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   1"                                             
## [82] "76 Indiana University-Bloomington                  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   1"                                             
## [83] "77 Washington University in St Louis               0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   1"                                             
## [84] "78 University at Buffalo                           0   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1"                                             
## [85] "                                          Total 207 219 269 271 300 276 355 324 344 345 379 397 396 402 419 482  5385"                                            
## [86] "   *Self-report 2015 number"

Lets drop the last two lines indicating a footnote and the totals along with the first three rows.

stat.phdnums <- stat.phdnums %>% head(-2) %>% tail(-3)
biostat.phdnums <- biostat.phdnums %>% head(-2) %>% tail(-3)

And then lets drop the empty character entries that indicate a newpage.

stat.phdnums <- stat.phdnums %>%
  lapply(function(ele)
    if (ele == "")
      NULL
    else
      ele) %>%
  compact()

biostat.phdnums <- biostat.phdnums %>%
  lapply(function(ele)
    if (ele == "")
      NULL
    else
      ele) %>%
  compact()

From here we’re good to start utilizing stringr to tidy things up.

stat.phd <-
  stat.phdnums[-1] %>%
  str_squish() %>%
  str_remove("^\\s*\\d*\\s*") %>%
  str_split("\\s+(?=\\d)") %>%
  unlist() %>%
  matrix(ncol = 18, byrow = TRUE) %>%
  set_colnames(value =
                 stat.phdnums[1] %>%
                 str_squish() %>%
                 str_split("\\s+(?=\\d)") %>%
                 unlist) %>%
  as_tibble() %>%
  type_convert()

A short explanation of each step is provided below. The site Regex101 is helpful with writing regex.

biostat.phd <-
  # Ignores column names
  biostat.phdnums[-1] %>%
  # Removing excess whitepace
  str_squish() %>%
  # Removes index
  str_remove("^\\s*\\d*\\s*") %>%
  # Split on all whitespace occurring before digits
  str_split("\\s+(?=\\d)") %>%
  # Turn list into a matrix
  unlist() %>%
  matrix(ncol = 18, byrow = TRUE) %>%
  # Handling variables names
  set_colnames(value =
                 biostat.phdnums[1] %>%
                 str_squish() %>%
                 str_split("\\s+(?=\\d)") %>%
                 unlist) %>%
  as_tibble() %>%
  # Transformating variables into  numeric
  type_convert()
## Parsed with column specification:
## cols(
##   `Biotatistics PhD's` = col_character(),
##   `2003` = col_double(),
##   `2004` = col_double(),
##   `2005` = col_double(),
##   `2006` = col_double(),
##   `2007` = col_double(),
##   `2008` = col_double(),
##   `2009` = col_double(),
##   `2010` = col_double(),
##   `2011` = col_double(),
##   `2012` = col_double(),
##   `2013` = col_double(),
##   `2014` = col_double(),
##   `2015` = col_double(),
##   `2016` = col_double(),
##   `2017` = col_double(),
##   `2018` = col_double(),
##   `2003-2018` = col_double()
## )

Let’s take a look at how this looks completed:

stat.phd %>% (function(x)rbind(head(x),tail(x)))
## # A tibble: 12 x 18
##    `Statistics PhD's`                         `2003` `2004` `2005` `2006` `2007` `2008` `2009` `2010` `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2003-2018`
##    <chr>                                       <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>       <dbl>
##  1 North Carolina State University at Raleigh     15      9     12     13     22     15     32     19     14     20     12     17     19     20     19     25         283
##  2 Iowa State University                           9      8      5     11     14      4     11     13     14      9     15     28     14      9     18     27         209
##  3 University of Wisconsin-Madison                 5      6     14      9     20     13     15     12     13      9     13     10     13     24     15     17         208
##  4 Stanford University*                           10     12     14      6      9     10      5      9     13      7     13     10      4      9     23      6         160
##  5 Texas A & M University-College Station          6     12     18     10      7      4      5     11     16     18     10      7     15      4      8      8         159
##  6 Pennsylvania State University                   6      9      8      1      8      9     10     10     14      9     12      9     17     14      6      9         151
##  7 University of New Hampshire                     0      0      0      0      0      0      0      0      0      0      0      1      0      0      2      0           3
##  8 Boston University                               0      0      0      0      0      0      0      0      0      0      0      0      0      0      1      1           2
##  9 Central Michigan University                     0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      1           1
## 10 Indiana University-Bloomington                  0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      1           1
## 11 Washington University in St Louis               0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      1           1
## 12 University at Buffalo                           0      1      0      0      0      0      0      0      0      0      0      0      0      0      0      0           1
biostat.phd %>% (function(x)rbind(head(x),tail(x)))
## # A tibble: 12 x 18
##    `Biotatistics PhD's`                        `2003` `2004` `2005` `2006` `2007` `2008` `2009` `2010` `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2003-2018`
##    <chr>                                        <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>       <dbl>
##  1 University of North Carolina at Chapel Hill      9      9     11      8      5     11      9     13     12     16      9     22     24     16     14     17         205
##  2 Harvard University                               1      9     17     14     13     12      9      6      9     17     10     13     12      9     14     16         181
##  3 University of Michigan-Ann Arbor                 8     12      6      7      8     10      6      9     10     15     15     12     11     11     13     15         168
##  4 University of Pittsburgh-Pittsburgh Campus       5      8      4      4      9      5     12     10     15      9     12     14     18      5     10      8         148
##  5 University of Washington-Seattle Campus          6      9      7      6      8     11      8      8      5      7     12      9     12     11     14      5         138
##  6 University of Texas Health Science Center        0      1      9      0      5      5      6      5      7      9     13      7     19     15     13     19         133
##  7 University of Miami                              0      0      0      0      0      0      0      0      0      0      0      0      0      0      1      3           4
##  8 University of Arizona                            0      0      0      0      0      0      0      0      0      0      0      0      0      1      0      2           3
##  9 University of Massachusetts-Amherst              0      0      0      0      0      0      0      0      0      0      0      0      1      0      1      0           2
## 10 University of Georgia                            0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      2           2
## 11 Southern Methodist University                    0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      1           1
## 12 University of Hawaii at Manoa                    0      0      1      0      0      0      0      0      0      0      0      0      0      0      0      0           1

Now, regarding USNews Statistics PhD Rankings…

USNews Rankings

I tried a few tools like SelectorGadget to simplify scraping the relevant portions of the CSS but USNews’ format was more unwieldly than it seemed capable of handling. There was also this interesting, but old, rvest issue about this with a method of handling a different area of USNews rankings.

My solution was to simply copy and paste as it skipped going deeper into HTML, XML, and CSS entirely with 30 seconds of effort.

The small bit of code to read my clipboard into R:

usnewsrankings <- read.table("clipboard",sep = "\n")

Taking a look at what this prints as:

usnewsrankings %>% (function(x)rbind(head(x),tail(x)))
##                                                                V1
## 1                  Stanford University (Department of Statistics)
## 2                                                    Stanford, CA
## 3                                                            \t5.0
## 4   University of California--Berkeley (Department of Statistics)
## 5                                                    Berkeley, CA
## 6                                                            \t4.7
## 336                                             RNP in Statistics
## 337                                                          \tN/A
## 338        Western Michigan University (Department of Statistics)
## 339                                                 Kalamazoo, MI
## 340                                             RNP in Statistics
## 341                                                          \tN/A

Should this process change the original copied data can be found here. Note downloading this might throw an error if you don’t use download.file(path, destfile, mode = "wb") as the default mode = "w" seems to adjust headers.

Lets clean this by first removing the “RNP in Statistics” entries and selecting only the program and peer ranking. This will leave us with only the schools name and rank.

usnewsrankings <- usnewsrankings[usnewsrankings != "RNP in Statistics"][c(TRUE, FALSE, TRUE)]
usnewsrankings %>% (function(x)rbind(head(x),tail(x)))
##      [,1]                                                                                           [,2]   [,3]                                                            [,4]   [,5]                                                     [,6]  
## [1,] "Stanford University (Department of Statistics)"                                               "\t5.0" "University of California--Berkeley (Department of Statistics)" "\t4.7" "Harvard University (Department of Biostatistics)"       "\t4.6"
## [2,] "University of Oklahoma Health Sciences Center (Department of Biostatistics and Epidemiology)" "\tN/A" "University of Wyoming (Department of Statistics)"              "\tN/A" "Western Michigan University (Department of Statistics)" "\tN/A"

Now for cleaning this up:

usnewsrankings %<>% 
  # Drop tabs
  str_replace("\t","") %>%
  # Form a matrix
  matrix(ncol = 2, byrow = TRUE) %>%
  # Column names
  set_colnames(c("Program","Rank")) %>%
  # Into a tibble
  as_tibble() %>%
  # Making Rank numeric
  # Coercing NA is intended here!
  mutate(Rank = as.numeric(Rank))
## Warning in mask$eval_all_mutate(dots[[i]]): NAs introduced by coercion

And here we have the USNews ranking cleaned up:

usnewsrankings %>% (function(x)rbind(head(x),tail(x)))
## # A tibble: 12 x 2
##    Program                                                                                       Rank
##    <chr>                                                                                        <dbl>
##  1 Stanford University (Department of Statistics)                                                 5  
##  2 University of California--Berkeley (Department of Statistics)                                  4.7
##  3 Harvard University (Department of Biostatistics)                                               4.6
##  4 Johns Hopkins University (Department of Biostatistics)                                         4.6
##  5 University of Washington (Department of Biostatistics)                                         4.6
##  6 Harvard University (Department of Statistics)                                                  4.4
##  7 Oklahoma State University (Department of Statistics)                                          NA  
##  8 Old Dominion University (Department of Mathematics and Statistics)                            NA  
##  9 Tulane University (Department of Biostatistics)                                               NA  
## 10 University of Oklahoma Health Sciences Center (Department of Biostatistics and Epidemiology)  NA  
## 11 University of Wyoming (Department of Statistics)                                              NA  
## 12 Western Michigan University (Department of Statistics)                                        NA

Splitting these into Statistics and Biostatistics programs requires first manually modifying an entry with Rutgers University.

usnewsrankings[40,1] <-
  c("Rutgers University--New Brunswick (Department of Statistics)")
usnewsrankings %<>%
  add_row(Program = "Rutgers University--New Brunswick (Department of Biostatistics)",
          Rank = 3.3,
          .after = 40)

Then we can split based on a few specific keywords.

biostat.keywords = c("biostatistic",
                     "health",
                     "medicine")
stat.phd.rank <-
  usnewsrankings %>%
  filter(!str_detect(Program,
                     regex(
                       paste(biostat.keywords,
                             collapse = '|'),
                       ignore_case = TRUE
                     )))

biostat.phd.rank <-
  usnewsrankings %>%
  filter(str_detect(Program,
                    regex(
                      paste(biostat.keywords,
                            collapse = '|'),
                      ignore_case = TRUE
                    )))

And here we are with the USNews rankings per program. Given this information is only updated every four years, with the next to be done in 2022, this should remain relevant for some time.

stat.phd.rank %>% print(n = Inf)
## # A tibble: 70 x 2
##    Program                                                                                     Rank
##    <chr>                                                                                      <dbl>
##  1 Stanford University (Department of Statistics)                                               5  
##  2 University of California--Berkeley (Department of Statistics)                                4.7
##  3 Harvard University (Department of Statistics)                                                4.4
##  4 University of Chicago (Department of Statistics)                                             4.4
##  5 Carnegie Mellon University (Department of Statistics)                                        4.3
##  6 University of Washington (Department of Statistics)                                          4.3
##  7 Duke University (Department of Statistical Science)                                          4.1
##  8 University of Michigan--Ann Arbor (Department of Statistics)                                 4.1
##  9 University of Pennsylvania (Department of Statistics)                                        4.1
## 10 Columbia University (Department of Statistics)                                               4  
## 11 North Carolina State University (Department of Statistics)                                   4  
## 12 University of Wisconsin--Madison (Department of Statistics)                                  4  
## 13 University of North Carolina--Chapel Hill (Department of Statistics & Operations Research)   3.9
## 14 Cornell University (Department of Statistical Science)                                       3.8
## 15 Iowa State University (Department of Statistics)                                             3.8
## 16 Pennsylvania State University (Department of Statistics)                                     3.8
## 17 Texas A&M University--College Station (Department of Statistics)                             3.8
## 18 University of Minnesota--Twin Cities (School of Statistics)                                  3.7
## 19 Purdue University--West Lafayette (Department of Statistics)                                 3.6
## 20 Johns Hopkins University (Department of Applied Mathematics and Statistics)                  3.5
## 21 University of California--Davis (Department of Statistics)                                   3.5
## 22 University of California--Los Angeles (Department of Statistics)                             3.5
## 23 Yale University (Department of Statistics)                                                   3.5
## 24 Ohio State University (Department of Statistics)                                             3.4
## 25 University of Illinois--Urbana-Champaign (Department of Statistics)                          3.4
## 26 Rutgers University--New Brunswick (Department of Statistics)                                 3.3
## 27 University of Florida (Department of Statistics)                                             3.3
## 28 University of Iowa (Department of Statistics and Actuarial Science)                          3.3
## 29 Rice University (Department of Statistics)                                                   3.2
## 30 Colorado State University (Department of Statistics)                                         3.1
## 31 Florida State University (Department of Statistics)                                          3.1
## 32 University of Connecticut (Department of Statistics)                                         3.1
## 33 Michigan State University (Department of Statistics and Probability)                         3  
## 34 University of California--Irvine (Department of Statistics)                                  3  
## 35 University of Texas--Austin (Department of Statistics and Data Science)                      3  
## 36 Northwestern University (Department of Statistics)                                           2.9
## 37 University of Pittsburgh (Department of Statistics)                                          2.9
## 38 George Washington University (Department of Statistics)                                      2.8
## 39 New York University (Department of Information, Operations, and Management Sciences)         2.8
## 40 University of Georgia (Department of Statistics)                                             2.8
## 41 University of Missouri--Columbia (Department of Statistics)                                  2.8
## 42 Virginia Tech (Department of Statistics)                                                     2.8
## 43 University of California--Santa Barbara (Department of Statistics and Applied Probability)   2.7
## 44 Indiana University--Bloomington (Department of Statistics)                                   2.6
## 45 Southern Methodist University (Department of Statistical Science)                            2.6
## 46 University of Maryland--Baltimore County (Department of Mathematics and Statistics)          2.6
## 47 University of Virginia (Department of Statistics)                                            2.6
## 48 Oregon State University (Department of Statistics)                                           2.5
## 49 University of California--Riverside (Department of Statistics)                               2.5
## 50 University of Massachusetts--Amherst (Department of Mathematics and Statistics)              2.5
## 51 University of South Carolina (Department of Statistics)                                      2.5
## 52 Arizona State University (School of Mathematical & Statistical Sciences)                     2.4
## 53 Case Western Reserve University (Department of Statistics)                                   2.4
## 54 Temple University (Department of Statistics)                                                 2.4
## 55 Baylor University (Department of Statistical Science)                                        2.3
## 56 George Mason University (Department of Statistics)                                           2.3
## 57 Kansas State University (Department of Statistics)                                           2.3
## 58 University of Colorado--Denver (Department of Mathematical and Statistical Sciences)         2.3
## 59 University of Kentucky (Department of Statistics)                                            2.2
## 60 Virginia Commonwealth University (Department of Statistics)                                  2.2
## 61 San Diego State University (Department of Mathematics and Statistics)                        2.1
## 62 University of North Carolina--Charlotte (Department of Mathematics and Statistics)           2  
## 63 University of Texas--San Antonio (Department of Management Science and Statistics)           2  
## 64 Auburn University (Department of Mathematics and Statistics)                                NA  
## 65 Montana State University (Department of Mathematical Sciences)                              NA  
## 66 North Dakota State University (Department of Statistics)                                    NA  
## 67 Oklahoma State University (Department of Statistics)                                        NA  
## 68 Old Dominion University (Department of Mathematics and Statistics)                          NA  
## 69 University of Wyoming (Department of Statistics)                                            NA  
## 70 Western Michigan University (Department of Statistics)                                      NA
biostat.phd.rank %>% print(n = Inf)
## # A tibble: 41 x 2
##    Program                                                                                                      Rank
##    <chr>                                                                                                       <dbl>
##  1 Harvard University (Department of Biostatistics)                                                              4.6
##  2 Johns Hopkins University (Department of Biostatistics)                                                        4.6
##  3 University of Washington (Department of Biostatistics)                                                        4.6
##  4 University of North Carolina--Chapel Hill (Department of Biostatistics)                                       4.3
##  5 University of Michigan--Ann Arbor (Department of Biostatistics)                                               4.2
##  6 University of California--Berkeley (Group in Biostatistics)                                                   4.1
##  7 University of Minnesota--Twin Cities (School of Public Health)                                                3.7
##  8 University of Wisconsin--Madison (School of Medicine and Public Health)                                       3.7
##  9 Columbia University (Department of Biostatistics)                                                             3.6
## 10 University of California--Los Angeles (Department of Biostatistics)                                           3.6
## 11 University of Texas MD Anderson (Department of Biostatistics)                                                 3.6
## 12 University of Pennsylvania (Department of Biostatistics & Epidemiology)                                       3.5
## 13 Yale University (Department of Biostatistics)                                                                 3.5
## 14 Emory University (Department of Biostatistics and Bioinformatics)                                             3.4
## 15 Rutgers University--New Brunswick (Department of Biostatistics)                                               3.3
## 16 Brown University (Department of Biostatistics)                                                                3.1
## 17 Duke University (Department of Biostatistics and Bioinformatics)                                              3.1
## 18 Vanderbilt University (Department of Biostatistics)                                                           3.1
## 19 Boston University (School of Public Health)                                                                   3  
## 20 University of California--Davis (Graduate Group in Biostatistics)                                             3  
## 21 University of Pittsburgh (Department of Biostatistics)                                                        3  
## 22 University of Florida (Department of Biostatistics)                                                           2.9
## 23 University of Iowa (Department of Biostatistics)                                                              2.9
## 24 University of Rochester (Department of Biostatistics and Computational Biology)                               2.9
## 25 University of Texas Health Science Center--Houston (University of Texas MD Anderson Cancer Center UTHealth)   2.9
## 26 Medical College of Wisconsin (Division of Biostatistics)                                                      2.8
## 27 University of Illinois--Chicago (Epidemiology and Biostatistics Division)                                     2.7
## 28 Case Western Reserve University (Department of Epidemiology and Biostatistics)                                2.6
## 29 University of Colorado--Denver (Department of Biostatistics and Informatics)                                  2.5
## 30 University of Massachusetts--Amherst (Division of Biostatistics and Epidemiology)                             2.4
## 31 University at Buffalo--SUNY (Department of Biostatistics)                                                     2.3
## 32 University of South Carolina (Epidemiology and Biostatistics)                                                 2.3
## 33 University of Alabama--Birmingham (Department of Biostatistics)                                               2.2
## 34 University of Kansas Medical Center (Department of Biostatistics)                                             2.2
## 35 University at Albany--SUNY (Department of Epidemiology & Biostatistics)                                       2.1
## 36 University of Cincinnati (Division of Epidemiology & Biostatistics)                                           2.1
## 37 Virginia Commonwealth University (Department of Biostatistics)                                                2.1
## 38 Augusta University (Department of Biostatistics and Epidemiology)                                            NA  
## 39 Louisiana State University Health Sciences Center--New Orleans (Biostatistics Program)                       NA  
## 40 Tulane University (Department of Biostatistics)                                                              NA  
## 41 University of Oklahoma Health Sciences Center (Department of Biostatistics and Epidemiology)                 NA

Related