IPEDS

The IPEDS package contains data on Post Secondary Institution Statistics in 2021. Some datasets have been filtered to exclude imputation variables, while other datasets are included in full. Details are given below.

Inspiration

We wanted to create a package that can be used with just a basic R understanding, for prospective students wanting to attend undergraduate or graduate colleges and universities. The package IPEDS allows easy access to a wide variety of information regarding Postsecondary Institutions, its current students, faculty, and their demographics, financial aid, educational and recreational offerings, and completions. College search websites are sometimes a little vague in it’s statistics for an institution; this package aims to provide a closer idea of what their institution of interest is really like.

Other notes

  • We dropped variables that indicated whether the data was imputed or not, in order to compress its size

Datasets Included

All the datasets are taken from [IPEDS] (https://nces.ed.gov/ipeds/use-the-data)

  • adm2021: dataset of Admissions and Test Scores for Fall 2021
  • complete2021: dataset of Completions in 2021
  • conference: dataset of Conferences for sports (from offerings2021)
  • dir_info2021: dataset of Directory Information for 2021
  • fall_enroll2021: dataset of Fall Enrollment for 2021
  • fin_aid1920: dataset of Financial Aid Statistics for 2019-2020
  • offerings2021: dataset of Institutional offerings for 2021
  • relig_aff: dataset of Religious Affiliations (from offerings2021)
  • staff2021: dataset of Fall Staff for 2021
  • staff_cat: dataset of Staff Categories based on staff2021$staff_cat

Who should use this package?

This package can be used by students, college counselors, or involved parents interested in pursuing higher education, considering their options, and securing admission into their school of choice. Additionally, anyone interested in educational statistics can use this data for their research.

What does the data look like?

Here’s the first 5 rows of the complete2021 dataset

head(complete2021)
#>   INSTITUTION_ID AWARD_LVL TOTAL TOTAL_M TOTAL_W TOTAL_NATIVE TOTAL_ASIAN TOTAL_BLACK TOTAL_HISP TOTAL_NHPI TOTAL_WHITE TOTAL_MULT TOTAL_UNKNOWN TOTAL_NRA UND18 AGE18_24 AGE25_39 AGE40PLUS AGE_UNKNOWN
#> 1         100654         5   562     191     371            1           0         507          6          1          12          8            19         8     0      367      185         8           2
#> 2         100654         7   251      71     180            1           1         168          0          0          12          3            58         8     0       13      203        31           4
#> 3         100654         9     7       2       5            0           0           3          0          0           0          0             0         4     0        0        7         0           0
#> 4         100654        10     1       0       1            0           0           1          0          0           0          0             0         0     0        0        1         0           0
#> 5         100663         2    71      34      37            1           3          10          1          0          52          2             0         2     0       39       21        11           0
#> 6         100663         5  2870    1047    1823            7         194         688        131          0        1630        131            13        76     0     2074      666       130           0

What can we do with this data?

We can use this package to address many questions such as:

  1. Which institutions have the qualities I’d like in an institution?
  2. What are the admission requirements for my preferred institution?
  3. What’s the relationship between the diversity of students and the diversity of staff?

To answer our questions we can make use of the existing functionality the package provides, as well as data wrangling and data visualization techniques. Some examples that address the question are below:

Example 1:

Which institutions have the qualities I’d like in an institution?

Let’s say Sophia, a senior at high school, is interested in going to a private college of relatively small size in the New England area that will accept the AP credits she’s earned, but is also slightly diverse and helps it’s students afford college.

Using the school_preferences function, Sophia can find a school that perfectly fits her preferences.

  • size: 2, selects a school size of 1,000 - 4,999 students
  • region: “New England”, selects schools from a particular region
  • alt_credits: “Yes”, selects schools that take either AP, Dual or Life Experience credits
  • diversity_students: 36, selects schools where 30% of the students are non-white, or higher
  • financial_aid: 70, selects school where 70% of the students receive financial aid
  • affiliation: 3, selects private (not for profit) institutions
school_preferences(size = 2, region = "New England", alt_credits = "Yes", diversity_students = 36, financial_aid = 70, affiliation = 3)
#>                         Institution Institution ID % of Students Recieved Aid Institution Size Student Diversity Staff Diversity % of Students Disabled      Region Type of Institution Religious Affiliation Calendar System Open Admissions Policy Years Required For Entering       Vet Programs     Alternative Credit   Alternative Tuition Payment                         Distance Education        Counseling Services        Employment Services           Daycare Services Live On-Campus Room Price Board Price Undergraduate Application Fee Graduate Application Fee
#> 1          University of Bridgeport         128744                         78                2                68              26                      1 New England                   3                    -2               1                      2                          -2 Programs Available Takes alternate credit Takes alternate tuition plans    Offers distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2          .           .                             0                        0
#> 2                Goodwin University         129154                         88                2                51              21                      1 New England                   3                    -2               1                      1                          -2 Programs Available Takes alternate credit Takes alternate tuition plans    Offers distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2       4500        1700                            50                       50
#> 3    American International College         164447                         97                2                54              18                      2 New England                   3                    -2               1                      2                          -2 Programs Available Takes alternate credit Takes alternate tuition plans    Offers distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2       8044        7310                             0                       50
#> 4               Bay Path University         164632                         81                2                41              11                      1 New England                   3                    -2               1                      2                          -2 Programs Available Takes alternate credit Takes alternate tuition plans    Offers distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2          .           .                            25                        0
#> 5                  Clark University         165334                         91                2                40              34                      2 New England                   3                    -2               1                      2                          -2 Programs Available Takes alternate credit Takes alternate tuition plans    Offers distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2       6000        4150                            60                      100
#> 6             Mount Holyoke College         166939                         76                2                49              39                      2 New England                   3                    -2               1                      2                          -2 Programs Available Takes alternate credit Takes alternate tuition plans    Offers distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2       8320        8260                            60                       50
#> 7                     Smith College         167835                         71                2                48              31                      2 New England                   3                    -2               1                      2                          -2 Programs Available Takes alternate credit Takes alternate tuition plans Offers no distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2       9700        9720                             0                       60
#> 8 Wentworth Institute of Technology         168227                         84                2                37              28                      2 New England                   3                    -2               1                      2                          -2 Programs Available Takes alternate credit Takes alternate tuition plans    Offers distance education opportunities Offers counseling services Offers employment services Offers no daycare services              2      12120        3300                            50                       50

The output is a data frame that includes The Institution name, ID, the % of students that receive aid, the size of the institution, the percent of non-white students and staff, the % of disabled students, the region of the institution, type, and other relevant information about the institution.

We can select the columns Sophia is most interested in:

school_preferences(size = 2, region = "New England", alt_credits = "Yes", diversity_students = 36, financial_aid = 70, affiliation = 3) %>% 
  select(`Institution`, `Institution Size`, `Region`, `Alternative Credit`, `Student Diversity`, `% of Students Recieved Aid`, `Type of Institution`)
#>                         Institution Institution Size      Region     Alternative Credit Student Diversity % of Students Recieved Aid Type of Institution
#> 1          University of Bridgeport                2 New England Takes alternate credit                68                         78                   3
#> 2                Goodwin University                2 New England Takes alternate credit                51                         88                   3
#> 3    American International College                2 New England Takes alternate credit                54                         97                   3
#> 4               Bay Path University                2 New England Takes alternate credit                41                         81                   3
#> 5                  Clark University                2 New England Takes alternate credit                40                         91                   3
#> 6             Mount Holyoke College                2 New England Takes alternate credit                49                         76                   3
#> 7                     Smith College                2 New England Takes alternate credit                48                         71                   3
#> 8 Wentworth Institute of Technology                2 New England Takes alternate credit                37                         84                   3

Example 2

What are the admission requirements for my preferred institution?

If Sophia is interested in what it takes to apply to one of her preferred schools, Sophia can use the admission_reqs function that provides her with a list of the application requirements.

admission_reqs(167835)
#> # A tibble: 9 × 2
#>   Requirements                            Priority                        
#>   <chr>                                   <chr>                           
#> 1 High School Record                      Required                        
#> 2 Recommendations                         Required                        
#> 3 High School GPA                         Recommended                     
#> 4 High School Rank                        Recommended                     
#> 5 Completion of College-Prepatory Program Recommended                     
#> 6 Test of English as a Foreign Language   Recommended                     
#> 7 Formal Demonstration of Competencies    Neither_required_nor_recommended
#> 8 Other Tests                             Neither_required_nor_recommended
#> 9 Admission Test Scores                   Considered_but_not_required

Now Sophia knows which application materials are required and recommended, and which ones are not necessary at all.

Example 3

What’s the relationship between the diversity of students and the diversity of staff?

In another scenario, a educational statistician is interested in the potential relationship between how diverse a student body is and the diversity of their staff. We’ll data visualize the % of diversity from the resulting dataframe output by the school_preferences function.

data <- school_preferences()

ggplot(data, aes(x = `Staff Diversity`, y = `Student Diversity`)) +
  geom_point() +
  geom_smooth(method = "lm") +
  labs(title = "Student Diversity vs. Staff Diversity",
       y = "Student Diversity (%)",
       x = "Staff Diversity (%)")
#> `geom_smooth()` using formula = 'y ~ x'

Due to it’s functionality, the statistician could also limit their research to explore this relationship to schools only located in the New England area:

data <- school_preferences(region = "New England")

ggplot(data, aes(x = `Staff Diversity`, y = `Student Diversity`)) +
  geom_point() +
  geom_smooth(method = "lm") +
  labs(title = "Student Diversity vs. Staff Diversity in New England Institutions",
       y = "Student Diversity (%)",
       x = "Staff Diversity (%)")
#> `geom_smooth()` using formula = 'y ~ x'

In both cases, we can see a moderate to strong positive relationship between student and staff diversity; after noting this relationship the statistician could go further by observing the how the size of an institution, can possibly influence this relationship.

data <- school_preferences(region = "New England") %>% 
  filter(`Institution Size` != -1 &`Institution Size` != -2 )

data$`Institution Size` <- as.factor(data$`Institution Size`)

ggplot(data, aes(x = `Staff Diversity`, y = `Student Diversity`, color = `Institution Size`)) +
  geom_point() +
  scale_fill_viridis_c(option = "magma") +
  geom_smooth(method = "lm", aes(color=`Institution Size`), se = FALSE) +
  labs(title = "Student Diversity vs. Staff Diversity in New England Institutions by Size",
       y = "Student Diversity (%)",
       x = "Staff Diversity (%)")
#> `geom_smooth()` using formula = 'y ~ x'

And they can conclude here doesn’t seem to be much of a difference depending on Institution Size in New England Institutions.

Example 4:

What are the main similarities and differences between my two top college choices?

Amanda, a high school senior, has to decide where she will attend college soon, but is still debating between her top two choices.

Using the compare_int function, Amanda can take the two schools she is interested in and compare them side by side in a table that lists some of the major qualities of each institution.


compare_int(100654, 100663)
#>                                        Alabama A & M University University of Alabama at Birmingham
#> Size                                                          3                                   5
#> Full Time Students                                         1459                                2361
#> Part Time Students                                           75                                  54
#> Average Aid Awarded                                        9872                                9344
#> Average Award Size                                         9679                               10435
#> City                                                     Normal                          Birmingham
#> State                                                        AL                                  AL
#> Region                                                Southeast                           Southeast
#> Urbanization                                                 12                                  12
#> Calendar System                                               1                                   1
#> Admission Test Scores               Considered_but_not_required         Considered_but_not_required
#> Room & Board Cost                                             .                                   .
#> Degrees Offered                                             Yes                                 Yes
#> AP Credit Accepted                                          Yes                                 Yes
#> Dual Enrollment Credit Accepted                             Yes                                 Yes
#> Study Abroad Programs                                       Yes                                 Yes
#> Freshman Required to Live on Campus                          No                                  No
#> Meals per Week                                               19                                   .