Tidy Tuesday 2019 week 48: Student loan debt

2019/11/26

This week’s data is from the Department of Education courtesy of Alex Albright.

Data idea comes from Dignity and Debt who is running a contest around data viz for understanding and spreading awareness around Student Loan debt.

Data Dictionary

Provided by Tidy Tuesday

variable class description
agency_name character Name of loan agency
year integer two digits year
quarter integer Quarter (3 month period)
starting double Inventory, Total value in dollars at start of quarter
added double Inventory, Total value added during quarter
total double Recoveries, Total dollars repaid
consolidation double Recoveries, Consolidation reflects the dollar value of loans consolidated
rehabilitation double Recoveries, Rehabilitation reflects the dollar value of loans rehabilitated
voluntary_payments double Recoveries, Voluntary payments reflects the total amount of payments received from borrowers
wage_garnishments double Recoveries, Wage Garnishments reflect the total amount of wage garnishment payments

Preview of the data:

## # A tibble: 6 x 10
##   agency_name  year quarter starting   added  total consolidation rehabilitation
##   <chr>       <dbl>   <dbl>    <dbl>   <dbl>  <dbl>         <dbl>          <dbl>
## 1 Account Co…    15       4   5.81e9  1.04e9 1.23e8     20081894.      90952573.
## 2 Allied Int…    15       4   3.69e9 NA      1.13e8     11533809.      86967994.
## 3 CBE Group      15       4   2.36e9 NA      8.39e7      7377703.      64227391.
## 4 Coast Prof…    15       4   7.04e8 NA      9.96e7      3401361.      85960328.
## 5 Collection…    15       4   2.95e9 NA      7.57e7      8946976       58395653.
## 6 Collecto, …    15       4   2.30e9 NA      7.88e7      6952382.      56324366.
## # … with 2 more variables: voluntary_payments <dbl>, wage_garnishments <dbl>

I noticed some names seemed to be duplicates, such as Action Financial Services and Action Financial Services* so I combined them. I tried to find a programatic way to find and replace, but it didn’t go well, then it stopped allowing me to knit the doc. So I ended up manually editing anyway. It also seems like Account Control Technology, Inc. might be the same as ACT. The red rows are those tagged as duplicates, before and after cleaning.

I’m not really sure what the terms mean, so I just made some exploratory plots.

Rehabilitation was the greatest amount of the recoveries, about 75% of the total each year. Voluntary payments were pretty low, but got a slight bump in 2018.

Among inventory, starting amount was the greatest, which makes sense. There was a period in 2017 where no inventory was added. It looks like this corresponds to a slight increase in wage garnishments and decrease in consolidation. There’s a very large bump in inventory added in 2018, which looks like it corresponds to increase in rehabilitation and voluntary payments.

Below are spagetti plots by each agency, animated using gganimate

There seems to be a lot of NA’s in the plots. It looks like some data is missing entirely for some years, or there are gaps in some agencies’ data. Not sure why, and it would probably take some digging in to the companies and other data sources to try to figure it out. It looks like some of the agencies that go missing were trending down prior to that - so maybe they closed or went out of business.