5 Minute Analysis of Data.gov Datasets with PivotBillions


Data.gov is an open source of data provided by the US government that provides a wealth of interesting information ranging from agriculture, education, finance, health, science, etc...Currently, it lists over 200,000 datasets available to explore.

In this 5 Minute Analysis, we picked one dataset from the consumer data category, specifically the Financial Services Consumer Complaint Database. This dataset is provided by the Consumer Financial Protection Bureau or CFPB, and is a record of complaints received to the CFPB on various financial products and services offered by a multitude of companies.

The data is comprised of over 1.2 million rows in csv format and is about 700MB in size. Loading the data was pretty simple using the drag and drop feature in PivotBillions to upload the data to our cloud based demo portal.




Doing some quick exploratory data analysis (EDA), here are some general trends that we can see in the data.

Since the launch of the CFPB back in 2011, the number of recorded complaints has gone up year over year.

In that time, the top financial product or service that was associated with a complaint were the following:

By far, mortgage, debt collection and credit reporting outpace other types of products or services with regards to generating complaints.

The top companies who are the recipients of these complaints include the major credit reporting companies and banks.

The top five states for consumer complaints are California, Florida, Texas, New York and surprisingly Georgia.  Georgia is a bit of an outlier considering it is only the 8th largest state in terms of population in the US, but somehow more complaints originated from there compared to Illinois which has a larger population.

The overwhelming preferred method for reporting complaints is online, via the web.

Out of the top three major financial product/services reported on, we see a trend of the mortgage related complaints declining while debt collection and credit reporting complaints have grown significantly over that time.


The Wrap-up

What insights did we gain through our EDA? The declining mortgage complaints seem to indicate that it looks like there is steady recovery from the subprime mortgage crisis of 2007-2008, although there is still friction with consumers trying to modify loans or avoid foreclosure. The slow rise of debt collection complaints might indicate that consumers might be taking on more credit card or loan debt and the rate of default is rising. The most drastic increase in complaints is with credit reporting/credit repair services. The huge spike in September 2017 is primarily a result of the Equifax data breach announcement, but even before then it looks like there was a steady rise of complaints with regard to how the credit reporting agencies were handling information in consumer credit reports.

A lot more can be gleaned by delving deeper into the data, but that might have to be reserved for another post. Currently, this dataset is available in our public demo portal for anyone to play with.  Try it for yourself and see what insights you might find.