We’ve all heard about the many improvements and changes being made to help get women into Hi-Tech and the Sciences, but where are these improvements being seen? Is the entire nation reaching gender equality or are there pockets of improvement and pockets of stagnation. These are the questions I set out to answer using the […] continue reading »
Category: Kaggle
San Francisco – A City Full of Psychopaths?
With a population of over 800,000 people within approximately 47 square miles, San Francisco is one of the most densely populated cities in the United States. Cities with this level of population density must contend with a high level of policing activity to ensure safety. I wanted to explore just how safe it actually is […] continue reading »
Data Twofer: Exploring LA Web Traffic and Google Appstore Sentiment
Overview It’s been a few weeks since we posted something in our 5 Minute Analysis series, so we decided to do two quick analyses on two different datasets we found on Kaggle. Instead of doing the analysis locally using PivotBillions on Docker, we opted to run the analysis in the free cloud version available on […] continue reading »
Health Data Analysis: CDC Behavioral Risk Factor data says eat your green veggies
Everyone wants to be healthy but there are many competing claims as to how you can achieve this. With so many contradictory diets, exercise routines that take enormous amounts of time and dedication, and many other perceived paths to a healthy body and mind; tying these claims to actual data becomes very necessary and useful. […] continue reading »
Completing the Picture: Who is the Fantasy Football GOAT for Offense?
Fantasy football can be a relaxing past time but for anyone who takes the competition seriously, data immediately becomes very necessary. While many people track their favorite players from their favorite teams, to truly put together a winning team you need to be able to explore and understand large amounts of data. Moreover, the data […] continue reading »
Completing the Picture: Uncovering NHL MVP’s in a pile of data
Data is rarely consistent. The most consistent attribute of data is that it is usually dispersed across many files and needs to be put back together again to truly understand it. Pivot Billions dramatically improves this process and makes it easy to merge your data and start to analyze it. As an example of this […] continue reading »
Finding Underutilized Kaggle Data
Overview In this 5 Minute Analysis we are exploring a Kaggle dataset about Kaggle datasets. This dataset lets us see a list of the datasets on Kaggle, and shows which ones have the most engagement and activity. Our goal is to explore and filter the data to find popular datasets with many downloads but very […] continue reading »
Simplifying Iowa Liquor Sales Data for Loading to Tableau
Overview In this 5 minute analysis, we pre-process, map, and explore complicated public sales data for liquor stores in Iowa to extract relevant latitude and longitude from a problematic column in the data. We want to filter the data for the city with the most sales and prepare it for easy loading into the […] continue reading »
Getting the Whole Picture of Retail Sales
Overview It should be apparent that we like to use data sets found in Kaggle. If you are not familiar with Kaggle, you should check out their website to peruse the thousands of great datasets available for free to play around with. In this 5 Minute Analysis of the Kaggle Retail Data Analytics we explore historical […] continue reading »
Olympics Data Reveals Rising Gender Equality
Overview In this 5 Minute Analysis of the Kaggle historical Olympic athletic data we filtered and pivoted the data in various ways to understand the trends in participation by gender over time. Steps 1. Load the Data and View its Structure The steps for loading data was covered in a previous post. 2. View and Explore the […] continue reading »