The holiday season brings with it a degree of cheer and joy that many claim makes people act friendlier towards each other. I wanted to see how this effect translates to action so I decided to look into tips for New York green taxis both during the holiday season and the rest of the year. To start, I streamed all of the green taxi data files from the public NYC Taxi and Limousine Commission Trip Record Data for 2017-07-01 to 2018-06-31 (the most recent year of green taxi data) into Pivot Billions and enhanced the data with two new columns: holidayseason and tip_percent.
There were many rows that weren’t relevant to this analysis since cash payments did not have records of tips, so I filtered out cash payments from the payment_type column in Pivot Billions bringing the total rows to ~5 Million.
To dive into the data I made use of Pivot Billions’ pivot feature to quickly reorganize all of this filtered data by where the passenger(s) were dropped off (DOLocationID) and whether the trip occurred during the holiday season. My over 9 Million original rows of data were now shrunk down to a much more manageable 513 row detailed summary. Downloading this new view of the data from Pivot Billions I switched my focus to visualizing and analysing the data in Tableau.
Now that the data was shrunk down to a size Tableau can handle, I loaded the Taxi Zone Shapefile and my newly downloaded DoLocationID_holiday_tips.csv file into Tableau. This was a simple process of loading the shapefile from our datasource and then joining it to our Pivot Billions - processed file by setting Location ID equal to DOLocationID.
After quickly defining a new metric from our data called “Holiday Effect” that tracks the percentage difference in average tips between the holiday season and the rest of the year and adding a few dynamic filters to the data to make it interactive and explorable, I was left with a very clear and powerful visualization of the green taxi data.
It is immediately clear that there are regions with a much greater occurrence of positive holiday effects (blue areas) than negative effects (orange areas) as well as the reverse. Utilizing Tableau’s dynamic filters it's easy to narrow down the data by location and explore which areas of New York experience the effect the most. It appears that Bronx and Brooklyn experience more negative effects whereas Queens is evenly spread between positive and negative. However, Manhattan and Newark Airport have a much higher proportion of positive effects due to the holiday season.
Though most of New York is being affected by the holidays for better or worse, people going to Manhattan and Newark Airport seem to be feeling the holiday spirit the most.
To view and interact with this visualization or download the workbook to Tableau, see my Holiday Effect on Tips by Drop-Off Location workbook on Tableau Public. You can also explore or download my other workbook to see the Holiday Effect on Tips by Pick-Up Location.