Hanukkah of Data 5784 - Day 8 The Collector
The Task
We’re given the clue that the Collector has an entire set of Noah’s collectibles.
Solution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
import re import pandas as pd def solve(): customers = pd.read_csv('noahs-customers.csv') orders = pd.read_csv('noahs-orders.csv') order_items = pd.read_csv('noahs-orders_items.csv') products = pd.read_csv('noahs-products.csv') data = customers.merge(orders).merge(order_items).merge(products) is_collectible = data['sku'].str.startswith('COL') grouped = (data[is_collectible][['customerid','sku','phone']]. drop_duplicates(). groupby(['customerid','phone']). count()) return (grouped[grouped['sku'] == grouped['sku'].max()]. reset_index(). iloc[0]['phone']) # --------------------------------------------------------------------------------------------- assert solve() == '212-547-3518' |
Create a predicate for collectibles:
1 |
is_collectible = data['sku'].str.startswith('COL') |
Group by customer and count the number of collectibles per customer:
1 2 3 4 |
grouped = (data[is_collectible][['customerid','sku','phone']]. drop_duplicates(). groupby(['customerid','phone']). count()) |
Return the phone number of the customer with the most collectibles:
1 2 3 |
return (grouped[grouped['sku'] == grouped['sku'].max()]. reset_index(). iloc[0]['phone']) |
Conclusion
That’s the end of Hanukkah of Data 5784. I had a lot of fun, and learned quite a bit about pandas
, but I have a long way to go to be proficient with Python’s data science ecosystem!