Hanukkah of Data 5784 - Day 1 The Investigator
The Task
The task is to find the customer whose last name can be spelled by their phone number, using the letters shown for each number on a dial pad, and report their phone number. We’re given a customer file as follows:
>>> import pandas as pd
>>> df = pd.read_csv('noahs-customers.csv')
>>> df.loc[:5,'customerid':'phone']
customerid name address citystatezip birthdate phone
0 1001 Jacqueline Alvarez 105N Elizabeth St Manhattan, NY 10013 1958-01-23 315-377-5031
1 1002 Julie Howell 185-1 Linden St Brooklyn, NY 11221 1956-12-03 680-537-8725
2 1003 Christopher Ali 174-28 Baisley Blvd Jamaica, NY 11434 2001-09-20 315-846-6054
3 1004 Christopher Rodriguez 102 Mount Hope Pl Bronx, NY 10453 1959-07-10 516-275-2292
4 1005 Jeffrey Wilkinson 17 St Marks Pl Manhattan, NY 10003 1988-09-08 838-830-6960
5 1006 Emma Wells 86-34 102nd Rd Ozone Park, NY 11416 1984-02-05 315-236-2043
Solution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
import re import pandas as pd letters = { '2' : 'abc', '3' : 'def', '4' : 'ghi', '5' : 'jkl', '6' : 'mno', '7' : 'pqrs', '8' : 'tuv', '9' : 'wxyz' } def phone_matches_last_name(phone, name): phone = re.sub(r'\D', '', phone) last_name = str.split(name)[-1].lower() if len(last_name) < len(phone): return False for d, c in zip(phone, last_name): if c not in letters.get(d, ''): return False return True def solve(): customers = pd.read_csv('noahs-customers.csv') predicate = lambda row: phone_matches_last_name(row['phone'], row['name']) return customers[customers.apply(predicate, axis=1)].iloc[0]['phone'] # --------------------------------------------------------------------------------------------- assert solve() == '826-636-2286' |
Commentary
Firstly, associate the numbers of a phone dial pad with its letters using a dict
:
1 2 |
letters = { '2' : 'abc', '3' : 'def', '4' : 'ghi', '5' : 'jkl', '6' : 'mno', '7' : 'pqrs', '8' : 'tuv', '9' : 'wxyz' } |
Next, a helper predicate to indicate if a phone number can spell a name. We’ll clean the phone number by removing any non-numeric digits, and we grab the last name by splitting on space and taking the last element of the list. If the length of the last name is less than the length of the phone number, it’s not a match; otherwise, for each letter of the last name, see if it’s found in the list of letters associated with the phone number digit - if all are found, it’s a match.
1 2 3 4 5 6 7 8 9 10 11 12 |
def phone_matches_last_name(phone, name): phone = re.sub(r'\D', '', phone) last_name = str.split(name)[-1].lower() if len(last_name) < len(phone): return False for d, c in zip(phone, last_name): if c not in letters.get(d, ''): return False return True |
With that in place, we just need to filter rows for which our predicate is true.
1 2 3 4 5 |
def solve(): customers = pd.read_csv('noahs-customers.csv') predicate = lambda row: phone_matches_last_name(row['phone'], row['name']) return customers[customers.apply(predicate, axis=1)].iloc[0]['phone'] |