Lojic Technologies

Archive for December 2007

Name Guessing in Ruby

leave a comment »

I came across a web site recently that prompted me to enter two names before I could view the site. So I wondered how difficult it would be to write a Ruby program that could pass the challenge. As you’d expect, it’s trivial in Ruby. If you encounter such a site, please don’t use the code in part two unless it’s permitted by the terms of service of the site in question.

Part One – Obtain a file of names

I found a government site that provided lists of the 1,000 most popular names for each decade, so I wrote a Ruby program to screen scrape the names into a list in order of popularity.

Make the open-uri library available.

require 'open-uri'

Create a simple Struct to contain the url for each decade along with an array for boy names and girl names (the site provides an ordered list for each gender). Then create an array of Decade instances for 80’s, 90’s and 00’s.

Decade = Struct.new(:url, :boy_names, :girl_names)
decades = [
  Decade.new('http://www.ssa.gov/OACT/babynames/decades/names1980s.html', [], []),
  Decade.new('http://www.ssa.gov/OACT/babynames/decades/names1990s.html', [], []),
  Decade.new('http://www.ssa.gov/OACT/babynames/decades/names2000s.html', [], [])
]

Iterate over each decade and screen scrape the list of boy names and girl names.

decades.each do |decade|
  str = open(decade.url).read
  str.scan(/<tr><td align="right">.*?<td width="15%">(w+)</td>.*?<td width="15%">(w+)</td>/m) do
    decade.boy_names << $1
    decade.girl_names << $2
  end
end

Now we have a pair of lists (boys & girls) for each decade with each list in order of popularity – 6 lists in total. We want to form a single list by selecting the most popular name from each of the 6 lists, followed by the second most popular name, etc. The zip function comes to mind, so we’ll first create an array of the 6 lists.

name_lists = decades.inject([]) do |result, decade|
  result << decade.boy_names
  result << decade.girl_names
  result
end

Now we have a list containing 6 lists. Here is where the “everything is an object” aspect of ruby is a bit of a pain. What I want to do is simply zip the 6 lists together, for example:

x = zip(one, two, three, four, five, six)

but I’m forced to pick one of the lists to be the instance for the zip method, for example:

x = one.zip(two, three, four, five, six)

I suppose my dissatisfaction is due in part to my recent research in functional programming.

The line below will do the following (I really like Ruby 🙂 ):

  1. pop one of the lists off of the main list (leaving 5) to be the instance for zip
  2. explode the list of 5 remaining lists into 5 individual arguments via the * operator
  3. zip the 6 lists together
  4. flatten the resulting list of lists produced by zip into one list
  5. remove duplicate entries
  6. write each name to standard output
(name_lists.shift).zip(*name_lists).flatten.uniq.each {|name| puts name }

Run the program as follows to create a file named names.txt which will be used in part two.

ruby scrape_names.rb > names.txt

Part Two – Brute force form submission

Now that we have a list of over 2,000 names, it’s a simple matter of repeatedly trying pairs of names in order of popularity. Here’s the program in its entirety with comments following:

 1  require 'net/http'

 2  names = ARGF.inject([]) {|result, line| result < 'application/x-www-form-urlencoded' }
 5    (1...names.length).each do |i|
 6      (0..i).each do |j|
 7        response = query.post('/guess/',
 8          "name1=#{names[i]}&name2=#{names[j]}", headers)
 9        if !response.body.include?('Failed')
10          puts "Answer: #{names[i]}, #{names[j]}"
11          exit
12        end
13      end
14    end
15  end

Line 1: makes the HTTP library accessible.

Line 2: reads a list of names from standard input into an array.

Line 3: opens a TCP connection to the specified web server (site withheld for privacy).

Line 4: through trial and error I discovered that contrary to the example in the Pick Axe p. 700, I need to add this header to get the form submission to work.

Lines 5-6: setup the iteration to compare each name (beginning with the second) with each of the previous names up to and including the current one (in case the answer is two identical names).

Lines 7-8: perform an HTTP post with the pair of names.

Lines 9-11: if the retrieved page doesn’t contain a failure string, print the solution to standard output and exit.

Run the program as follows:

ruby guess_names.rb < names.txt

There you have it – a pair of Ruby programs to pass a “guess two names to enter” challenge. The amount of time to find a solution will depend on the uncommonness of the names used. This is an O(n^2) algorithm, so if the names in the solution are near the bottom of the popularity scale, expect a long run time. The name list I produced had 2,697 names, so the worst case scenario is (2697 * 2698) /2 -1 = 3,638,252 requests. If only the first 100 names need to be compared there would be 5,049 requests.

This algorithm is trivial to parallelize – just change line 5 above from (1…names.length) to multiple non-overlapping ranges (one per process), and let her fly.

Ruby has many benefits and is currently my most productive programming language, but I think it particularly shines in this type of ad-hoc, web-enabled task because of the expressiveness of the language and the availability of useful libraries. I highly recommend putting it high on your list of languages to learn if you don’t already use it.

Written by Brian Adkins

December 20, 2007 at 9:04 pm

Posted in programming

Tagged with ,

Paul Graham on Procastination

leave a comment »

The most impressive people I know are all terrible procrastinators. So could it be that procrastination isn’t always bad?

Most people who write about procrastination write about how to cure it. But this is, strictly speaking, impossible. There are an infinite number of things you could be doing. No matter what you work on, you’re not working on everything else. So the question is not how to avoid procrastination, but how to procrastinate well.

It’s ironic that I read this essay while procastinating 🙂 To read the rest, click the previous link.

Written by Brian Adkins

December 17, 2007 at 9:46 pm

Posted in business, people

Tagged with

Bug Labs

leave a comment »

This is one of the coolest ideas I’ve seen in a while. Bug Labs is developing some technology that should be very interesting to any geek. Another great find by Robert Scoble. The video quality isn’t high because they were recorded on his cell phone, but I’m glad he had a video capable cell phone with him when he bumped into Peter.

Part One
Part Two
Part Three

bug_lab.gif

Written by Brian Adkins

December 1, 2007 at 1:16 pm

Posted in technology, video

Tagged with ,

Crayon Physics

with one comment

I found this video of a “crayon physics” game on Robert Scoble’s site – very cool!

Written by Brian Adkins

December 1, 2007 at 11:08 am

Posted in amazing, science, video

Tagged with ,