Lojic Technologies

Archive for the ‘people’ Category

How to Write a Spelling Corrector in Ruby

with 14 comments

Update 10/16/2015: Please see the Racket Version also.

Peter Norvig wrote a simple spelling corrector in 20 lines of Python 2.5,
so I thought I’d see what it looks like in Ruby. Here are some areas I’m not pleased with:

  1. List comprehensions in Python made the edits1 function more elegant IMO.
  2. The boolean expression in the correct function evaluates empty sets/arrays as false in Python but not in Ruby, so I had to add the “result.empty? ? nil : result” expression to several functions. I expect there’s a better way to handle this also.

Otherwise, the translation was pretty straightforward.

Here’s a link to Norvig’s page:
http://www.norvig.com/spell-correct.html

That page includes a link to a text file that I saved locally as
holmes.txt: http://www.norvig.com/holmes.txt

def words text
  text.downcase.scan(/[a-z]+/)
end

def train features
  model = Hash.new(1)
  features.each {|f| model[f] += 1 }
  return model
end

NWORDS = train(words(File.new('holmes.txt').read))
LETTERS = ("a".."z").to_a.join

def edits1 word
  n = word.length
  deletion = (0...n).collect {|i| word[0...i]+word[i+1..-1] }
  transposition = (0...n-1).collect {|i| word[0...i]+word[i+1,1]+word[i,1]+word[i+2..-1] }
  alteration = []
  n.times {|i| LETTERS.each_byte {|l| alteration << word[0...i]+l.chr+word[i+1..-1] } }
  insertion = []
  (n+1).times {|i| LETTERS.each_byte {|l| insertion << word[0...i]+l.chr+word[i..-1] } }
  result = deletion + transposition + alteration + insertion
  result.empty? ? nil : result
end

def known_edits2 word
  result = []
  edits1(word).each {|e1| edits1(e1).each {|e2| result << e2 if NWORDS.has_key?(e2) }}
  result.empty? ? nil : result
end

def known words
  result = words.find_all {|w| NWORDS.has_key?(w) }
  result.empty? ? nil : result
end

def correct word
  (known([word]) or known(edits1(word)) or known_edits2(word) or
    [word]).max {|a,b| NWORDS[a] <=> NWORDS[b] }
end

After you’ve saved the holmes.txt file, load the code into irb and call the correct function with a string as follows:

badkins:~/sync/code/ruby$ irb
irb(main):001:0> require 'spelling_corrector.rb'
=> true
irb(main):002:0> correct "whree"
=> "where"

Written by Brian Adkins

September 4, 2008 at 3:58 pm

Posted in people, programming

Tagged with ,

Startup School 2008

leave a comment »

http://omnisio.com/startupschool08

http://www.justin.tv/hackertv/97554/Startup_School

Peter Norvig, Paul Graham, Marc Andreessen, Mike Arrington, Jeff Bezos, David Heinemeier Hansson, etc.

Written by Brian Adkins

April 21, 2008 at 12:22 pm

Posted in business, people, video

Tagged with

Paul Graham on Procastination

leave a comment »

The most impressive people I know are all terrible procrastinators. So could it be that procrastination isn’t always bad?

Most people who write about procrastination write about how to cure it. But this is, strictly speaking, impossible. There are an infinite number of things you could be doing. No matter what you work on, you’re not working on everything else. So the question is not how to avoid procrastination, but how to procrastinate well.

It’s ironic that I read this essay while procastinating 🙂 To read the rest, click the previous link.

Written by Brian Adkins

December 17, 2007 at 9:46 pm

Posted in business, people

Tagged with

BBC Richard Feynman Interview

leave a comment »

Although I disagree with Richard Feynman’s conclusions on some of the more important questions we can ask, no one can deny he is an interesting and amazing person. I’ve read a few biographical books about him, and they were quite entertaining. Here is a video interview he did for the BBC in 1981.

Although I’ve read a fair amount about him, this was the first time I heard his voice. I noticed he sounds very much like Regis Philbin 🙂

UPDATE:
Here are a few more BBC video links:

Written by Brian Adkins

November 16, 2007 at 11:21 am

Posted in people, video

Tagged with ,

Peter Seibel’s “Practical Common Lisp” Google Talk

with one comment

Here’s Peter Seibel’s “Practical Common Lisp” talk at Google (about an hour):

Google Video Link

Written by Brian Adkins

August 4, 2007 at 1:15 am

Posted in books, people, programming, video

Tagged with ,