Lojic Technologies

Digest Tag Population in Ruby

with 6 comments

I saw a post on comp.lang.lisp demonstrating the suitability of Common Lisp for functional programming. The poster asked to see versions in other languages including Ruby, so I thought I’d whip something up. Here’s the original post with description of the problem:

This one was too much fun for words in re how cool it is programming
with Lisp. I would like to see this in Ruby, Clojure, Qi, and
Scheme. The precise fun part tho is typing it all in in the final form
versus dividing the thing up into steps to get intermediate results,
ie, a test of one's mastery of one's language. Non-functional
languages I guess have no choice but to stop and assign temporaries.

Given:

(defparameter *pets*
  '((dog ((blab 12)(glab 17)(cbret 82)(dober 42)(gshep 25)))
    (cat ((pers 22)(siam 7)(tibet 52)(russ 92)(meow 35)))
    (snake ((garter 10)(cobra 37)(python 77)(adder 24)(rattle 40)))
    (cow ((jersey 200)(heiffer 300)(moo 400)))))

Write:

(defun digest-tag-population (tag-population pick-tags count)...)

Such that:

(digest-tag-population *pets* '(dog cat snake) 5)

=> ((DOG CBRET 82) (DOG DOBER 42) (CAT RUSS 92) (CAT TIBET 52) (SNAKE
PYTHON 77))

...the rules being:

- consider only the populations of tags (the first symbol in each
sublist) found in the parameter pick-tags, a list

- take only the  most populous of the union of the populations

- return (tag name population) of the most populous in this order:

    firstly, by position of the tag in pick-tags
    second, ie within a tag, in descending order of population

(defun subseq-ex (st e s)
  (subseq s st (min e (length s))))

(defun digest-tag-population (tag-population pick-tags count)
  (flet ((tagpos (tag) (position tag pick-tags)))
    (stable-sort (subseq-ex 0 count
                   (sort (loop for (tag population) in tag-population
                             when (tagpos tag)
                             append (loop for pop in population
                                        collecting (list* tag pop)))
                     '> :key (lambda (x)
                               (caddr x))))
      '< :key (lambda (x) (tagpos (car x))))))

(defparameter *pets*
  '((dog ((blab 12)(glab 17)(cbret 82)(dober 42)(gshep 25)))
    (cat ((pers 22)(siam 7)(tibet 52)(russ 92)(meow 35)))
    (snake ((garter 10)(cobra 37)(python 77)(adder 24)(rattle 40)))
    (cow ((jersey 200)(heiffer 300)(moo 400)))))

#+test
(digest-tag-population *pets* '(dog cat snake) 5)

And here is my Ruby version:

PETS = [
  [:dog, [[:blab, 12], [:glab, 17], [:cbret, 82], [:dober, 42], [:gshep, 25]]],
  [:cat, [[:pers, 22], [:siam, 7], [:tibet, 52], [:russ, 92], [:meow, 35]]],
  [:snake, [[:garter, 10], [:cobra, 37], [:python, 77], [:adder, 24], [:rattle, 40]]],
  [:cow, [[:jersey, 200], [:heiffer, 300], [:moo, 400]]]
]

def digest_tag_population tag_population, pick_tags, count
  tag_population.select {|e| pick_tags.include?(e[0]) }.
    inject([]) {|memo,obj| obj[1].each {|e| memo << [obj[0], e[0], e[1]] }; memo }.
    sort {|a,b| b[2] <=> a[2] }[0,count].
    sort_by {|e| [ tag_population.map{|p| p[0]}.rindex(e[0]), e[2] * -1] }
end

digest_tag_population(PETS, [:dog, :cat, :snake], 5)

Within the function:
Line 1: select elements that match the pick tags
Line 2: map to a list of tuples of the form [:dog, :blab, 12]
Line 3: sort the list of tuples by population and select the first count of them
Line 4: sort by tag position, population

Output:

[[:dog, :cbret, 82],
[:dog, :dober, 42],
[:cat, :russ, 92],
[:cat, :tibet, 52],
[:snake, :python, 77]]

I think Ruby compares very favorably. What do you think? Feel free to submit a version in another language.

Written by Brian Adkins

March 1, 2009 at 12:47 am

Posted in programming

Tagged with , , , ,

6 Responses

Subscribe to comments with RSS.

  1. A Haskell version by Paul Rubin from comp.lang.haskell:

    import Data.List
    import Data.Ord
    import Control.Monad
    
    pets =
       [("dog", [("blab", 12),("glab", 17),("cbret", 82),
                 ("dober", 42),("gshep", 25)]),
         ("cat", [("pers", 22),("siam", 7),("tibet", 52),
                  ("russ", 92),("meow", 35)]),
         ("snake", [("garter", 10),("cobra", 37),("python", 77),
                    ("adder", 24),("rattle", 40)]),
         ("cow", [("jersey", 200),("heiffer", 300),("moo", 400)])]
    
    digest_tag tag_pop pick_tags count =
      let selected_pops =
            [(i,(-c,b)) | (i,(a,bs)) <- zip [0..] tag_pop, a `elem` pick_tags,
                                        (b,c) <- bs]
          top_pops = sort $ take count (sortBy (comparing snd) selected_pops)
      in [(fst (tag_pop !! i), b, -c) | (i, (c,b)) <- top_pops]
    
    main = print $ digest_tag pets ["dog","cat","snake"] 5
    

    Brian Adkins

    March 1, 2009 at 2:30 pm

  2. A Python version by Ron Garret from comp.lang.lisp:

    pets = {'dog' : 'blab 12, glab 17, cbret 82, dober 42, gshep 25',
            'cat' :   'pers 22, siam 7, tibet 52, russ 92, meow 35',
            'snake' : 'garter 10, cobra 37, python 77, adder 24, rattle 40',
            'cow' : 'jersey 200, heiffer 300, moo 400'}
    
    for k in pets: pets[k] = [(k,tag,int(n)) for (tag,n) in
                              [s.split() for s in pets[k].split(',')]]
    
    def keycmp(f): return lambda x,y: cmp(f(x), f(y))
    
    def dtp(tags, types, cnt):
      l = []
      for t in types: l.extend(pets[t])
      l.sort(keycmp(lambda x:-x[2]))
      l=l[:cnt]
      l.sort(keycmp(lambda x: types.index(x[0]))
      return l
    
    
    >>> dtp(pets, ['dog','cat','snake'], 5)
    [('dog', 'cbret', 82), ('dog', 'dober', 42), ('cat', 'russ', 92),
    ('cat', 'tibet', 52), ('snake', 'python', 77)]
    

    Brian Adkins

    March 1, 2009 at 2:47 pm

  3. Two Clojure versions by Raffael Cavallaro on comp.lang.lisp:

    Version 1:

    (defn digest-tag-population [tag-population pick-tags count]
      (sort-by #(first (for [[idx elt]
    			 (map vector (iterate inc 0) pick-tags)
    			 :when (= elt (first%))] idx))
    	   
    				  (mapcat (fn [a-lst] (map #(cons (first a-lst) %)
    							   (first (rest a-lst))))
    					  (filter #(contains? (set pick-tags) (first %)) tag-population))))))
    

    Version 2:

    (defn digest-tag-population [tag-population pick-tags count]
      (let [selected (filter #(contains? (set pick-tags) (first %)) tag-population)
    	union (mapcat (fn [a-lst]
    			(map (fn [lst]
    			       (cons (first a-lst) lst))
    			     (first (rest a-lst))))
    		      selected)
    	pet-position (fn [petsym]
    		       (first (for [[idx elt] (map vector (iterate inc 0) pick-tags) :when (= elt petsym)] idx)))
    	pop-sorted (sort-by #(nth % 2) > union)
    	top-n (take count pop-sorted)
    	pet-sorted (sort-by #(pet-position (first %)) < top-n)]
        pet-sorted))
    

    Brian Adkins

    March 1, 2009 at 2:50 pm

  4. Second version by Paul Rubin on comp.lang.haskell:

    import Data.List
    import Data.Ord
    
    pets = [("dog", [("blab", 12),("glab", 17),("cbret", 82),
                     ("dober", 42),("gshep", 25)]),
            ("cat", [("pers", 22),("siam", 7),("tibet", 52),
                     ("russ", 92),("meow", 35)]),
            ("snake", [("garter", 10),("cobra", 37),("python", 77),
                       ("adder", 24),("rattle", 40)]),
            ("cow", [("jersey", 200),("heiffer", 300),("moo", 400)])]
    
    digest_tag tag_pop pick_tags count =
        sortBy (comparing ((a,_)->findIndex (==a) (map fst tag_pop)))
                   (take count $ sortBy (flip (comparing (snd . snd)))
                             [(a,(b,c)) | (a,bs) <- tag_pop, a `elem` pick_tags,
                              (b,c) <- bs])
    
    main = print $ digest_tag pets ["dog","cat","snake"] 5
    

    Brian Adkins

    March 3, 2009 at 1:06 am

  5. A Haskell version by Florian Kreidler (modifying a version by Greg Bacon) on comp.lang.haskell:

     import Data.List (sortBy, elemIndex)
     import Data.Ord(comparing)
    
     type Tag       = String
     type Pet       = (String, Int)
     type PetTag    = (Tag, [Pet])
     type TaggedPet = (Tag, Pet)
    
     pets :: PetTags
     pets =
       [ ("dog",   [ ("blab",    12)
                   , ("glab",    17)
                   , ("cbret",   82)
                   , ("dober",   42)
                   , ("gshep",   25)
                   ])
       , ("cat",   [ ("pers",    22)
                   , ("siam",    7)
                   , ("tibet",   52)
                   , ("russ",    92)
                   , ("meow",    35)
                   ])
       , ("snake", [ ("garter",  10)
                   , ("cobra",   37)
                   , ("python",  77)
                   , ("adder",   24)
                   , ("rattle",  40)
                   ])
       , ("cow",   [ ("jersey",  200)
                   , ("heiffer", 300)
                   , ("moo",     400)
                   ])
       ]
    
     digestTagPopulation :: [PetTag] -> [Tag] -> Int -> [TaggedPet]
     digestTagPopulation tp tags c
         = sortBy (comparing $ flip elemIndex tags . fst) $
           take c $ sortBy (flip $ comparing $ snd . snd)
           [ (t, p) | (t, ps) <- tp, t `elem` tags, p <- ps ]
    

    Brian Adkins

    March 21, 2009 at 10:07 am

  6. I’ve posted two alternate Clojure solutions here: http://scott.andstuff.org/ClojureSpikes. They take a slightly different approach from Raffael’s above. The high-level structure of my second solution is inspired by Raffael’s second solution — this is a helpful LISP idiom that I was glad to learn.

    Scott Moonen

    August 10, 2009 at 6:25 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: