Lojic Technologies

Posts Tagged ‘sml

Programming Language Popularity – Part Two

with one comment

See Part Five

I compiled some programming language popularity statistics in April and mentioned I’d update the results in 6 months, so here they are:

I made a number of Google searches of the forms below and averaged the results:

"implemented in <language>"
"written in <language>"

Language # Results
Apr 09
# Results
Oct 09
C 1,905,500 16,975,000 0
C++ 699,000 6,270,000 +1
Java 850,000 5,118,000 -1
PHP 680,000 5,083,500 0
Lisp Family1 176,507 3,489,650 +3
Python 396,000 3,407,000 -1
Perl 365,500 3,132,500 -1
C# 349,700 2,125,000 -1
Scheme 86,450 2,100,000 +2
FORTRAN 1,621,000 N/A
JavaScript 102,700 1,163,000 -1
ML Family2 29,062 1,003,800 +3
(S)ML3 5,173 590,700 +12
Common Lisp 20,600 554,500 +5
Lisp 61,900 486,500 -2
Prolog 17,750 390,500 +4
Tcl 44,800 382,000 -3
OCaml 22,000 343,500 0
Arc 6,775 286,500 +4
Haskell 22,550 280,500 -4
COBOL 247,300 N/A
Ruby 99,650 227,000 -10
Io 1,760 198,500 +6
Smalltalk 9,105 187,500 -1
Erlang 22,285 161,700 -7
Forth 6,465 146,450 -1
Lua 13,065 131,800 -5
Caml 1,889 69,600 0
Scala 3,570 66,250 -2
Clojure 782 62,200 0

1 combines Lisp, Scheme, Common Lisp, Arc & Clojure
2 combines OCaml, (S)ML, Caml
3 summed separate searches for sml and ml

Written by Brian Adkins

October 24, 2009 at 10:23 am

Programming in Standard ML – Part 3

leave a comment »

Table of Contents

Chapter 6 – Case Analysis

Tuple types are homogeneous e.g. all values of type int*real are pairs containing an int and a real. Match failures occur at compile time. Types that have more than one form, such as int, are heterogeneous. Pattern matches fail at run time as a bind failure.

ML defines functions over heterogeneous types using clausal function expressions. For example:

fn pat1 => exp1
 | pat2 => exp2
 | ...
 | patn => expn

Each component pat => exp is called a clause, or a rule. The entire assembly of rules is a called a match. For example:

val recip : int -> int =
    fn 0 => 0 | n:int => 1 div n

The fun notation is generalized so we can be more concise:

fun recip 0 = 0
  | recip (n:int) = 1 div n

Case analysis on the values of a heterogeneous type is performed by application of a clausally-defined function. The notation:

case exp
  of pat1 => exp1
   | ...
   | patn => expn

is short for the application:

(fn pat1 => exp1
  | ...
  | patn => expn)

Matches are subject to two forms of “sanity check” – exhaustiveness checking and redundancy checking.

Chapter 7 – Recursive Functions

For a function to be able to call itself, it needs a name. For example:

val rec factorial : int->int =
    fn 0 => 1 | n:int => n * factorial (n-1)

or using fun notation:

fun factorial 0 = 1
  | factorial (n:int) = n * factorial (n-1)


If we define a helper function that accepts an accumulator we can reduce the storage needed:

fun helper (0,r:int) = r
  | helper (n:int,r:int) = helper (n-1,n*r)
fun factorial (n:int) = helper (n,1)

It’s better programming style to hide the helper function w/in a local declaration:

    fun helper (0,r:int) = r
      | helper (n:int,r:int) = helper (n-1,n*r)
    fun factorial (n:int) = helper (n,1)

Tail recursive functions are analogous to loops in imperative languages – they iterate the computation w/o needing auxiliary storage.

Mutual Recursion

Definitions which are mutually recursive can be joined together with the and keyword to indicate they are defined simultaneously by mutual recursion:

fun even 0 = true
  | even n = odd (n-1)
and odd 0 = false
  | odd n = even (n-1)

Chapter 8 – Type Inference and Polymorphism

Standard ML allows you to omit type information whenever it can be determined from context. Consider the following:

fn s:string => s ^ "n"

There is no need to declare the type of s since it can be inferred from the context, so we may write:

fn s => s ^ "n"

A type scheme is a type expression involving one or more type variables standing for an unknown, but arbitrary type expression. Type variables are written ‘a (pronounced alpha), ‘b (pronounced beta), etc. For example, the type scheme ‘a->’a has instances int->int, string->string, (int*int)->(int*int), and (‘b->’b)->(‘b->’b), etc. It does not have the type int->string.

We may bind an identity function to the variable I as follows:

val I : 'a->'a = fn x=>x

We may also write:

fun I(x:'a) : 'a = x

Standard ML eliminates the need to ascribe a type scheme to the variable:

val I = fn x=>x


fun I(x) = x

Chapter 9 – Programming with Lists

The values of type type list are the finite lists of values of type type:

1. nil is a value of type typ list.
2. if h is a value of type typ, and t is a value of type typ list,
   then h::t is a value of type typ list.
3. Nothing else is a value of type typ list.

The type expression typ list is a postfix notation for the application of the type constructor list to the type typ.

A value val of type typ list has the form:
val1 :: (val2 :: (… :: (valn :: nil) … ))

The :: operator is right-associative, so we may omit parentheses:
val1 :: val2 :: … :: valn :: nil

Or, we may use list notation:
[ val1, val2, …, valn ]

Computing With Lists

Some examples:

fun length nil = 0
  | length (_::t) = 1 + length t

Note we do not give a name to the head of the list, instead we use a wildcard _

fun append (nil, l) = l
  | append (h::t, l) = h :: append (t, l)

The latter is built into Standard ML and is written using infix as: exp1 @ exp2

fun rev nil = nil
  | rev (h::t) = rev t @ [h]

The running time of the latter is O(n2). The following definition makes use of an accumulator and has a running time of O(n):

    fun helper (nil, a) = a
      | helper (h::t, a) = helper (t, h::a)
    fun rev' l = helper (l, nil)

Chapter 10 – Concrete Data Types

Non-Recursive Datatypes

Example of nullary i.e. zero argument, constructors:

datatype suit = Spades | Hearts | Diamonds | Clubs

It is conventional to capitalize the names of value constructors, but this is not required by the language.

Datatypes may be parameterized by a type:

datatype 'a option = NONE | SOME of 'a

The values are NONE or Some val, where val is a value of type typ. The option type constructor is pre-defined in Standard ML.

Option types can also be used in aggregate data structures:

type entry = { name:string, spouse string option }

An entry for an unmarried person would have a spouse field with a value of NONE.

Recursive Datatypes

datatype 'a tree =
  Empty |
  Node of 'a tree * 'a * 'a tree

1. The empty tree Empty is a binary tree.
2. If tree_1 and tree_2 are binary trees, and val is a value of type type, then Node (tree_1, val, tree_2) is a binary tree.
3. Nothing else is a binary tree.
A function to compute the height of a binary tree, and one to compute the number of nodes:

fun height Empty = 0
  | height (Node (lft, _, rht)) = 1 + max (height lft, height rht)

fun size Empty = 0
  | size (Node (lft, _, rht)) = 1 + size lft + size rht

Heterogeneous Data Structures

The tree data type above requires that the type of the data items at the nodes must be the same for every node of the tree. To represent a heterogeneous tree, the data item must be labelled with enough info to determine the type at run-time.

datatype int_or_string =
  Int of int |
  String of string

type int_or_string =
  int_or_string tree

Datatype declarations and pattern matching can be useful for manipulating the abstract syntax of a language. Consider an example representing arithmetic expressions:

datatype expr =
  Numeral of int |
  Plus of expr * expr |
  Times of expr * expr

fun eval (Numeral n) = Numeral n
  | eval (Plus (e1, e2)) =
        val Numeral n1 = eval e1
        val Numeral n2 = eval e2
        Numeral (n1+n2)
  | eval (Times (e1, e2)) =
        val Numeral n1 = eval e1
        val Numeral n2 = eval e2
        Numeral (n1*n2)

If we extend the expr datatype as follows:

datatype expr =
  Numeral of int |
  Plus of expr * expr |
  Times of expr * expr
  Recip of expr

The compiler will complain about eval being incompatible with the new version of expr. Recompiling eval will produce an inexhaustive match warning since eval lacks a case for Recip. This is one of the benefits of static typing provided in Standard ML.

Written by Brian Adkins

August 15, 2009 at 2:43 pm

Posted in programming

Tagged with ,

Programming in Standard ML – Part 2

leave a comment »

Table of Contents

Chapter 4 – Functions

Lambda expressions are written as:

fn var : typ => exp

For example:

fn x : real => Math.sqrt (Math.sqrt x)
(fn x : real => Math.sqrt (Math.sqrt x)) (16.0)
val fourthroot : real -> real =
  fn x : real => Math.sqrt (Math.sqrt x)
fourthroot 16.0

ML provides a special syntax for function bindings that’s more concise:

fun fourthroot (x:real):real = Math.sqrt (Math.sqrt x)

By experimenting (I expect this is covered later), I found the following is sufficient due to type inference:

fun fourthroot x = Math.sqrt (Math.sqrt x)

Local val bindings may shadow parameters and other val bindings. For example, in the following, the last occurrence of x refers to the parameter of h, but the preceding two occurrences of x refer to the local binding with a value of 2.0:

fun h(x:real):real =
  let val x:real = 2.0 in x+x end * x

Chapter 5 – Products and Records

The n-tuple is the simplest form of aggregate data structure. They are of the form:
(val1, … ,valn)
An n-tuple is a value of a product type of the form:
typ1* … *typn
For example:

val pair : int * int = (2, 3)
val triple : int * real * string = (2, 2.0, "2")
val pair_of_pairs : (int * int) * (real * real) = ((2,3),(2.0,3.0))

A 0-tuple, also known as a null tuple, is the empty sequence of values, ( ). It is a value of type unit. 1-tuples are absent from the language.

Tuple Patterns

We can use pattern matching to extract portions of an n-tuple. For example, in the code snippet below, r will be equal to 3.14. Underscores indicate “don’t care” positions:

val foo : (int * string) * (real * char) = ((7,"hello"),(3.14,#"z"))
val ((_, _), (r:real, _)) = foo

We can give names to the first and second components of the pair – using the REPL:

- val (is:int*string,rc:real*char) = foo;
val is = (7,"hello") : int * string
val rc = (3.14,#"z") : real * char

A pattern is one of three forms:

  1. A variable pattern of the form var : typ
  2. A tuple pattern of the form (pat1, …, patn), where each pati is a pattern. This includes as a special case the null-tuple pattern, ()
  3. A wildcard pattern of the form _

Record Types

Tuples can become more difficult to use as the number of elements increases. Record types allow labeling each component. A record type has the form:
{ lab1:typ1, …, labn:typn}
A record value has the form:
{ lab1=val1, …, labn=valn}
A record pattern has the form:
{ lab1=pat1, …, labn=patn}

For example, the record type hyperlink is defined as follows:

type hyperlink =
  { protocol : string,
    address  : string,
    display  : string }

The following record binding defines a variable of type hyperlink:

val mailto : hyperlink =
  { protocol = "mailto",
    address = "foo@bar.com",
    display = "Brian Adkins" }

The following record binding:

val { protocol=prot, display=disp, address=addr } = mailto

decomposes into the three variable bindings:

val prot = "mailto"
val addr = "foo@bar.com"
val disp = "Brian Adkins"

We can use wildcard to extract selected fields:

val {protocol=prot, address=_, display=_ } = mailto

However, this isn’t very helpful with many fields, so we can use ellipsis patterns:

val {protocol=prot, ... } = mailto

ML provides an abbreviated form of record pattern {lab1,…labn} which stands for { lab1=var1, …, labn=varn} where the variables have the same name as the corresponding labels. For example, the following:

val { protocol, address, display } = mailto

decomposes into these bindings:

val protocol = "mailto"
val address = "foo@bar.com"
val display = "Brian Adkins"

Multiple Arguments and Multiple Results

fun dist (x:real, y:real):real = sqrt (x*x + y*y)

Keyword parameters are supported through record patterns:

fun dist’ {x=x:real, y=y:real} = sqrt (x*x + y*y)

Invoked as follows:

dist' {x=2.0,y=3.0}

Functions with multiple results may be thought of as functions yield tuples (or records).

fun dist2 (x:real, y:real):real*real
  = (sqrt (x*x+y*y), abs(x-y))

Sharp notation allows us to conveniently access the Nth item in a tuple.

- val foo = (3,7,5,2);
val foo = (3,7,5,2) : int * int * int * int
- #3 foo;
val it = 5 : int

A similar notation is used for record field selection:

 - val foo = { name = "Brian", phone = "555-1212" };
val foo = {name="Brian",phone="555-1212"} : {name:string, phone:string}
- #name foo;
val it = "Brian" : string

However, Harper states, “Use of the sharp notation is strongly discouraged!

Written by Brian Adkins

May 2, 2009 at 8:51 pm

Posted in programming

Tagged with

Programming in Standard ML – Part 1

leave a comment »

Table of Contents

The posts in this series will be notes & highlights from working through “Programming in Standard ML” by Robert Harper. See Getting Started with Standard ML for a link to the PDF. I may not always use quotes in this series, but it should be assumed that any factual information about Standard ML that I write in this series has been obtained from the PDF above – this series is, in effect, a summarization of Robert Harper’s PDF.


Attributes of Standard ML:

  • Type-safe programming language
  • Statically typed
  • Extensible type system
  • Polymorphic type inference
  • Automatic storage management
  • Encourages functional (effect-free) programming
  • Allows imperative (effect-ful) programming where appropriate
  • Pattern matching
  • Extensible exception mechanism for handling error conditions
  • Expressive and flexible module system
  • Portable due to its precise definition
  • Portable standard basis library

Chapter 1 – Programming in Standard ML

This chapter provides a brief overview of the language by considering an implementation of a regular expression package. It was slightly more familiar to me because of my brief studies of Haskell; otherwise, I think it would’ve appeared very foreign. I prefer more of a bottom up approach as much as possible instead of looking at language features that haven’t been described.


I do think that currying & partial application are cool features. Consider the following function declaration:

val match : RegExp.regexp -> string -> bool

Coming from a non-currying background, I would describe match as a function that accepts a regular expression and a string, and returns a boolean value indicating whether the string matches the regular expression, but Robert describes it this way:

It contains a function match that, when given a regular expression, returns a function that determines whether or not a given string matches that expression.

This was one of the features that made a strong impression on me when I first started looking at functional programming languages.

Chapter 2 – Types, Values, and Effects

Computation in ML is of a somewhat different nature. The emphasis in ML is on computation by evaluation of expressions, rather than execution of commands.

An expressions has three characteristics:

  • It may or may not have a type
  • It may or may not have a value
  • It may or may not create an effect

Effects will be ignored until chapter 13, so until then, it’s assumed that all expressions are effect-free, or pure. A type is defined by specifying three things:

  • a name for the type,
  • the values of the type, and
  • the operations that may be performed on values of the type


  • Negative numbers, as literals, are indicated with a tilde instead of a hypen e.g. ~7
  • div is used for integers, / for reals

A typing assertion has the form: exp : typ For example: 3 : int 4 div 3 : int ML does not perform any implicit conversions between types. Use real(3) + 4.3, or 3 + round(4.3) Conditional expression (exp1 and exp2 must have the same type): if exp then exp1 else exp2 if not exp then exp1 else exp2

Chapter 3 – Declarations

Once a variable is bound to a value, it is bound to it for life; there is no possibility of changing the binding of a variable once it has been bound. Examples of type bindings:

type float = real
type count = int and average = real

Examples of value bindings:

val m : int = 3+2
val pi : real = 3.14 and e : real = 2.17

One thing to keep in mind is that binding is not assignment. The binding of a variable never changes; once bound to a value, it is always bound to that value (within the scope of the binding). However, we may shadow a binding by introducing a second binding for a variable within the scope of the first binding.

Limiting scope with let expressions:

  val m : int = 3
  val n : int = m*m

The following expression evaluates to 54 because the binding of m is temporarily overridden during the evaluation of the let expression, then restored upon completion of this evaluation:

val m : int = 2
val r : int =
        val m : int = 3
        val n : int = m*m
    end * m

Written by Brian Adkins

May 2, 2009 at 7:45 pm

Posted in programming

Tagged with

Getting Started with Standard ML

with one comment

One of the two parallel tracks in my 2009 Programming Language Plan begins with the Standard ML programming language, so it’s time to get started.

Standard ML Resources



Since I was unfamiliar with the Standard ML programming language, I was surprised to find there are a number of good books about the language. Following are just some of them:

Elements of ML Programming

ML for the Working Programmer

Purely Functional Data Structures

The Definition of Standard ML

The Standard ML Basis Library

Introduction to Programming using SML

Concurrent Programming in ML

Other Educational Materials

Programming in Standard ML – excellent online book by Robert Harper of Carnegie Mellon University. Since I don’t know if Standard ML will simply be a stepping stone to Haskell (which in turn may not be a primary language for me) or a language I invest a lot of time in, I’m going to restrict myself from my normal method of purchasing a book or two when learning a new language. Instead, I’ll be going through Harper’s online book initially.

Tips for Computer Scientists on Standard ML (Revised)

Hello World

There are a number of great compilers for Standard ML (listed above), but I only need one to get started, so I chose Standard ML of New Jersey despite the funky name. It’s a popular version, and it has a REPL, so it’s good enough for me for now.

I develop software on Mac OSX and deploy on Ubuntu Linux. On my Ubuntu server, installing SML/NJ was as simple as:

sudo apt-get install smlnj

On Mac OSX, there are a couple of options listed on this page. I could use a pre-built system or the generic Unix install, so naturally I chose the generic Unix install which installed easily according to the simple directions.

# Download config.tgz
tar xzf config.tgz

# Wait for install to complete

~/software/smlnj$ rlwrap bin/sml
Standard ML of New Jersey v110.69 [built: Sat May  2 12:04:08 2009]
- print "hello, worldn";
hello, world
val it = () : unit

Great, looks like everything is working fine. Note, I use the rlwrap utility to provide a nicer REPL experience, but it’s not required.

I’ll continue with a series of posts with notes from working through “Programming in Standard ML”.

Written by Brian Adkins

May 2, 2009 at 12:26 pm

Posted in books, programming

Tagged with ,