Sunday, January 27, 2008

Walking the F# List namespace - Post #3

This will probably be my last series on examining the source behind the List namespace. Only because I was running out of interesting functions that have interesting implementations, or functions that I feel I could adequately explain. And frankly, writing that List.filter actually imlements List.filter from another namespace is not too interesting. So, I took a look inside list.fs to find some interesting implementations and found a few more functions to walk through. List.assoc: List.assoc is roughly equivalent to a hashtable or dictionary lookup in c#, except the list is a list of tuples. (Also, I’m not sure if there are any index optimizations either.) Throws a not_found() exception if the key is not found. The signature for List.assoc is ‘a -> (‘a * ‘b) list -> ‘b
‘a – the key value to find
(‘a * ‘b) list – the source tuple list
‘b – the value for the key.
Here’s an example, using FSI:
> let list = [for x in 1 to 50000 -> (x,x*x)];;
val list : (int * int) list
>let findSquare s = List.assoc s list;;
val findSquare : int -> int
> findSquare 14513;;
val it : int = 210627169
If we look at the source code, List.assoc is implemented as:
let rec assoc x l =
match l with
|[] -> not_found()
|((h,r)::t) -> if x = h then r else assoc x t
The interesting code block with this function is the ((h,r)::t) syntax. The syntax of (h,r) is breaking up the tuple in the first element on the list, with h getting the key and r getting the value. If the key matches the passed in value, then return r. Also, if the list is empty ([]), then raise the not_found() exception. (I didn’t go into any detail about the rec or match, I describe them in further detail in my previous posts.) The documentation for List.assoc suggests we use List.try_assoc. So, lets try. List.try_assoc: essentially the same as List.assoc, with two main differences. First, try_assoc, will not throw a not_found() exception if the key is not found. Second, try_assoc returns an option type. When using F# code from other .Net languages, the empty option type (None) is equalivant to the null value. To create a value of the option type, you need to use Some or None. The signature for List.try_assoc is: ‘a -> (‘a * ‘b) list -> ‘b option
‘a – the first tuple element to find
(‘a * ‘b) list – the soure list
‘b option – the returned value
Here’s an example, from FSI.
> let list = [for x in 1 to 50000 -> (x,x*x)];;
val list : (int * int) list
> findSquare 50001;;
val it : int option = None
>findSquare 23;;
val it : int option = Some 529
If we look at the source code, List.try_assoc is implemented as
let rec try_assoc x l =
match l with
|[] -> None
|((h,r)::t) -> if x = h then Some(r) else try_assoc x t
Wow, this looks surprisingly similar to List.assoc, with the exception of the return types of None and Some(r). List.assq: List.assq is almost like List.assoc, except is used the PhysicalEquality operator. PhysicalEquality is defined in as “Reference/physical equality. True if boxed versions of the inputs are reference-equal, OR if both are value types and the implementation of Object.Equals for the type of the first argument returns true on the boxed versions of the inputs.” as defined in: http://research.microsoft.com/fsharp/manual/FSharp.Core/Microsoft.FSharp.Core.LanguagePrimitives.html So, generally speaking, List.assq does reference equality on values on the stack, whereas value equality compares values on the heap. I posted a question on an implementation of PhysicalEquality here (Kudos to zakaluka!) Anyway here’s how List.assq is implemented:
let rec assq x l =
match l with
|[] -> not_found()
|((h,r)::t) -> if PhysicalEquality x h then r else assq x t
List.try_assq is exactly the same as List.try_assoc, except the return type is an option type. Here’s the implementation:
let rec try_assq x l =
match l with
|[] -> None
|((h,r)::t) -> if PhysicalEquality x h then Some(r) else try_assq x t
List.for_all: Returns true if all the elements in the list satisfy the given predicate, false if one fails. The list elements are anded together, visualized as p iO && p i1 && .. p iN. The signature for list.for_all is (‘a -> bool) -> ‘a list -> bool
(‘a -> bool) – A function that evaluates the list element and returns a bool
‘a list -> the list to process
Bool -> return value
Here’s an example:
[<Test>]
member t.For_allExample () =
let list1 = [0 .. +2 .. 20]
let list2 = [1 .. 10]
let evens = List.for_all (fun x -> x % 2 = 0) list1
let notevens = List.for_all (fun x -> x % 2 = 0) list2
Assert.IsTrue(evens)
Assert.IsFalse(notevens)
List.for_all implements
let rec for_all f l1 = Microsoft.FSharp.Primitives.Basics.List.for_all f l1
Hmm, not so much fun to describe.....so on we go. List.for_all2: Returns true if all the elements in two lists satisfy the given predicate, false if one fails. The lists must have the same length. The signature for list.for_all2 signature is (‘a -> ‘b -> bool) -> ‘a list -> ‘b list -> bool.
(‘a -> ‘b -> bool) – A function that takes an element from the first list and the second list and returns a bool
‘a list – the first list
‘b list – the second list
Bool -> the return value
Here’s an example:
[<Test>]
member t.For_all2Example () =
let list1 = [0 .. +2 .. 20]
let list2 = [20 .. +2 .. 40]
let list3 = [1 .. 10]
let evens = List.for_all2 (fun x y -> x % 2 = 0 && y % 2 = 0 ) list1 list2
let notevens = List.for_all2 (fun x y -> x % 2 = 0 && y % 2 = 0) list2 list3
Assert.IsTrue(evens)
Assert.IsFalse(notevens)
List.for_all2 implements:
let rec for_all2 f l1 l2 =
match l1,l2 with
|[],[] -> true
|(h1::t1),(h2::t2) -> f h1 h2 && for_all2 f t1 t2
|_ -> invalid_arg "for_all2"
This recursive function is interesting, in particular the line (h1::t1),(h2::t2) -> f h1 h2 && for_all2 f t1 t2. The function aggregates the results of the function and the list elements!

Friday, January 18, 2008

Dustin Campbell on F#

I caught Dustin Campbell's presentation @ Code Mash. It was a great experience for me to just watch someone talk and type F# code for 40 minutes. Dustin has started blogging about F# also. Here's his podcast with Scott Hanselman.

Thursday, January 17, 2008

A few F# basics

I was driving to the Code Mash conference with a colleague last week and I started talking about F# and how I was really getting into it. He asked generally, "What's the main difference between F# and C#?". After drawing a complete blank and a long pause, I stammered out something like, there's really no one difference between the two languages, and began to ramble some of the differences I could recall. A few days later I went back and nosed through my copy of Robert Pickering's book; I refreshed what some of the main differences are. So I thought I'd jot some down. Disclaimer: I am just beginning to learn this language, everything here is my interpretation and I am sure there are more differences than the few I list. Identifiers and expressions - Identifier is the keyword to define a value. Expressions are any chunk of code that returns something.
let myname = "Nate"
In F#, as with other functional languages, all values are immutable. Once a value is assigned to an identifier the value never changes. Of course, like all languages, there are ways to let identifiers be mutable using the keyword mutable.

let mutable myName = "Nate"

myName <- "Etan"

Functions are identifiers- Functions in F# are first class entities and are treated just like identifiers.
let squares x = x * x
let cubes x = squares x * x
In this example, squares is the name of the function and x is the parameter. The function cubes takes it's parameter passes it to the square function. The result of the squares function gets included in the rest of the expression for cubes. Functions can also be curried. Currying is the process of passing a function with some arguments that returns you a function with a single argument. Consider the following snippet:
let p = 20
let add x y = x + y
let add13 = add 13
let answer = add13 p

answer = 33
Huh?? Let's take a walk through that. The add function has the signature of int -> int -> int, which reads; this function takes two ints and returns an int. The add13 function has the signature (int -> int) which reads; this function takes a function that returns an int and returns an int. The expression answer calls the add13 function, which calls the add function giving 1 parameter which gets assigned to x then p gets assigned to y on the add function. Type inference - F# is very strongly typed, but you don't have to explicitly declare the variable types. The F# compiler handles all that for you. Using FSI, you can see what the complier infers
> let s = "S"
val s : string
> let double = 2.3
val double : float
> let x = 0
val x : int
> let c = 'c'
val c : char
> let whatisthis q = q
val what : 'a -> 'a
The 'a means the complier could not infer what the type is so it uses a generic type. (Not to be confused with generics in C#.) Pattern matching and recursion - These two traits deserve their own respective posts. Pattern matching is conducive to if..elseif..else and switch statements in procedural languages and recursion means a function calls itself. Anonymous functions - anonymous functions, also called lambdas, that don't get an identifier. These functions are used as arguments to other functions. The keyword fun identifies these.
let divide q = (fun x y -> x / y) q 2
That's all for now, comments and feedback are welcome!

Walking the F# List namespace - Post #1

This will be the first part of a many part posting as I learn F#. Still being a research language, it's immense. So I thought, why not just pick a namespace and write a small bit of code to learn all the new functions in the FSharp.Collections.List namespace. F# comes complete with the source code, which is written in F# and one of the best ways to learn a new language and to find the "best" or "right" ways of writing new code is to spend as much time reading code as you do writing code. (Scott Hanselman has a great blog entry on reading code.) I took a TDD approach to help make my way through, so each code snippet is actually a small NUnit test. To get a project configured for vs2008, take a look here. For some of the simpler syntax, I used F# Interactive (FSI). Lists, like everything in F#, are immutable, which means once one is created, you cannot change them. Also, lists can only contain members of the same type, you do have the ability to create a list of an obj type. Lists are a built in data structure that is basically a concatenation of the head and tail. The head being the first element and the tail the remainder of the list, as you'll see later on as we get into the source code. The syntax to create a list is use the :: followed by an empty list []. The :: syntax stands for cons.)
let list = "blue" :: "red" :: "green"
Like most programmers, I'm a lazy typer, so this shorthand also works.
let list = ["blue";"red";"green"]
To concatenate two lists, use the @ symbol
let list = ["blue";"red";]@["yellow";"green"]

To bring out some of the power of F# you can also create lists using comprehensions:

let numbers = [1 .. 10] = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
let chars = ['a' .. 'f'] = ['a'; 'b'; 'c'; 'd'; 'e'; 'f']
let cubes = [ for x in 1 .. 5 -> x * x * x] = [1; 8; 27; 64; 125]
So, on to the first function! List.append: This function returns a new list from two lists that are passed to the function in the order they are passed. The method signature of append is

'a list -> 'a list -> 'a list

A little on the signature, List.append first and second arguments are of type 'a list and returns 'a list (remember 'a is how the compiler infers a generic type.)

let list1 = [2 .. +2 .. 10]
let list2 = [1 .. +2 ..9]
let appended= List.append list1 list2
combined = [2;4;6;8;10;1;3;5;7;9;] Lets take a look at how this function is implemented:
let append l1 l2 = l1 @ l2
List.exists: Tests if any element in the list statisy the given predicate. (A predicate is a statement that returns true or false.) The signature of exists is ('a -> bool) -> 'a list -> bool.
('a -> bool) is the function that applies the predicate to item 'a 'a -> list is your list bool is the return value
Here's an example:
[<Test>]
member t.Listexists() =
let list1 = ["The";"quick";"brown";"fox";"jumped";"over";"the";"lazy";"dog"]
let answer = List.exists (fun x -> x = "fox") list1
let wrong = List.exists (fun x -> x = "Fox") list1
Assert.IsTrue(answer)
Assert.IsFalse(wrong)
Lists.exists under the covers is:
let rec exists f l1 = Microsoft.FSharp.Primitives.Basics.List.exists f l1

The signature of Microsoft.FSharp.Primitives.Basics.List.exists is the same of the List.exists function

Here we the first use if the rec keyword. Rec stands for recursion, which means this function calls itself. F# recursion is a core concept, it works with lists by getting the head which is the first element and the tail which is the rest of the list, so until tail is empty [], the function will be called with each element in the list.

Lists.exists2: Same as List.exists, but works with two lists. If the lists don't match in length, it throws an Invalid_ArgumentException. Exists2 signature is: ('a -> 'b -> bool) -> 'a list -> 'b list -> bool.
('a -> 'b -> bool) is the function that applies the predicate to item 'a and item 'b
'a -> list is your first list
'b -> list is your second list
bool is the return value
Here's an example:
[<Test>]
member t.Listexists2() =
let list1 = ["The";"quick";"brown";"fox";"jumped";"over";"the";"lazy";"dog"]
let list2 = ["Some";"other";"thing";" ";"jumped";"around";"another";"restless";"mammal"]
let list3 = ["Some";"other";"thing";"Jumped";"winking";"around";"another";"tired";"mammal"]
let answer = List.exists2 (fun x y -> x = y) list1 list2
let wrong = List.exists2 (fun x y -> x = y) list1 list3
Assert.IsTrue(answer)
Assert.IsFalse(wrong)

[<Test;expectedexception(type Invalid_argumentException)>]
member t.Listexists2exception() =
let list1 = ["The";"quick";"brown";"fox";"jumped";"over";"the";"lazy";"dog"]
let list2 = ["Some";"other";"thing";"jumped";"around";"another";"restless";"mammal"]
let answer = List.exists2 (fun x y -> x = y) list1 list2
Assert.IsTrue(answer)
List.exists2 under the covers:
let rec exists2 f l1 l2 =
match l1,l2 with
| [],[] -> false
| (h1::t1),(h2::t2) -> f h1 h2 exists2 f t1 t2
| _ -> invalid_arg "exists2"
Ahh - the rec keyword again, but this time paired with match. Match is F# pattern matching. The match statement is working with each list, the first condition [],[] says if both lists are empty then return false. The (h1::t1),(h2::t2) -> f ht h2 exists2 f t1 t2 gets the head and tail for each list and applies the function to h1 and h2 then passes the function and the rest of each list back to exists. The _ character basically is a wildcard to catch everything else. List.filter: Returns a new list that only contains the elements that the given predicate returns true. The signature ('a -> bool) -> 'a list -> 'a list.
('a -> bool) = the function that each element is applied to the
predicate
'a list = the list supplied to List.filter
'a list = the new list
Here's an example:
[<Test>]
member t.Listfilter() =
let list1 = [1 .. 10]
let list2 = List.filter (fun x -> x % 2 = 0) list1
Assert.AreEqual([2 .. +2 .. 10],list2)
So what's happening internally?
let filter f x = Microsoft.FSharp.Primitives.Basics.List.filter f x
The signature for Microsoft.FSharp.Primitives.Basics.Lists.filter function is the same for List.filter. List.find: Returns the element in the list where the given predicate returns true. Throws a Not_Found exception if no element is found. The signature is: ('a -> bool) -> 'a list -> 'a.
('a -> bool) the function to evaluate 'a
'a list = the list to work with
'a = the return value

Here's an example:

[<Test>]
member t.ListFindTest() =
let list = [1 .. 9]
let a = List.find (fun x -> x = 2) list
Assert.AreEqual(a,2)
Internally, what is implemented is once again our rec and match expressions!
let rec find f l =
match l with [] -> not_found()
¦ h::t -> if f h then h else find f t
The find function takes a function and the list, checks for an empty list if it's empty then throw a not_found() exception, otherwise get the head and tail. Apply the value in h to the function, if it's true, it returns h, otherwise pass the function and the tail back to find. You again see here the power of functional languages core functions of rec, match and list processing. Being able to get the first element in a list with simply stating h::t; what a change from having to enumerate lists with for or foreach loops! Lind.find_all: Returns a new list of all the elements where the given predicate returns true. The signature is ('a -> bool) -> 'a list -> 'a list.
('a -> bool) the function to evaluate 'a
'a list = the list to work with
'a list = the new list with all the elements that met the predicate
Here's an example:
[<Test>]
member t.ListFind_All() =
let list = [1;2;3;5;7;11;13;16;17;19;23]
let evens = List.find_all (fun x -> x % 2 = 0) list
Assert.AreEqual([2;16;],evens)
Behind the scenes List.find_all implements:
let find_all f x = Microsoft.FSharp.Primitives.Basics.List.filter f x
Which is the same method that List.filter uses! List.partition: This is an interesting function, it splits a list based on the predicate, the first is contains all elements that return true, the second list is all the elements that return false. The signature is: ('a -> bool) -> 'a list -> 'a list * 'a list.
('a -> bool) the function to evaluate each element
'a list - the source list
'a list * 'a list - the returned list that contains a tuple of the
elements.
The 'a list * 'a list signature identifies a tuple, not multiplication. The returned list contains two lists. Here's an example:
let list = [1;2;3;5;7;11;13;16;17;19;23]
let newlist = List.partition (fun x -> x % 2) list
newlist = ([2; 16], [1; 3; 5; 7; 11; 13; 17; 19; 23])
With F# you can add on the identifiers like such:
let list1,list2 = List.partition (fun x -> x % 2 = 0) list
list1 = [2;,16;]
list2 = [1; 3; 5; 7; 11; 13; 17; 19; 23]
Wow! The partition function returns a list of tuples, which the let statement is asking for each identifier to get one. By separating the identifiers by commas, it tells the compiler to assign one of the tuple values to each of the identifiers. If I enter the following statement into fsi
let list1, list2,list3 = List.partition (fun x -> x % 2 = 0) list
-------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

stdin(8,25): error: FS0001: Type mismatch. Expecting a
'a * 'b * 'c
but given a
'd list * 'd list.
The tuples have different lengths
Under the covers List.partiation uses:
let partition p x = Microsoft.FSharp.Primitives.Basics.List.partition p x
Enough for now, comments and feedback are welcome!

Tuesday, January 15, 2008

Walking the F# List namespace - Post #2

Continuing on with my walkthrough of the Microsoft.FSharp.Collections.List namespace, I thought I’d give the folding functions a shot. It took me some to time to understand what the functions were accomplishing, but after a few tests, they became clear. List.fold_left: This applies a function to each element in the list and starts an aggregate argument to apply through the rest of the elements. Additionally, the function is given a seed value. The seed value and the first element are passed to the function, the result and the second element are passed to the function, that result and the third element….you get the point. Fold_left can be described as
f(..f(f((f s i0 ) i1)i2)..) in
The signature for List.fold_left is (‘b -> ‘a -> ‘b) -> ‘a list -> ‘b
(‘b -> ‘a -> ‘b) - the function that takes the seed value (‘b) and the
first element (‘a). The function returns ‘b.
‘a list - the source list
‘b - the result

Here’s an example:

[<Test>]
member t.Listfold_left() =
let list = [1 .. 4]
let answer = List.fold_left (fun x y -> x*y) 2 list
Assert.AreEqual(48,answer)
Behind the scenes, List.fold_left is implemented as:
let rec fold_left f s l =
match l with
¦ [] -> s
¦ (h::t) -> fold_left f (f s h) t
Sweet, fold_left is simply a recursive match! The parameters to fold_left are the function (f), the seed (s) and the list (l). If the list is empty, simply return the seed value, because the seed value applied to nothing is the seed value. The (h::t) statement is a thing of beauty, it gets the first element in the list (h) and the rest of the list (t), then calls itself with the function (f), the result of function applied to the seed value and the first element, (f s h) and the tail. List.fold1_left: This applies a function to each element in the list and starts an aggregate argument to apply through the rest of the elements. It takes the first and second elements applies the function, the result and the third function are then given the function, until the end of the list returning the result.
f(..f(f((f i0 i1) i2)i3)..) in 
The signature for the function is (‘a -> ‘a -> ‘a) -> ‘a list -> ‘a
(‘a -> ‘a -> ‘a) - the function to apply the first element to the second
element and return the value
‘a list - the source list
‘a - the return value
Here’s an example:
let list = [1 .. 9]
let total = List.fold1_left (fun x y -> x * y) list
Total = 362880 You can also do a short hand operator with fold1_left, like so:
let total = List.fold1_left (+) list
Total = 45 List.fold1_left behind the scenes is implemented like so:
let fold1_left f l =
match l with
¦ [] -> invalid_arg "List.fold1_left"
¦ (h::t) -> fold_left f h t
Ahhh simplicity! Note the fold1_left function does not have the rec keyword. Given the function (f) and the list (l), the match statement checks for an empty list, if so, throw an invalid argument exception. Next split the list (h::t) and then call fold_left with the function (f), first element (h) and the rest of the list (t).

List.fold_left2: This applies a function to each element in two lists and starts an aggregate calculation based on a seed value and each corresponding list element. Then takes the result and applies it to the second element, and so on. Returning the result. The lists must be the same size. List.fold_left2 can be displayed as

f(..f(f((f s i0 j0 ) i1 j1)i2 j2)..) in jn
The signature for fold_left2 is (‘c -> ‘a -> ‘b –> ‘c) -> ‘c -> ‘a list -> ‘b list -> ‘c
(‘c -> ‘a -> ‘b –> ‘c) - This is a function that takes arguments of ‘c; the seed value, ‘a; an element from list1 and ‘b; an element from list2 and returns ‘c.
‘c – The seed value
‘a list – The first List
‘b list – The second list
‘c – The return value
Here’s an example:
let list1 = [1 .. +2 .. 9]
let list2 = [2 .. + 2 .. 10]
let answer = List.fold_left2 (fun c a b -> c + a + b) 10 list1 list2
Answer = 65 Here is how is this function implemented.
let rec fold_left2 f acc l1 l2 =
     match l1,l2 with
     | [],[] -> acc
     | (h1::t1),(h2::t2) -> fold_left2 f (f acc h1 h2) t1 t2
     | _ -> invalid_arg "List.fold_left2"
This looks very much like the fold_left function above. Fold_left2 takes the parameters of a function (f), the seed value (acc) and two lists (l1) and (l2). The first condition of the match statement checks if both the lists are empty, if so, it returns the seed value (acc). Next it splits the lists by using the (h::t) function and calls itself, by passing the function (f), the result of the function, the seed value and the heads of both lists, (f acc h1 h2) and both list tails.

Sample setup for Visual Studio 2008 for F# Unit Testing with NUnit

Here is an example of how I configured a vs2008 solution for building a small F# application using Test Driven Development (TDD) techniques and NUnit. I thought I’d share. Here’s the directory structure layout: C:\Projects\F#Projects\myApp · contains the sln file C:\Projects\F#Projects\myApp\myApp · contains the “production” fs files and fsharp project files C:\Projects\F#Projects\myApp\myApp_Tests · contains the test fs files and fsharp project files C:\Projects\F#Projects\myApp\Output · has a copy of nunit.framework.dll I started with a new F# solution and added two projects, myApp and myApp_Tests, both projects have their output paths set to the output directory. The myApp project I configured to be an exe and the myApp_Tests project I configured it to be a dll. (What the myApp project is compiled to really doesn’t matter, it all depends on your final use.) There’s a few additional configuration tasks I had to setup before I could start coding. In the myApp project I qualified it with module myApp so my code and tests could be in separate files. In the myApp_Tests project, I updated the DLL references section with these two entries: -r "C:\Projects\F#Projects\myApp\Output\Nunit.framework.dll" -r "C:\Projects\F#Projects\myApp\Output\myapp.exe" If you don’t want to update the dll references properties, you can add the references in your fs file as listed below. #r @"..\Output\nunit.framework.dll" #r @"..\Output\myapp.exe" I also had to set a project dependancy order to build myApp first then myApp_Tests second, via the solution properties menu. Whew, at last I can new write some code! (The formatting is a little off due to the encoding.) Here’s the myApp_Tests.fs file:
#light open NUnit.Framework [<TestFixture>] type myAppTests = class new() = {} [<Test>] member t.ShouldReturn2() = Assert.AreEqual(2,myApp.AddTwoNumbers(1,1)) [<test>] member t.ShouldReturn45() = Assert.AreEqual(45,myApp.AddTwoNumbers(13,32)) end
Here’s the myApp.fs file:
#light module myApp let AddTwoNumbers a b = a + b
Once you build your solution you can open Nunit Gui, find your test assembly and run your tests untill your in the green! (Make sure you enable NUNit’s shadow copy, so you can keep you project and Nunit open at the same time.) When you add a test, go back to NUnit and re-run, your new test will automatically show up. Feeback and comments are welcome! Enjoy!