Thursday, January 17, 2008

Walking the F# List namespace - Post #1

This will be the first part of a many part posting as I learn F#. Still being a research language, it's immense. So I thought, why not just pick a namespace and write a small bit of code to learn all the new functions in the FSharp.Collections.List namespace. F# comes complete with the source code, which is written in F# and one of the best ways to learn a new language and to find the "best" or "right" ways of writing new code is to spend as much time reading code as you do writing code. (Scott Hanselman has a great blog entry on reading code.) I took a TDD approach to help make my way through, so each code snippet is actually a small NUnit test. To get a project configured for vs2008, take a look here. For some of the simpler syntax, I used F# Interactive (FSI). Lists, like everything in F#, are immutable, which means once one is created, you cannot change them. Also, lists can only contain members of the same type, you do have the ability to create a list of an obj type. Lists are a built in data structure that is basically a concatenation of the head and tail. The head being the first element and the tail the remainder of the list, as you'll see later on as we get into the source code. The syntax to create a list is use the :: followed by an empty list []. The :: syntax stands for cons.)
let list = "blue" :: "red" :: "green"
Like most programmers, I'm a lazy typer, so this shorthand also works.
let list = ["blue";"red";"green"]
To concatenate two lists, use the @ symbol
let list = ["blue";"red";]@["yellow";"green"]

To bring out some of the power of F# you can also create lists using comprehensions:

let numbers = [1 .. 10] = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
let chars = ['a' .. 'f'] = ['a'; 'b'; 'c'; 'd'; 'e'; 'f']
let cubes = [ for x in 1 .. 5 -> x * x * x] = [1; 8; 27; 64; 125]
So, on to the first function! List.append: This function returns a new list from two lists that are passed to the function in the order they are passed. The method signature of append is

'a list -> 'a list -> 'a list

A little on the signature, List.append first and second arguments are of type 'a list and returns 'a list (remember 'a is how the compiler infers a generic type.)

let list1 = [2 .. +2 .. 10]
let list2 = [1 .. +2 ..9]
let appended= List.append list1 list2
combined = [2;4;6;8;10;1;3;5;7;9;] Lets take a look at how this function is implemented:
let append l1 l2 = l1 @ l2
List.exists: Tests if any element in the list statisy the given predicate. (A predicate is a statement that returns true or false.) The signature of exists is ('a -> bool) -> 'a list -> bool.
('a -> bool) is the function that applies the predicate to item 'a 'a -> list is your list bool is the return value
Here's an example:
[<Test>]
member t.Listexists() =
let list1 = ["The";"quick";"brown";"fox";"jumped";"over";"the";"lazy";"dog"]
let answer = List.exists (fun x -> x = "fox") list1
let wrong = List.exists (fun x -> x = "Fox") list1
Assert.IsTrue(answer)
Assert.IsFalse(wrong)
Lists.exists under the covers is:
let rec exists f l1 = Microsoft.FSharp.Primitives.Basics.List.exists f l1

The signature of Microsoft.FSharp.Primitives.Basics.List.exists is the same of the List.exists function

Here we the first use if the rec keyword. Rec stands for recursion, which means this function calls itself. F# recursion is a core concept, it works with lists by getting the head which is the first element and the tail which is the rest of the list, so until tail is empty [], the function will be called with each element in the list.

Lists.exists2: Same as List.exists, but works with two lists. If the lists don't match in length, it throws an Invalid_ArgumentException. Exists2 signature is: ('a -> 'b -> bool) -> 'a list -> 'b list -> bool.
('a -> 'b -> bool) is the function that applies the predicate to item 'a and item 'b
'a -> list is your first list
'b -> list is your second list
bool is the return value
Here's an example:
[<Test>]
member t.Listexists2() =
let list1 = ["The";"quick";"brown";"fox";"jumped";"over";"the";"lazy";"dog"]
let list2 = ["Some";"other";"thing";" ";"jumped";"around";"another";"restless";"mammal"]
let list3 = ["Some";"other";"thing";"Jumped";"winking";"around";"another";"tired";"mammal"]
let answer = List.exists2 (fun x y -> x = y) list1 list2
let wrong = List.exists2 (fun x y -> x = y) list1 list3
Assert.IsTrue(answer)
Assert.IsFalse(wrong)

[<Test;expectedexception(type Invalid_argumentException)>]
member t.Listexists2exception() =
let list1 = ["The";"quick";"brown";"fox";"jumped";"over";"the";"lazy";"dog"]
let list2 = ["Some";"other";"thing";"jumped";"around";"another";"restless";"mammal"]
let answer = List.exists2 (fun x y -> x = y) list1 list2
Assert.IsTrue(answer)
List.exists2 under the covers:
let rec exists2 f l1 l2 =
match l1,l2 with
| [],[] -> false
| (h1::t1),(h2::t2) -> f h1 h2 exists2 f t1 t2
| _ -> invalid_arg "exists2"
Ahh - the rec keyword again, but this time paired with match. Match is F# pattern matching. The match statement is working with each list, the first condition [],[] says if both lists are empty then return false. The (h1::t1),(h2::t2) -> f ht h2 exists2 f t1 t2 gets the head and tail for each list and applies the function to h1 and h2 then passes the function and the rest of each list back to exists. The _ character basically is a wildcard to catch everything else. List.filter: Returns a new list that only contains the elements that the given predicate returns true. The signature ('a -> bool) -> 'a list -> 'a list.
('a -> bool) = the function that each element is applied to the
predicate
'a list = the list supplied to List.filter
'a list = the new list
Here's an example:
[<Test>]
member t.Listfilter() =
let list1 = [1 .. 10]
let list2 = List.filter (fun x -> x % 2 = 0) list1
Assert.AreEqual([2 .. +2 .. 10],list2)
So what's happening internally?
let filter f x = Microsoft.FSharp.Primitives.Basics.List.filter f x
The signature for Microsoft.FSharp.Primitives.Basics.Lists.filter function is the same for List.filter. List.find: Returns the element in the list where the given predicate returns true. Throws a Not_Found exception if no element is found. The signature is: ('a -> bool) -> 'a list -> 'a.
('a -> bool) the function to evaluate 'a
'a list = the list to work with
'a = the return value

Here's an example:

[<Test>]
member t.ListFindTest() =
let list = [1 .. 9]
let a = List.find (fun x -> x = 2) list
Assert.AreEqual(a,2)
Internally, what is implemented is once again our rec and match expressions!
let rec find f l =
match l with [] -> not_found()
¦ h::t -> if f h then h else find f t
The find function takes a function and the list, checks for an empty list if it's empty then throw a not_found() exception, otherwise get the head and tail. Apply the value in h to the function, if it's true, it returns h, otherwise pass the function and the tail back to find. You again see here the power of functional languages core functions of rec, match and list processing. Being able to get the first element in a list with simply stating h::t; what a change from having to enumerate lists with for or foreach loops! Lind.find_all: Returns a new list of all the elements where the given predicate returns true. The signature is ('a -> bool) -> 'a list -> 'a list.
('a -> bool) the function to evaluate 'a
'a list = the list to work with
'a list = the new list with all the elements that met the predicate
Here's an example:
[<Test>]
member t.ListFind_All() =
let list = [1;2;3;5;7;11;13;16;17;19;23]
let evens = List.find_all (fun x -> x % 2 = 0) list
Assert.AreEqual([2;16;],evens)
Behind the scenes List.find_all implements:
let find_all f x = Microsoft.FSharp.Primitives.Basics.List.filter f x
Which is the same method that List.filter uses! List.partition: This is an interesting function, it splits a list based on the predicate, the first is contains all elements that return true, the second list is all the elements that return false. The signature is: ('a -> bool) -> 'a list -> 'a list * 'a list.
('a -> bool) the function to evaluate each element
'a list - the source list
'a list * 'a list - the returned list that contains a tuple of the
elements.
The 'a list * 'a list signature identifies a tuple, not multiplication. The returned list contains two lists. Here's an example:
let list = [1;2;3;5;7;11;13;16;17;19;23]
let newlist = List.partition (fun x -> x % 2) list
newlist = ([2; 16], [1; 3; 5; 7; 11; 13; 17; 19; 23])
With F# you can add on the identifiers like such:
let list1,list2 = List.partition (fun x -> x % 2 = 0) list
list1 = [2;,16;]
list2 = [1; 3; 5; 7; 11; 13; 17; 19; 23]
Wow! The partition function returns a list of tuples, which the let statement is asking for each identifier to get one. By separating the identifiers by commas, it tells the compiler to assign one of the tuple values to each of the identifiers. If I enter the following statement into fsi
let list1, list2,list3 = List.partition (fun x -> x % 2 = 0) list
-------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

stdin(8,25): error: FS0001: Type mismatch. Expecting a
'a * 'b * 'c
but given a
'd list * 'd list.
The tuples have different lengths
Under the covers List.partiation uses:
let partition p x = Microsoft.FSharp.Primitives.Basics.List.partition p x
Enough for now, comments and feedback are welcome!

2 comments:

Anonymous said...

Is this even valid F#..? The "Test" keyword isn't recognized by XUnit and member's need to be part of a type. Why not post the whole code?

[Test]
member t.Listfilter() =
let list1 = [1 .. 10]
let list2 = List.filter (fun x -> x % 2 = 0) list1
Assert.AreEqual([2 .. +2 .. 10],list2)

Anonymous said...

Hi, the [<Test>] is the NUnit attribute. Check this post to get it running in your code.

Testing with tools

(You have to watch for the formatting, sorry about that.)