Friday, February 15, 2008

Basic Type conversions with F#

As I work my way through the ProjectEuler problems they forced me to look into working with the different types in F#. F# is a very strongly typed language, but it has the same basic types as the types in the other .Net languages. Here is a list of the basic types and how to tell the compiler you want to use them:
let int = 42
let string = "This is a string"
let char = 'c'
let bool = true
let bytearray = "This is a byte string"B
let hexint = 0x34
let octalint = 0o42
let binaryinteger = 0b101010
let signedbyte = 68y
let unsignedbyte = 102uy
let smallint = 16s
let smalluint = 16us
let integer = 345l
let usignedint = 345ul
let nativeint = 765n
let unsignednativeint = 765un
let long = 12345678912345L
let unsignedlong = 12345678912345UL
let float32 = 42.8F
let float = 42.8
(Definitions for the types are listed here.) F# also has BigInt and BigNum types, they stand for arbitrary large integer and arbitrary large number respectively. (I don't know how big they are yet.)
let bigInt = 9876543219876I
let bigNum = 123456789987654N
The F# compiler will determine the types you are working with, a feature called Type Inference. To see what types are inferred, compile your fs files using the –i switch to create an FSI (F# Interface file) or use the mouse in Visual Studio. Most of the F# programming you will do, inference will work. Now, I know some of you might be thinking; “Yeehaw! I don’t have to worry about declaring types! I’m free!!” Well…some of you might have, I did. If you want force a type and not let inference handle it for you, you have to use the conventions above like so:
> 3423456573476N;;
val it : bignum = 3423456573476N
> "This will be a string of bytes"B;;
val it : byte []
= [84uy; 104uy; 105uy; 115uy; 32uy; 119uy; 105uy; 108uy; 108uy; 32uy; 98uy;
101uy; 32uy; 97uy; 32uy; 115uy; 116uy; 114uy; 105uy; 110uy; 103uy; 32uy;
111uy; 102uy; 32uy; 98uy;121uy; 116uy; 101uy; 115uy]
> 0x06D;;
val it : int = 109
But what happens if you have to convert between types? Well, F# has conversion methods like so:
> let x = 42;;
val x : int
> let bigx = Int64.of_int x;;
val bigx : int64
> bigx;;
val it : int64 = 42L
The first statement let x = 42 and the resulting line val x : int is an example of type inference. The F# compiler infers that 42 is of type Int32. OK, not really too much here to write home about. The second statement Int64.of_int actually converts x to a type of Int64, as demonstrated by the output “42L”. Again, not too much here to write home about. There are methods to convert the types between each other. I just didn't write them all. Type inference is great, but you have to be careful when you try things like this:
> let reallybignum = 123456789456123789;;
let reallybignum = 123456789456123789;;
-------------------^^^^^^^^^^^^^^^^^^^
stdin(4,19): error: error: This number is outside the allowable range for 32-bit signed integers
Oops, I tried to stuff a number larger than what a 32 bit number can hold. To fix this, we need to specify a 64 bit integer:
> let reallybignum = 123456789456123789L;;
val reallybignum : int64
OK, as exciting as writing about types and type inference are there is another part to this post. I was poking around through the source code and came across the conversion code F# uses. Here’s the method for converting an int to other data types:
let inline int32 (x: ^a) = (^a : (static member ToInt32: ^a -> int32)(x))
when ^a : string = (System.Int32.Parse(castToStringx,System.Globalization.CultureInfo.InvariantCulture))
when ^a : float = (# "conv.i4" x : int32 #)
when ^a : float32 = (# "conv.i4" x : int32 #)
when ^a : int64 = (# "conv.i4" x : int32 #)
when ^a : int32 = (# "conv.i4" x : int32 #)
when ^a : int16 = (# "conv.i4" x : int32 #)
when ^a : nativeint = (# "conv.i4" x : int32 #)
when ^a : sbyte = (# "conv.i4" x : int32 #)
when ^a : uint64 = (# "conv.i4" x : int32 #)
when ^a : uint32 = (# "conv.i4" x : int32 #)
when ^a : uint16 = (# "conv.i4" x : int32 #)
when ^a : unativeint = (# "conv.i4" x : int32 #)
when ^a : byte = (# "conv.i4" x : int32 #)
Wow, there’s a lot going on here, but overall it should look familiar; it's a function. The inline keyword is a pseudo-function marker for code expansion. Which means the compiler will copy the function inline to the call site. The ^a parameter designates a static head-type, which means the type must be known at compile time. The : type parameter in this case is a type constraint on the value. The (# “conv.i4” : int32 #) is a special syntax for a feature of the F# language, inline il. I know I went through that fast, but at this point a lot of this stuff is specific to the compiler. More detail than I can explain.

You could read this line: when ^a : int64 = (# "conv.i4" x : int32 #) as "when ^a is a type of int64 use the il instruction conv.i4 passing in the value (x) to convert and tell the compiler the type is an int32".

A little about the IL part of the line; (# “conv.i4” x : int32 #). The (# #) block tells the compiler, here comes an IL instruction. The conv.i4 is the il opt code for convert to an int32, x is value to convert and the : int32 completes the IL instruction to enforce the int32 type.

Even though this example if fairly implicit, I ran and got my copy of Expert .Net 2.0 IL Assembler book by Serge Linden and found conv is indeed the IL code for convert operations and i4 is the int32 type. Conv takes the value from the stack, converts it and puts it back. Type conversions are tricky, if you reduce the size of a value, i.e. – int64 -> int32, the most significant bytes are throws away. Likewise if you increase the size of the value int32 -> int64 the value is zero extended.

let reallybignum = 123456789456123789L;;
val reallybignum : int64
> let truncated = Int32.of_int64 reallybignum;;
val truncated : int32
> truncated;;
val it : int32 = -1062963315
If we look at how F# handles these conversions, we find the code:
when ^a : int64 = (# "conv.ovf.i4" x : int32 #)
The optcode conv.ovf.i4 is the IL overflow conversion operator. If the conversion truncates, an Overflow exception is thrown. That’s all for now, comments, questions and corrections are welcome!

No comments: