Background
As I usually do every Sunday, I skim through Sergeys F# Weekly just to see if there are anything interesting happening in the F# Community.
This week I found Lucas Reis’ blog post really well written, educational and didactic, specially the visualization of final state machine representation.
What seem to tingle a bit
my OCD
was the implementation of the EventStore
:
1
2
3
4
5
6
7
8
9
type EventStore() =
let eventList =
new ResizeArray<String * ScoutEvent>()
member this.Save(name, events) =
events |> List.iter (fun e -> eventList.Add(name, e))
member this.Get() =
eventList
Problem by introducing OO data structures into F# (or OCaml)
As Lucas mention, you can just declare a type with ()
and define
it’s members, and then you have a new data structure in F#. As with Lucas
EventStore
, I will point out the main issue by taking this
approach. If we look into MSDN, we can see that ResizeArray
is just
a type abbreviation for a generic .NET list:
So my example will also made by using the built-in ResizeArray
data
structure:
1
2
3
4
let xs = new ResizeArray<int>()
Array.Parallel.init 1000 (fun i -> xs.Add i) |> ignore
xs |> Seq.reduce(fun x y -> x + y)
We can see that the final reduced sum is a non-deterministic as well as incorrect result:
So why is this happening? Well if you are used to work with the .NET platform,
you might as well (if you actually read the documentation on MSDN) have seen the
following text on the bottom of almost every Class
definition,
under the Thread Safety sections:
Public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
The main point here is that .NET collections are not immutable and therefore don’t fit well with the functional paradigm that F# is mainly built-on, even though it has support for other paradigms as imperative and OO.
Build your data structures the right way
Is there a way to solve this self inflicted problem? Yes, we can create constrained types in F#, see Scott Wlaschin Gist in the References below for more information, where you can avoid exposing types from a module. They are accessible from inside the module, but not from code importing the module.
With this in mind, I will create an immutable array like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
module Immutable =
type 'a iarray = private | T of 'a array with
override ia.ToString() =
ia |> function | T xs -> xs |> sprintf "%A"
module Array =
let init n f =
Array.Parallel.init n f |> T
let map f (T xs) =
xs |> Array.Parallel.map f |> T
let iter f (T xs) =
xs |> Array.iter f
let reduce f (T xs) =
xs |> Array.reduce f
let fold init f (T xs) =
xs |> Array.fold f init
let length (T xs) = xs |> Array.length
let at i (T xs as ixs) =
if i < 0 || i >= (length ixs) then
failwith (sprintf "index: %i is out of boundries." i)
else
xs.[i]
let append (T xs) (T ys) =
Array.append xs ys |> T
module Extra =
let add x (T xs) =
Array.append xs [| x |] |> T
let pop (T xs as ixs) = length ixs |> function
| 0 -> failwith "the array is empty."
| 1 -> [| |] |> T
| n -> xs.[0 .. n-2] |> T
where the 'a iarray
is visible from outside, while the single-case
union constructor T
is marked as private | T of 'a
array
therefore it can only be accessed from inside the module (and
sub modules).
As you can see in the sub (and sub sub) modules, I’m just extracting the standard (and mutable) array type from the single-case union constructor and then using the built-in functions to perform the desired logic.
If you look carefully, I’m never exposing the underlying and mutable array,
therefore, as I don’t allow any external piece of code to instantiate my type
iarray
unless it’s by using the init
function, I can
therefore argue that my data structure is sound to be used as an immutable F#
data structure as the native built-in would be used.
ResizeArray vs iarray
Snippets:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
let foobar =
Array.Parallel.init 1000 id
|> Array.reduce(fun x y -> x + y)
let foo =
let xs = new ResizeArray<int>()
Array.Parallel.init 1000 (fun i -> xs.Add i) |> ignore
xs |> Seq.reduce(fun x y -> x + y)
let bar =
let xs = Immutable.Array.init 0 id
Array.Parallel.init 1000 (fun i -> xs |> Immutable.Array.Extra.add i)
|> Array.reduce(fun x y -> Immutable.Array.append x y)
|> Immutable.Array.reduce (fun x y -> x + y)
Output:
Functor modules as in OCaml
In my current implementation of iarray
my additions to the array
are in linear time, as a new array +1 needs to be allocated on another spot in
memory, while my indexed access still is in constant time. So in the case that
I was using this data structure for a lot of reads but very few inserts, it
would be ideal, but what about if I had a lot of inserts but very few reads? Or
what if I had more or less fifty/fifty on reads and writes? Well, in the case
that I had a lot of writes and few reads, I would have used a standard built in
list as the underlying data structure due to constant addition and linear
reads while in the case where I had fifty/fifty reads and writes I would
probably go for a balanced tree, logarithmic reads and writes. In all these
cases, I would actually have to create new and separated modules for each of the
approaches I mention.
Therefore it would be really nice if F# could port the Functor modules from OCaml as it would allow us to change the underlying datastructures inside a module.
I’ve POC an approach where I used records as modules, as you can see in the References, but it’s very hackerish and doesn’t really gets the job done …
Conclusion:
I think it’s a change of the mindset that you need to do when your are coding with functional programming languages that are multi-paradigm, as you will be able to do things the way you are used to do, in an OO way, but that might not always be the appropriate approach.
References:
- Sergey Tihon’s Blog:
- Lucas Reis’ Blog:
- MSDN:
- Scott Wlaschin on GitHub Gist:
- Blog: Ramón Soto Mathiesen
- Part I - An introduction to OCaml: