F# is composed of expressions:
// expressions separated by `;`s
1; 2; 3
// expressions separated by new lines
4
5
6
6
If you wrap expressions with [||]
, you get an array:
// This is an array.
[| 1; 2; 3 |]
// This is also an array.
[|
1
2
3
|]
[ 1, 2, 3 ]
Every expression in F# has a type:
1 // int
2. // float
2.0 // float
"abc" // string
abc
All elements of an array must be the of the same type. So [| 1; "a"; true |]
is not valid.
In F#, the ,
separates tuple elements, not collection elements.
// two-ple
1, 2
(1, 2)
Item1 | 1 |
Item2 | 2 |
Tuples are useful for all kinds of things in F#, and the language comes with a terse syntax for representing them:
// 4-ary tuple
3, 4, 5, 6
// Unlike collections (lists and arrays), tuples can hold parameters of different types.
"Erica", 34, false
// Sometimes parentheses are required
(6, 7)
(6, 7)
Item1 | 6 |
Item2 | 7 |
Let's use tuples and arrays together to plot some points. We can install Plotly.NET from NuGet and use it all in one go:
// Use this syntax to install packages from **NuGet**.
// (Only necessary in interative mode. Otherwise can be installed with the command line with `dotnet add <PackageName>`.)
#r "nuget: Plotly.NET"
#r "nuget: Plotly.NET.Interactive"
// Use this syntax to open a module or a namespace.
open Plotly.NET
open Plotly.NET.LayoutObjects
// set some default styling (ignore for now)
let margin = 30.
Defaults.DefaultHeight <- 400
Defaults.DefaultWidth <- 0
Defaults.DefaultTemplate <- Template.init(Layout.init(AutoSize = true, Margin = Margin.init(Top = margin, Left = margin, Right = margin, Bottom = margin)))
- Plotly.NET, 5.0.0
- Plotly.NET.Interactive, 5.0.0
Loading extensions from `C:\Users\retru\.nuget\packages\plotly.net.interactive\5.0.0\lib\netstandard2.1\Plotly.NET.Interactive.dll`
Chart.Point([|
1, 2
2, 4
3, 3
|])
You can also create arrays using the range operator ..
,
// start..end (both inclusive)
[| 1 .. 10 |]
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]
// ..step..
[| 5 .. -1 .. -5 |]
[ 5, 4, 3, 2, 1, 0, -1, -2, -3, -4, -5 ]
or by using sequence expressions:
[| for i in 1..10 -> i * i |]
[ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100 ]
Let's use them together to plot the array of integer squares up to 10:
open Plotly.NET
Chart.Line([| for x in 1 .. 10 -> x, x * x |])
In math, elements belong to one or more sets, and functions map elements between those sets.
Here is a function that maps an element $x$ to itself plus 2: $$f(x) = x + 2$$
We didn't write out the sets that $f$ maps elements between. We can just infer that information based on how $x$ is used in $f$.
That information is still there though. If we wanted to specifically say $f$ maps an integer to an integer, we could write a domain constraint: $$f(x) : \mathbb{Z} \rightarrow \mathbb{Z}$$
We could write a function resembling $f$ in F# like so:
fun (x: int) -> (x + 2): int
Here, we take x
(an int
) and add 2
(an int
) to it which evaluates to x + 2
(also an int
).
int
works kind of like $\mathbb{Z}$ here:
Similar to our first $f$, we can omit the keyword int
:
fun x -> x + 2
It looks like the types just vanished 👻! But I assure you they're still there.
The F# compiler sees we added an int
to x
, and infers that x
must be an int
.
We can't see it yet, but the F# compiler also inferred that our function itself has a type int -> int
, because it takes an int
and evaluates to an int
. We can demonstrate its type by applying an int
to our function and seeing that we get an int
back out:
(fun x -> x + 2) 3
5
ℹ️ Note for C# developers
int -> int
is an F# type, but what does this look like when compiled to an assembly? F# types are a superset of .NET types. All .NET types can be represented in F# but F# function types compile toFSharpFunc
. Reading on to Partial Application and taking a look at some F# decompiled to C# explains why.
let
Bindings¶
Functions allow us to create scopes (the ()
s) wherein we can assign names to values:
(fun x -> // name in (
x + 2 // scope
) 3 // ) = value
5
We can rewrite this use of a function with a let
binding:
let x = 3 in // name = value in (
x + 2 // scope
// )
5
These two pieces of code effectively represent the same thing in F#.
The in
and indentation are used to explicitly define the scope where x
is defined. If you want a binding to be defined for as long as possible (up until the parent scope ends), you can leave out the in
and indentation:
let y = 2 in
let x = 3
x + y // `x` is defined here
// `x` is not defined here
5
Assigning a name to an expression in F# is called a binding, because the value can't change once set. Using =
without a let
compares the equality of two objects.
let a = 3
a = 4
False
Partial Application¶
We can move the body up into the same line with in
:
let x = 3 in x + 2
5
And assign the whole let
expression to another one:
let five = let x = 3 in x + 2
five
5
Rewriting our inner let
binding back to a fun
looks like this:
let five = (fun x -> x + 2) 3
five
5
We can remove the 3
to delay binding the parameter to our function:
let add2 = fun x -> x + 2
add2 3
5
We can rewrite the above function by moving x
to the left of the =
. This results in the same behavior.
let add2 x = x + 2
add2 3
5
We can replace x + 2
with another fun
, one that introduces a parameter y
and uses it in tandem with x
(this is called closure):
let add x = fun y -> x + y
add 3 2
5
We can also move y
to the left of =
:
let add x y = x + y
add 3 2
5
Practically, we've come across a function add
that can take not just one parameter, but two! It may not surprise you that we can keep repeating this process to allow for many parameters. However, under the hood, we can treat functions that take multiple parameters like add
as if they were recursively enclosing fun
s, which means we can bind them to names without applying every single one of their parameters:
let add2 = add 2
add2 3
5
This feature is called partial application. It can help you express complexity using simple, modular pieces:
Here we build add
and divide
from combine
by passing +
and /
to it. This code doesn't do much anything useful though...
// y applied to x applied to f
let combine f x y = f x y
let add = combine (+) // a way to pass the + function
let divide = combine (/)
add 3 4, divide 9 4
(7, 2)
Item1 | 7 |
Item2 | 2 |
We can pass a check
function to combine
that can perform a check and decide whether we want to continue the computation or not.
let combine f check x y =
check f x y
We can build all different kinds of "adders" from combine
:
let normalize f x y = f (abs x) (abs y)
let normalizeThenAdd = combine (+) normalize
normalizeThenAdd -4 5 |> printfn "%d"
9
let printThenAdd = combine (+) (fun f x y -> printfn "Adding %d + %d..." x y; f x y)
printThenAdd 5 6 |> printfn "%d"
Adding 5 + 6... 11
let add = combine (+) id // id is a special function that means "do nothing" in this context
add 3 5 |> printfn "%d"
8
We can build "safe dividers" that check when the denominator = 0 and change behavior in response.
safeDivide
replaces y
with NaN
when y = 0
:
let ``convert divBy0 to NaN`` f x y =
f x (if y = 0 then nan else y)
let safeDivide = combine (/) ``convert divBy0 to NaN``
safeDivide 4 0 |> printfn "%f"
NaN
safeDivide
implicitly evaluates to a float
, though, because NaN
is not a valid value for int
s. Sometimes you absolutely do want integer division, which evaluates to an int
and ignores the remainder.
tryDivide
checks if y = 0
, and if it is, it avoids doing the division altogether (by not evaluating cont
).
ℹ️ Note
I should move this example down further to when I explain Option types, perhaps referencing this example. It is not 100% clear what is going on here without explaining option types.
let ``convert divBy0 to None`` cont x y =
if y = 0 then None else Some(cont x y)
// remove this example for now and reference it when teaching the Option type
let tryDivide = combine (/) ``convert divBy0 to None``
tryDivide 4 1 |> printfn "%O"
tryDivide 4 0 |> printfn "%O"
Some(4) <null>
Let's start with a simple exercise comparing with Python:
#!connect jupyter --kernel-name pythonkernel --conda-env base --kernel-spec python3
The #!connect jupyter
feature is in preview. Please report any feedback or issues at https://github.com/dotnet/interactive/issues/new/choose.
Kernel added: #!pythonkernel
We create a function that prints the combined age of two people:
def add_person_age(person1, person2):
print(f"{person1.name} and {person2.name}'s combined age is {person1.age + person2.age}")
We construct two objects that each contain the attributes our function uses:
# using the type function
person1 = type("", (), {})
person1.name, person1.age = "Rebecca", 23
# using a class
class Person:
# optionally write an __init__ function to instead pass attribute values to a constructor
pass
person2 = Person()
person2.name, person2.age = "Eric", 27
And then call the function with our objects:
add_person_age(person1, person2)
Rebecca and Eric's combined age is 50
person3 = type("", (), {})
person3.name = "Eric"
# oops we forgot to assign Age...
# ❌ this code compiles but fails in the function call ⬇️ when trying to add an Age of None
add_person_age(person1, person3)
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[1], line 5 2 person3.name = "Eric" 3 # oops we forgot to assign Age... 4 # ❌ this code compiles but fails in the function call ⬇️ when trying to add an Age of None ----> 5 add_person_age(person1, person3) Cell In[1], line 2, in add_person_age(person1, person2) 1 def add_person_age(person1, person2): ----> 2 print(f"{person1.name} and {person2.name}'s combined age is {person1.age + person2.age}") AttributeError: type object '' has no attribute 'age'
AttributeError
We know that invoking add_person_age(person1, person3)
will always error before executing. However, the error isn't thrown until we actually run our code.
Let's write one possible F# alternative:
let inline addPersonAge person1 person2 =
printfn "%s and %s's combined age is %d"
(^T: (member Name : string) person1)
(^U: (member Name : string) person2)
((^T: (member Age : int) person1) + (^U: (member Age : int) person2))
^T: (member Name : string)
is called a type constraint. When applied to person1
, it's the same as accessing person1.Name
but also tells the F# compiler that person1
has to have a Name
of type string
.
We can evaluate addPersonAge
with anonymous records:
addPersonAge
{| Name = "Rebecca"; Age = 23 |}
{| Name = "Eric"; Age = 27 |}
Rebecca and Eric's combined age is 50
// ❌ this code does not compile
addPersonAge
{| Name = "Rebecca"; Age = 23 |}
{| Name = "Eric" |} // oops... we forgot to add `Age`
Stopped due to error
input.fsx (4,5)-(4,24) typecheck error The type '{| Name: string |}' does not support the operator 'get_Age'. See also input.fsx(2,0)-(2,12).
This feature is called Statically Resolved Type Parameters. It's pretty useful for when you want to make assertions about the data your function accepts, but you'd like to write your function in a way that accepts all kinds of data.
For example, Jonny's record has an additional property below, but we can still use it with addPersonAge
because it satisfies the necessary type constraints (requires member Name and member Age
):
addPersonAge
{| Name = "Rebecca"; Age = 23 |}
{| Name = "Jonny"; Age = 34; IsAdmin = true |}
Rebecca and Jonny's combined age is 57
We can make the body of addPersonAge
more concise by moving the type constraints to the function signature:
let inline addPersonAge<'T, 'U
when 'T : (member Name : string)
and 'T : (member Age : int)
and 'U : (member Name : string)
and 'U : (member Age : int)>(person1: 'T) (person2: 'U) =
printfn "%s and %s's combined age is %d"
person1.Name
person2.Name
(person1.Age + person2.Age)
However... in turn, it makes the function signature a little cluttered...
When we have control of our source data, it's often better to use explicit types:
type Person = {
Name : string
Age : int
}
The following code creates a person1
of type Person
, which denotes that person1
has the exact shape of Person
, no more members, no fewer.
let person1 : Person = { Name = "Rebecca"; Age = 23 }
The : Person
is called the type signature. It's often not needed, which we can demonstrate with our new addPersonAge
:
let addPersonAge person1 person2 =
printfn "%s and %s's combined age is %d"
person1.Name
person2.Name
(person1.Age + person2.Age)
addPersonAge { Name = "Rebecca"; Age = 23 } { Name = "Rebecca"; Age = 23 }
Rebecca and Rebecca's combined age is 46
Just because we didn't have to write the type doesn't mean it's not there. The compiler looks at how we used the parameter and infers the type. This is - perhaps unsurprisingly - called type inference.
Type Providers¶
We did still have to define the structure of Person
up front though. This is called domain modeling, and it's useful when you are writing an application Type Providers to model a business function and want to reduce number of possible error states to a minimum.
Sometimes, however, we're not writing an application but instead writing a script that deals with large amounts of data, and our stupid mistakes might come from accidentally misinterpreting the structure of our data.
F# has a feature for this called type providers. Whenever we're dealing with external data imports, type providers create a type for us based on the structure of the data we're importing, meaning we don't have to manually create types for huge data sets and our data and types never get out of sync.
Here's a quick example using the FSharp.Data WorldBank type provider that I took from their documentation:
#r "nuget: FSharp.Data"
open FSharp.Data
let data = WorldBankData.GetDataContext()
data.Countries.``United Kingdom``.Indicators.``Gross capital formation (% of GDP)``
|> Seq.maxBy fst
- FSharp.Data, 6.4.0
(2023, 17.7531439952426)
Item1 | 2023 |
Item2 | 17.7531439952426 |
The WorldBank provider is a premade example where a type is specifically created and republished using a well-known data source. But we can also use "data type" (CSV, JSON, SQL, etc.,) type providers to create types for our own data:
// define our data source
[<Literal>]
let uri = "http://query1.finance.yahoo.com/v7/finance/download/MSFT?period1=1678116713&period2=1709739113&interval=1d&events=history&includeAdjustedClose=true"
// create a type using it
// (normally, we'd pass a smaller and local file with the same structure here, but passing the same `uri` is fine for example / notebooks)
type Stocks = CsvProvider<uri>
// get a sample of the data from the new type using the default data source
let msft = Stocks.GetSample()
// plot the high vs low daily difference over time
Chart.Line([ for row in msft.Rows -> row.Date, row.High - row.Low ])
Working with type providers can sometimes be less convenient than dynamically typed data access libraries like pandas
's DataFrame
when your data isn't well structured.
For example, if you had data with similar structure as above (Date
, Open
, Adj Close
, etc., columns) but for multiple companies all in one CSV file, they might be addressed with the name of the company first, then the column name, such as MSFT_Date
, for example.
You could access this data in an unsafe way using programmatic access, such as with f-strings in Python (frame[f"{ticker}_{column_name}"]
), but F# type providers would have no knowledge of the implicit structure via column naming because CSV does not allow for multi-level indexing unlike other data types like JSON. In fact, iterating over all or a subset of columns from a CsvProvider
type requires a hack, and you'd be better off importing into a semi-strongly-typed DataFrame
using a package called Deedle than using type providers.
However, when you need to access a few named columns of homogenous types, it's actually not too difficult to work with the data provided from type providers as pure collections and not use a data frame at all:
let resample (interval : TimeSpan) (observations : (DateTime * decimal array) seq) =
observations
|> Seq.groupBy (fun (dt, _) -> dt.Ticks / interval.Ticks)
|> Seq.map (fun (_, group) ->
group |> Seq.map fst |> Seq.head,
group |> Seq.map snd |> Seq.transpose |> Seq.map Seq.average |> Array.ofSeq)
let print observations =
observations
|> Seq.map (fun (date, (cols : 'a array)) -> {| Date = date; Low = cols[0]; High = cols[1] |})
|> _.DisplayTable()
"__MSFT Stock Highs and Lows__" |> _.DisplayAs("text/markdown")
msft.Rows
|> Seq.map (fun row -> row.Date, [| row.Low; row.High |])
|> Seq.take 7
|> print
"__MSFT Stock Highs and Lows - Resample and Average__" |> _.DisplayAs("text/markdown")
msft.Rows
|> Seq.map (fun row -> row.Date, [| row.Low; row.High |])
|> resample (TimeSpan.FromDays 7)
|> Seq.take 3
|> print
MSFT Stock Highs and Lows
Date | High | Low |
2023-03-06 00:00:00Z | 260.119995 | 255.979996 |
2023-03-07 00:00:00Z | 257.690002 | 253.389999 |
2023-03-08 00:00:00Z | 254.539993 | 250.809998 |
2023-03-09 00:00:00Z | 259.559998 | 251.580002 |
2023-03-10 00:00:00Z | 252.789993 | 247.600006 |
2023-03-13 00:00:00Z | 257.910004 | 245.729996 |
2023-03-14 00:00:00Z | 261.070007 | 255.860001 |
MSFT Stock Highs and Lows - Resample and Average
Date | High | Low |
2023-03-06 00:00:00Z | 256.9399962 | 251.8720002 |
2023-03-13 00:00:00Z | 269.0700014 | 260.0799988 |
2023-03-20 00:00:00Z | 279.0420046 | 272.4059998 |
If you do need access to a data frame, it's pretty trivial to import data into a DataFrame
using Deedle and perform operations on it. You won't get the helpful IntelliSense hints informing you of valid navigations, and you'll get most type errors at runtime instead of compile time, but the errors should be clearer than Python in most cases.
#r "nuget: Deedle"
- Deedle, 3.0.0
open System.Net
open System.Net.Http
open Deedle
[<Literal>]
let url = "http://query1.finance.yahoo.com/v7/finance/download/MSFT?period1=1678116713&period2=1709739113&interval=1d&events=history&includeAdjustedClose=true"
let frame =
Frame.ReadCsv((new HttpClient()).GetStreamAsync(url).Result)
|> Frame.indexRowsDate "Date"
frame.Print()
let df =
frame?Low
|> Series.sampleTime (TimeSpan.FromDays 7) Direction.Forward
|> Series.mapValues (fun v -> Stats.mean v)
df.Print()
3/6/2023 12:00:00 AM -> 251.8720002 3/13/2023 12:00:00 AM -> 260.0799988 3/20/2023 12:00:00 AM -> 272.40599979999996 3/27/2023 12:00:00 AM -> 278.0919984 4/3/2023 12:00:00 AM -> 283.64250925 4/10/2023 12:00:00 AM -> 283.0340024 4/17/2023 12:00:00 AM -> 285.17000160000003 4/24/2023 12:00:00 AM -> 289.076001 5/1/2023 12:00:00 AM -> 304.1639953999999 5/8/2023 12:00:00 AM -> 306.58600459999997 5/15/2023 12:00:00 AM -> 311.6499938 5/22/2023 12:00:00 AM -> 317.95 5/29/2023 12:00:00 AM -> 328.77999124999997 6/5/2023 12:00:00 AM -> 327.41800539999997 6/12/2023 12:00:00 AM -> 333.5020082 ... -> ... 11/27/2023 12:00:00 AM -> 375.7160034 12/4/2023 12:00:00 AM -> 366.2200012 12/11/2023 12:00:00 AM -> 367.547998 12/18/2023 12:00:00 AM -> 370.3599976 12/25/2023 12:00:00 AM -> 373.48750325000003 1/1/2024 12:00:00 AM -> 367.237503 1/8/2024 12:00:00 AM -> 376.31000359999996 1/15/2024 12:00:00 AM -> 389.012497 1/22/2024 12:00:00 AM -> 398.5859986 1/29/2024 12:00:00 AM -> 402.6699952 2/5/2024 12:00:00 AM -> 408.3839966 2/12/2024 12:00:00 AM -> 406.0880066 2/19/2024 12:00:00 AM -> 403.19250475 2/26/2024 12:00:00 AM -> 406.6660032 3/4/2024 12:00:00 AM -> 403.78334566666666