Typing#

At this point, you have met most of Scheme’s essential data types:

numbers,
strings,
booleans,
pairs,
the empty list (a unique value),
symbols,
functions (also called “procedures”).

(Remember that lists can be formed indirectly using pairs and the empty list.)

It is now time to broach what types actually means from Scheme’s point of view. For example: what do we mean by “this variable is an integer”?

Dynamic typing#

Many programming languages have so-called “static typing”. In these languages, types are determined before actually executing the program. While this improves performance and is usually necessary in order to compile a program to optimized machine code (which is done for LilyPond’s core, written in C++), it does mean that either the type system is inflexible or too simple and certain programs will be difficult to express or error-prone, like, e.g., C, or the type system is rich enough to express most programs one would like to express, but requires a certain learning curve to get used to, like, e.g., C++, Haskell, Rust. Scheme embraces the opposite paradigm, called “dynamic typing”, also found in languages such as Python, JavaScript, Lua or R. Namely, the Scheme interpreter has no idea what the type of a variable is before that variable is evaluated. Taking this invalid Rust program:

fn main() {
    println!("Hello, World!");
    println!("2 + false = {}", 2 + false);
}

When we try to compile this program, the compiler gives an error:

error[E0277]: cannot add `bool` to `{integer}`
 --> tmp.rs:3:34
  |
3 |     println!("2 + false = {}", 2 + false);
  |                                  ^ no implementation for `{integer} + bool`
  |
  = help: the trait `Add<bool>` is not implemented for `{integer}`
  = help: the following other types implement trait `Add<Rhs>`:
            <&'a f32 as Add<f32>>
            <&'a f64 as Add<f64>>
            <&'a i128 as Add<i128>>
            <&'a i16 as Add<i16>>
            <&'a i32 as Add<i32>>
            <&'a i64 as Add<i64>>
            <&'a i8 as Add<i8>>
            <&'a isize as Add<isize>>
          and 48 others

error: aborting due to previous error

Note that no code was actually run during this process. We did not see the text “Hello, World!” printed, even though the first line of our main() function is code that prints this string. This is because the Rust compiler determines types while compiling it, which is before the compiled program can be run. Contrast this with the equivalent Scheme program:

(begin
  (ly:message "Hello, World!")
  (ly:message "2 + #f = ~a" (+ 2 #f)))

If we run this in the Scheme sandbox, we get:

Hello, World!ice-9/boot-9.scm:1685:16: In procedure raise-exception:
In procedure +: Wrong type: #f

The program did print “Hello, World!”. Indeed, the Scheme interpreter does not try to determine types in advance. It sees a begin expression, so it evaluates the inner expressions one by one. The first one prints “Hello, World!”. Afterwards, the second one contains a sub-expression (+ 2 #f), which is evaluated. When the + function notices that it received an invalid argument, it raises an error.

In more concise terms, Scheme values have types, while Scheme variables do not. A variable in Scheme does not have any type information attached to it, but when executing the program, the variable has a value, and this value has a type.

In static languages, compound data types, such as list-like types, are typically homogeneous, meaning that the values they contain must all have the same type. This Rust program does not compile:

fn main() {
  let vec = vec![1, 2, true, "Hello, World!"];
}

On the other hand, this Scheme code is perfectly valid:

'(1 2 #t "Hello, World!")
⇒ (1 2 #t "Hello, World!")

Again, this corresponds to Scheme’s principle of being flexible with types. Not only would enforcing homogeneity bring no performance gains (Scheme would need to be more low-level for the interpreter to be able to take advantage of it), but it would not fit into the design of Scheme as a language that is flexible about types.

This flexibility means that we can write functions taking arguments that do not always have the same type, without any ceremony to explain what we are doing to the interpreter. Just think of the filter function.

(filter (lambda (x)
          (= x 0))
        '(0 1 2 3 4))
⇒ (0)
(filter (lambda (x)
          (eq? x 'sym))
        '(sym #t sym2 #f sym3 sym4))
⇒ (sym)

Here, the second argument was a list of numbers in the first case, and a list of symbols interspersed with booleans in the second case. In the previous section, we saw how to reimplement the filter function ourselves:

(define (filter function lst)
  (let loop ((rest lst)
             (acc '()))
    (if (null? rest)
        (reverse acc)
        (loop (cdr rest)
              (if (function (car rest))
                  (cons (car rest)
                        acc)
                  acc)))))

Conveniently, we did not have to explain to the interpreter how the types of the two arguments relate (the first argument must accept the values in the list passed as the second argument). As long as types are right when the program is run, Scheme is happy.

Basic type predicates#

So far, we have rarely had to wonder about typing when programming. Suppose we write this function:

(define (is-two-or-minus-two? x)
  (= 2 (abs x)))

(abs is the absolute value function: \(abs(x) = x\) if \(x\) is positive or zero and \(abs(x) = -x\) if x is negative or zero.)

It is implicit for the programmer that the argument x must be a number, and the interpreter will raise an error if this is not the case, when it tries to execute abs. Also, we know that the result of abs is always a number, so the call to = will never raise an error.

Now consider a second function:

(define birth-years
  '(("Dowland" . 1563)
    ("Mozart" . 1756)
    ("Schumann" . 1810)
    ("Hába" . 1893)
    ("Weill" . 1900)
    ("Ligeti" . 1923)))

(define (get-500th-birthday-year composer)
  (let ((birth-year (assoc-ref birth-years composer)))
    (+ 500 birth-year)))

(get-500th-birthday-year "Mozart") ⇒ 2256

While this function works on the composers defined in the alist birth-years, it throws an error if we pass an unknown composer, since assoc-ref returns #f in that case, and + does not know how to handle it:

(get-500th-birthday-year "Brahms")
⊨ In procedure +: Wrong type argument in position 1: #f

However, perhaps we’d like to do something else. For example, instead of throwing an error, return #f. We can do this with if:

(define birth-years
  '(("Dowland" . 1563)
    ("Mozart" . 1756)
    ("Schumann" . 1810)
    ("Hába" . 1893)
    ("Weill" . 1900)
    ("Ligeti" . 1923)))

(define (get-500th-birthday-year composer)
  (let ((birth-year (assoc-ref birth-years composer)))
    (if birth-year
        (+ 500 birth-year)
        #f)))

(get-500th-birthday-year "Brahms")
⇒ #f

This works, because we knew that the only possible non-number value was #f, which we can check for using if. But what if we wanted to differentiate not between a number and #f, but, say, a number and a list? For example, suppose that we enrich our alist to contain the full birth date, not just the year, as a three-element list (dd mm yyyy), except that we keep the year only if the exact day is not known.

(define birth-years
  '(("Dowland" . 1563)
    ("Mozart" . (27 01 1756))
    ("Schumann" . (08 06 1810))
    ("Hába" . (21 06 1893))
    ("Weill" . (02 03 1900))
    ("Ligeti" . (28 05 1923))))

(define (get-500th-birthday-year composer)
  (let ((birth-year (assoc-ref birth-years composer)))
    ???))

While dynamic typing is a usual feature of interpreted languages like Scheme, its way of allowing one to know about types at runtime is much less usual. In languages like Python or JavaScript, there is a type or typeof function that returns an object representing the type of a value. For example, in Python:

>>> type(5)
<class 'int'>
>>> type(True)
<class 'bool'>

While Guile does have a facility to access types like this (the class-of function in the (oop goops) module), it is not standard Scheme and its use in practice is very rare, at least in LilyPond. Instead, the more typical approach in Scheme is to use dedicated functions that the language provides, which check whether a value is of a certain type. These functions are called “type predicates”. A predicate is a function that returns a boolean (predicates are usually named with a question mark at the end), and a type predicate is a predicate that returns a boolean telling whether the argument has a certain type. These are the predicates for Scheme’s common types:

number?,
string?,
boolean?,
pair?,
null? (checks for the empty list),
symbol?,
procedure?.

The twist is that we never ask questions having the form “what is the type of this value?”, but only “is this value an X?”, for X a certain type.

In the example above, we can use pair? to distinguish pairs from numbers:

(define birth-dates
  '(("Dowland" . 1563)
    ("Mozart" . (27 01 1756))
    ("Schumann" . (08 06 1810))
    ("Hába" . (21 06 1893))
    ("Weill" . (02 03 1900))
    ("Ligeti" . (28 05 1923))))

(define (get-500th-birthday-date composer)
  (let ((birth-date (assoc-ref birth-dates composer)))
    (cond
     ((pair? birth-date)
      (list (list-ref birth-date 0)
            (list-ref birth-date 1)
            (+ 500 (list-ref birth-date 2))))
     ((not birth-date)
      #f)
     (else
      (+ birth-date 500)))))

(get-500th-birthday-date "Mozart")
⇒ (27 1 2256)
(get-500th-birthday-date "Dowland")
⇒ 2063
(get-500th-birthday-date "Beethoven")
⇒ #f

Derived and custom type predicates#

What if we want to check whether a value is a list? We already studied the definition of lists: a Scheme value is a list if it is the empty list, or a pair whose cdr is a list. This definition can be naturally translated into a predicate:

(define (list? x)
  (or (null? x)
      (and (pair? x)
           (list? (cdr x)))))

The need to check for lists is actually common enough that a list? predicate is provided by default[1].

However, the fact that we can define it ourselves illustrates the elegance of using predicates to express types. In some cases, you may want to define your own types from basic types – LilyPond, for example, defines a color, through the predicate color?, as string or a list of 3 or 4 numbers between 0 and 1. Some languages like Rust have a concept of “interfaces” or “traits”, which, in a very simplified view, allows defining a type X by “valid values of type X are values of type Y or values of type Z”. In its quest for minimalism, Scheme chose an approach that allows this flexibility without needing any additional special language feature. It may seem silly at first to restrict yourself to asking questions of the form “Is x a value of type X?” and never “What is the type of X?”, but realize that it allows a value to have several valid “types” (e.g., "red" is both a valid string, satisfying string?, and a valid color, satisfying color?), and this enables elegantly treating basic and custom types in exactly the same way.