Scala Tutorials Part #2 - Type Inference & types in Scala

Type inference

This is part 2 of the scala tutorial series. Check here for the full series.

If you recall from the first part, we briefly dealt with type inference with a few examples. Now we are going to deal it with a little more depth, below is what this article will cover.

Note: This is a pretty dry topic,mostly theory based, but also important. I have tried to cover the essential ones in the most minimalistic way possible. Many materials have been given for leisure study.

: This article has been translated to chinese by ChanZong Huang, you can check it out here.


What exactly is type inference from a programmer's perspective

This is not something unique to scala. Many languages such as OCaml, Haskell, Rust, Swift, C#(starting with version 3.0) already have these.

Let’s start with the wikipedia’s definition.

Type inference refers to the automatic deduction of the data type of an expression in a programming language

Well, that was obvious as we saw from the previous post that it automatically deduces types.

The primary purpose is to help the programmer avoiding verbose typing but still maintaining the compile time type safety of a statically typed language.

Saying in a succinct manner, type inference is the best of both world’s i.e static and dynamic typing.

Or at-least it tries to be. The reality is largely dependant upon where/how we use it.

An Overview of a type system

A type system is a language component that is responsible for type checking.
Scala is a statically typed language, so there are always a defined set of types(, and anything that does not belong in that set is not regarded as a valid type and an error is thrown at compile time.

Why have any type system at all?

Because computers cannot match human stupidity, and certain things are better handled by the compiler rather than relying on people for getting it right. It also can prevent a ton of bugs that pop up due to improper types. A type system exists to give type safety, and the levels of strictness is what differentiates between different languages and runtimes.

This brings us to another question, what kind of type systems exist and based upon which we can classify different languages.

Language classification according to their type system

Quoting from this wikipedia page,

That was a dizzying number of systems. I also highly encourage you take this course, to understand more. Videos are available for download/offline viewing.

This might also come up on coursera as well, so make sure you add it to your watchlist.

There is also another course which is really good.

Scala as mentioned above can be classified as a statically typed language with type inference. There is a strong relation between functional programming and type inference which we will keep re-visiting from time to time.

Hindley-Milner(HM) type inference

We can talk about type inference for days but perhaps the most famous algorithm of them all is HM type inference algorithm.

The HM algorithm checks the program to deduce what type it belongs to. If you have taken the courses above, then you would have a pretty solid idea what that means.

Below is an example of how a typical type system with type inference would work. It would build a parse tree consisting of all the elements, analyses the elements of what type it could belong to and arrive at a final conclusion.

Scala type system

The above example is pseudo code, the syntax is not much important. It returns true if the sum is less than 10 and false if greater. We can translate/build up from this example to other complicated workflows.

Many algorithms work in almost the same manner. If there are any type errors such as multiplying two strings, it would throw an exception.

Some entry level haskell programming will really help to understand this better. Learn you a haskell is a good website to start with.

Hindley-Milner algorithm is also called as Global type inference. It reads the whole of the code and deduces the types. Scala’s type system works a little different as explained below.

Local vs Global type inference and sub-typing - Why scala choose local type inference

Scala’s follows a combination of sub-typing and local type inference. I tend to compare it with Haskell, since it one of the most famous Functional Programming(F.P) paradigm language out there.

Let’s understand with an example.

If we consider the below code, it gives a compile time error in Intellij.

def factorial(a: Int) = {
    if (a <= 1) 1 else a * factorial(a - 1)

Error message

Scala type error

Syntax details can be ignored for now (we can deal with lot more detail while learning methods). The program computes the factorial value based on the number passed in. If we notice the error, the pre-compiler is not able to infer the type of the recursive function. The same(similar) code can be used in haskell without any errors.

let factorial 0 = 1; factorial n = n * factorial (n - 1)

The above code when executed inside the haskell GHCI shell (kind of like scala REPL) compiles with no errors.

Haskell Global type inference

This is a real world example of Global vs Local type inference. In scala, we have to annotate the types wherever local type inference does not help (also see below on when to use type inference).

The correct version of the above code would be as follows. Notice the type Int is explicitly mentioned which is not present in the code above.

 def factorial(a:Int): Int = {
    if(a <=1) 1 else a * factorial(a-1)

For a language that is multi-paradigm, it is really hard to do global/hindley-milner style type inference since it restricts doing OOP features such as inheritance and method overloading. We are not going to in detail of why languages such as Haskell cannot do such things (there are lots of resources on the net if you are curious on systems programming/compiler hacking), but the point is scala has made a different trade-off.

Systems do exist which combine these together, but to a programmer if there is a type error, the compiler has to give meaningful error messages so that they can be fixed. In reality, this is very hard to do. Continuous research is being done in this area to improve them.

A brief overview of scala's type system and subtyping

As mentioned above a type system is made of pre-defined components of types and this forms the foundation of how scala infers them.

Scala type system

The picture says it all, you can try to dig into the source code by the usual intellij route of ctrl+click and it all points to the Any class. Please note that types are not regular classes, although they seem to be. We will deal with it in a future article in detail.

Sub-typing is something that is not supported by the Hindley-Milner algorithm, but is essential in a multi-paradigm world. This is also another reason why scala does not use the HM algorithm.

Let’s look at the below example to understand sub-typing.

Scala sub-typing

We are constructing a heterogeneous list. Sub-typing converts the lower type into a higher type if possible. A simple example would converting a Int to a Double which is the first example. If it cannot be fit, it goes to the top level i.e the Any type. All of this conversion can be translated to the type system hierarchy above.

This makes Objected oriented programming much easier to handle.

For more information you can visit the scala docs.

When to use type inference?

There is a fine line that divides dynamic typing (no types) and static typing with type inference. As they say “all code should look like well written prose”, it is important to know when to use them and when not to.

When to use them?

When it saves programmer time and also where type information does not really matter. Situations could be inside of a function or a loop where the information about types is obvious.

When not to use them?

Simple, when type information is important i.e it should not leave the programmer who reads to code guessing about types.

It is hard to give a code example since it really depends on application under consideration. The only fool-proof way to deal with this is to conduct code reviews with peer programmers and see if they can understand them.

After all writing code is O(K) and reading code would be O(N), where K would be a constant with not much variation, since only a single person would be writing it and N would be the size of your team trying to read the code. It multiplies with team size.

With guessing comes mistakes, with mistakes come bad code, and with bad code comes frustration, with frustration comes the axe murderer

It is a matter of code readability rather than anything else. With freedom comes responsibility

Congratulations !! If you have understood/reached this far, then you should be proud of yourself. Rather than saying this is a pretty difficult topic, I would say it is a very non-intuitive one to get your head around.

Stay tuned !! This is just the beginning.


Tagged Under


Search this website...

Keeping up with blogs...

I blog occasionally and mostly write about Software engineering and the likes. If you are interested in keeping up with new blog posts, you should follow me on twitter where I usually tweet when I publish them. You can also use the RSS feed , or even subscribe via email below.

Feedio Subscribe

Share & Like

If you like this post, then you can either share/discuss/vote up on the below sites.

Thoughts ...

Please feel free to share your comments below for discussion

Blog comments powered by Disqus