Introduction to Types in Programming Languages

17 January 2018

Computers are not magic

Computers are stupid. Really. Every computer, phone, or tablet is based on the simple fact that a switch has two possible states: on and off. And if you increase the number of switches in consideration, the number of states increases exponentially. If you have access to electricity at home (which I hope), you probably have switch-controlled lights. Consider your living room lights. They have two possible states: on and off. Now also consider your kitchen lights, which also have two possible states. If you consider them as a whole, you obtain four different states: either both are on, both are off, only the kitchen is on, or only the living room is on. If you add your bathroom lights, you get eight different states, then sixteen, thirty-two, etc. With only ten switches, you already have 1024 possible states.

Well, a computer is just that. A (very) large amount of switches in a box, so large that, as a whole, it can be in an almost infinite number of states. And when you press a key on your keyboard or move your mouse even slightly, you toggle some of these switches to change the state of your computer. Any program you run on your computer follows the same principle: a program is a sequence of switches being toggled on and off, which may eventually produce a result (such as displaying this very webpage on your screen). That is why you may have heard of computers as executing binary code ---any hacker in your favorite TV series probably reads and writes fluently in binary code because it's badass--- which is simply a sequence of 0's and 1's (such as 001101010001). Binary code is simply code telling the computer to toggle a switch off (0) or on (1). Which brings me to my main point: a computer is stupid, it only toggles the switches we are telling it to, it has no inherent intelligence (and I don't really want to go into long arguments about what exactly defines intelligence).

Enter programming languages

Although definitely badass, writing programs in binary code is totally impractical, error-prone and trichotillomania-inducing. That's why programmers do not usually write programs in binary code anymore, but use higher-level languages, that is, programming languages which are (usually) easier to understand for a human being. It is then the role of a compiler to translate from a high-level language to binary code, such that programs can be understood by a computer.

Of course, even if the compilers were exempt of bugs (which is definitely not the case), writing programs in high-level languages does not nullify the risk of making mistakes. If it were the case, your favorite piece of office software (or, to be more honest with yourself, your favorite game) would not have crashed mid-work (aka. mid-boss battle), causing you to lose hours of progress. Consequences can be even worse in some cases, when programs are used in very critical systems: you don't want the program controlling the engines of a rocket to unexpectedly stop mid-flight, or the altimeter of a plane to report the wrong altitude.

Therefore, programmers, engineers and researchers are constantly struggling to not only eradicate bugs, but also to find better ways of avoiding making mistakes in the first place. This leads us to types in programming languages.

Types and typing

Types in programming languages are, in essence, a way to find and track down some bugs more easily. They can even detect certain mistakes before the program is even executed, which can be hugely beneficial for critical systems. To understand why, let's first start by explaining what is a type in a programming language.

In a programming language called C, you can write code like this:

int add(int x, int y) 
{
  return (x+y);
}

This code tells the computer to define a function named add which expects two values x and y, and add them together to produce a new value which equals x+y. This is very much like a mathematical function which takes two inputs and has one output corresponding to the sum of its inputs. But what the function is and does exactly does not really matter. For all intents and purposes, you can consider this function to simply be a black box where you insert two things and get a thing back. The important point are the annotations int (for integer). These annotations indicate that the two things you put in the box must be integers, and that the output will always be an integer. These annotations are what we call types. We also say that the inputs must be of type int and the output is of type int.

Such types can be crucial to the programmer. In particular, since the program now knows that our black box (aka. the function add) can only be applied to integers, it can verify that the programmer did not make any mistake by, for example, trying to put a carrot in the black box. The act of adding types and verifying that every box/function is used correctly is what we call typing. Additionally, whether this verification takes place before or during the execution of the program is the fundamental difference between what we call static typing and dynamic typing, which will be the topic of the next post.