Is there a reason to have a bottom type in a programming language?

Question

A bottom type is a construct primarily appearing in mathematical type theory. It is also called the empty type. It is a type that has no values, but is a subtype of all types.

If a function's return type is the bottom type, that means that it does not return. Period. Maybe it loops forever, or maybe it throws an exception.

What is the point of having this weird type in a programming language? It's not that common, but it is present in some, such as Scala and Lisp.

Unit types (e.g. void in C, or unit in Ocaml) are much more common that bottom types. Are you sure you are asking about bottom type, not the unit type? Are you sure Lisp has a bottom type in the language? — Basile Starynkevitch, Mar 24 '15 at 05:47
void is not even an unit type. void is pretty much useless. — Display Name, Mar 24 '15 at 06:28
@SargeBorsch: are you sure of that? Of course one cannot in C explicitly define a void data... — Basile Starynkevitch, Mar 24 '15 at 08:21
@BasileStarynkevitch there are no values of type void, and unit type must have one value. Also, as you pointed out, you cannot even declare a value of type void, that means it's not even a type, just a special corner case in the language. — Display Name, Mar 24 '15 at 08:56
I know that and I agree with that. To be picky, it is not sure that C has types (in the strict denotational semantics way of thinking them). Look into compcert if you really care about formal typing in C (you'll understand that without additional precision, it has no sense) — Basile Starynkevitch, Mar 24 '15 at 08:59
Yes, the C is bizarre in this, especially in how the pointer and function pointer types are written. But void in Java is nearly the same: not really a type & can't have values. — Display Name, Mar 24 '15 at 09:03
In the semantics of languages with a bottom type, the bottom type is not considered to have no values, but rather to have one value, the bottom value, representing a computation that never completes (normally). Since the bottom value is a value of every type, the bottom type can be a subtype of every type. — Theodore Norvell, Mar 24 '15 at 11:38
twanvl.nl/blog/haskell/conduits-vs-pipes mentions one example of a good use of the void type. Listing the reverse dependencies of the void Haskell library (only containing a definition of this type) gives 49 libraries, suggesting that there are many more. — monocell, Mar 24 '15 at 15:24
FYI, with Julia there is also a non-functional language that has a bottom type (called None). — Martin Ender, Mar 24 '15 at 16:44
@BasileStarynkevitch Common Lisp has the nil type which has no values. It also has the null type which has only one value, the symbol nil (a.k.a., ()), which is a unit type. — Joshua Taylor, Mar 24 '15 at 18:49
I agree on the subject of void and unit types. They don't consistently fall into any category. I like to think of void as a unit type from a philosophical point of view, because I like to think of functions as always having return types (it makes them more consistent), and thus functions that return must always return a value. In practice their implementation makes them different and inconsistent. — GregRos, Mar 24 '15 at 21:26
@TheodoreNorvell In languages with a bottom type, the bottom type has no values at all, and the same is true in type theory. Here is a list of sources: http://goo.gl/8CizpA. Of course languages can call anything they like a bottom type and have values of it, but the general consensus is that the type is uninhabited. The quality of being a subtype to all types isn't restricted to a bottom type though. — GregRos, Mar 24 '15 at 21:54
@JustGreg You are right. I should have said "In the denotational semantics of languages with a bottom type ...". For example, Pierce uses operational semantics in his book. The undefined "value" is certainly not a value in the sense that programmers usually use. Good point about being bottom and being a subtype of all types not being equivalent; I think that rules out Telastin's answer. — Theodore Norvell, Mar 25 '15 at 00:10

score 36 · Answer 1 · edited Feb 01 '19 at 18:52

36

I'll take a simple example: C++ vs Rust.

Here is a function used to throw an exception in C++11:

[[noreturn]] void ThrowException(char const* message,
                                 char const* file,
                                 int line,
                                 char const* function);

And here is the equivalent in Rust:

fn formatted_panic(message: &str, file: &str, line: isize, function: &str) -> !;

On a purely syntactic matter, the Rust construct is more sensible. Note that the C++ construct specifies a return type even though it also specifies it is not going to return. That's a bit weird.

On a standard note, the C++ syntax only appeared with C++11 (it was tacked on top), but various compilers had been providing various extensions for a while, so that third party analysis tools had to be programmed to recognize the various ways this attribute could be written. Having it standardized is obviously clearly superior.

Now, as for the benefit?

The fact that a function does not return can be useful for:

optimization: one can prune any code after it (it won't return), there is no need to save the registers (as it won't be necessary to restore them), ...
static analysis: it eliminates a number of potential execution paths
maintainability: (see static analysis, but by humans)

edited Feb 01 '19 at 18:52

Deduplicator

9,031

answered Mar 24 '15 at 08:10

Matthieu M.

14,896

6

void in your C++ example defines (part of) the function's type -- not the return type. It does restrict the value the function is allowed to return; anything that can convert to void (which is nothing). If the function returns it must not be followed by a value. The full type of the function is void () (char const*, char const*, int, char const *). + 1 for using char const instead of const char :-) – Clearer Mar 24 '15 at 09:07
6

That doesn't mean it makes more sense to have a bottom type though, just that it makes sense to annotate functions on whether they return or not as part of the language. Actually, since functions can fail to return due to different reasons, it seems to be better to encode the reason in some way instead of using a catch-all term, kind of like the relatively recent concept of annotating functions based on their side effects. – GregRos Mar 24 '15 at 22:10
2

Actually, there's a reason to make "does not return" and "has return-type X" independent: Backwards-compatibility for your own code, as the calling-convention might depend on the return-type. – Deduplicator Mar 24 '15 at 22:59
is [[noreturn]] par of the syntax or an addition of functionality? – Zaibis Mar 25 '15 at 10:19
@Zaibis: It's an addition of functionality, but is not part of the function signature. Thus if you have a handle on a function, this "property" is lost. – Matthieu M. Mar 25 '15 at 13:08
@Deduplicator A properly implemented bottom type is a subtype of any other, i.e. a (mythical) value of type bottom can be implicitly converted to any other type (this is vacuously true, because there are no values of type bottom). Your backwards compatibility is therefore covered by allowing to return subtypes in later versions (like you can do in overridden methods). – Alex Shpilkin Feb 01 '19 at 18:26
@AlexShpilkin Calling-conventions don't just concern themselves with theoretical ideals, but also practicalities. As an example, if a function returns a (bigger) object by value, the caller might have to pass a pointer to the destination. In contrast to that, if just a small integer is returned, that would comfortably fit into the return-register. Of course, if you only have reference-types, there is no returning objects by value. – Deduplicator Feb 01 '19 at 18:50
@Deduplicator Yes, value returns complicate the implementation of function subtyping, though I maintain that this would just mean that a subtyping implementation that doesn’t provide it is just badly engineered. (The point is not being “ideal”, the point is being easy for the programmer to reason about.) A function of with a superclass return type ought to be able to return a subclass instance, however much pain this induces in the ABI. (The C++ version of this is that subtyping only works on pointers.) [cont.] – Alex Shpilkin Feb 01 '19 at 19:09
[cont.] For bottom returns in particular, the only problem is indeed a possible hidden return argument; e.g. SysV x86-64 screwed this up by passing it in RDI instead of, say, RAX. I see two ways out: first, we can only declare API not ABI compatibility; second, given that we’re talking about something like C++, Rust or Go, the calling convention is in any case not the C one (and probably unstable), so we might as well design it with subtyping in mind. [cont.] – Alex Shpilkin Feb 01 '19 at 19:20
1

[cont.] Overall, I’d just say that a discussion on the advantages of ⊥ has to define what qualifies as an implementation of ⊥; and I don’t think a type system that doesn’t have (a → ⊥) ≤ (a → b) is a useful implementation of ⊥. So in this sense the SysV x86-64 C ABI (among others) just doesn’t allow implementing ⊥. – Alex Shpilkin Feb 01 '19 at 19:29

Theodore Norvell · Answer 2 · 2019-02-04T02:13:58.947

Karl's answer is good. Here is an additional use that I don't think anyone else has mentioned. The type of

if E then A else B

should be a type that includes all the values in the type of A and all the values in the type of B. If the type of B is Nothing, then the type of the if expression can be the type of A. I'll often declare a routine

def unreachable( s:String ) : Nothing = throw new AssertionError("Unreachable "+s)

to say that code is not expected to be reached. Since its type is Nothing, unreachable(s) can now be used in any if or (more often) switch without affecting the type of result. For example

 val colour : Colour := switch state of
         BLACK_TO_MOVE: BLACK
         WHITE_TO_MOVE: WHITE
         default: unreachable("Bad state")

Scala has such a Nothing type.

Another use case for Nothing (as mentioned in Karl's answer) is List[Nothing] is the type of lists each of whose members has type Nothing. Thus it can be the type of the empty list.

The key property of Nothing that makes these use cases work is not that it has no values --although in Scala, for example, it does have no values-- it is that it is a subtype of every other type.

Suppose you have a language where every type contains the same value -- let's call it (). In such a language the unit type, which has () as its only value, could be a subtype of every type. That doesn't make it a bottom type in the sense that the OP meant; the OP was clear that a bottom type contains no values. However, as it is a type that is a subtype of every type, it can play much the same role as a bottom type.

Haskell does things a bit differently. In Haskell, an expression that never produces a value can have the type scheme forall a.a. An instance of this type scheme will unify with any other type, so it effectively acts as a bottom type, even though (standard) Haskell has no notion of subtyping. For example, the error function from the standard prelude has type scheme forall a. [Char] -> a. So you can write

if E then A else error ""

and the type of the expression will be the same as the type of A, for any expression A.

The empty list in Haskell has the type scheme forall a. [a]. If A is an expression whose type is a list type, then

if E then A else []

is an expression with the same type as A.

What is the difference between the type forall a . [a] and the type [a] in Haskell? Aren't type variables already universally quantified in Haskell type expressions? — Giorgio, Mar 25 '15 at 20:44
@Giorgio In Haskell the universal quantification is implicit if is clear that you are looking at a type scheme. You can't even write forall in standard Haskell 2010. I wrote the quantification explicitly because this is not a Haskell forum and some people might not be familiar with Haskell's conventions. So there is no difference except that forall a . [a] is not standard whereas [a] is. — Theodore Norvell, Mar 25 '15 at 20:55

leftaroundabout · Answer 3 · 2018-06-02T15:05:00.640

Types form a monoid in two ways, together making a semiring. That's what's called algebraic data types. For finite types, this semiring directly relates to the semiring of natural numbers (including zero), which means you count how many possible values the type has (excluding “nonterminating values”).

The bottom type (I'll call it Vacuous) has zero values^†.
The unit type has one value. I'll call both the type and its single value ().
Composition (which most programming languages support quite directly, through records / structs / classes with public fields) is a product operation. For instance, (Bool, Bool) has four possible values, namely (False,False), (False,True), (True,False) and (True,True).
The unit type is the identity element of the composition operation. E.g. ((), False) and ((), True) are the only values of type ((), Bool), so this type is isomorphic to Bool itself.
Alternative types are somewhat neglected in most languages (OO languages kind-of support them with inheritance), but they are no less useful. An alternative between two types A and B basically has all the values of A, plus all the values of B, hence sum type. For instance, Either () Bool has three values, I'll call them Left (), Right False and Right True.
The bottom type is the identity element of the sum: Either Vacuous A has only values of the form Right a, because Left ... doesn't make sense (Vacuous has no values).

What's interesting about these monoids is that, when you introduce functions to your language, the category of these types with the functions as morphisms is a monoidal category. Amongst other things, this allows you to define applicative functors and monads, which turn out to be an excellent abstraction for general computations (possibly involving side-effects etc.) within otherwise purely functional terms.

Now, actually you can get quite far with worrying only one side of the issue (the composition monoid), then you don't really need the bottom type explicitly. For instance, even Haskell did for a long time not have a standard bottom type. Now it has, it's called Void.

But when you consider the full picture, as a bicartesian closed category, then the type system is actually equivalent to the whole lambda calculus, so basically you have the perfect abstraction over everything possible in a Turing-complete language. Great for embedded domain-specific languages, for instance there's a project about directly coding electronic circuits this way.

Of course, you may well say that this is all theoretists' general nonsense. You don't need to know about category theory at all to be a good programmer, but when you do, it gives you powerful and ridiculously general ways to reason about code, and proove invariants.

^†_{mb21 reminds me to note that this should not be confused with bottom values. In lazy languages like Haskell, every type contains a bottom “value”, denoted ⊥. This isn't a concrete thing that you could ever explicitly pass around, instead it's what's “returned” for example when a function loops forever. Even Haskell's Void type “contains” the bottom value, thus the name. In that light, Haskell's bottom type really has one value and its unit type has two values, but in category-theory discussion this is generally ignored.}

"The bottom type (I'll call it Void)", which is not to be confused with the value bottom, which is a member of any type in Haskell. — mb21, Jun 02 '18 at 11:38

Karl Bielefeldt · Answer 4 · 2015-03-25T23:33:47.727

18

Maybe it loops forever, or maybe it throws an exception.

Sounds like a useful type to have in those situations, rare though they may be.

Also, even though Nothing (Scala's name for the bottom type) can have no values, List[Nothing] does not have that restriction, which makes it useful as the type of an empty list. Most languages get around this by making an empty list of strings a different type than an empty list of integers, which kind of makes sense, but makes an empty list more verbose to write, which is a big drawback in a list-oriented language.

edited Mar 25 '15 at 23:33

answered Mar 24 '15 at 04:04

Karl Bielefeldt

147,435

12

“Haskell's empty list is a type constructor”: surely the relevant thing about it here is more that it’s polymorphic, or overloaded — that is, the empty lists from different types are distinct values, but [] represents all of them, and will be instanatiated to the specific type as necessary. – Peter LeFanu Lumsdaine Mar 24 '15 at 05:38
Interestingly: If you try to create an empty array in the Haskell interpreter, you get a very definite value with a very indefinite type: [a]. Similarly, :t Left 1 yields Num a => Either a b. Actually evaluating the expression forces the type of a, but not of b: Either Integer b – John Dvorak Mar 24 '15 at 19:47
5

The empty list is a value constructor. A bit confusingly, the type constructor involved has the same name but the empty list itself is a value not a type (well, there are type level lists too, but that's a whole other topic). The part that makes the empty list work for any list type is the implied forall in its type, forall a. [a]. There are some nice ways to think about forall, but it does take some time to really figure out. – David Mar 24 '15 at 20:32
@PeterLeFanuLumsdaine That is exactly what being a type constructor means. It just means it's a type with a kind different from *. – GregRos Mar 24 '15 at 22:09
2

In Haskell [] is a type constructor and [] is an expression representing an empty list. But that does not mean that "Haskell's empty list is a type constructor". The context makes it clear whether [] is being used as a type or as an expression. Suppose you declare data Foo x = Foo | Bar x (Foo x); now you can use Foo as a type constructor or as a value, but it's just happenstance that you happened to choose the same name for both. – Theodore Norvell Mar 25 '15 at 20:44
Sheesh, you guys are fixating on an incidental part of the answer mostly made in passing, so I deleted it. – Karl Bielefeldt Mar 25 '15 at 23:34

score 3 · Answer 5 · answered Mar 24 '15 at 12:38

It is useful for static analysis to document the fact that a particular code path is not reachable. For example if you write the following in C#:

int F(int arg) {
 if (arg != 0)
  return arg + 1; //some computation
 else
  Assert(false); //this throws but the compiler does not know that
}
void Assert(bool cond) { if (!cond) throw ...; }

The compiler will complain that F does not return anything in at least one code path. If Assert were to be marked as non-returning the compiler would not need to warn.

Telastyn · Answer 6 · 2015-03-24T15:03:52.727

2

In some languages, null has the bottom type, since the subtype of all types nicely defines what languages use null for (despite the mild contradiction of having null be both itself and a function that returns itself, avoiding the common arguments about why bot should be uninhabited).

It can also be used as a catch-all in function types (any -> bot) to handle dispatch gone awry.

And some languages allow you to actually resolve bot as an error, which can be used to provide custom compiler errors.

edited Mar 24 '15 at 15:03

answered Mar 24 '15 at 01:41

Telastyn

109,398

11

No, a bottom type is not the unit type. A bottom type has no value at all, so a function returning a bottom type should not return (i.e. throw an exception or loop indefinitely) – Basile Starynkevitch Mar 24 '15 at 05:48
@BasileStarynkevitch - I'm not talking about the unit type. The unit type maps to void in common languages (albeit with slightly different semantics for the same use), not null. Though you're also right that most languages do not model null as the bottom type. – Telastyn Mar 24 '15 at 11:33
Is there an example of language where null has the bottom type? – Theodore Norvell Mar 24 '15 at 11:40
3

@TheodoreNorvell - early versions of Tangent did that - though I am it's author, so that's perhaps cheating. I don't have the links saved for others, and it's been a while since I did that research. – Telastyn Mar 24 '15 at 11:51
Ok. So in these languages, I guess, bot is not used to represent computations that don't terminate normally, but rather computations that terminate with a value that is included in every other type. – Theodore Norvell Mar 24 '15 at 12:33
1

@TheodoreNorvell maybe sort of in some way C#'s null used to have the Null type, with null as its only value. Looking only at reference types, since you can assign null to everything, that means the Null type is similar to a bottom type, under the condition you never use null. Unfortunately, the Null type was a special type, and you couldn't declare things having the Null type, so it never was a useful bottom type. – Martijn Mar 24 '15 at 12:42
2

@Martijn But you can use null, e.g. you an compare a pointer to null an get a Boolean result. I think the answers are showing that there are two distinct kinds of bottom types. (a) Languages (e.g. Scala) where the type that is a subtype of every type represents computations that don't deliver any results. Essentially it's an empty type, though technically often populated by a useless bottom value representing nontermination. (b) Languages like Tangent, in which the bottom type is a subset of every other type because it contains a useful value that is also found in every other type -- null. – Theodore Norvell Mar 24 '15 at 14:54
5

It's interesting that some language have a value with a type you can't declare (common for the null literal), and others have a type you can declare but has no values (a traditional bottom type), and that they fill somewhat comparable roles. – Martijn Mar 24 '15 at 15:07
1

Could you provide examples for the two some languages instances in your answer to make it more complete? – Mast Mar 25 '15 at 09:43

score 1 · Answer 7 · answered Mar 24 '15 at 12:46

Yes this is a quite useful type; while its role would be mostly interior to the type system, there are some occasion where the bottom type would appear in openly.

Consider a statically typed language in which conditionals are expressions (so the if-then-else construction doubles as the ternary operator of C and friends, and there might be a similar multi-way case statement). Functional programming language have this, but it happens in certain imperative languages as well (ever since ALGOL 60). Then all branch expressions must ultimately produce the type of the whole conditional expression. One could simply require their types to be equal (and I think this is the case for the ternary operator in C) but this is overly restrictive especially when the conditional can also be used as conditional statement (not returning any useful value). In general one wants each branch expression to be (implicitly) convertible to a common type that will be the type of the full expression (possibly with more or less complicated restrictions to allow that common type to be effectively found by the complier, cf. C++, but I won't go into those details here).

There are two kinds of situations where a general kind of conversion will allow necessary flexibility of such conditional expressions. One is already mentioned, where the result type is the unit type void; this is naturally a super-type of all other types, and allowing any type to be (trivially) converted to it makes it possible to use the conditional expression as conditional statement. The other involves cases where the expression does return a useful value, but one or more branches are incapable of producing one. They will usually raise an exception or involve a jump, and requiring them to (also) produce a value of the type of the whole expression (from an unreachable point) would be pointless. It is this kind of situation that can be gracefully handled by giving exception-raising clauses, jumps, and calls that will have such an effect, the bottom type, the one type that can be (trivially) converted into any other type.

I would suggest writing such a bottom type as * to suggest its convertibility to arbitrary type. It may serve other useful purposes internally, for instance when trying to deduce a result type for a recursive function that does not declare any, the type inferencer could assign the type * to any recursive call to avoid a chicken-and-egg situation; the actual type will be determined by non-recursive branches, and the recursive ones will be converted to the common type of the non-recursive ones. If there are no non-recursive branches at all, the type will remain *, and correctly indicate that the function has no possible way of ever returning from the recursion. Other than this and as result type of exception throwing functions, one can use * as component type of sequences of length 0, for instance of the empty list; again if ever an element is selected from an expression of type [*] (necessarily empty list), then the resulting type * will correctly indicated that this can never return without an error.

So is the idea that var foo = someCondition() ? functionReturningBar() : functionThatAlwaysThrows() could infer the type of foo as Bar, since the expression could never yield anything else? — supercat, Mar 24 '15 at 16:04
You’ve just described the unit type— at least in the first part of your answer. A function which returns the unit type is the same as one which is declared as returning void in C. The second part of your answer, where you talk about a type for a function which never returns, or a list with no elements— that is indeed the bottom type! (It’s often written as _|_ rather than *. Not sure why. Perhaps because it looks like a (human) bottom :) — andrewf, Mar 24 '15 at 17:14
For the avoidance of doubt: ‘doesn’t return anything useful’ is different from ‘doesn’t return’; the first is represented by the Unit type; the second by the Bottom type. — andrewf, Mar 24 '15 at 17:19
@andrewf: Yes I understand the distinction. My answer is a bit longish, but the point I wanted to make is that the unit type and the bottom type both play (different but) comparable roles in allowing certain expressions to be used more flexibly (but still safely). — Marc van Leeuwen, Mar 25 '15 at 11:14
@supercat: Yes that is the idea. Currently in C++ that is illegal, although it would be valid if ̀functionThatAlwaysThrows() were replaced by an explicit throw, due to special language in the Standard. Having a type that does this would be an improvement. — Marc van Leeuwen, Mar 25 '15 at 11:17
@MarcvanLeeuwen: I can see how that could be somewhat useful, though the lack of such a type is far less of an omission IMHO than the inability to distinguish between e.g. 16-bit values which are used as a wrapping algebraic rings (such that given uint16_t a=2,b=65535; uint32_t c=0; the expression c+=(a-b); would add 3 to c), or as numbers (such that the expression would subtract 65533 from c). Even if different processors have different integer sizes that are "most convenient" to work with, allowing code which needs particular integer behaviors to specify them... — supercat, Mar 25 '15 at 15:55
...would make C a usable language for writing portable code. As it is, I think standard writers have historically been more interested in writing a standard which would allow obscure systems to implement a language and call it "C", than in defining one which would allow code written in C to run identically on an 8-bit micro or a 64-bit monster. — supercat, Mar 25 '15 at 15:57

score 1 · Answer 8 · edited Feb 02 '19 at 12:52

In some languages, you can annotate a function to tell both the compiler and developers that a call to this function isn't going to return (and if the function is written in a way that it can return, the compiler won't allow it). That's a useful thing to know, but in the end you can call a function like this like any other. The compiler can use the information for optimisation, to give warnings about dead code, and so on. So there is no very compelling reason to have this type, but no very compelling reason to avoid it either.

In many languages, a function can return "void". What that exactly means depends on the language. In C it means the function returns nothing. In Swift, it means the function returns an object with only one possible value, and since there is only one possible value that value takes zero bits and doesn't actually require any code. In either case, that's not the same as "bottom".

"bottom" would be a type with no possible values. It can never exist. If a function returns "bottom", it cannot actually return, because there is no value of type "bottom" that it could return.

If a language designer feels like it, then there is no reason to not have that type. The implementation is not difficult (you can implement it exactly like a function returning void and marked as "doesn't return"). You can't mix pointers to functions returning bottom with pointers to functions returning void, because they are not the same type).

Is there a reason to have a bottom type in a programming language?

8 Answers8