I am currently looking for a job! If you're hiring a new grad in 2026 for for Rust, TypeScript, or React, feel free to shoot me an email at serena (at) quamserena.com.

Referential transparency

From a theory perspective, referential transparency is complicated and idiosyncratic definitions common. For the purposes of this post I'll focus on the following definition of referential transparency:

A term is referentially transparent iff every occurrence of that term can be replaced by that term's definition and vice-versa without materially changing the program.

In other words, we want to be able to transparently swap out a term for the definition that it references without changing the program.

For example, in OCaml, we have

let add x y = x + y in
print_endline "The sum is:";
print_endline @@ string_of_int @@ add 1 2

in which the term add 1 2 can be freely replaced by its definition x + y without changing program behavior:

print_endline @@ string_of_int @@ 1 + 2

So, we can conclude that the application of add is referentially transparent. Let's try adding a side effect to add and see what happens:

let add x y = (print_endline "Adding..."; x + y) in
print_endline "The sum is:";
print_endline @@ string_of_int @@ add 1 2

We get:

The sum is:
Adding...
3

Substituting add, we have

print_endline "The sum is:";
print_endline @@ string_of_int @@ (print_endline "Adding..."; x + y)

And we get the same thing...

The sum is:
Adding...
3

So we can conclude that in this case the application of add is also referentially transparent, despite having side effects. (Yes, this is different than the normal definition of referential transparency. I'll get to that.) OCaml's problems come in when expressions have no parameters:

let sayhi = print_endline "Hi!";
sayhi;
sayhi

which only prints Hi! once because the side effects are realized when sayhi is bound. The solution is to give our function at least one parameter, in this case the unit type:

let sayhi () = print_endline "Hi!";
sayhi ();
sayhi ()

and now it works as desired. This is because the zero-parameter (nullary) sayhi is a value and the one-parameter (unary) sayhi () is a function, and functions are referentially transparent in OCaml while variables are not.

Haskell takes a different approach and ensures that all code is always referentially transparent by handling effects explicitly as values, e.g. monads. The problem on the OCaml side is that we don't have a way in the syntax to express directly the ordering in which side effects should be realized, and instead we have to do this indirectly by controlling when a value is bound. Haskell's semantics, on the other hand, have no concept of evaluation order (though operationally it is specified to be call-by-need), and instead the effect ordering is specified explicitly in the syntax. This allows Haskell to be lazily evaluated and also referentially transparent, as the effects will always happen with the same ordering regardless of when a value is bound.

The important bit to highlight is that:

in languages with explicit effects, evaluation order is semantically irrelevant because it doesn't affect the behavior of the program; and
in languages with implicit side effects, the order in which those effects are realized is determined by the order in which they are evaluated, so evaluation order is semantically important.

The other definition of referential transparency

For reasons beyond my comprehension, the definition of referential transparency is commonly understood to be along the lines "swap out a function for the value it returns" instead of "swap it out for its definition." It then becomes imporant what we should consider the value to be, e.g. do we include side effects in the value? For example, consider the following JavaScript:

function add(x, y) {
    console.log("Adding!");
    return x + y;
}

let val = add(1,2);

Under this (unfortunately common) definition, we would swap out add with its return value:

let val = 1 + 2;

and in doing so would injudiciously remove the side effects (printing Adding!) from the program. I argue that we could just as validly consider the value of add(1, 2) to be:

((x, y) => {
    console.log("Adding!");
    return x + y;
})(1, 2)

which will print Adding! when it is bound to val. JavaScript actually has a name for this pattern, it's called an immediately invoked function expression (IIFE). I argue that the value of a function invocation in JavaScript is its entire invocation, not just what it returns; and so we ought to replace a function invocation with its entire invocation when considering referential transparency, not just the return value. I think the confusion arises because:

In functional languages, the definition of a function is an expression (the return value), and therefore there is no semantic difference between a function's invocation and its return value. (In imperative languages, functions are a series of statements, not expressions.)
Therefore in functional languages it's fine to replace a function application with its return value (that is, the expression it returns), but this doesn't extend to imperative languages because functions aren't expressions.

There is a reason that I switched to JavaScript for this section. I will now repeat the same exercise above in OCaml to highlight the absurdity, since in OCaml functions are expressions. We start with

let add x y = (print_endline "Adding..."; x + y) in
print_endline @@ string_of_int @@ add 1 2

If we replace it with the expression that the function returns, we have

print_endline @@ string_of_int @@ (print_endline "Adding..."; 1 + 2)

If we capriciously drop the first part of the expression, we would write

print_endline @@ string_of_int @@ 1 + 2

There is no reason to drop the first part of the expression. It's still a part of the expression that yields the return value! While this definition of the term "referential transparency" is valid and not contradictory, it isn't well-motivated or useful. We already have a word to describe functions that have side effects, impure, and if we were to take up this definition of referential transparency it would be completely overshadowed by purity. Referential transparency, as defined in the former section, instead reveals something more subtle and more important about programming language semantics than is revealed by defining it by return value.

Postscript

Sometimes the term referential transparency is applied to functions, when really it's a property of expressions, which in this case is the application of the function (or invocation in imperative-land). While this usage is technically incorrect it can be understood as meaning "for every application of the function, its application is referentially transparent." I have also glossed over some finer points above (e.g., it is not a requirement that imperative languages discriminate between expressions and statements, it is just common) because I didn't want to distract from the main point.

I should also note that I believe this definition of referential transparency to be closer to what Quine meant when he discussed referential transparency, but I'll leave that for someone who is more familiar with modal logic and Quine's work to address.