Category Archives: research

The biggest semantic mess in Futhark

**Posted on September 26, 2025**

The original idea behind Futhark was that parallel programming (of certain problems) does not require a complicated language. Indeed, we believed that there was little need for the complexity to exceed that of the functional languages commonly taught to first-year students at universities. (The complexity of parallel algorithms is another matter.) Overall, I think Futhark has succeeded at that.

The meaning of a Futhark program is fairly easily determined using normal environment-based semantics; even tricky things like uniqueness types are mainly complicated in an operational sense, and to some extent in the type system, the meaning of a program (once it type checks) is obvious. This semantic simplicity is also evident in the implementation. While the compiler has lots of complicated optimizations and sophisticated transformations, the reference interpreter is largely straightforward and quite similar in structure to how a person studying programming languages would write their first tree-walking interpreter.

You will note that the above paragraph is full of words like *overall*, *largely*, and *fairly*. This is because there is one language feature that has proven a particularly fertile ground for edge cases and implementation bugs. That feature is **size types**.

In the following, I will explain why a seemingly simple type system feature has proven so surprisingly challenging.

### Size Parameters

To recap the basic idea, size types allow Futhark functions to impose constraints on the sizes of their parameters. A simple example is a definition of the dot product, which takes two arguments that must be vectors of the same size.

Here, *n* is a size parameter that is implicitly instantiated whenever the function `dotprod` is applied, based on the concrete arrays it is passed. This by itself is not so difficult. Sizes can easily be incorporated into a type checking algorithm by treating them as types of a different kind — the details do not matter, just take my word that it’s fine. (Writing blog posts is easier than writing academic papers.)

The main trouble arises when we introduce the ability to use sizes as term-level variables, like, for example, the definition of `length`.

When a size parameter is in scope, it can be used in expressions. Unsurprisingly, the value of a size parameter is the size of the corresponding dimension in some array. What is interesting is that we can access the size of `x` without actually mentioning `x` itself.

Intuitively, we can imagine that the concrete value of *n* is determined at run time by actually looking at `x` (say, by counting its elements), and for now this intuition holds.

But now let us consider what happens for a function that takes the number of columns of a matrix (the inner length):

“`futhark
cols (mat: [n][m]a) = m
“`

There are now two size parameters, *n* and *m*, and we retrieve the latter.

This case is a little more challenging when *n* is zero, as we cannot simply retrieve a row and look at it to determine *m* because there are no rows. Yet an expression such as

“`futhark
cols (replicate 0 (replicate 3 0))
“`

should still work (and evaluate to 3).

This means we need to extend our notion of how the values of size parameters are determined since it cannot be done by looking at the syntactic form of an array value and counting the elements (since `replicate 0 (replicate 3 0)` really is just written `[]`).

### Shape Annotation of Arrays

The solution is to extend our (conceptual and perhaps even concrete) array representation such that an array always carries a **shape** with it, in addition to its actual elements.

Then, intuitively, to determine the value of some size parameter, we still look for values (such as `x` above) that have that size somewhere and extract it from those values.

But now, perhaps unwittingly, we have picked a difficult fight.

The problem is that we sometimes have to create multidimensional arrays without having any example of an element! Yet we still somehow have to conjure up the right shape for the array.

As an example, consider the `map` function, of the following type:

“`futhark
val map [n] ‘a ‘b : (f: a -> b) -> (as: [n]a) -> [n]b
“`

The element type of the returned array is given by the return type of the function `(f)` we are mapping with. But if we are mapping over an empty array, then `f` may never be applied:

“`futhark
map (\(x: i32) -> [x, x, x]) []
“`

How, then, are we supposed to determine that the shape of this empty array is actually `[0][3]`?

When the array is constructed inside `map`, all that is known is that the outer size is *n* (which is known to be 0), and that the element type is some `b`, but we have no value of type `b` we can use to determine what the shape may be! We do have a function `a -> b`, but we also have no `a`; all we have is an array of type `[0]a`, which clearly does not have any elements inside of it.

### The Idea of Shapely Functions

One solution to this problem is due to Barry Jay and explained in the paper *A Semantics for Shape*.

The idea is that any **shapely function** can be evaluated normally (with a value, producing a value) or with a shape, producing a shape. A shapely function is therefore one where the shape of the result depends *only* on the shape of the input, which rules out functions such as filtering, where the result shape depends on the values as well.

This by itself is no problem to Futhark, as we only want to allow mapping with functions that have a predictable result to avoid irregular arrays.

Using this approach requires us to have two ways of applying a function: for **value** and for **shape**. This is a slight semantic complication, but perhaps we can live with it.

But we also have to get the input shape from somewhere, and in the case of `map`, all we know is that the input has type `a`.

### Using Instantiated Type Parameters

Things could be made to work if, whenever a function is invoked, we also receive the concrete shapes of values of type `a` (assuming this is possible, but because `a` is used for array elements, we know it must have a meaningful shape).

But if we do that, then why not just take the shape from `b` instead and avoid this entire business of shapely functions?

And indeed, this is the Futhark evaluation model. At any time, a polymorphic function can inquire about the concrete type that a type parameter has been instantiated with and extract a shape if necessary.

This can then be used to annotate any constructed arrays with their full shape.

Note that this is a model: the interpreter does it this way because the interpreter is intended to closely mirror the model, but the actual compiler does not, of course, do it literally this way, as type parameters do not exist at run time. It just has to do it in a way that produces the same result. (In practice, it does monomorphisation.)

### Troubles

We didn’t do it this way because it was easy. We did it because we thought it would be easy.

Sadly, it has turned out to not be easy.

The basic problem is that we now have an obligation to always know the full shape of any type at all times (except for those types that can never be used as array elements, but let us leave those aside for now).

This turns out to require machinery more intricate than standard environment-based evaluation.

The fundamental problem is pretty obvious: we need to also evaluate **types** along with expressions, just in case they are eventually used to construct an array, and types occur in various unexpected places.

For example, consider a module that defines some local binding `cnt` and a size-parameterised type that refers also to `cnt`:

“`futhark
module M = {
let cnt = 5
type C [n] = [n][n * cnt]i32
}
“`

The usual way definitions of polymorphic types such as `type C [n] = …` works is that they are added as type constructors to a type environment and then instantiated when used.

Now, `M.C [n]` by itself does not have a shape, since *n* is not yet known. At some point in the future, we may end up with an instantiation `M.C [k]` for some concrete *k*, and when that happens we can then compute the shape of the type, which will be `[k][k * M.cnt]`.

But there is no guarantee that `M.cnt` is actually in scope — it may be some hidden definition inside the module `M`, and even if it isn’t, it’s difficult to go from an expression `n * cnt` and give it a meaning in a different scope than it was originally defined in.

Since Futhark is a pure language, we could, as soon as we interpret the type definition of `C`, substitute the result of evaluating `cnt` into its right-hand side.

But this is also uncomfortable: it’s a syntactic operation, and while substitution-based semantics are fairly common in theoretical work, they are undesirable in implementations because they are quite inefficient. While the expression `n * cnt` is small, others may be large.

### Type Constructors as Closures

Our solution is that a type definition captures not just the right-hand side of the definition, but also its environment — that is, type constructors are **closures**.

When at some point in the future we finally instantiate `M.C` and have a `k` value for `n`, we extend the captured environment with a binding `n => k` and evaluate all the expressions in sizes.

This is a very strange implementation that took us quite a while to work out. If I had more experience implementing dependently typed languages, then perhaps I would not find it so weird, as it really just makes type constructors similar to functions, which they would be in a fully dependently typed language.

### Takeaway

Size types have proven to be a rich source of complexity in the design and implementation of Futhark, despite their initial simplicity.

Handling shapes, sizes, and type-level computations requires intricate machinery and careful design decisions.

However, these challenges are also what make Futhark powerful and expressive for parallel programming with precise size constraints.

Understanding these subtleties brings us closer to developing robust, efficient, and predictable parallel programs.

Thank you for reading!
https://futhark-lang.org/blog/2025-09-26-the-biggest-semantic-mess.html

Indoor wood burners may be harming your lungs, study warns

Wood-burning stoves have become increasingly popular across Europe in recent years, often promoted as an eco-friendly and renewable alternative to fossil fuels. However, new research suggests that using these stoves inside the home may come at a cost: a decline in lung function.

The study was presented at the European Respiratory Society Congress in Amsterdam by Dr. Laura Horsfall, Principal Research Fellow at University College London’s Institute of Health Informatics. Her team’s findings add to a growing body of evidence that indoor wood burning may be more harmful than many people realize.

In the UK, domestic wood burning now contributes about 20% of harmful fine particle air pollution, known as PM2.5. These particles are tiny—less than 2.5 micrometers across—and can penetrate deep into the lungs. For comparison, road vehicle exhaust now makes up only about 4% of the UK’s PM2.5 emissions.

The shift has been dramatic: annual emissions from wood burning almost doubled between 2009 and 2023, rising from 3,200 to 6,000 metric tons.

“Wood burning at home emits harmful air pollution both indoors and outdoors, including substances we know can cause cancer,” said Dr. Horsfall. Despite this, the number of wood stoves in use has grown rapidly since 2009.

However, we still don’t fully understand the health impacts in high-income countries, where wood stoves tend to be used for lifestyle or aesthetic reasons rather than necessity.

To explore these effects, Horsfall and her colleagues used data from the English Longitudinal Study of Aging, which tracks the health of older adults in England. They focused on a lung function measure called FEV1—the amount of air a person can exhale forcefully in one second. A lower FEV1 is linked to higher risk of conditions such as asthma, COPD, and other respiratory complications.

Studying the health impacts of wood stoves is tricky. People who use solid fuel heating in the UK are often wealthier and healthier than the average population. They smoke less and have lower rates of pre-existing lung disease. This means the negative impact of wood burning could easily be overlooked.

Even after accounting for these factors, the researchers found that people who reported using wood or other solid fuels experienced a faster decline in lung function over an eight-year period compared to non-users.

“Our study suggests that high levels of particulate matter from stoves damage respiratory tissues and trigger inflammation in much the same way as cigarette smoke,” Dr. Horsfall explained.

The team now plans to study whether people living in neighborhoods with a high concentration of stoves—such as affluent areas of London—also experience more prescriptions for inhalers or higher hospital admissions for lung disease.

Professor Ane Johannessen, an environmental health expert at the University of Bergen who was not involved in the study, said the results are significant.

“We already know from studies in low- and middle-income countries that wood smoke exposure inside homes is harmful, causing asthma, COPD, and lung cancer. This research suggests that the same risks could apply in European homes, where stoves are used for lifestyle reasons.”

Although modern eco-design stoves are marketed as cleaner than older models, the study authors stress that even newer stoves may not be risk-free.

Johannessen urged both the public and doctors to take the risks seriously. “Doctors should be asking patients about stove use when investigating unexplained lung problems. And families should be aware that these stoves could be harming their health.”

The findings highlight a pressing need for clearer public health guidance and regulation. While wood stoves may be cozy and marketed as eco-friendly, they may also be silently damaging the lungs of those who use them most.
https://knowridge.com/2025/10/indoor-wood-burners-may-be-harming-your-lungs-study-warns/