Fun with folds

The functional genome

 A fold is a higher order function (a function that has one or more function parameters and/or returns a function) that is recursively applied over a data structure. An easy to grasp example of such a function can be found in Haskell :

foldl (operator) accumulator [...]

Here are the building blocks of the above expression :

  • foldl : A left fold; left as in leftwise applied (…not a leftover fold) – more on this later.
  • (operator) : A binary function
  • accumulator : A starting value
  • […] : A list to fold

 So, say you want for example to sum the (list of) numbers from 1 to 10 :

foldl (+) 0 [1..10]

 This expands to :

( ( ( 1 + 2 ) + 3 ) + ... + 10 )

 Does the power of folds boil down to accumulations? Far from it!


 Folding can be used creatively to perform all sorts of actions, algebra preserving ones:

foldl (max) 0 [1, 20, 2, 14]

or catamorphisms

foldl (Stringify) "" [1, 20, 2, 14]

here we assume that Stringify is a function (you’d need a type class for this) that maps its input to a stringified concatenation, so that we’d get “120214” and have a numeric sequence reduced to a string.

Folds in a C++ Context

 C++17 will have fold expressions! Clang supports them since 3.6 as does the trunk version of gcc. The syntax differs from Haskell and is designed to play well with variadic templates. A fold expression can be written like this:

(args + ...)

 This is a (unary) left fold, because we “anchor” the application of the operator to the left, ie it expands as:

((args$0 + args$1) + ...) + args$n

 If we were to create a right fold :

(... + args)

 the expression expands like this :

args$0 + (... + (args$n-1 + args$n))

 We used the term unary previously not to describe the operator (+) but the accumulator. When an accumulator is explicitly specified

(args + ... +   an)
(  a0 + ... + args)

we dub the fold expression as “binary”. For unary folds the identity value of the operator’s algebra is implicitly used, i.e for ‘+’ a zero is added, for ‘*’ the result is multiplied by 1 etc, so that empty parameter packs have meaningful folding.

Show me some folding

A for_each lambda

auto ForEach = [](auto&& fun, auto&&... args) {
    (fun(args), ...);

To showcase this, let’s assume the existence of a printer functor

struct Printer 
    template <class T> 
    void operator()(T&& arg) { std::cout << arg; }

we can call it on arbitrary sequences like this

ForEach(Printer{}, 1, " to the ", 2, " to the ", 3, '\n');

ForEach applies an arbitrary callable to variadic arguments by folding with the , operator. It also does this in a left to right fashion, which is the associativity of the comma operator. Note that if a right fold was used inside ForEach we’d still be applying the callable from left to right (the associativity of comma does not change). Run the code here.

Summing the contents of an std::array at compile time

namespace detail
	template <class... Ts>
	constexpr auto sum_(Ts&&... args)
		return (args + ...);

	template <typename T, size_t N, size_t... Is>
	constexpr T sum_impl(array<T, N> const &arr, 
		return sum_(get<Is>(arr)...);

template <typename T, size_t N>
constexpr T sum(array<T, N> const &arr)
    return detail::sum_impl(
        arr, make_index_sequence<N>{});

int main()
    constexpr array<int, 4> arr{ { 1, 1, 2, 3 } };
    cout << integral_constant<int, sum(arr)>{};

The core function here is detail::sum_ which uses fold expressions to calculate the total of a sequence.

A more modern summing style

The above is OK and all, but C++17 will give us the tools to be way, way more terse. In this case the function apply abstracts away the process of unfolding an algebraic data type into a function call (what we manually do, in a total of three functions, with the index sequence above). Having this in place, all it takes is to encapsulate our fold :

struct Summer
    template <class... Ts>
    constexpr auto operator()(Ts&&... args) { 
        return (args + ...); 

and then call

apply(Summer{}, arr)

Sadly the current proposal for apply only works for tuples. It takes a minor modification to have it working for std::array as well, specifically the use of a mechanism to obtain the valency of the input data type (for now tuple_size is “hardcoded” into apply). Click here for a demo.

Folding arbitrary functions

A somewhat limiting feature of c++ fold expressions, is that they’re only available for certain operators:

        +  -  *  /  %  ^  &  |  ~  =  <  >  <<  >>
        +=  -=  *=  /=  %=  ^=  &=  |=  <<=  >>=
        ==  !=  <=  >=  &&  ||  ,  .*  ->*

So doing

args Op ...

is not allowed for arbitrary operators. We are going to create a mechanism to bypass this. To solve this we are going to use expression templates, which is the first thing that came to mind when looking at Haskell’s fold diagram, due to its resemblance to Veldhuizen’s expression parse trees:


The code below creates add-hoc expression templates for arbitrary operators using thin wrapper types that can be parametrized by the operant/operator pair.

First thing we need is a wrapper type that apart from data, records the type of the operant we want to use:

template <class F, class T>
struct XX
	T data;

	template <class D>
	constexpr XX(D&& data)
		: data(std::forward<D>(data))
	{ }

	constexpr T give() const
		return data;

Then we create an expression class that records the operants and the type of the operation we want to perform.

template <class L, class F, class R>
struct YY
	L l;
	R r;

	template <class LL, class RR>
	constexpr YY(LL&& l, RR&& r)
		: l(std::forward<LL>(l))
		, r(std::forward<RR>(r))
	{ }

	constexpr auto give() const
		return F{}(l.give(), r.give());

We’ll be using the + operator in our ad-hoc expression templates so we overload it to perform expression parsing for our wrapper class (the two versions are created in way that satisfies the types involved in a left to right evaluation):

template <class F, class R, class T>
constexpr auto operator+(
    XX<F, T> const& lhs, R const& rhs)
	return YY< XX<F, T>, F, R >(lhs, rhs);

template <class F, class T1, class T2, class R>
constexpr auto operator+(
    YY<T1, F, T2> const& lhs, R const& rhs)
	return YY< YY<T1, F, T2>, F, R >(lhs, rhs);

Finally the interface functions use the above machinery:

namespace detail
	template <class... Ts>
	constexpr auto _foldl(Ts&&... args)
		return (args + ...);

	template <class... Ts>
	constexpr auto _foldr(Ts&&... args)
		return (... + args);

template <class F, class... Ts>
constexpr decltype(auto) foldl(Ts&&... args)
	return detail::_foldl(XX<F, Ts>(args)...).give();

template <class F, class... Ts>
constexpr decltype(auto) foldr(Ts&&... args)
	return detail::_foldr(XX<F, Ts>(args)...).give();

What we get out of this, is a foldl and a foldr function that, much like the Haskell counterparts, can be parameterized by the type of a binary operator. And then things get interesting :

// concatenate the stringification of sequence elements
    std::string(" bird in the hand, is worth "), 
    std::string(" in the bush"));

// create a vector out of a sequence
auto k = foldl<Vettore>(1, 2, 30, 12);

// find the maximum element of a sequence
foldl<Max>(1, 20, 3, 5);

The complete code for the above examples can be found here.

Hopefully this is clear enough to proceed to the next step, providing some syntactic sugar. We didn’t get through all this ET trouble to end up with something compile time recursion could (kind of, almost in let’s say a similar way) handle. What we aimed for is the ability to write customizable C++ folds ourselves. Here’s how :

template <class F, class T>
XX<F, T> Op(T const& val)
    return XX<F, T>(val); 

So when using this “operator generator” we’d be writing things like

(Op<Max>(args) + ...)


5. Extension to std::experimental::apply
6. Arbitrary folds
7. All the fold examples in one place