Monoid
Accumulation, reduce operations
__ __ ___ _ _ ___ ___ ____ ____
| \/ |/ _ \| \ | |/ _ \_ _| _ \/ ___|
| |\/| | | | | \| | | | | || | | \___ \
| | | | |_| | |\ | |_| | || |_| |___) |
|_| |_|\___/|_| \_|\___/___|____/|____/
Introduction #
Having explored semigroups and their associative operations, we now encounter a natural extension that adds even more power to our algebraic toolkit: monoids[1]. If semigroups gave us the ability to combine elements associatively, monoids give us something extra - a neutral element that doesn't change anything when combined with other elements.
Think of monoids as semigroups with a "do nothing" operation. This seemingly simple addition unlocks additional power.
From Semigroups to Monoids #
A monoid is essentially a semigroup enhanced with an identity element (also called a neutral element or unit). This identity element acts like zero in addition or one in multiplication - it leaves other elements unchanged when combined with them.
While semigroups are most naturally used with non-empty collections, monoids can also handle empty collections gracefully thanks to their identity element. This makes them useful for operations like:
- Folding over empty lists
- Initializing accumulator values
- Providing default values in computations
Formal Definition #
A monoid is an algebraic structure consisting of:
- A set
M(the underlying set or carrier) - A binary operation
∘ : M × M -> M(closed under the operation) - Associativity law: For all
a, b, c ∈ M:(a ∘ b) ∘ c = a ∘ (b ∘ c) - Identity element
e ∈ Msuch that for alla ∈ M:e ∘ a = a ∘ e = a
The first three properties make a monoid a semigroup, and the fourth property - the identity element - is what distinguishes monoids from semigroups.
The Identity #
The identity element might seem trivial, but it provides crucial benefits:
Empty Case Handling: When folding over an empty collection, the identity element provides a meaningful default result.
Initialization: The identity element serves as a perfect starting point for accumulator-style computations.
Simplification: Many algorithms become simpler when they can assume an identity element exists.
Composability: Functions that return monoids can be safely composed, even when some return "empty" results.
Monoids as Single-Object Categories #
The identity element is not the only hidden gem of monoids. In addition, they provide an associative binary operation. Sounds familiar?
A regular category also has objects, morphisms, an associative composition law, and identity morphisms. Can you see it? Monoids have all the tools to form a category. Hence, every monoid can be viewed as a category with exactly one object.
Take a category with one object, regardless of the nature of that object:
- Objects: just one, call it
•. The single object•is fixed - Morphisms:
Hom(•, •) = M, whereMis original monoid setHom(•, •)is the hom-set of arrows from•to itself
- Composition: define categorical composition by
g ∘ f = g * f- monoid operation - Identity:
id_• = e. The identity arrow is the monoid unit
• --f--> •
• --g--> •
• --h--> •
Composition: g ∘ f = h (defined by *)
Identity: id_• = e
Monoid (M, *, e) Category C with object A
------------------- -------------------------
| Elements: M | | Morphisms: Hom(A,A) |
| Operation: * | <------> | Operation: ∘ |
| Identity: e | | Identity: id_A |
------------------- -------------------------
| |
v v
Category with one object (•) End(A) forms a monoid
Monoid ≅ Single-Object Category
Monoidal category definition #
There is also an ordinary category of monoids, whose objects are monoids and whose morphisms are monoid homomorphisms.[2] That should not be confused with a monoidal category, which is a different notion.
A monoidal category[3] is a category C equipped with:
- A tensor product functor
⊗ : C × C -> C - A unit object
IinC. - Natural isomorphisms ensuring:
- Associativity:
(A ⊗ B) ⊗ C ≅ A ⊗ (B ⊗ C). - Unit laws:
I ⊗ A ≅ A ≅ A ⊗ I.
- Associativity:
- Coherence conditions ensuring these isomorphisms fit together consistently.
A monoidal category is an entire category equipped with a monoid-like structure. Tensor product
⊗plays the role of the binary operation. Unit objectIplays the role of the identity. Associativity and unit laws hold up to natural isomorphism.
Monoidal category of endofunctors #
An important example of a monoidal category is the category of endofunctors.
Category of endofunctors #
Endofunctors form a category (denoted End(C)):
-
Objects: The objects of
End(C)are all endofunctorsF: C -> C -
Morphisms: The morphisms are natural transformations between endofunctors. For endofunctors
F, G: C -> C, a morphismα: F ⟹ GinEnd(C)is a natural transformation where:- For each object
AinC, we have a morphismα_A: F(A) -> G(A)inC - Naturality condition: For any morphism
f: A -> BinC, the following diagram commutes:
F(A) ----α_A----> G(A) | | |F(f) |G(f) ▼ ▼ F(B) ----α_B----> G(B)This means
G(f) ∘ α_A = α_B ∘ F(f)for all morphismsfinC. - For each object
-
Identity: The identity natural transformation
id_F: F ⟹ Fwhere(id_F)_A = id_{F(A)}for each objectA -
Composition: Natural transformations compose component-wise:
- Given
α: F ⟹ Gandβ: G ⟹ H, their composition isβ ∘ α: F ⟹ H (β ∘ α)_A = β_A ∘ α_Afor each objectA
- Given
The collection of all endofunctors on a category
Cforms its own category, denotedEnd(C)or[C, C].
Monoidal category #
And how is it a monoidal category? Because End(C) has additional structure beyond just being a category.
Monoidal Product: Functor composition ∘ serves as the monoidal product
- For endofunctors
F, G: C -> C, their compositionF ∘ Gis also an endofunctor - Object mapping:
(F ∘ G)(A) = F(G(A)) - Morphism mapping:
(F ∘ G)(f) = F(G(f))
Monoidal Unit: The identity functor Id_C: C -> C where Id_C(A) = A and Id_C(f) = f
Associativity: Functor composition is associative: (F ∘ G) ∘ H = F ∘ (G ∘ H)
Unit Laws: F ∘ Id_C = F = Id_C ∘ F for any endofunctor F
(Endofunctors on
C,∘,Id_C) is a monoidal category
Horizontal Composition of Natural Transformations #
In End(C), natural transformations can be composed both vertically (as morphisms in the category) and horizontally (using the monoidal structure):
Vertical Composition: Standard composition of natural transformations
(β ∘ α)_A = β_A ∘ α_A
This is ordinary composition of morphisms in the underlying category. It is literally the same pattern as: given f:A -> B, g:B -> C, their composite is (g ∘ f):A -> C.
A -- f --> B -- g --> C
Horizontal Composition: For natural transformations α: F ⟹ G and β: H ⟹ K:
α * β: F ∘ H ⟹ G ∘ K- Its component at
Ais(α * β)_A = G(β_A) ∘ α_{H(A)} - By naturality of
α, this is equivalentlyα_{K(A)} ∘ F(β_A)
In a general monoidal category, this is a composition "side by side" using the tensor product ⊗. In End(C), that tensor product is functor composition. For ordinary morphisms f:A -> B and g:C -> D, the analogous tensor is f ⊗ g: A ⊗ C -> B ⊗ D.
A ----f----> B
⊗
C ----g----> D
Result: (A ⊗ C) ----f ⊗ g----> (B ⊗ D)
This is "parallel composition": apply f and g independently, then combine the results with the tensor.
When we combine both horizontal and vertical composition, it is helpful to picture a two-dimensional grid like this:
A --f--> B --f'--> C
| | |
g g' g''
| | |
D ------> E ------> F
- Row-wise composition corresponds to ordinary composition inside each row.
- The second direction represents composition across the other axis of the grid.
- In
End(C), these two directions correspond to vertical composition of natural transformations and horizontal composition induced by functor composition.
The Interchange Law #
When we have both kinds of composition (horizontal and vertical), they must cooperate. This is captured by the interchange law.
In a monoidal category, for morphisms:
f:A -> B, f':B -> Cg:D -> E, g':E -> F
the following equation holds:
(f′ ∘ f) ⊗ (g′ ∘ g) = (f′ ⊗ g′) ∘ (f ⊗ g)
A⊗D -- f⊗g --> B⊗E -- f'⊗g' --> C⊗F
= same as
(f' ∘ f) ⊗ (g' ∘ g)
A⊗D ----------------------------> C⊗F
So whether you compose first and then tensor, or tensor first and then compose, you get the same result.
Vertical then horizontal = horizontal then vertical
Take for example:
-
Set with Cartesian product
- Vertical = usual composition of functions.
- Horizontal = product of functions.
-
Endofunctors with composition:
- Vertical = natural transformation composition.
- Horizontal = horizontal composition of natural transformations induced by functor composition.
Examples #
Let's look at some common monoids that build upon the semigroups we've already explored:
Numbers Under Addition #
- Set: All integers
- Operation: Addition (
+) - Identity:
0(because0 + x = x + 0 = x) - Associativity:
(a + b) + c = a + (b + c)
Numbers Under Multiplication #
- Set: All integers
- Operation: Multiplication (
*) - Identity:
1(because1 * x = x * 1 = x) - Associativity:
(a * b) * c = a * (b * c)
Strings Under Concatenation #
- Set: All strings
- Operation: Concatenation (
++) - Identity: Empty string
""(because"" ++ s = s ++ "" = s) - Associativity:
(s1 ++ s2) ++ s3 = s1 ++ (s2 ++ s3)
Lists Under Concatenation #
- Set: All lists of type
[a] - Operation: List concatenation (
++) - Identity: Empty list
[](because[] ++ xs = xs ++ [] = xs) - Associativity:
(xs ++ ys) ++ zs = xs ++ (ys ++ zs)
Notice how each of these semigroups gains an identity element to become a monoid, making them easier to work with in practical applications.
Example usage #
Haskell has built-in support for monoids through the Monoid type class, which builds upon the Semigroup type class:
import Data.Monoid
-- Using Sum monoid for addition
example1 :: Int
example1 = getSum $ mconcat [Sum 1, Sum 2, Sum 3, Sum 4]
-- Result: 10
-- Using Product monoid for multiplication
example2 :: Int
example2 = getProduct $ mconcat [Product 2, Product 3, Product 4]
-- Result: 24
-- Working with strings (concatenation)
example3 :: String
example3 = mconcat ["Hello", " ", "World", "!"]
-- Result: "Hello World!"
-- Folding with empty case handling
safeSum :: [Int] -> Int
safeSum xs = getSum $ foldMap Sum xs
-- safeSum [] = 0 (identity element)
-- safeSum [1,2,3] = 6
-- Custom monoid for tracking statistics
data Stats = Stats { count :: Int, total :: Int } deriving (Show)
instance Semigroup Stats where
Stats c1 t1 <> Stats c2 t2 = Stats (c1 + c2) (t1 + t2)
instance Monoid Stats where
mempty = Stats 0 0
-- Usage: combining statistics from multiple sources
combineStats :: [Int] -> Stats
combineStats = foldMap (\x -> Stats 1 x)
-- combineStats [10, 20, 30] = Stats {count = 3, total = 60}
-- combineStats [] = Stats {count = 0, total = 0}
main :: IO ()
main = do
print example1
print example2
print example3
print $ safeSum [1,2,3]
print $ safeSum []
print $ combineStats [1, 2, 3]
// Monoid interface
interface Monoid<T> {
empty: T;
combine: (a: T, b: T) => T;
}
// String concatenation monoid
const StringMonoid: Monoid<string> = {
empty: "",
combine: (a, b) => a + b
};
// Number addition monoid
const NumberAddMonoid: Monoid<number> = {
empty: 0,
combine: (a, b) => a + b
};
// Array concatenation monoid
const ArrayMonoid = <T>(): Monoid<T[]> => ({
empty: [],
combine: (a, b) => [...a, ...b]
});
// Generic fold function that works with any monoid
function fold<T>(monoid: Monoid<T>, values: T[]): T {
return values.reduce(monoid.combine, monoid.empty);
}
// Usage examples
const result1 = fold(StringMonoid, ["Hello", " ", "World"]);
// Result: "Hello World"
const result2 = fold(NumberAddMonoid, [1, 2, 3, 4, 5]);
// Result: 15
const result3 = fold(ArrayMonoid<number>(), [[1, 2], [3, 4], [5]]);
// Result: [1, 2, 3, 4, 5]
// Safe operations with empty collections
const emptySum = fold(NumberAddMonoid, []);
// Result: 0 (identity element)
// Custom monoid for shopping cart
interface CartItem {
quantity: number;
price: number;
}
const CartMonoid: Monoid<CartItem> = {
empty: { quantity: 0, price: 0 },
combine: (a, b) => ({
quantity: a.quantity + b.quantity,
price: a.price + b.price
})
};
const cart = fold(CartMonoid, [
{ quantity: 2, price: 10.50 },
{ quantity: 1, price: 25.00 },
{ quantity: 3, price: 7.25 }
]);
// Result: { quantity: 6, price: 42.75 }
console.log(result1, result2, result3, emptySum, cart);
using System;
using System.Collections.Generic;
using System.Linq;
// Monoid interface
public interface IMonoid<T>
{
T Identity { get; }
T Combine(T a, T b);
}
// String concatenation monoid
public class StringConcatMonoid : IMonoid<string>
{
public string Identity => string.Empty;
public string Combine(string a, string b) => a + b;
}
// Integer addition monoid
public class IntAddMonoid : IMonoid<int>
{
public int Identity => 0;
public int Combine(int a, int b) => a + b;
}
// List concatenation monoid
public class ListConcatMonoid<T> : IMonoid<List<T>>
{
public List<T> Identity => new List<T>();
public List<T> Combine(List<T> a, List<T> b)
{
var result = new List<T>(a);
result.AddRange(b);
return result;
}
}
// Extension methods for monoid operations
public static class MonoidExtensions
{
public static T Fold<T>(this IEnumerable<T> source, IMonoid<T> monoid)
{
return source.Aggregate(monoid.Identity, monoid.Combine);
}
public static T FoldMap<TSource, T>(
this IEnumerable<TSource> source,
Func<TSource, T> selector,
IMonoid<T> monoid)
{
return source.Select(selector).Fold(monoid);
}
}
// Usage examples
class Program
{
static void Main()
{
var stringMonoid = new StringConcatMonoid();
var intMonoid = new IntAddMonoid();
var listMonoid = new ListConcatMonoid<int>();
// String concatenation
var words = new[] { "Hello", " ", "World", "!" };
var sentence = words.Fold(stringMonoid);
Console.WriteLine(sentence); // "Hello World!"
// Number addition
var numbers = new[] { 1, 2, 3, 4, 5 };
var sum = numbers.Fold(intMonoid);
Console.WriteLine(sum); // 15
// List concatenation
var lists = new[]
{
new List<int> { 1, 2 },
new List<int> { 3, 4 },
new List<int> { 5 }
};
var combined = lists.Fold(listMonoid);
Console.WriteLine(string.Join(", ", combined)); // "1, 2, 3, 4, 5"
// Safe empty collection handling
var emptySum = new int[0].Fold(intMonoid);
Console.WriteLine(emptySum); // 0 (identity element)
// Custom monoid for order totals
var orders = new[] { 10.50m, 25.00m, 7.25m, 15.75m };
var total = orders.FoldMap(x => x, new DecimalAddMonoid());
Console.WriteLine($"Total: ${total}"); // "Total: $58.50"
}
}
// Custom decimal addition monoid
public class DecimalAddMonoid : IMonoid<decimal>
{
public decimal Identity => 0m;
public decimal Combine(decimal a, decimal b) => a + b;
}
Triangle Monoid Example #
Building upon the triangle semigroup from our previous chapter, we have to be a bit more careful. The centroid-based triangle operation does not come with an obvious geometric identity triangle.
The Set and Operation #
To obtain a monoid from that semigroup, we adjoin a new formal identity element e:
- Set: All triangles in 2D space, together with a new symbol
e - Operation: The same triangle combination operation from the semigroup, extended by the rules
e ∘ T = TandT ∘ e = T - Identity:
e
This e is an extra element added so the operation has a true identity. That construction is standard: any semigroup can be turned into a monoid by adjoining an identity element.
Visual Representation #
Adjoined identity:
e
Combining with identity:
e ∘ △ = △
△ ∘ e = △
Regular triangle combination:
△₁ ∘ △₂ = △₃
Where △₃ is formed by:
- Centroid of △₁ as first vertex
- Centroid of △₂ as second vertex
- Midpoint between centroids as third vertex
Let's see how triangle combination works step by step:
Step 1: Original Triangles
Triangle A: Triangle B:
A D
/|\ /|\
/ | \ / | \
/ | \ / | \
B---+---C E---+---F
↑ ↑
A_centroid B_centroid
Step 2: Extract Centroids
Centroid A: ● (x₁,y₁)
Centroid B: ● (x₂,y₂)
Step 3: Calculate Midpoint
Midpoint M: ● ((x₁+x₂)/2, (y₁+y₂)/2)
Step 4: Form New Triangle
Result Triangle (A ∘ B):
Centroid_A ●
/|\
/ | \
/ | \
Centroid_B ●---+---● Midpoint_M
With Coordinates:
Triangle A: (0,3), (0,0), (3,0) Triangle B: (5,2), (4,0), (6,0)
A(0,3) D(5,2)
/|\ /|\
/ | \ / | \
/ | \ / | \
B(0,0)-+-C(3,0) E(4,0)-+-F(6,0)
↑ ↑
Centroid A Centroid B
(1, 1) (5, 0.67)
Combination Process:
1. A_centroid = (1, 1)
2. B_centroid = (5, 0.67)
3. Midpoint = (3, 0.835)
Result Triangle:
A_centroid(1,1) ●
/|\
/ | \
/ | \
B_centroid(5,0.67) ●---+---● Midpoint(3,0.835)
Identity Element Behavior:
Formal Identity + Any Triangle = That Triangle
e ∘ Triangle A:
A A
e /|\ = /|\
(formal unit) / | \ / | \
/ | \ / | \
B---+---C B---+---C
Triangle A ∘ e:
A A
/|\ e = /|\
/ | \ (formal unit) / | \
/ | \ / | \
B---+---C B---+---C
CSV Transform example #
Continuing semigroup CSV Transform example, let's see how monoids enhance data processing by providing safe empty case handling and initialization.
With monoids we gain additional benefits:
- 🔧 Safe Empty Datasets: No errors when processing empty CSV files
- 🚀 Default Initialization: Perfect starting values for accumulator operations
- 🎯 Simplified Processing: No need for special empty case handling
- ⚡ Parallel-Ready: Safe chunking and load balancing without edge cases
- 📊 Robust Statistics: Always valid results even with missing data
- 👥 Person Combination: Enhanced merging with identity element safety
- ➕ Age Summation: Zero-safe addition that works with empty collections
- 📋 Name Collection: Safe concatenation that handles empty lists gracefully
- 🗂️ Dataset Merging: Combines multiple CSV datasets including empty ones
- 🛡️ Error Recovery: Failed operations contribute identity elements safely
// Enhanced monoid interface with identity element
interface Monoid<T> {
empty: T;
combine: (a: T, b: T) => T;
}
// CSV parsing function (enhanced for safety)
function readCSV(data: string): Person[] {
if (!data.trim()) return []; // Safe handling of empty input
const [header, ...rows] = data.trim().split('\n');
if (!header || rows.length === 0) return []; // Safe handling of malformed CSV
const keys = header.split(',');
return rows.map(row => {
const values = row.split(',');
const obj = Object.fromEntries(keys.map((k, i) => [k, values[i]])) as any;
return {
name: obj.name || '',
age: obj.age || '0',
totalAge: undefined,
count: undefined
} as Person;
});
}
// Person type with safe defaults
interface Person {
name: string;
age: string;
totalAge?: number;
count?: number;
}
// Person monoid with safe empty person
const PersonMonoid: Monoid<Person> = {
empty: {
name: '',
age: '0',
totalAge: 0,
count: 0
},
combine: (p1, p2) => {
// Handle empty persons gracefully
if (p1.name === '' && p1.age === '0') return p2;
if (p2.name === '' && p2.age === '0') return p1;
return {
name: p1.name + " & " + p2.name,
age: Math.max(parseInt(p1.age), parseInt(p2.age)).toString(),
totalAge: (p1.totalAge || parseInt(p1.age)) + (p2.totalAge || parseInt(p2.age)),
count: (p1.count || 1) + (p2.count || 1)
};
}
};
// Age sum monoid (safe for empty collections)
const AgeSumMonoid: Monoid<number> = {
empty: 0,
combine: (a, b) => a + b
};
// Name collection monoid (safe concatenation)
const NameListMonoid: Monoid<string[]> = {
empty: [],
combine: (a, b) => [...a, ...b]
};
// Array concatenation monoid (generic)
const ArrayMonoid = <T>(): Monoid<T[]> => ({
empty: [],
combine: (a, b) => [...a, ...b]
});
// Statistics monoid for comprehensive data analysis
interface Stats {
totalAge: number;
count: number;
names: string[];
averageAge: number;
}
const StatsMonoid: Monoid<Stats> = {
empty: {
totalAge: 0,
count: 0,
names: [],
averageAge: 0
},
combine: (s1, s2) => {
const totalAge = s1.totalAge + s2.totalAge;
const count = s1.count + s2.count;
return {
totalAge,
count,
names: [...s1.names, ...s2.names],
averageAge: count > 0 ? totalAge / count : 0
};
}
};
// Safe fold function using monoids
function fold<T>(monoid: Monoid<T>, items: T[]): T {
return items.reduce(monoid.combine, monoid.empty);
}
// Safe map-fold operation
function foldMap<A, B>(monoid: Monoid<B>, mapper: (a: A) => B, items: A[]): B {
return fold(monoid, items.map(mapper));
}
// Usage examples showcasing monoid benefits
const data1 = readCSV('name,age\nAlice,25\nBob,40\nCharlie,35');
const data2 = readCSV('name,age\nDave,28\nEve,32');
const data3 = readCSV(''); // Empty CSV - safe with monoids!
const data4 = readCSV('name,age\nFrank,45\nGrace,29');
console.log("Dataset 1:", data1);
console.log("Dataset 2:", data2);
console.log("Dataset 3 (empty):", data3); // Empty array, handled safely
// 1. Safe combination of all persons (empty datasets don't break anything)
const combinedPerson = fold(PersonMonoid, [
fold(PersonMonoid, data1),
fold(PersonMonoid, data2),
fold(PersonMonoid, data3), // Empty dataset contributes identity element
fold(PersonMonoid, data4)
]);
console.log("Combined person:", combinedPerson);
// 2. Safe age summation (works even with empty datasets)
const allAges = [
...data1.map(p => parseInt(p.age)),
...data2.map(p => parseInt(p.age)),
...data3.map(p => parseInt(p.age)), // Empty array - no problem!
...data4.map(p => parseInt(p.age))
];
const totalAge = fold(AgeSumMonoid, allAges);
console.log("Total age:", totalAge);
// 3. Safe name collection
const allNames = foldMap(
NameListMonoid,
(person: Person) => [person.name],
[...data1, ...data2, ...data3, ...data4]
);
console.log("All names:", allNames);
// 4. Safe dataset merging with automatic empty handling
const allDatasets = fold(ArrayMonoid<Person>(), [data1, data2, data3, data4]);
console.log("Combined datasets:", allDatasets);
console.log("Total records:", allDatasets.length);
// 5. Comprehensive statistics with safe empty handling
const statsFromPerson = (p: Person): Stats => ({
totalAge: parseInt(p.age),
count: 1,
names: [p.name],
averageAge: parseInt(p.age)
});
const overallStats = foldMap(StatsMonoid, statsFromPerson, allDatasets);
console.log("Overall statistics:", overallStats);
// 6. Safe pipeline operations
const processDatasetSafely = (csvData: string): Stats => {
const persons = readCSV(csvData);
return foldMap(StatsMonoid, statsFromPerson, persons);
};
// These all work safely, even with problematic input
const results = [
processDatasetSafely('name,age\nAlice,25'),
processDatasetSafely(''), // Empty CSV
processDatasetSafely('name,age'), // Header only
processDatasetSafely('name,age\nBob,30\nCharlie,35')
].map(result => fold(StatsMonoid, [result]));
const finalResult = fold(StatsMonoid, results);
console.log("Pipeline result:", finalResult);
// 7. Error recovery with identity elements
const processWithRetry = (csvInputs: string[]): Stats => {
const validResults: Stats[] = [];
csvInputs.forEach(csv => {
try {
const result = processDatasetSafely(csv);
validResults.push(result);
} catch (error) {
// Failed processing contributes identity element (safe)
validResults.push(StatsMonoid.empty);
console.log("Processing failed, using identity element");
}
});
return fold(StatsMonoid, validResults);
};
const robustResult = processWithRetry([
'name,age\nAlice,25',
'', // Empty - safe
'invalid,csv,format', // Might fail - safe
'name,age\nBob,30'
]);
console.log("Robust processing result:", robustResult);
{-# LANGUAGE OverloadedStrings #-}
import Data.List.Split (splitOn)
import Data.Monoid (Sum(..), getSum)
import Data.Maybe (fromMaybe, mapMaybe)
-- Person data type with safe defaults
data Person = Person
{ personName :: String
, personAge :: Int
, personTotalAge :: Maybe Int
, personCount :: Maybe Int
} deriving (Show, Eq)
-- Monoid instance for Person (includes identity element)
instance Semigroup Person where
p1 <> p2 = Person
{ personName = if personName p1 == "" then personName p2
else if personName p2 == "" then personName p1
else personName p1 ++ " & " ++ personName p2
, personAge = max (personAge p1) (personAge p2)
, personTotalAge = Just $ (fromMaybe (personAge p1) (personTotalAge p1)) +
(fromMaybe (personAge p2) (personTotalAge p2))
, personCount = Just $ (fromMaybe 1 (personCount p1)) +
(fromMaybe 1 (personCount p2))
}
instance Monoid Person where
mempty = Person "" 0 (Just 0) (Just 0) -- Identity element
-- Statistics data type for comprehensive analysis
data Stats = Stats
{ statsTotalAge :: Int
, statsCount :: Int
, statsNames :: [String]
, statsAverageAge :: Double
} deriving (Show, Eq)
-- Monoid instance for Stats
instance Semigroup Stats where
s1 <> s2 =
let totalAge = statsTotalAge s1 + statsTotalAge s2
count = statsCount s1 + statsCount s2
avgAge = if count > 0 then fromIntegral totalAge / fromIntegral count else 0
in Stats totalAge count (statsNames s1 ++ statsNames s2) avgAge
instance Monoid Stats where
mempty = Stats 0 0 [] 0 -- Identity element for safe empty handling
-- Safe CSV parsing function
readCSV :: String -> [Person]
readCSV csvData
| null (filter (not . null) (lines csvData)) = [] -- Safe handling of empty input
| otherwise =
let allLines = lines csvData
in case allLines of
[] -> []
[_] -> [] -- Header only
(header:rows) ->
let keys = splitOn "," header
parseRow row = do
let values = splitOn "," row
rowMap = zip keys values
nameVal <- lookup "name" rowMap
ageStr <- lookup "age" rowMap
ageVal <- readMaybe ageStr
return $ Person nameVal ageVal Nothing Nothing
in mapMaybe parseRow rows
where
readMaybe :: String -> Maybe Int
readMaybe s = case reads s of
[(x, "")] -> Just x
_ -> Nothing
-- Safe fold function using monoids (no Maybe needed!)
foldMonoid :: Monoid a => [a] -> a
foldMonoid = mconcat -- Always safe, empty list becomes mempty
-- Safe map-fold operation
foldMapSafe :: Monoid b => (a -> b) -> [a] -> b
foldMapSafe f = foldMap f -- Always safe with monoids
-- Convert Person to Stats
personToStats :: Person -> Stats
personToStats p =
let age = personAge p
in Stats age 1 [personName p] (fromIntegral age)
-- Safe dataset processing
processDatasetSafely :: String -> Stats
processDatasetSafely csvData =
let persons = readCSV csvData
in foldMapSafe personToStats persons
-- Error recovery with identity elements
processWithRetry :: [String] -> Stats
processWithRetry csvInputs =
let tryProcess csv = case readCSV csv of
[] -> mempty -- Empty contributes identity element
persons -> foldMapSafe personToStats persons
in foldMonoid (map tryProcess csvInputs)
-- Main demonstration
main :: IO ()
main = do
let csvData1 = "name,age\nAlice,25\nBob,40\nCharlie,35"
let csvData2 = "name,age\nDave,28\nEve,32"
let csvData3 = "" -- Empty CSV - safe with monoids!
let csvData4 = "name,age\nFrank,45\nGrace,29"
let data1 = readCSV csvData1
let data2 = readCSV csvData2
let data3 = readCSV csvData3 -- Empty list
let data4 = readCSV csvData4
putStrLn "Dataset 1:"
mapM_ print data1
putStrLn $ "Dataset 2: " ++ show (length data2) ++ " records"
putStrLn $ "Dataset 3 (empty): " ++ show (length data3) ++ " records" -- Safe empty handling
putStrLn $ "Dataset 4: " ++ show (length data4) ++ " records"
-- 1. Safe combination of all persons (empty datasets don't break anything)
let combinedPersons = map foldMonoid [data1, data2, data3, data4]
let finalCombinedPerson = foldMonoid combinedPersons
putStrLn "\nCombined person:"
print finalCombinedPerson
-- 2. Safe age summation using Sum monoid (works even with empty datasets)
let allPersons = data1 ++ data2 ++ data3 ++ data4 -- Includes empty list safely
let allAges = map (Sum . fromIntegral . personAge :: Person -> Sum Int) allPersons
let totalAge = getSum $ foldMonoid allAges
putStrLn $ "Total age: " ++ show totalAge
-- 3. Safe name collection using list concatenation monoid
let allNames = foldMapSafe (\p -> [personName p]) allPersons
putStrLn $ "All names: " ++ show allNames
-- 4. Safe dataset merging with automatic empty handling
let allDatasets = foldMonoid [data1, data2, data3, data4] -- Empty lists handled automatically
putStrLn $ "Combined datasets: " ++ show (length allDatasets) ++ " total records"
-- 5. Comprehensive statistics with safe empty handling
let overallStats = foldMapSafe personToStats allDatasets
putStrLn "\nOverall statistics:"
print overallStats
putStrLn $ "Processed " ++ show (statsCount overallStats) ++
" records with average age " ++ show (statsAverageAge overallStats)
-- 6. Safe pipeline operations
let pipelineResults = map processDatasetSafely
[ "name,age\nAlice,25"
, "" -- Empty CSV
, "name,age" -- Header only
, "name,age\nBob,30\nCharlie,35"
]
let finalPipelineResult = foldMonoid pipelineResults
putStrLn $ "Pipeline result: " ++ show (statsCount finalPipelineResult) ++ " records processed"
-- 7. Error recovery with identity elements
let robustResult = processWithRetry
[ "name,age\nAlice,25"
, "" -- Empty - safe
, "invalid,csv,format" -- Might fail - safe
, "name,age\nBob,30"
]
putStrLn $ "Robust processing result: " ++ show (statsCount robustResult) ++ " records"
using System;
using System.Collections.Generic;
using System.Linq;
// Enhanced monoid interface with identity element
public interface IMonoid<T>
{
T Empty { get; }
T Combine(T a, T b);
}
// Person type with safe defaults
public class Person
{
public string Name { get; set; } = "";
public string Age { get; set; } = "0";
public int? TotalAge { get; set; }
public int? Count { get; set; }
}
// Person monoid with safe empty person
public class PersonMonoid : IMonoid<Person>
{
public Person Empty => new Person { Name = "", Age = "0", TotalAge = 0, Count = 0 };
public Person Combine(Person p1, Person p2)
{
if (p1.Name == "" && p1.Age == "0") return p2;
if (p2.Name == "" && p2.Age == "0") return p1;
return new Person
{
Name = $"{p1.Name} & {p2.Name}",
Age = Math.Max(int.Parse(p1.Age), int.Parse(p2.Age)).ToString(),
TotalAge = (p1.TotalAge ?? int.Parse(p1.Age)) + (p2.TotalAge ?? int.Parse(p2.Age)),
Count = (p1.Count ?? 1) + (p2.Count ?? 1)
};
}
}
// Age sum monoid (safe for empty collections)
public class AgeSumMonoid : IMonoid<int>
{
public int Empty => 0;
public int Combine(int a, int b) => a + b;
}
// Name collection monoid (safe concatenation)
public class NameListMonoid : IMonoid<List<string>>
{
public List<string> Empty => new List<string>();
public List<string> Combine(List<string> a, List<string> b) => a.Concat(b).ToList();
}
// Array concatenation monoid (generic)
public class ArrayMonoid<T> : IMonoid<List<T>>
{
public List<T> Empty => new List<T>();
public List<T> Combine(List<T> a, List<T> b) => a.Concat(b).ToList();
}
// Statistics monoid for comprehensive data analysis
public class Stats
{
public int TotalAge { get; set; }
public int Count { get; set; }
public List<string> Names { get; set; } = new List<string>();
public double AverageAge { get; set; }
}
public class StatsMonoid : IMonoid<Stats>
{
public Stats Empty => new Stats { TotalAge = 0, Count = 0, Names = new List<string>(), AverageAge = 0 };
public Stats Combine(Stats s1, Stats s2)
{
var totalAge = s1.TotalAge + s2.TotalAge;
var count = s1.Count + s2.Count;
return new Stats
{
TotalAge = totalAge,
Count = count,
Names = s1.Names.Concat(s2.Names).ToList(),
AverageAge = count > 0 ? (double)totalAge / count : 0
};
}
}
public class Program
{
// CSV parsing function (enhanced for safety)
public static List<Person> ReadCSV(string data)
{
if (string.IsNullOrWhiteSpace(data)) return new List<Person>();
var lines = data.Trim().Split('\n');
if (lines.Length < 2) return new List<Person>();
var header = lines[0].Split(',');
var rows = lines.Skip(1);
var result = new List<Person>();
foreach (var row in rows)
{
var values = row.Split(',');
var dict = new Dictionary<string, string>();
for (int i = 0; i < header.Length && i < values.Length; i++)
dict[header[i]] = values[i];
result.Add(new Person
{
Name = dict.ContainsKey("name") ? dict["name"] : "",
Age = dict.ContainsKey("age") ? dict["age"] : "0",
TotalAge = null,
Count = null
});
}
return result;
}
// Safe fold function using monoids
public static T Fold<T>(IMonoid<T> monoid, IEnumerable<T> items)
{
return items.Aggregate(monoid.Empty, monoid.Combine);
}
// Safe map-fold operation
public static B FoldMap<A, B>(IMonoid<B> monoid, Func<A, B> mapper, IEnumerable<A> items)
{
return Fold(monoid, items.Select(mapper));
}
public static void Main()
{
var data1 = ReadCSV("name,age\nAlice,25\nBob,40\nCharlie,35");
var data2 = ReadCSV("name,age\nDave,28\nEve,32");
var data3 = ReadCSV(""); // Empty CSV - safe with monoids!
var data4 = ReadCSV("name,age\nFrank,45\nGrace,29");
Console.WriteLine("Dataset 1: " + string.Join("; ", data1.Select(p => $"{p.Name}:{p.Age}")));
Console.WriteLine("Dataset 2: " + string.Join("; ", data2.Select(p => $"{p.Name}:{p.Age}")));
Console.WriteLine("Dataset 3 (empty): " + string.Join("; ", data3.Select(p => $"{p.Name}:{p.Age}")));
// 1. Safe combination of all persons (empty datasets don't break anything)
var personMonoid = new PersonMonoid();
var combinedPerson = Fold(personMonoid, new[]
{
Fold(personMonoid, data1),
Fold(personMonoid, data2),
Fold(personMonoid, data3),
Fold(personMonoid, data4)
});
Console.WriteLine("Combined person: " + $"{combinedPerson.Name}, Age: {combinedPerson.Age}, TotalAge: {combinedPerson.TotalAge}, Count: {combinedPerson.Count}");
// 2. Safe age summation (works even with empty datasets)
var allAges = data1.Concat(data2).Concat(data3).Concat(data4).Select(p => int.Parse(p.Age)).ToList();
var ageSumMonoid = new AgeSumMonoid();
var totalAge = Fold(ageSumMonoid, allAges);
Console.WriteLine("Total age: " + totalAge);
// 3. Safe name collection
var nameListMonoid = new NameListMonoid();
var allNames = FoldMap(nameListMonoid, (Person p) => new List<string> { p.Name }, data1.Concat(data2).Concat(data3).Concat(data4));
Console.WriteLine("All names: " + string.Join(", ", allNames));
// 4. Safe dataset merging with automatic empty handling
var arrayMonoid = new ArrayMonoid<Person>();
var allDatasets = Fold(arrayMonoid, new List<List<Person>> { data1, data2, data3, data4 });
Console.WriteLine("Combined datasets: " + string.Join("; ", allDatasets.Select(p => $"{p.Name}:{p.Age}")));
Console.WriteLine("Total records: " + allDatasets.Count);
// 5. Comprehensive statistics with safe empty handling
Stats StatsFromPerson(Person p) => new Stats
{
TotalAge = int.Parse(p.Age),
Count = 1,
Names = new List<string> { p.Name },
AverageAge = int.Parse(p.Age)
};
var statsMonoid = new StatsMonoid();
var overallStats = FoldMap(statsMonoid, StatsFromPerson, allDatasets);
Console.WriteLine($"Overall statistics: TotalAge={overallStats.TotalAge}, Count={overallStats.Count}, AverageAge={overallStats.AverageAge}, Names=[{string.Join(", ", overallStats.Names)}]");
// 6. Safe pipeline operations
Stats ProcessDatasetSafely(string csvData)
{
var persons = ReadCSV(csvData);
return FoldMap(statsMonoid, StatsFromPerson, persons);
}
var results = new[]
{
ProcessDatasetSafely("name,age\nAlice,25"),
ProcessDatasetSafely(""),
ProcessDatasetSafely("name,age"),
ProcessDatasetSafely("name,age\nBob,30\nCharlie,35")
}.Select(result => Fold(statsMonoid, new[] { result })).ToList();
var finalResult = Fold(statsMonoid, results);
Console.WriteLine($"Pipeline result: TotalAge={finalResult.TotalAge}, Count={finalResult.Count}, AverageAge={finalResult.AverageAge}, Names=[{string.Join(", ", finalResult.Names)}]");
// 7. Error recovery with identity elements
Stats ProcessWithRetry(IEnumerable<string> csvInputs)
{
var validResults = new List<Stats>();
foreach (var csv in csvInputs)
{
try
{
var result = ProcessDatasetSafely(csv);
validResults.Add(result);
}
catch
{
validResults.Add(statsMonoid.Empty);
Console.WriteLine("Processing failed, using identity element");
}
}
return Fold(statsMonoid, validResults);
}
var robustResult = ProcessWithRetry(new[]
{
"name,age\nAlice,25",
"",
"invalid,csv,format",
"name,age\nBob,30"
});
Console.WriteLine($"Robust processing result: TotalAge={robustResult.TotalAge}, Count={robustResult.Count}, AverageAge={robustResult.AverageAge}, Names=[{string.Join(", ", robustResult.Names)}]");
}
}
Visualizing monoids #
Monoid = Set + Operation + Laws
SET: {a, b, c, d, e, ...}
│
├─ OPERATION: ∘ (binary, closed, associative)
│
└─ IDENTITY: e (neutral element)
Laws:
1. Closure: a ∘ b ∈ Set
2. Associativity: (a ∘ b) ∘ c = a ∘ (b ∘ c)
3. Identity: e ∘ a = a ∘ e = a
-------------------------------------------------------
Identity Element (e) acts as "do nothing":
a ∘ e = a e ∘ a = a
│ │ │ │ │ │
│ │ └────────┘ │ │
│ └────────────────── │
└────────────────────────┘
Same element 'a'
Left-associative: ((a ∘ b) ∘ c) ∘ d
│─────│
│ r₁ │
│─────────│
│ r₂ │
│─────────────│
│ r₃ │
Right-associative: a ∘ (b ∘ (c ∘ d))
│─────│
│ r₁ │
│─────────│
│ r₂ │
│─────────────│
│ r₃ │
Both produce the same result: r₃
-------------------------------------------------------
Decomposition:
Original data: [a, b, c, d, e, f, g, h]
Split into chunks:
┌─────────┬─────────┬─────────┬─────────┐
│ [a, b] │ [c, d] │ [e, f] │ [g, h] │
└─────────┴─────────┴─────────┴─────────┘
│ │ │ │
Worker 1 Worker 2 Worker 3 Worker 4
│ │ │ │
a ∘ b c ∘ d e ∘ f g ∘ h
│ │ │ │
r₁ r₂ r₃ r₄
Combine results:
(r₁ ∘ r₂) ∘ (r₃ ∘ r₄) = final result
Empty chunks are safe:
┌─────────┬─────────┬─────────┬─────────┐
│ [a, b] │ [] │ [e, f] │ [g, h] │
└─────────┴─────────┴─────────┴─────────┘
│ │ │ │
a ∘ b e e ∘ f g ∘ h
│ │ │ │
r₁ r₂ r₃ r₄
Still works: (r₁ ∘ e) ∘ (r₃ ∘ r₄) = r₁ ∘ (r₃ ∘ r₄)
Semigroup vs Monoid Comparison #
Semigroup (no identity):
┌─────────────────────────────────┐
│ SET + ASSOCIATIVE OPERATION │
│ │
│ ⚠️ fold([]) = ERROR │
│ ✅ fold([a]) = a │
│ ✅ fold([a,b,c]) = (a∘b)∘c │
└─────────────────────────────────┘
Monoid (with identity):
┌─────────────────────────────────┐
│ SET + ASSOCIATIVE OPERATION │
│ + IDENTITY ELEMENT │
│ │
│ ✅ fold([]) = e │
│ ✅ fold([a]) = a │
│ ✅ fold([a,b,c]) = (a∘b)∘c │
└─────────────────────────────────┘
The identity element makes monoids much safer!
Parallel processing #
Building upon the parallel processing patterns from semigroups, monoids provide even greater advantages for parallel computation. The identity element eliminates the need for special handling of empty collections and provides guaranteed safe initialization values. Let's explore how monoids excel in parallel scenarios:
Key Advantages of Monoids in Parallel Processing #
- Empty Collection Safety: No need for
Maybe/Optionaltypes - empty collections fold to the identity element - Guaranteed Initialization: Every parallel worker can start with the identity element
- Simplified Load Balancing: Uneven chunk distribution doesn't require special handling
- Robust Error Recovery: Failed workers can be replaced with identity elements
{-# LANGUAGE ParallelListComp #-}
import Control.Parallel.Strategies
import Data.Monoid (Sum(..), getSum)
-- Simple parallel fold using monoid properties
parallelFold :: Monoid a => [a] -> a
parallelFold [] = mempty -- Identity element handles empty case automatically!
parallelFold xs =
let chunkSize = max 1 (length xs `div` 4) -- Divide among 4 cores
chunks = chunksOf chunkSize xs
-- Each chunk folds to its local result, empty chunks become mempty
chunkResults = map (foldMap id) chunks `using` parList rseq
in foldMap id chunkResults -- Combine all results
-- Enhanced parallel map-fold with guaranteed safety
parallelMapFold :: Monoid b => (a -> b) -> [a] -> b
parallelMapFold f xs =
let chunks = chunksOf 1000 xs
-- Each worker processes its chunk independently
chunkResults = map (foldMap f) chunks `using` parList rseq
in foldMap id chunkResults -- Always safe, even with empty chunks
-- Parallel statistics collection using multiple monoids
data Stats = Stats { count :: Sum Int, total :: Sum Int, squares :: Sum Int }
deriving (Show, Eq)
instance Semigroup Stats where
Stats c1 t1 s1 <> Stats c2 t2 s2 = Stats (c1 <> c2) (t1 <> t2) (s1 <> s2)
instance Monoid Stats where
mempty = Stats mempty mempty mempty
-- Collect statistics in parallel
parallelStats :: [Int] -> Stats
parallelStats = parallelMapFold (\x -> Stats (Sum 1) (Sum x) (Sum (x * x)))
-- Helper function
chunksOf :: Int -> [a] -> [[a]]
chunksOf _ [] = []
chunksOf n xs =
let (chunk, rest) = splitAt n xs
in chunk : chunksOf n rest
-- Example usage
main :: IO ()
main = do
let numbers = [1..1000000] :: [Int] -- Explicit type annotation
let emptyList = [] :: [Int]
-- Direct parallel fold of monoid values
let sums = map Sum ([1..100] :: [Int]) -- Explicit type for the range
let foldResult = parallelFold sums
putStrLn $ "Parallel fold result: " ++ show (getSum foldResult)
-- Safe parallel sum - works even with empty lists!
let sumResult = parallelMapFold Sum numbers
let emptySumResult = parallelMapFold Sum emptyList
putStrLn $ "Parallel sum: " ++ show (getSum sumResult)
putStrLn $ "Empty list sum: " ++ show (getSum emptySumResult) -- 0, no Maybe needed!
-- Parallel statistics
let stats = parallelStats [1..100]
putStrLn $ "Parallel stats: " ++ show stats
-- Empty stats are safe too
let emptyStats = parallelStats []
putStrLn $ "Empty stats: " ++ show emptyStats -- All zeros, perfectly safe
// Enhanced monoid interface with guaranteed identity
interface Monoid<T> {
empty: T;
combine: (a: T, b: T) => T;
}
// Functional helper: create chunks from array
const createChunks = <T>(data: T[], chunkSize: number): T[][] => {
const chunks: T[][] = [];
for (let i = 0; i < data.length; i += chunkSize) {
chunks.push(data.slice(i, i + chunkSize));
}
return chunks;
};
// Functional helper: simulate parallel chunk processing
const processChunk = <T>(monoid: Monoid<T>) => (chunk: T[]): Promise<T> =>
new Promise(resolve => {
setTimeout(() => {
// Empty chunks safely reduce to identity element
const result = chunk.reduce(monoid.combine, monoid.empty);
resolve(result);
}, Math.random() * 50);
});
// Safe parallel fold - always returns a value, never null/undefined
const parallelFold = <T>(monoid: Monoid<T>) => async (data: T[]): Promise<T> => {
if (data.length === 0) return monoid.empty; // Identity element!
const chunkSize = Math.max(1, Math.ceil(data.length / (navigator.hardwareConcurrency || 4)));
const chunks = createChunks(data, chunkSize);
// Process chunks in parallel, empty chunks automatically become identity
const chunkResults = await Promise.all(chunks.map(processChunk(monoid)));
// Combine all results - empty chunks contribute identity element
return chunkResults.reduce(monoid.combine, monoid.empty);
};
// Enhanced map-fold with load balancing
const parallelMapFold = <T, U>(monoid: Monoid<T>) => (mapper: (item: U) => T) =>
async (data: U[]): Promise<T> => {
const chunks = createChunks(data, 1000);
const chunkResults = await Promise.all(
chunks.map(chunk => processChunk(monoid)(chunk.map(mapper)))
);
return chunkResults.reduce(monoid.combine, monoid.empty);
};
// Enhanced monoids with better parallel characteristics
const NumberAddMonoid: Monoid<number> = {
empty: 0,
combine: (a, b) => a + b
};
const NumberMultiplyMonoid: Monoid<number> = {
empty: 1,
combine: (a, b) => a * b
};
// Statistics monoid for parallel data analysis
interface Statistics {
count: number;
sum: number;
sumSquares: number;
}
const StatisticsMonoid: Monoid<Statistics> = {
empty: { count: 0, sum: 0, sumSquares: 0 },
combine: (a, b) => ({
count: a.count + b.count,
sum: a.sum + b.sum,
sumSquares: a.sumSquares + b.sumSquares
})
};
// Usage demonstration - functional composition style
async function demonstrateParallelMonoids() {
const numbers = Array.from({length: 1000000}, (_, i) => i + 1);
const emptyArray: number[] = [];
// Create specialized functions using partial application
const parallelSum = parallelMapFold<number, number>(NumberAddMonoid)((x: number) => x);
const parallelFoldSum = parallelFold(NumberAddMonoid);
const parallelStats = parallelMapFold<Statistics, number>(StatisticsMonoid)(
(x: number) => ({ count: 1, sum: x, sumSquares: x * x })
);
console.time('Parallel Sum');
const sum = await parallelSum(numbers);
console.timeEnd('Parallel Sum');
console.log('Parallel sum:', sum);
// Empty arrays are perfectly safe - no null checks needed!
const emptySum = await parallelFoldSum(emptyArray);
console.log('Empty array sum:', emptySum); // 0, not null/undefined
// Parallel statistics collection
console.time('Parallel Stats');
const stats = await parallelStats(numbers.slice(0, 10000));
console.timeEnd('Parallel Stats');
const mean = stats.sum / stats.count;
const variance = (stats.sumSquares / stats.count) - (mean * mean);
console.log(`Statistics: count=${stats.count}, mean=${mean.toFixed(2)}, variance=${variance.toFixed(2)}`);
// Functional composition example: pipeline of operations
const processLargeDataset = async (data: number[]) => {
const sum = await parallelSum(data);
const stats = await parallelStats(data);
const product = await parallelMapFold<number, number>(NumberMultiplyMonoid)((x: number) => x)(data.slice(0, 10));
return { sum, stats, product };
};
const results = await processLargeDataset([1, 2, 3, 4, 5]);
console.log('Pipeline results:', results);
}
demonstrateParallelMonoids();
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
// Enhanced monoid interface
public interface IMonoid<T>
{
T Identity { get; }
T Combine(T a, T b);
}
public class IntAddMonoid : IMonoid<int>
{
public int Identity => 0;
public int Combine(int a, int b) => a + b;
}
// Robust parallel processor with monoid guarantees
public class MonoidParallelProcessor<T>
{
private readonly IMonoid<T> _monoid;
public MonoidParallelProcessor(IMonoid<T> monoid)
{
_monoid = monoid;
}
// Safe parallel fold - always returns a value, never null
public async Task<T> ParallelFoldAsync(IEnumerable<T> data)
{
var list = data.ToList();
if (!list.Any()) return _monoid.Identity; // Safe for empty collections!
return await Task.Run(() => ParallelFoldInternal(list));
}
private T ParallelFoldInternal(IList<T> data)
{
if (!data.Any()) return _monoid.Identity;
if (data.Count == 1) return data[0];
var tasks = new List<Task<T>>();
var chunkSize = Math.Max(1, data.Count / Environment.ProcessorCount);
for (int i = 0; i < data.Count; i += chunkSize)
{
var chunk = data.Skip(i).Take(chunkSize).ToList();
tasks.Add(Task.Run(() => chunk.Aggregate(_monoid.Identity, _monoid.Combine)));
}
Task.WaitAll(tasks.ToArray());
return tasks.Select(t => t.Result).Aggregate(_monoid.Identity, _monoid.Combine);
}
// Enhanced map-fold with automatic load balancing
public async Task<T> ParallelMapFoldAsync<U>(
IEnumerable<U> data,
Func<U, T> mapper,
int? chunkSizeOverride = null)
{
var list = data.ToList();
if (!list.Any()) return _monoid.Identity;
var chunkSize = chunkSizeOverride ?? Math.Max(1, list.Count / Environment.ProcessorCount);
var chunks = list.Chunk(chunkSize);
var chunkTasks = chunks.Select(chunk => Task.Run(() =>
{
// Each chunk maps and folds independently
return chunk.Select(mapper).Aggregate(_monoid.Identity, _monoid.Combine);
}));
var results = await Task.WhenAll(chunkTasks);
return results.Aggregate(_monoid.Identity, _monoid.Combine);
}
// PLINQ with monoid safety
public T PLinqMapFold<U>(IEnumerable<U> data, Func<U, T> mapper)
{
return data.AsParallel()
.Select(mapper)
.Aggregate(_monoid.Identity, _monoid.Combine);
}
}
// Statistics monoid for parallel analysis
public struct Statistics
{
public long Count { get; }
public long Sum { get; }
public long SumSquares { get; }
public Statistics(long count, long sum, long sumSquares)
{
Count = count;
Sum = sum;
SumSquares = sumSquares;
}
public double Mean => Count > 0 ? (double)Sum / Count : 0;
public double Variance => Count > 0 ? ((double)SumSquares / Count) - (Mean * Mean) : 0;
}
public class StatisticsMonoid : IMonoid<Statistics>
{
public Statistics Identity => new Statistics(0, 0, 0);
public Statistics Combine(Statistics a, Statistics b) =>
new Statistics(a.Count + b.Count, a.Sum + b.Sum, a.SumSquares + b.SumSquares);
}
// Extension methods for easier usage
public static class MonoidExtensions
{
public static async Task<T> ParallelFoldAsync<T>(this IEnumerable<T> source, IMonoid<T> monoid)
{
var processor = new MonoidParallelProcessor<T>(monoid);
return await processor.ParallelFoldAsync(source);
}
public static IEnumerable<T[]> Chunk<T>(this IEnumerable<T> source, int chunkSize)
{
var enumerator = source.GetEnumerator();
while (enumerator.MoveNext())
{
var chunk = new List<T> { enumerator.Current };
for (int i = 1; i < chunkSize && enumerator.MoveNext(); i++)
{
chunk.Add(enumerator.Current);
}
yield return chunk.ToArray();
}
}
}
// Demonstration
class Program
{
static async Task Main(string[] args)
{
var numbers = Enumerable.Range(1, 1_000_000).ToList();
var emptyList = new List<int>();
// Safe parallel sum - works with empty collections!
var sumMonoid = new IntAddMonoid();
var sumProcessor = new MonoidParallelProcessor<int>(sumMonoid);
var parallelSum = await sumProcessor.ParallelMapFoldAsync(numbers, x => x);
var emptySum = await sumProcessor.ParallelFoldAsync(emptyList);
Console.WriteLine($"Parallel Sum: {parallelSum}");
Console.WriteLine($"Empty List Sum: {emptySum}"); // 0, not null!
// Parallel statistics collection
var statsMonoid = new StatisticsMonoid();
var statsProcessor = new MonoidParallelProcessor<Statistics>(statsMonoid);
var stats = await statsProcessor.ParallelMapFoldAsync(
numbers.Take(100_000),
x => new Statistics(1, x, (long)x * x)
);
Console.WriteLine($"Count: {stats.Count}");
Console.WriteLine($"Mean: {stats.Mean:F2}");
Console.WriteLine($"Variance: {stats.Variance:F2}");
// Empty statistics are safe too
var emptyStats = await statsProcessor.ParallelFoldAsync(
Enumerable.Empty<Statistics>()
);
Console.WriteLine($"Empty Stats: Count={emptyStats.Count}, Mean={emptyStats.Mean}");
}
}
Conclusion #
Monoids provide a perfect foundation for many common programming patterns:
- Reduce operations: When you need to combine all elements in a collection
- Default values: The identity element provides a sensible default for empty cases
- Incremental computation: Building up results step by step with guaranteed correctness
- Parallel processing: The associativity and identity properties enable safe parallelization
- Compositional design: Monoids compose naturally, making complex systems easier to reason about
Source code #
Reference implementation (opens in a new tab)