Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Course Map

The problem this chapter solves is:

Before reading individual source files, you need one map that connects the tiny ML pipeline, the Rust modules, and the category-theory vocabulary.

The repository is intentionally small, but it still has layers. One layer names the values. Another layer names transformations between values. A third layer uses those transformations to make predictions, measure loss, and update model parameters.

This chapter gives you the whole map before the book zooms in.

Chapter Outcomes

By the end of this chapter, you should be able to:

  • place each printed line from cargo run --bin category_ml into domain value, typed transformation, or training update,
  • explain how src/domain.rs, src/category.rs, src/ml.rs, and src/training.rs divide responsibility,
  • translate the book’s first pipeline into objects, morphisms, product input, loss, and endomorphism language.

Choose Your Path

Use the book-first path if you want the concepts introduced in order:

Welcome
  -> Course Map
  -> Domain Objects
  -> Morphism and Composition
  -> Tiny ML Pipeline
  -> Training as an Endomorphism

Use the code-first path if you learn faster by running something first:

cargo run --example 01_token_sequence
cargo run --bin category_ml

Then come back to this map and place each printed line in one of three locations:

domain value
typed transformation
training update

Both paths are valid. The book-first path reduces surprise. The code-first path reduces abstraction anxiety. The important thing is not to open every file at once. Start with one path, run one command, and attach each new word to one visible Rust shape.

What You Already Know

If you read a program from top to bottom, you already know how to follow a flow. If you read a Rust function signature, you already know that a step has an input type and an output type. If you have seen any ML pipeline, you already know that raw data eventually becomes predictions, loss, and updates.

The map in this chapter puts those familiar habits together:

value
  -> transformation
  -> composed transformations
  -> measured error
  -> repeated update

The category-theory vocabulary is not a separate layer pasted on top. It names shapes that are already present in the Rust and ML readings.

Worked Example: From One Function To A Pipeline

Start with one ordinary function:

#![allow(unused)]
fn main() {
fn token_to_vector_id(token_id: usize) -> usize {
    token_id + 100
}

assert_eq!(token_to_vector_id(7), 107);
}

This has the shape:

usize -> usize

That is a transformation, but it is not yet a good teaching boundary. Both sides use the same raw type, so the signature does not tell us whether the number is a token, a vector row, a dimension, or something else.

The book replaces that vague movement with named stages:

TokenId -> Vector

Then it composes more stages:

TokenId -> Vector -> Logits -> Distribution

That is the basic move for the whole book. Start with a familiar function, give the meaningful values names, then ask which typed transformations can compose safely.

Self-check

Before continuing, explain why TokenId -> Vector carries more information than usize -> Vec<f32>. A strong answer should mention both reader clarity and compiler-checked boundaries.

The Whole Pipeline

The first mental model is:

Text -> Tokens -> TrainingPairs -> ModelState -> Prediction -> Loss -> Updated ModelState

Read it as one question:

What object do we have now, and what typed transformation moves us to the next object?

The same diagram with the first concrete Rust names is:

Text
  |
  | tokenize
  v
TokenSequence
  |
  | adjacent pairs
  v
TrainingSet
  |
  | train with current Parameters
  v
Parameters
  |
  | predict
  v
Distribution
  |
  | compare with target token
  v
Loss
  |
  | optimizer step
  v
Parameters

The public names and Rust names are close, but not identical:

Reader-facing nameRust name in this projectWhy the distinction matters
TokensTokenSequencethe code preserves order, not only a bag of token IDs
TrainingPairsTrainingSet of Product<TokenId, TokenId>each example has an input token and the next-token target
ModelStateParametersthis tiny model’s trainable state is its embedding and projection parameters
Updated ModelStateupdated Parameterstraining is a state update, not a new kind of object

The central book pipeline is:

Text
  -> TokenSequence
  -> TrainingSet
  -> Prediction
  -> Loss
  -> Updated Parameters

The concrete Rust shape is slightly more detailed:

TokenSequence -> TrainingSet
TokenId       -> Vector
Vector        -> Logits
Logits        -> Distribution
Distribution x TokenId -> Loss
Parameters    -> Parameters

The same map can be drawn as a learner-facing flow:

raw text
   |
   v
TokenSequence --DatasetWindowing--> TrainingSet
   |
   v
TokenId --Embedding--> Vector --LinearToLogits--> Logits --Softmax--> Distribution
                                                                        |
                                                                        v
                                                       Product<Distribution, TokenId>
                                                                        |
                                                                        v
                                                               CrossEntropy -> Loss

Parameters --TrainStep--> Updated Parameters

The same course map as a compact rendered math view:

[ \begin{array}{ccccccccc} \mathrm{Text} & \to & \mathrm{TokenSequence} & \xrightarrow{\mathrm{DatasetWindowing}} & \mathrm{TrainingSet} & \leadsto & \mathrm{Product}\langle\mathrm{Distribution},\mathrm{TokenId}\rangle & \xrightarrow{\mathrm{CrossEntropy}} & \mathrm{Loss} \ &&& &&& \uparrow \mathrm{Softmax \circ LinearToLogits \circ Embedding} && \ &&& &&& \mathrm{TokenId} && \ \mathrm{Parameters} & \xrightarrow{\mathrm{TrainStep}} & \mathrm{UpdatedParameters} &&&&&& \end{array} ]

Read the top path as prediction and evaluation. Read the bottom path as training state. The two meet because TrainStep uses the training set, current parameters, prediction path, and loss to produce updated parameters.

If the text diagram is easier to read first, use it first. If the rendered view is easier to track, redraw it and label the Rust object behind every mathematical name. Both views are teaching aids; the proof that the map is real is still the code and the commands.

Read that map in three ways.

The Rust reading is about named types, trait implementations, constructors, fallible boundaries, and tests. The ML reading is about data preparation, embeddings, scores, probabilities, error measurement, and parameter updates. The category-theory reading is about objects, morphisms, products, composition, endomorphisms, and laws.

These are not three different books. They are three readings of the same small program.

Module Map

The source tree follows the learning path. Each file owns one part of the conceptual load, so the reader does not have to learn every abstraction at the same time.

FileWhat it teachesMain shape
src/domain.rsMeaningful valuesTokenId, Vector, Distribution, Parameters
src/category.rsTyped arrowsMorphism<Input, Output>
src/ml.rsConcrete ML transformationsTokenId -> Vector -> Logits -> Distribution
src/training.rsRepeated updatesParameters -> Parameters
src/structure.rsReusable structurefunctor, natural transformation, monoid
src/calculus.rsLocal derivative flowchain rule for z = x * y
src/sketches.rsApplied category-theory sketchestyped models plus law checks
src/demo.rsFull guided walkthroughexecutable course outline

This table is not something to memorize. Use it as a navigation tool. When a later chapter names a concept, you should be able to place it in one file and one row of the pipeline.

The Library Surface

The library root collects the modules and re-exports the teaching types. That is why the examples can import clear names instead of reaching through deep paths.

The important design is:

domain nouns
category arrows
ML arrows
training update
structure patterns
calculus rule
applied sketches
demo

That order is also the reading path. The book first asks “what are the values?” Then it asks “what transformations are allowed?” Only after those two questions does it build prediction, loss, and training.

How The Files Fit Together

src/domain.rs defines the nouns. A TokenId is not a VocabSize, a Distribution is not a raw vector, and a LearningRate is not any other floating-point number. This file protects meaning at the boundary where raw machine values enter the tutorial.

src/category.rs defines the arrows. The central trait is:

pub trait Morphism<Input, Output> {
    fn name(&self) -> &'static str;
    fn apply(&self, input: Input) -> CtResult<Output>;
}

That trait says: a morphism is something that transforms an Input into an Output, and the transformation may fail with a typed course error.

src/ml.rs makes the arrows concrete. It implements dataset windowing, embedding lookup, linear projection, softmax, and cross entropy. This is where the abstract phrase “typed transformation” becomes a tiny learning pipeline.

src/training.rs defines the update step:

TrainStep : Parameters -> Parameters

Because the output type is the same as the input type, the update can be repeated:

Parameters0 -> Parameters1 -> Parameters2 -> ... -> ParametersN

That is why the training chapter teaches one optimizer step as an endomorphism. It is a transformation from a type back to itself.

src/structure.rs gives names to reusable patterns that appear after the pipeline works: mapping inside a container, converting one wrapper shape to another, and combining traces with an identity value. These ideas are useful because real systems accumulate logs, batches, optional results, gradients, and workflow traces.

src/calculus.rs keeps backpropagation deliberately small. It shows the local rule for:

z = x * y
dL/dx = dL/dz * y
dL/dy = dL/dz * x

This is not a full automatic-differentiation engine. It is the smallest local chain-rule shape the later training story can point at.

src/sketches.rs connects the tutorial to applied category theory beyond the tiny ML pipeline. It models orders, resources, databases, co-design, signal flow, circuits, and behavior logic as typed Rust values with law-checking tests.

Guided Walkthrough Snapshot

The terminal demo is the spine of the book. It gives a learner one command that uses every major idea in a concrete order.

Source snapshot: src/demo.rs
use crate::calculus::{LocalGradient, MulOp, Scalar};
use crate::category::{Compose, StepCount, apply_endomorphism_n_times};
use crate::domain::{
    LearningRate, Logits, ModelDimension, Parameters, Product, TokenId, TokenSequence, Vector,
    VocabSize,
};
use crate::error::CtResult;
use crate::ml::{
    CrossEntropy, DatasetWindowing, Embedding, LinearToLogits, Softmax, average_loss,
    composed_prediction_matches_direct_prediction,
};
use crate::structure::{
    Functor, Monoid, OptionFunctor, PipelineTrace, TraceStep, VecFunctor,
    monoid_laws_hold_for_pipeline_trace, naturality_square_holds_for_first_option,
};
use crate::training::TrainStep;
use crate::{Identity, Morphism};

/// Run the full terminal walkthrough used by `cargo run --bin category_ml`.
pub fn run_demo() -> CtResult<()> {
    println!("Category theory concepts implemented in Rust 2024");
    println!("=================================================\n");

    let vocab = ["<pad>", "I", "love", "Rust", "."];
    let raw_text = TokenSequence::from_indices([1, 2, 3, 4, 1, 2, 3, 4])?;

    println!("1. Object examples");
    println!("   TokenId(1) means {:?}\n", vocab[1]);

    println!("2. Dataset morphism: TokenSequence -> TrainingSet");
    let dataset = DatasetWindowing.apply(raw_text)?;
    for example in dataset.examples() {
        println!(
            "   {:?} -> {:?}",
            vocab[example.first().index()],
            vocab[example.second().index()]
        );
    }
    println!();

    println!("3. Identity morphism: id_Vector : Vector -> Vector");
    let v = Vector::new(vec![1.0, 2.0, 3.0]);
    let same_v = Identity::<Vector>::new().apply(v.clone())?;
    println!("   input  = {:?}", v);
    println!("   output = {:?}\n", same_v);

    println!("4. Composition: Softmax after Linear after Embedding");
    let params = Parameters::init(VocabSize::new(vocab.len())?, ModelDimension::new(4)?);
    let embedding = Embedding::from_parameters(&params);
    let linear = LinearToLogits::from_parameters(&params);
    let token_to_logits = Compose::<_, _, Vector>::new(embedding, linear);
    let token_to_distribution = Compose::<_, _, Logits>::new(token_to_logits, Softmax);
    let distribution = token_to_distribution.apply(TokenId::new(1))?;
    println!("   P(next token | 'I') = {:?}\n", distribution.as_slice());

    println!("5. Product object: Prediction x Target -> Loss");
    let loss = CrossEntropy.apply(Product::new(distribution, TokenId::new(2)))?;
    println!("   loss for target 'love' = {:.6}\n", loss.value());

    println!("6. Endomorphism: TrainStep : Parameters -> Parameters");
    let before = average_loss(&params, &dataset)?;
    let train_step = TrainStep::new(dataset.clone(), LearningRate::new(1.0)?);
    let trained_params =
        apply_endomorphism_n_times(&train_step, params.clone(), StepCount::new(80))?;
    let after = average_loss(&trained_params, &dataset)?;
    println!("   average loss before training = {:.6}", before.value());
    println!("   average loss after  training = {:.6}\n", after.value());

    println!("7. Functor: fmap over Vec and Option");
    let xs = vec![1, 2, 3];
    let ys = VecFunctor::fmap(xs, |x| x * x);
    let maybe = OptionFunctor::fmap(Some(7), |x| x + 1);
    println!("   VecFunctor fmap square: {:?}", ys);
    println!("   OptionFunctor fmap +1: {:?}\n", maybe);

    println!("8. Natural transformation: Vec<A> -> Option<A>");
    println!(
        "   naturality square holds: {}\n",
        naturality_square_holds_for_first_option()
    );

    println!("9. Monoid: pipeline traces compose associatively with identity");
    let trace = PipelineTrace::from_steps(vec![TraceStep::new("embedding")])
        .combine(&PipelineTrace::from_steps(vec![TraceStep::new("linear")]))
        .combine(&PipelineTrace::from_steps(vec![TraceStep::new("softmax")]));
    println!("   trace = {:?}", trace.names());
    println!(
        "   monoid laws hold for this trace type: {}\n",
        monoid_laws_hold_for_pipeline_trace()
    );

    println!("10. Commutative diagram check");
    println!(
        "   composed prediction == direct prediction: {}\n",
        composed_prediction_matches_direct_prediction(&params)?
    );

    println!("11. Chain rule / local derivative morphism");
    let mul = MulOp;
    let x = Scalar::new(2.0)?;
    let y = Scalar::new(3.0)?;
    let z = mul.forward(x, y)?;
    let upstream = LocalGradient::new(1.0)?;
    let (dl_dx, dl_dy) = mul.backward(x, y, upstream)?;
    println!("   z = x * y = {}", z.value());
    println!(
        "   if dL/dz = {}, then dL/dx = {}, dL/dy = {}\n",
        upstream.value(),
        dl_dx.value(),
        dl_dy.value()
    );

    println!("Compressed categorical training view:");
    println!("   Dataset x Parameters -> Prediction -> Loss -> Gradients -> Updated Parameters");
    println!("   TrainStep is repeated as Parameters0 -> Parameters1 -> ... -> ParametersN");

    Ok(())
}

How To Read The Demo

The demo output is a miniature course outline.

It starts with an object:

TokenId(1)

Then it applies a data-preparation morphism:

TokenSequence -> TrainingSet

Then it shows identity and composition:

Vector -> Vector
TokenId -> Vector -> Logits -> Distribution

Then it uses a product object to measure loss:

Distribution x TokenId -> Loss

Then it repeats an endomorphism:

Parameters -> Parameters

The later demo sections add functors, naturality, monoids, a commutative diagram check, and a local chain-rule example. By the time you finish the demo, you have seen each major term at least once in executable form.

Demo Output Wayfinding Checklist

After running cargo run --bin category_ml, use the numbered output as a map instead of reading it as one long printout.

Demo sectionSource file to inspect nextRust readingML readingCategory-theory reading
1. Object examplessrc/domain.rsTokenId gives a raw index a domain nametokens are data, not model stateobject
2. Dataset morphismsrc/ml.rsDatasetWindowing turns a sequence into pairstext becomes supervised examplesmorphism
3. Identity morphismsrc/category.rsIdentity<Vector> returns the same valuea neutral transformation should not change featuresidentity law
4. Compositionsrc/ml.rs and src/category.rsCompose connects matching output and input typesembedding, logits, and softmax form predictioncomposition
5. Product objectsrc/domain.rs and src/ml.rsProduct<Distribution, TokenId> pairs prediction with targetloss needs both prediction and correct next tokenproduct object
6. Endomorphismsrc/training.rsTrainStep returns Parameterstraining updates model stateendomorphism
7-9. Structure patternssrc/structure.rstraits and tests name reusable operationsbatches, options, and traces recur in ML systemsfunctor, naturality, monoid
10. Commutative diagram checksrc/ml.rstwo code paths are compareddirect and composed prediction should agreecommutative diagram
11. Chain rulesrc/calculus.rsMulOp::backward returns local gradientsbackprop starts from local derivative ruleschain rule

This table gives you a safe next action. If the output line is clear, continue reading. If it is not clear, open the source file in the second column and look for the type or function named by the line. The goal is not to memorize the demo. The goal is to use it as a routing table from terminal output to chapter, source file, ML role, and category-theory shape.

Source-Backed Wayfinding Rules

This chapter uses sources to keep the opening map practical. Each source supports one local rule for how a first session should move from command output to source files and then to vocabulary.

SourceWhat the source supportsLocal rule in this chapterRepository evidence
How People Learn IILearning should connect new ideas to prior knowledge and learner context.Start from a familiar function, then attach Rust, ML, and category-theory names to one visible pipeline.## Worked Example: From One Function To A Pipeline, ## The Three Readings
Rust Book: Packages, Crates, and ModulesRust packages organize code into crates and modules with separate responsibilities.Treat the source tree as the learning map: nouns, arrows, ML arrows, training, structure, calculus, sketches, and demo.src/domain.rs, src/category.rs, src/ml.rs, src/training.rs, src/demo.rs
Rust By ExampleSmall runnable examples make syntax inspectable before a larger explanation.Run one command, inspect its output, then route the output line to the matching source file.cargo run --example 01_token_sequence, cargo run --bin category_ml
Seven SketchesApplied category theory is taught through compositional examples and recurring shapes.Introduce object, morphism, product, composition, endomorphism, and law as names for shapes already visible in the tiny pipeline.TokenId -> Vector -> Logits -> Distribution, Distribution x TokenId -> Loss, Parameters -> Parameters
Category Theory for ProgrammingProgramming examples can make category-theory vocabulary less detached from code.Translate from Rust file and function evidence to category vocabulary only after the typed path is visible.## Demo Output Wayfinding Checklist, Morphism<Input, Output>, Compose<F, G, Middle>

The transfer pattern is:

source rule -> route through files -> command/output evidence

For this chapter, that means using cargo run --bin category_ml as more than a demo. Treat it as a table of contents whose output routes you to src/domain.rs, src/category.rs, src/ml.rs, src/training.rs, src/structure.rs, src/calculus.rs, and src/sketches.rs.

The table is not evidence that the book has solved every learner’s route through the material. It is evidence that the first-session map is grounded in named sources, real files, and executable output.

Binary Entrypoint

The binary entrypoint is deliberately tiny:

Source snapshot: src/bin/category_ml.rs
fn main() -> category_theory_transformer_rs::CtResult<()> {
    category_theory_transformer_rs::run_demo()
}

The whole file delegates to the library walkthrough:

fn main() -> category_theory_transformer_rs::CtResult<()> {
    category_theory_transformer_rs::run_demo()
}

The binary returns CtResult, so fallible work can propagate through Rust’s ordinary Result path. The binary stays short because this book is teaching the typed pipeline, not command-line interface design.

First Run

Start with the smallest visible pipeline:

cargo run --example 01_token_sequence

That command turns text into token IDs and next-token training pairs before any model weights appear.

Then run the full guided demo:

cargo run --bin category_ml

The exact floating-point values are less important than the shape. You should see a loss before training, a lower loss after repeated training, and the same typed pipeline used throughout the walkthrough.

Core Mental Model

Every chapter after this one zooms into one row of the map.

An object is a typed thing the program can talk about precisely.

A morphism is a typed transformation from one object to another.

Composition is a legal connection between transformations, where the output type of one step matches the input type of the next.

An endomorphism is a transformation from a type back to itself.

A law is a property the code checks so the reader can trust the shape, not only the example output.

Checkpoint

Explain this line in your own words:

TokenId -> Vector -> Logits -> Distribution

A strong answer should mention token lookup, the embedding vector, vocabulary scores, the probability distribution, and the fact that the whole path is a composition of typed morphisms.

Where This Leaves Us

This chapter gave the whole shape before the details. You now know the names of the source files, the major pipeline objects, and the difference between objects, morphisms, composition, endomorphisms, and laws.

The next chapter, Domain Objects, slows down and studies the objects themselves. Before a pipeline can compose arrows safely, it needs values whose meanings are clear enough for arrows to start and end at them.

Further Reading

Do not leave this chapter with only a list of links. Use the next sources to practice the map.

Start from this local evidence:

cargo run --example 01_token_sequence
cargo run --bin category_ml
src/domain.rs
src/category.rs
src/ml.rs
src/training.rs

Then read the sources in this order:

SourceWhat to transfer back into this chapterLocal evidence to inspect
How People Learn IINew ideas should connect to prior knowledge, context, and visible learner activity.## What You Already Know, ## Demo Output Wayfinding Checklist
Rust Book: Packages, Crates, and ModulesA Rust package can expose a library crate, binary crates, and modules with separate responsibilities.src/bin/category_ml.rs, src/lib.rs, src/domain.rs, src/category.rs
Rust By ExampleSmall runnable examples make syntax inspectable before a larger explanation.examples/01_token_sequence.rs, examples/02_morphism_composition.rs
Seven SketchesApplied category theory can be introduced through concrete examples before abstraction.TokenId -> Vector -> Logits -> Distribution
Category Theory for ProgrammingProgramming-shaped examples can keep category vocabulary attached to code.Morphism<Input, Output>, Compose<F, G, Middle>

After reading one external source, ask four questions:

  1. Which command output line did it make easier to place?
  2. Which source file should you inspect next?
  3. Which category word did it clarify?
  4. Which later chapter should you read after this map?

For this chapter, the commands are:

cargo run --example 01_token_sequence
cargo run --bin category_ml

For terminology recovery, use the Glossary entries for object, morphism, composition, and endomorphism. For source depth, use References and the Seven Sketches Through Rust companion chapter after you can already route the first demo output.

If an external source does not help you connect one terminal line to one source file and one category-theory word, it has not transferred back into the map yet.

Practice After This Chapter

Use Exercise 2 to change the demo input and Exercise 8 to connect one source file back to the course map. Use the demo-output wayfinding checklist above to decide which file to inspect. Those exercises check whether the map is active knowledge rather than only a diagram you read once.

Retrieval Practice

Recall

Name the three readings used throughout the book.

Explain

Why does the book start with a whole-pipeline map before reading individual source files?

Apply

Write a one-line diagram for a pipeline you already know, then label the input object, arrow, and output object.