Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Exercises

The problem this chapter solves is:

Reading detailed explanations is not enough. You need to practice explaining the code through Rust syntax, ML concept, and category-theory concept.

The exercises are deliberately small. A strong answer is not a long essay; it is a precise explanation that connects a line of Rust to the value it protects, the ML step it supports, and the categorical shape it names. When an exercise asks you to edit code, make the smallest change, run the command, and then explain what changed.

For every exercise, use this answer shape:

Rust syntax:
...

ML concept:
...

Category theory concept:
...

The point is not to write long answers.

The point is to connect the same block of code across all three meanings.

Before starting, make sure the basic Rust feedback loop works:

cargo test --all-targets --all-features

That command is part of the learning method. It proves that the examples in the book are not only explanatory text; they are tied to code that the compiler can check.

Worked Example

First study a complete answer. The exercise is:

Explain why TokenId is not a raw usize.

A strong answer:

Rust syntax:
TokenId is a tuple struct around usize. The field is private, so callers use
TokenId::new and index() instead of reaching into the raw value directly.

ML concept:
The number represents a vocabulary position, not an arbitrary count or shape.

Category theory concept:
TokenId is one object in the small category of typed pipeline values. Morphisms
such as Embedding can start from it.

Notice the order: name the syntax, connect it to the ML role, then name only the categorical shape the code supports.

Partially Completed Example

Complete the missing lines for Distribution:

Rust syntax:
Distribution wraps ________ and construction can return ________.

ML concept:
It represents probabilities over possible next tokens, so the values must be
non-negative and sum to ________.

Category theory concept:
It is an object produced by ________ and consumed with a target token by
________.

Expected completion:

Vec<f32>
CtResult<Self>
one
Softmax
CrossEntropy

Your Turn

Now solve the same kind of exercise without the filled answer. Pick Loss, TrainingSet, or LearningRate and explain it through the same three lenses.

Transfer Exercise

Design a new wrapper type for a future Transformer chapter, such as SequenceLength, HeadCount, or AttentionScore. State the raw representation, the invariant, and one function that should consume or produce it.

Exercise 1: Explain One Domain Type

Use Domain Objects.

Pick one type:

  • Vector
  • Logits
  • Distribution
  • Loss
  • TrainingSet
  • Parameters

Write:

The problem this solves:

Rust syntax:

ML concept:

Category theory concept:

Pass condition:

  • You name the raw representation.
  • You name the invariant or semantic distinction.
  • You name the pipeline stage where the type appears.

First-principles hint:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
struct LocalTokenId(usize);

impl LocalTokenId {
    fn new(index: usize) -> Self {
        Self(index)
    }

    fn index(self) -> usize {
        self.0
    }
}

assert_eq!(LocalTokenId::new(7).index(), 7);
}

That snippet is intentionally smaller than the real crate. It shows the raw idea: a named wrapper can make one usize mean “token id” instead of “any number.”

Exercise 2: Add A Token

Use the src/demo.rs snapshot in Course Map.

Add one new vocabulary item and extend the token sequence.

Run:

cargo run --bin category_ml

Pass condition:

  • the demo still runs
  • the dataset windowing output includes your new transition
  • you can explain why a longer TokenSequence creates more training examples

Exercise 3: Trace DatasetWindowing

Use The Tiny ML Pipeline.

For this input:

[TokenId(4), TokenId(8), TokenId(15), TokenId(16)]

write the training examples produced by windows(2).

Then explain:

Rust syntax:
what does `.windows(2)` do?

ML concept:
why does next-token training need adjacent pairs?

Category theory concept:
why is each example a product object?

Exercise 4: Break A Composition

Use the examples/02_morphism_composition.rs snapshot in Morphism and Composition.

Try to compose Embedding directly with Softmax.

Expected failure shape:

the trait bound ... is not satisfied

Then restore the working version.

Explain:

Rust syntax:
which type did the compiler reject?

ML concept:
which prediction stage was skipped?

Category theory concept:
which middle object failed to match?

Exercise 5: Change The Training Repetition Count

Use the examples/03_training_endomorphism.rs snapshot in Training as an Endomorphism.

Change:

StepCount::new(80)

to:

StepCount::new(1)
StepCount::new(10)
StepCount::new(200)

Run:

cargo run --example 03_training_endomorphism

Explain the result:

Rust syntax:
where is the count used?

ML concept:
what happens when training repeats more times?

Category theory concept:
why can the update be repeated?

Exercise 6: Explain Distribution<T>::map

Use Functors, Naturality, Monoids, and Chain Rule.

Explain the conceptual Distribution<T>::map example.

Use this input distribution:

TokenId(2) -> 0.70
TokenId(3) -> 0.30

and this function:

TokenId -> String

where:

TokenId(2) -> "Rust"
TokenId(3) -> "."

Write the output distribution.

Then explain:

Rust syntax:
why does `self` plus `into_iter()` move the old outcomes?

ML concept:
why do the probabilities stay the same?

Category theory concept:
what does it mean to lift `T -> U` into `Distribution<T> -> Distribution<U>`?

Exercise 7: Explain One Validation Boundary

Pick one constructor:

  • Distribution::new
  • Loss::new
  • LearningRate::new
  • TrainingSet::new
  • SignalMatrix::new
  • OpenCircuit::new

Write:

The problem this solves:

Rust syntax:
which condition returns `Err(...)`?

ML or software concept:
what bad runtime behavior does this prevent?

Category theory concept:
what intended object or relationship is being protected?

Exercise 8: Trace A Full Source File

Use Repository Source Snapshots.

Pick one complete source file and write a five-sentence summary:

  1. What problem does the file solve?
  2. What are the main Rust types or traits?
  3. What ML or software concept does it model?
  4. What category-theory concept does it teach?
  5. Which command proves the file still works?

Exercise 9: Connect One External Reference

Use References.

Pick one external resource and connect it to one source file in this course.

Answer:

External resource:
Source file:
Rust syntax connection:
ML or software concept connection:
Category theory concept connection:
One difference between the full treatment and this tiny implementation:

Exercise 10: Test One Sketch Law

Use Seven Sketches Through Rust.

Pick one law from src/sketches.rs:

  • preorder laws
  • feature/layer Galois law
  • resource monotonicity
  • foreign-key resolution
  • signal-flow matrix composition
  • local-to-global safety truth

Change one input in examples/05_seven_sketches.rs, then run:

cargo run --example 05_seven_sketches

Pass condition:

  • you can explain which law still holds
  • you can explain which constructor or method prevents invalid structure
  • your explanation uses Rust syntax, ML or software concept, and category theory concept

Exercise 11: Write A New Block Explanation

Choose any block from the source snapshots that the chapter did not explain in enough detail for you.

Write a block explanation using this structure:

The problem this block solves:

The whole block:

Rust syntax:

ML or software concept:

Category theory concept:

Core mental model:

Pass condition:

  • A beginner can understand the Rust syntax.
  • An ML learner can understand why the block exists.
  • A category-theory learner can name the shape.

Where This Leaves Us

If you can complete these exercises, you can read the project without treating category theory, Rust, and ML as three disconnected subjects. You can start from a line of code, name the syntax, identify the software or ML role, and then describe the categorical shape only as far as the code justifies it.