Rust, the good, the bad, and the ugly

Geert Jan Bex

Motivation

Central question: is Rust a programming language you want to use for scientific computing and/or data analysis?

Rust: the good

Features:

Memory safety without garbage collection
Concurrency without data races
Zero-cost abstractions
Modern tooling and ecosystem(?)
Performance comparable to C/C++
Strong type system and pattern matching

Rust: the bad

Challenges:

Steep learning curve
Limited libraries and ecosystem for scientific computing
Less community support and resources for data science
Slower compile times

Rust: the ugly

Overhyped? Judge for yourselves

Introduction

Course Arc

Start with the project workflow
Build the core Rust mental model
Move from syntax to data and error handling
Finish with scientific-computing examples

Audience

Programmers who are new to Rust
Scientists and engineers assessing Rust
Learners who need practical examples, not language trivia
People who will read compiler diagnostics often

Working Style

Short concept checkpoints
Terminal work for longer code
Small Cargo projects for practice
Scientific-computing relevance throughout

The Through-Line

Explicit types and conversions
Ownership before larger data structures
Traits and iterators for reusable code
Result, tests, seeds, and reproducible runs
Integrated numerical programs at the end

Modules 1-4: Foundations

Getting Started With Rust Projects
Scalar Computation And Numeric Basics
Control Flow And Program Structure
Ownership, Borrowing, And Mutation

Modules 5-9: Building Programs

Data Modeling With Structs And Methods
Reusable Abstractions With Traits
Collections, Iterators, And Text Data
Error Handling
Project Organization, Libraries, And Tests

Modules 10-13: Scientific Workflows

Randomness And Reproducible Runs
Data Parallelism With Rayon
Integrated Numerical Example: Julia Set
Integrated Numerical Example: N-Body Simulation

Example Workflow

cd source-code/hello-world
cargo run
cargo check

Enter one example project
Run the code before changing it
Make one small change
Use compiler output as feedback

Shorter Course Path

Use Modules 1-4 for the Rust core
Add Modules 7-8 for practical data and errors
Use the Julia set as the main integrated example
Leave N-body as follow-up reading or an optional lab

Habits To Build

cargo check as the edit-check loop
Compiler diagnostics as feedback
Explicitness around types, errors, and configuration
Small examples first, integrated examples later
Reproducible commands and inputs

First Hands-On Module

Open the repository structure
Run the smallest Rust example
Inspect Cargo.toml and src/main.rs
Add one dependency-backed command-line tool

Module 1: Getting Started With Rust Projects

Module Arc

Rust projects are Cargo projects
Start from a tiny binary
Build the edit-check-run loop
Add a dependency-backed command-line interface

By The End

Recognize Cargo.toml, Cargo.lock, and src/main.rs
Run cargo check, cargo build, and cargo run
Read a simple compiler diagnostic
Add and use a crate dependency
Move from “hello world” to a small CLI

Project Anatomy

project-name/
├── Cargo.toml
└── src/
    └── main.rs

Manifest and dependencies
Source entry point
Lockfile for selected versions

Terminal: Check The Toolchain

rustc --version
cargo --version

rustc is the compiler
cargo is the project workflow
Most examples start with Cargo

Terminal: First Run

cd source-code/hello-world
cargo run

Build if needed
Run the binary
Inspect the project files after the first run

The Smallest Program

fn main() {
    println!("Hello, world!");
}

main is the binary entry point
println! writes to standard output
! marks a macro call

Edit-Check-Run Loop

cargo check
cargo build
cargo run

check is fast feedback
build produces the executable
run builds and executes

Diagnostics Are Part Of The Workflow

Read the first error carefully
Use the file and line number
Look at the highlighted expression
Apply one fix, then check again

Terminal: Break And Repair

cd source-code/hello-world
cargo check

Introduce one small error
Locate the diagnostic in the source file
Fix only that error
Run cargo check again

From Program To Tool

Scientific programs need inputs
Command-line flags make runs repeatable
Libraries should handle common parsing work
This repository uses clap

Terminal: CLI Help

cd source-code/hello-clap
cargo run -- --help

First --: pass arguments to the program
--help: generated by clap
Options come from Rust types

Dependency In `Cargo.toml`

[dependencies]
clap = { version = "4", features = ["derive"] }

Crates provide reusable functionality
Cargo resolves and locks versions
Applications usually keep Cargo.lock

Parser As A Type

#[derive(Parser, Debug)]
struct Args {
    #[arg(short, long)]
    name: String,

    #[arg(short, long, default_value_t = 1)]
    count: u32,
}

A struct describes valid input
Attributes describe command-line behavior
The library handles parsing details

Terminal: Run With Options

cd source-code/hello-clap
cargo run -- --name Rust
cargo run -- --name Rust --count 3 --uppercase

Change inputs without editing code
Keep command history as a record
Treat parameters as part of the experiment

Hands-On Sequence

Run source-code/hello-world
Change the printed message
Introduce and fix one compiler error
Run source-code/hello-clap -- --help
Run with several options
Inspect Cargo.toml

Questions

Why prefer cargo check while editing?
What makes a compiler diagnostic useful?
When should parameters become CLI options?
Why keep Cargo.lock for training examples?

Connection To The Next Module

Same project structure
Same Cargo workflow
More attention to concrete types
Numeric behavior becomes the focus

Module 2: Scalar Computation And Numeric Basics

Module Arc

Inspect scalar type families
Compare integer and floating-point behavior
Use methods and constants on floating-point values
Make conversions visible
Add domain-specific numeric types through crates

By The End

Recognize signed, unsigned, floating-point, Boolean, character, and pointer-size types
Explain why integer and floating-point division differ
Choose Euclidean division when a non-negative remainder matters
Convert integer values to floating-point values explicitly
Use num-complex for complex arithmetic

Terminal: Inspect Scalar Types

cd source-code/basic-types
cargo run

Integer ranges are tied to bit width
usize and isize match pointer width
Floating-point constants are type-specific

Scalar Type Families

Signed integers: i8, i16, i32, i64, i128, isize
Unsigned integers: u8, u16, u32, u64, u128, usize
Floating-point values: f32, f64
Other scalars: bool, char

Floating-Point Constants

f32::MIN
f32::MAX
f32::MIN_POSITIVE
f32::EPSILON

std::f64::consts::PI
std::f64::consts::TAU

Constants are namespaced by type
f32 and f64 constants are distinct
Precision is part of the type choice

Type Inference And Explicit Types

let x = 17;
let y = 5.2;

let a: i32 = 17;
let b: i32 = 5;
let value: f64 = 5.2;

Rust infers types from context
Explicit annotations help when the type matters
Numerical examples should make important types visible

Terminal: Compare Arithmetic

cd source-code/math
cargo run

Compare 17 / 5 with 17.3 / 5.2
Compare % for integers and floating-point values
Keep the operand types visible

Integer Arithmetic

let a: i32 = 17;
let b: i32 = 5;

println!("{}", a / b);
println!("{}", a % b);

Integer / discards the fractional part
% gives the remainder for the same division rule
The result type is still an integer

Negative Integer Division

let a: i32 = -17;
let b: i32 = 5;

println!("{}", a / b);
println!("{}", a % b);
println!("{}", a.div_euclid(b));
println!("{}", a.rem_euclid(b));

/ and % use truncating division
div_euclid and rem_euclid use Euclidean division
Non-negative remainders matter for periodic domains

Floating-Point Arithmetic

let x: f64 = 17.3;
let y: f64 = 5.2;

println!("{}", x / y);
println!("{}", x % y);

Floating-point division keeps fractional information
Floating-point remainder is a related but different operation
f32 and f64 remain distinct types

Mathematical Methods

let angle = std::f64::consts::FRAC_PI_6;
let value = 2.0_f64;

println!("{}", angle.sin());
println!("{}", value.sqrt());
println!("{}", value.powi(8));
println!("{}", value.powf(0.5));

Floating-point functions are methods on values
powi takes an integer exponent
powf takes a floating-point exponent

Rounding And Absolute Values

let x = -3.75_f64;

println!("{}", x.abs());
println!("{}", x.floor());
println!("{}", x.ceil());
println!("{}", x.round());
println!("{}", x.trunc());

Choose the operation that matches the numerical meaning
Rounding choices affect downstream results
Method names make the choice explicit

Terminal: Polynomial Function

cd source-code/numerical-function
cargo run -- --help
cargo run -- --a 2.0 --b -1.0 --c 0.5

Command-line coefficients are f64
The function signature fixes the numeric type
Output values come from a generated grid

Typed Numeric Functions

fn polynomial(x: f64, a: f64, b: f64, c: f64) -> f64 {
    a * x.powi(2) + b * x + c
}

Parameter types are part of the interface
Return type is explicit
powi(2) matches an integer exponent

Explicit Conversion

let delta_x = (x_max - x_min) / (nr_points as f64 - 1.0);

for i in 0..nr_points {
    let x = x_min + i as f64 * delta_x;
    let result = polynomial(x, args.a, args.b, args.c);
    println!("{x} {result}");
}

Integer loop counts do not silently become f64
as f64 marks the conversion point
Visible conversions make numerical intent reviewable

Terminal: Conversion Diagnostic

cd source-code/numerical-function
cargo check

Remove one as f64 conversion
Locate the type mismatch
Restore the explicit conversion

Avoiding Implicit Double Promotion

fn compute_polynom(x: f32) -> f32 {
    let a = 3.0;
    let b = 2.0;
    let c = 1.0;
    a * x * x + b * x + c
}

Floating-point literals get a concrete type from context
The expression remains f32
No silent promote-to-double-then-convert-back step

Terminal: Literal Types

cd source-code/no-double-promotion
cargo run --release

Inspect the printed type names
Notice how x, a, b, and c are typed
Treat ambiguity as something the compiler should reject

Complex Numbers

use num_complex::Complex64;

let z1 = Complex64 { re: 1.0, im: 2.0 };
let z2 = Complex64 { re: 3.0, im: 4.0 };

println!("{}", z1 + z2);
println!("{}", z1 * z2);
println!("{}", z1.norm());

Complex numbers come from a crate
Complex64 stores f64 real and imaginary parts
Numeric behavior can be extended through types

Terminal: Complex Arithmetic

cd source-code/complex-numbers
cargo run

Add z1 - z2
Inspect z1.re and z1.im
Compare arithmetic with scalar f64 arithmetic

Hands-On Sequence

Run source-code/basic-types
Add one more floating-point constant
Compare integer and floating-point division
Predict /, %, div_euclid, and rem_euclid
Remove one as f64 conversion and run cargo check
Add z1 - z2 to the complex-number example

Questions

When should a type annotation be written explicitly?
Which division operation matches periodic indexing?
Where should integer-to-float conversion happen?
Which numeric types belong in the standard library?
Which numeric types belong in crates?

Connection To The Next Module

Numeric expressions need control flow
Loops generate grids and samples
Functions separate reusable calculations
match makes discrete choices explicit

Module 3: Control Flow And Program Structure

Module Arc

Make choices with if and else
Repeat work with while, for, and ranges
Name reusable calculations with functions
Represent choices with enums
Split growing examples across source files

By The End

Use if, while, and for
Use half-open and inclusive ranges
Define typed functions
Use blocks and if expressions as values
Select behavior with match
Declare modules with mod

Terminal: Greatest Common Divisors

cd source-code/control-flow
cargo run

Inspect the table of gcd(a, b) values
Identify the repeated calculation
Locate the loop bounds in main

Branches With `if` And `else`

if a > b {
    a -= b;
} else {
    b -= a;
}

The condition must be a bool
Each branch updates one local value
Integers are not used as truth values

Loops With `while`

fn gcd(mut a: i32, mut b: i32) -> i32 {
    while a != b {
        if a > b {
            a -= b;
        } else {
            b -= a;
        }
    }
    a
}

while repeats while the condition is true
mut allows local parameter bindings to change
The final expression a is the return value

Half-Open Ranges

for i in 0..n {
    println!("{i}");
}

Includes 0
Stops before n
Common for indices and repeated work

Inclusive Ranges

for a in 1..=a_max {
    for b in 1..=b_max {
        println!("gcd({a}, {b}) = {}", gcd(a, b));
    }
}

Includes both endpoints
Useful for small numeric tables
Nested loops describe a grid of input pairs

Terminal: Change The Grid

cd source-code/control-flow
cargo run

Change a_max and b_max
Compare 1..=a_max with 1..a_max
Trace one pair such as gcd(9, 6)

Functions With Typed Interfaces

fn polynomial(x: f64, a: f64, b: f64, c: f64) -> f64 {
    a * x.powi(2) + b * x + c
}

Parameter types are explicit
Return type follows ->
The last expression is returned

Semicolon And Return Value

fn polynomial(x: f64, a: f64, b: f64, c: f64) -> f64 {
    a * x.powi(2) + b * x + c
}

No semicolon: expression value is returned
Semicolon: expression becomes a statement
Return type and body must agree

Blocks As Expressions

let square = {
    let x = 2.0;
    x * x
};

A block can produce a value
Local bindings stay inside the block
The final expression determines the block value

`if` Expressions

let weight = if i % 2 == 0 { 2.0 } else { 4.0 };

The whole if has a value
Both branches must have compatible types
Useful for local numerical choices

Tuples

let point: (f64, f64) = (1.0, 2.0);

let x = point.0;
let y = point.1;

let (row, col) = (2_usize, 3_usize);

Tuples group a fixed number of values
Fields are accessed by position
Destructuring names the components

Terminal: Quadrature Choices

cd source-code/enum-match
cargo run
cargo run -- --method gauss

Run both integration methods
Compare the command-line option with the code
Identify where the selected method is handled

Enums For Fixed Choices

#[derive(Clone, ValueEnum)]
enum QuadratureMethod {
    Simpson,
    Gauss,
}

An enum value is one variant from a fixed set
The variants are part of the type
Parsed input becomes a Rust value

Selecting Behavior With `match`

let result = match args.method {
    QuadratureMethod::Simpson => simpson::quad(f, a, b, 1000),
    QuadratureMethod::Gauss => gauss::quad(f, a, b),
};

Each arm handles one variant
The compiler checks coverage
The selected arm produces the value of result

Structural Matches Later

match (self.first_time, self.last_time) {
    (None, None) => { /* first record */ }
    (Some(first), Some(last)) => { /* update range */ }
    _ => unreachable!("timestamps move together"),
}

Patterns can describe tuple structure
match is not only for command choices
This appears in source-code/strings

Closures As Function Arguments

let f = |x: f64| x.sin();

pub fn quad<F>(f: F, a: f64, b: f64, n: usize) -> f64
where
    F: Fn(f64) -> f64,
{
    // implementation
}

The integration algorithm receives the function to integrate
The same algorithm can run on different functions
The trait bound is a preview of later modules

Source Modules

src/
├── main.rs
├── simpson.rs
└── gauss.rs

mod simpson;
mod gauss;

main.rs handles setup and dispatch
Numerical algorithms live in separate files
Qualified names show where functions come from

Qualified Function Calls

simpson::quad(f, a, b, 1000)
gauss::quad(f, a, b)

Module names organize related code
Function names can be reused in different modules
Call sites show the chosen implementation

Hands-On Sequence

Run source-code/control-flow
Change a_max and b_max
Compare 1..=a_max with 1..a_max
Add a second function to source-code/numerical-function
Run both enum-match methods
Change the integrated function from sin to cos
Add an enum variant and inspect the match diagnostic

Questions

Which loops naturally fit ranges?
When should a calculation become a function?
When is a tuple enough, and when is a struct clearer?
Which choices deserve an enum?
What should stay in main.rs?

Connection To The Next Module

Functions make ownership visible at boundaries
Mutation starts to matter more
Loops over collections introduce borrowing questions
Larger structures need clearer data ownership

Module 4: Ownership, Borrowing, And Mutation

Module Arc

Start with local mutation
Compare scalar copies with vector moves
Borrow data for read-only functions
Borrow mutably for in-place updates
Choose signatures from ownership intent

By The End

Use mut when a binding changes
Distinguish copy, move, clone, borrow, and mutable borrow
Prefer &[T] over &Vec<T> for read-only sequences
Use &mut [T] for in-place sequence updates
Read function signatures as ownership contracts

Immutable By Default

let x = 1.0;

let mut y = 1.0;
y += 0.1;

Bindings do not change unless marked mut
Mutation is local and visible
mut belongs to the binding

Terminal: Mutable Grid Point

cd source-code/mutable-variables
cargo run -- --help
cargo run -- --a 1.0 --b 0.0 --c 0.0

Locate the binding that changes
Remove mut and run cargo check
Restore mut

Local Mutation In A Loop

let mut x = x_min;
let delta_x = (x_max - x_min) / (nr_points as f64 - 1.0);

for _ in 0..nr_points {
    let result = polynomial(x, args.a, args.b, args.c);
    println!("{x} {result}");
    x += delta_x;
}

x changes on each iteration
delta_x does not change
The mutable binding is the one that needs mut

Mutable References

fn rhs(x: f64, dxdt: &mut f64, _t: f64) {
    *dxdt = -x;
}

dxdt is borrowed mutably
*dxdt writes through the reference
Mutation appears in the function signature

Terminal: Write Through A Reference

cd source-code/mutable-borrowing
cargo run

Find the &mut at the call site
Find the *dxdt assignment
Rewrite rhs to return f64

Mutable Borrow At The Call Site

let mut dxdt = 0.0;

rhs(x, &mut dxdt, t);
x += dxdt * delta_t;

The caller must own a mutable binding
&mut dxdt grants temporary write access
The caller keeps ownership after the call

Copying Scalars

let x = 5.0;
let y = x;

println!("x: {x}, y: {y}");

Small scalar values such as f64 are copied
Both bindings remain usable
This is the expected behavior for simple numeric values

Moving Owned Data

let xs = vec![1.0, 2.0, 3.0];
let ys = xs;

println!("ys: {ys:?}");

Vec<f64> owns heap-allocated data
Assignment moves ownership to ys
xs is no longer usable after the move

Terminal: Copy Versus Move

cd source-code/copy-vs-move
cargo run
cargo check

Compare the scalar assignment with the vector assignment
Uncomment one line that uses a moved vector
Locate where ownership moved

Cloning Owned Data

let xs = vec![1.0, 2.0, 3.0];
let ys = xs.clone();

println!("xs: {xs:?}, ys: {ys:?}");

clone creates an explicit copy
Both vectors can be used afterward
Large clones are visible in the code

Moving Into A Function

fn mean_move(values: Vec<f64>) -> f64 {
    values.iter().sum::<f64>() / (values.len() as f64)
}

let xs = vec![1.0, 2.0, 3.0];
let mean = mean_move(xs);

The function takes ownership of the vector
The caller cannot use xs after the call
Read-only computations usually should not take ownership

Shared Borrowing

fn mean_borrow(values: &Vec<f64>) -> f64 {
    values.iter().sum::<f64>() / (values.len() as f64)
}

let xs = vec![1.0, 2.0, 3.0];
let mean = mean_borrow(&xs);

&xs grants read-only access
Ownership stays with the caller
xs can still be used after the call

Prefer Slices For Sequences

fn mean(data: &[f64]) -> f64 {
    let sum: f64 = data.iter().sum();
    sum / (data.len() as f64)
}

&[f64] is a borrowed view of contiguous values
A slice can refer to all or part of a vector
Read-only sequence APIs are more flexible with slices

Mutable Slices

fn normalize(data: &mut [f64]) {
    let mean_value = mean(data);
    for value in data.iter_mut() {
        *value /= mean_value;
    }
}

&mut [f64] grants write access to sequence elements
iter_mut yields mutable element references
*value writes through each reference

Terminal: Normalize In Place

cd source-code/borrowing-vectors
cargo run

Inspect the original data
Normalize through &mut data
Compare the mean before and after normalization

Borrowing Rules

{
    let first_value = &data[0];
    let this_mean = mean(&data);
    println!("First value: {first_value}");
}

Multiple shared borrows can overlap
Read-only access does not conflict with read-only access
The borrow ends when the reference is no longer used

Shared And Mutable Borrows

{
    let first_value = &data[0];
    normalize(&mut data);
    println!("First value before normalization: {first_value}");
}

first_value borrows from data
normalize(&mut data) needs exclusive access
The two accesses overlap

Terminal: Borrow Conflict

cd source-code/borrowing-vectors
cargo check

Uncomment the rejected borrowing block
Locate the shared borrow
Locate the attempted mutable borrow
Restore the original code

References In Collections

let xs = vec![1.0, 2.0, 3.0];
let filtered: Vec<&f64> = xs.iter()
    .filter(|&&x| x > 1.5)
    .collect();

filtered stores references into xs
xs remains borrowed while those references are used
The filtered vector does not own the numbers

Independent Filtered Values

let filtered: Vec<f64> = xs.iter()
    .filter(|&&x| x > 1.5)
    .copied()
    .collect();

copied turns &f64 into f64
The filtered vector owns its values
xs can be modified independently afterward

Returning Owned Values

fn return_vector() -> Vec<f64> {
    vec![1.0, 2.0, 3.0]
}

let xs = return_vector();

A function can create owned data
Ownership moves to the caller
Returning a vector does not imply element-by-element copying

Choosing Function Signatures

Small scalar input: pass by value
Read-only sequence input: pass &[T]
In-place sequence update: pass &mut [T]
Ownership transfer: pass an owning type by value
Newly created data: return an owning type

Signature Examples

fn mean(data: &[f64]) -> f64

fn normalize(data: &mut [f64])

fn return_vector() -> Vec<f64>

Read
Modify
Create and return

Hands-On Sequence

Run source-code/mutable-variables
Remove and restore one required mut
Rewrite rhs to return f64
Compare scalar copy with vector move
Change mean_borrow to use a slice
Uncomment one borrow conflict and run cargo check
Add shift(data: &mut [f64], offset: f64)

Questions

Does this function need ownership?
Does this function only read data?
Does this function modify data in place?
Is a clone intentional and worth its cost?
Can a slice make the API more flexible?

Connection To The Next Module

Structs store owned data in fields
Methods borrow self or &mut self
Constructors return owned values
Encapsulation depends on clear ownership boundaries

Module 5: Data Modeling With Structs And Methods

Module Arc

Group related data into a named type
Attach behavior to that type
Protect invariants with private fields
Keep main.rs focused on program flow
Generalize the type when one design fits multiple element types

By The End

Define a struct with named fields
Implement methods in an impl block
Distinguish self, &self, and &mut self
Create values with associated functions such as new
Store matrix data in a flat vector
Add trait bounds only where operations need them

Terminal: Matrix Example

cd source-code/structs-and-methods
cargo run -- --help
cargo run -- --rows 2 --cols 5

Create a matrix from command-line dimensions
Fill it through methods
Print values through methods

Why A Struct?

pub struct Matrix {
    rows: usize,
    cols: usize,
    data: Vec<f64>,
}

Rows, columns, and storage belong together
The type has a useful invariant
data.len() should match rows * cols

Private Fields

pub struct Matrix {
    rows: usize,
    cols: usize,
    data: Vec<f64>,
}

pub struct Matrix makes the type visible
The fields are private
External code uses the public methods

Associated Function: `new`

impl Matrix {
    pub fn new(rows: usize, cols: usize) -> Self {
        Self {
            rows,
            cols,
            data: vec![0.0; rows * cols],
        }
    }
}

new belongs to Matrix
It does not take self
Self means Matrix inside this impl

Creating A Value

let mut matrix = Matrix::new(args.rows, args.cols);

Type-qualified syntax calls the associated function
The matrix starts with initialized storage
The binding is mutable because elements will be set later

Methods With `&self`

impl Matrix {
    pub fn rows(&self) -> usize {
        self.rows
    }

    pub fn cols(&self) -> usize {
        self.cols
    }
}

&self means shared access
These methods only read fields
Accessors expose selected internal state

Methods With `&mut self`

pub fn set(&mut self, row: usize, col: usize, value: f64) {
    self.data[row * self.cols + col] = value;
}

&mut self means mutable access
The method changes matrix storage
The caller needs a mutable matrix binding

Reading An Element

pub fn get(&self, row: usize, col: usize) -> f64 {
    self.data[row * self.cols + col]
}

get only needs shared access
The indexing formula maps 2D coordinates to flat storage
Row-major layout stores each row contiguously

Filling Through The Interface

for i in 0..matrix.rows() {
    for j in 0..matrix.cols() {
        matrix.set(i, j, (i * matrix.cols() + j) as f64);
    }
}

Dimensions come from accessor methods
Values are written through set
The caller does not touch data directly

Keeping `main.rs` Focused

src/
├── main.rs
└── matrix.rs

mod matrix;
use matrix::Matrix;

main.rs handles CLI and program flow
matrix.rs defines the domain type
Module boundaries keep examples readable

Encapsulation

pub fn new(rows: usize, cols: usize) -> Self
pub fn rows(&self) -> usize
pub fn cols(&self) -> usize
pub fn get(&self, row: usize, col: usize) -> f64
pub fn set(&mut self, row: usize, col: usize, value: f64)

Public methods define supported operations
Private fields protect representation choices
Invariants stay inside the module

Terminal: Add A Method

cd source-code/structs-and-methods
cargo check

Add len(&self) -> usize
Call matrix.len() from main.rs
Try direct field access and inspect the diagnostic

Private Helper Method

fn index(&self, row: usize, col: usize) -> usize {
    row * self.cols + col
}

Repeated indexing logic gets one name
get and set use the same mapping
The helper can stay private

Generic Matrix

pub struct Matrix<T> {
    rows: usize,
    cols: usize,
    data: Vec<T>,
}

T is the element type
Matrix<f64> stores floating-point values
Matrix<i32> stores integer values

Terminal: Generic Structs

cd source-code/generic-structs
cargo run -- --help
cargo run -- --rows 2 --cols 3

Locate the Matrix<f64>
Locate the Matrix<i32>
Add a Matrix<bool> or Matrix<char>

Methods For All `T`

impl<T> Matrix<T> {
    pub fn rows(&self) -> usize {
        self.rows
    }

    pub fn cols(&self) -> usize {
        self.cols
    }
}

Reading dimensions does not depend on the element type
No trait bound is needed
The method works for every Matrix<T>

Checked Indexing

fn index(&self, row: usize, col: usize) -> Option<usize> {
    if row < self.rows && col < self.cols {
        Some(row * self.cols + col)
    } else {
        None
    }
}

Some(index) means the coordinates are valid
None means the coordinates are out of bounds
Bounds checking becomes part of the interface

Borrowing Elements

pub fn get(&self, row: usize, col: usize) -> Option<&T> {
    self.index(row, col).map(|index| &self.data[index])
}

The result may be missing
The element is borrowed, not copied
T does not need to implement Copy or Clone

Setting Elements

pub fn set(&mut self, row: usize, col: usize, value: T) -> Result<(), String> {
    let index = self
        .index(row, col)
        .ok_or_else(|| format!("matrix index ({row}, {col}) is out of bounds"))?;
    self.data[index] = value;
    Ok(())
}

The value is moved into the matrix
Out-of-bounds writes return an error
Result makes failure explicit

Trait Bounds Where Needed

impl<T: Clone> Matrix<T> {
    pub fn new(rows: usize, cols: usize, value: T) -> Self {
        Self {
            rows,
            cols,
            data: vec![value; rows * cols],
        }
    }
}

Repeated initialization needs cloneable values
The Clone bound belongs on this implementation
Other methods can work without T: Clone

Hands-On Sequence

Run source-code/structs-and-methods
Add len(&self) -> usize
Try direct access to matrix.data
Add a private index helper
Run source-code/generic-structs
Add a Matrix<bool> or Matrix<char>
Remove and restore the Clone bound

Questions

Which values belong together as one type?
Which fields should stay private?
Which methods need &self?
Which methods need &mut self?
Where does a trait bound actually belong?

Connection To The Next Module

Methods attach behavior to one type
Traits describe behavior shared across types
Trait bounds generalize generic code
Standard traits provide familiar operations

Module 6: Reusable Abstractions With Traits

Module Arc

Implement standard traits for a custom matrix type
Connect custom types to familiar Rust syntax
Use crate-provided numeric traits for generic statistics
Define a numerical interface with a project-specific trait
Use trait bounds when generic code needs behavior
Compare compile-time and run-time dispatch

By The End

Explain what a trait represents
Implement Index, IndexMut, Display, and TryFrom
Recognize associated types such as Output and Error
Define and implement a user-defined trait
Use trait bounds to require behavior
Recognize numeric trait bounds from external crates
Recognize dyn Trait as dynamic dispatch

What Is A Trait?

A named set of behavior
A contract a type can implement
A way to use shared syntax or shared interfaces
Not an inheritance relationship

println!("{value}");
matrix[(row, col)]

Terminal: Matrix With Traits

cd source-code/traits
cargo run

Fill a matrix with indexing syntax
Print a matrix with {}
Iterate over borrowed and owned matrix values

Standard Traits In This Example

Index: read with matrix[(row, col)]
IndexMut: assign with matrix[(row, col)] = value
Display: print with println!("{matrix}")
TryFrom: build from nested vectors fallibly
IntoIterator: loop over matrix values

Indexing With `Index`

impl<T> Index<(usize, usize)> for Matrix<T> {
    type Output = T;

    fn index(&self, index: (usize, usize)) -> &Self::Output {
        let (row, col) = index;
        let flat_index = self.flat_index(row, col)
            .unwrap_or_else(|| panic!("matrix index ({row}, {col}) is out of bounds"));
        &self.data[flat_index]
    }
}

The index type is (usize, usize)
type Output = T defines the indexed value type
The method returns a shared reference

Indexing Syntax

let value = matrix[(row, col)];

The syntax uses the Index implementation
The row and column become one tuple index
Out-of-bounds indexing is a programming error here

Mutable Indexing With `IndexMut`

impl<T> IndexMut<(usize, usize)> for Matrix<T> {
    fn index_mut(&mut self, index: (usize, usize)) -> &mut Self::Output {
        let (row, col) = index;
        let flat_index = self.flat_index(row, col)
            .unwrap_or_else(|| panic!("matrix index ({row}, {col}) is out of bounds"));
        &mut self.data[flat_index]
    }
}

&mut self grants mutable matrix access
The method returns a mutable element reference
Assignment uses this implementation

Assignment Through Indexing

for row in 0..matrix.rows() {
    for col in 0..matrix.cols() {
        matrix[(row, col)] = (row * matrix.cols() + col) as f64;
    }
}

Matrix assignment reads like ordinary indexing
The implementation still controls bounds checks
The call site no longer needs a set method

Formatting With `Display`

impl<T: Display> Display for Matrix<T> {
    fn fmt(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
        for row in 0..self.rows {
            for col in 0..self.cols {
                if col > 0 {
                    write!(formatter, " ")?;
                }
                write!(formatter, "{}", self[(row, col)])?;
            }
        }
        Ok(())
    }
}

Display controls {} formatting
T: Display is needed for each element
fmt::Result reports formatting success or failure

Printing A Matrix

println!("floating-point matrix:");
println!("{matrix}");

The matrix controls its own text representation
The caller uses ordinary formatting syntax
Element formatting depends on T: Display

Fallible Conversion With `TryFrom`

impl<T> TryFrom<Vec<Vec<T>>> for Matrix<T> {
    type Error = String;

    fn try_from(rows: Vec<Vec<T>>) -> Result<Self, Self::Error> {
        // validate row lengths, flatten data
    }
}

Nested vectors may be ragged
TryFrom represents conversion that can fail
type Error = String defines the error type

Using `TryFrom`

let integer_matrix =
    Matrix::try_from(vec![vec![1, 0], vec![0, 2]])
        .expect("all rows have the same length");

Conversion produces Result<Matrix<_>, String>
Valid nested vectors become matrices
Ragged rows follow the error path

Terminal: Generic Numerics

cd source-code/generic-numerics
cargo run

Accumulate values in a generic Stats<T>
Convert input values with num-traits
Compute mean and population standard deviation

Numeric Trait Bounds

struct Stats<T: Float + FromPrimitive> {
    sum: T,
    sum_sqr: T,
    count: usize,
}

fn add<U>(&mut self, value: U) -> Option<()>
where
    U: ToPrimitive,

Float supplies floating-point operations
FromPrimitive converts counts and values into T
ToPrimitive accepts several input numeric types

Iteration With `IntoIterator`

for value in &mut matrix {
    *value *= 0.5;
}

Owned iteration consumes the matrix
Shared borrowed iteration yields &T
Mutable borrowed iteration yields &mut T

Borrowed Iteration

impl<'a, T> IntoIterator for &'a Matrix<T> {
    type Item = &'a T;
    type IntoIter = std::slice::Iter<'a, T>;

    fn into_iter(self) -> Self::IntoIter {
        self.data.iter()
    }
}

Items are references into the matrix
The iterator cannot outlive the borrow
The matrix remains usable after shared iteration

Terminal: Trait Experiments

cd source-code/traits
cargo test
cargo run

Change the Display separator
Create a ragged nested vector
Add a loop over &matrix
Add a loop over &mut matrix

User-Defined Traits

pub trait QuadratureRule {
    fn integrate(&self, f: &dyn Fn(f64) -> f64, a: f64, b: f64) -> f64;

    fn name(&self) -> &'static str;
}

The trait names shared numerical behavior
Each rule integrates a function over an interval
Each rule provides a human-readable name

Terminal: Quadrature Rules

cd source-code/user-defined-trait
cargo run
cargo run -- --method gauss
cargo run -- --method simpson --subdivisions 2000

Run both concrete implementations
Compare output names
Vary Simpson subdivisions

Implementing A Trait

pub struct Simpson {
    subdivisions: usize,
}

impl QuadratureRule for Simpson {
    fn integrate(&self, f: &dyn Fn(f64) -> f64, a: f64, b: f64) -> f64 {
        // Simpson implementation
    }

    fn name(&self) -> &'static str {
        "composite Simpson"
    }
}

The type stores data needed by the algorithm
The implementation provides required methods
The trait describes what callers can rely on

Same Trait, Different Type

pub struct GaussLegendre10;

impl QuadratureRule for GaussLegendre10 {
    fn integrate(&self, f: &dyn Fn(f64) -> f64, a: f64, b: f64) -> f64 {
        // Gauss-Legendre implementation
    }

    fn name(&self) -> &'static str {
        "10-point Gauss-Legendre"
    }
}

Different storage
Different algorithm
Same external interface

Trait Objects With `dyn Trait`

fn select_rule(args: &Args) -> Box<dyn QuadratureRule> {
    match args.method {
        QuadratureMethod::Simpson => Box::new(Simpson::new(args.subdivisions)),
        QuadratureMethod::Gauss => Box::new(GaussLegendre10),
    }
}

The concrete rule is selected at run time
Box<dyn QuadratureRule> stores either implementation
Callers use the trait methods

Uniform Use After Selection

let rule = select_rule(&args);
let result = rule.integrate(&f, a, b);

println!("using {} quadrature", rule.name());

The caller does not match on the method again
integrate works through the trait object
name works through the same interface

Static Dispatch

impl<T: Display> Display for Matrix<T>

The concrete type is known at compile time
The compiler generates code for that type
Trait bounds express required behavior

Dynamic Dispatch

Box<dyn QuadratureRule>

The concrete type is selected at run time
Calls go through a trait object
Different concrete types are handled uniformly

Choosing The Trait Pattern

Use standard traits for familiar Rust syntax
Use user-defined traits for domain roles
Use generic bounds when concrete types stay known
Use trait objects for run-time selection
Put bounds where the operation needs them

Hands-On Sequence

Run source-code/traits
Change the Display separator
Trigger the TryFrom error path with ragged rows
Iterate over &matrix without consuming it
Iterate over &mut matrix and scale values
Run both quadrature methods
Change one name implementation
Add a placeholder quadrature rule

Questions

Which standard trait matches the syntax you want?
Which associated types are part of the trait contract?
Which behavior does a generic implementation need?
Which domain role deserves a user-defined trait?
Is the concrete type known at compile time or selected at run time?

Connection To The Next Module

Iterators are trait-based abstractions
Collections expose borrowed and owned iteration
Iterator adapters compose behavior
Trait bounds appear throughout data-processing code

Module 7: Collections, Iterators, And Text Data

Module Arc

Read structured text into typed values
Store columns in vectors
Transform and filter with iterators
Combine related sequences
Count and classify text tokens
Choose buffered I/O for file processing

By The End

Build vectors with push
Use iter, iter_mut, and into_iter
Use filter, map, zip, unzip, and enumerate
Use sum, fold, and scan
Use HashMap for counts
Use HashSet for unique values
Read and write text through buffers
Process owned String and borrowed &str
Parse timestamped records with chrono
Compare direct parsing with structural matching

Terminal: Iterator Example

cd source-code/iterators
cargo run -- --file data.txt

Read two numeric columns
Store x and y values
Transform and summarize the data

Vectors

let mut xs = Vec::new();
let mut ys = Vec::new();

xs.push(value.x);
ys.push(value.y);

Vec<T> stores a growable sequence
All elements have the same type
The vectors are mutable while they are being filled

Structured Text To Typed Values

#[derive(Deserialize, Debug)]
struct Values {
    x: f64,
    y: f64,
}

let mut reader = csv::Reader::from_path(args.file)?;

serde maps records to a Rust struct
csv handles parsing the file format
The rest of the program works with f64

Deserializing Records

for result in reader.deserialize() {
    let value: Values = result?;
    xs.push(value.x);
    ys.push(value.y);
}

Each record can fail to parse
? propagates parse or I/O errors
Valid records become ordinary Rust values

Borrowed Iteration And `copied`

let filtered_xs: Vec<f64> = xs
    .iter()
    .copied()
    .filter(|x| *x >= 10.0)
    .collect();

iter() yields &f64
copied() turns &f64 into f64
Copying is appropriate for small scalar values

Filtering Values

.filter(|x| *x >= 10.0)

The closure decides which values remain
Iterator adapters are lazy
collect consumes the pipeline

Mapping Values

let cubed_xs: Vec<f64> = xs
    .iter()
    .copied()
    .map(|x| x.powi(3))
    .collect();

map transforms each item
The output item type can differ from the input item type
Element-wise numerical transformations read naturally as pipelines

Collecting Results

let cubed_xs: Vec<f64> = xs
    .iter()
    .copied()
    .map(|x| x.powi(3))
    .collect();

collect builds a concrete collection
The type annotation says what to build
The same iterator could feed different collection types

Combining Columns With `zip`

let filtered_pairs: Vec<(f64, f64)> = xs
    .iter()
    .copied()
    .zip(ys.iter().copied())
    .filter(|(x, _)| *x >= 10.0)
    .collect();

zip combines two iterators
Items become 2-tuples
Related columns can be processed together

Splitting Pairs With `unzip`

let (filtered_xs, filtered_ys): (Vec<f64>, Vec<f64>) =
    filtered_pairs
        .iter()
        .copied()
        .unzip();

unzip splits an iterator over pairs
The result has two collections
The output type is written explicitly

Simple Reductions

let sum_y: f64 = ys.iter().sum();

A reduction turns many values into one value
The type annotation fixes the numeric result type
sum is concise for common accumulation

General Accumulation With `fold`

let sum_of_squares = xs
    .iter()
    .copied()
    .fold(0.0, |accumulator, x| accumulator + x * x);

fold carries an accumulator
The final accumulator is the result
Useful when no specialized reduction exists

Running State With `scan`

let cumulative_sum: Vec<f64> = xs
    .iter()
    .copied()
    .scan(0.0, |state, x| {
        *state += x;
        Some(*state)
    })
    .collect();

scan keeps state between items
Each step can yield a value
Running sums and cumulative quantities fit this pattern

Indices With `enumerate`

for (i, y) in ys.iter().enumerate() {
    println!("Index: {i}, y value: {y:.1}");
}

enumerate attaches an index
The item becomes (index, value)
Manual counters are usually unnecessary

Terminal: Iterator Exercises

cd source-code/iterators
cargo run -- --file data.txt

Change the filter threshold
Add x.sqrt() for non-negative x
Compute the sum of squares of y
Compute cumulative sums
Build a vector of x + y

Terminal: Count Nucleotides

cd source-code/hashmap-hashset
cargo run --bin count-nucleotides -- --file errors.txt

Count valid nucleotide characters
Collect unique invalid tokens
Print counts in a stable order

Hash Maps For Counting

let mut counts = HashMap::new();

*counts.entry(nucleotide).or_insert(0) += 1;

HashMap<K, V> stores values by key
entry selects or creates a map entry
or_insert(0) provides the initial count

Stable Count Output

for nucleotide in VALID_NUCLEOTIDES {
    counts.entry(nucleotide).or_insert(0);
    println!("{nucleotide}: {}", counts[&nucleotide]);
}

Every valid nucleotide appears in the output
Missing keys receive count 0
Iterating over a fixed list gives stable ordering

Hash Sets For Unique Values

let mut error_tokens = HashSet::new();

error_tokens.insert(nucleotide);

A set stores each value once
Repeated invalid tokens do not create duplicates
Use a set for “which values appeared?”

Buffered Text Input

let file = std::fs::File::open(args.file)
    .expect("Failed to open the DNA sequence file");
let reader = BufReader::new(file);

Buffered input avoids many tiny reads
File processing becomes more efficient
The reader provides byte or line iteration

Byte-Wise Processing

for byte in reader.bytes() {
    let nucleotide = byte.expect("Failed to read the DNA sequence file") as char;
    // process nucleotide
}

Byte-wise processing fits simple ASCII-like sequence data
General Unicode text usually needs a different approach
The input format should drive the reading strategy

Terminal: Timestamped Strings

cd source-code/strings
cargo run -- --file data.txt

Read timestamped records line by line
Accumulate one record in a String
Parse borrowed text with &str
Aggregate values without loading the whole file

Line-Based Text Input

let reader = BufReader::new(file);

for line in reader.lines() {
    let line = line.expect("Failed to read line");
    // process one line
}

lines() yields owned String values
Records can be built incrementally
The input format drives the reading strategy

Owned `String`, Borrowed `&str`

record_buffer.push_str(&line);
record_buffer.push('\n');

fn parse_record(record_str: &str) -> Result<Record, String> {
    // parse fields from borrowed text
}

String owns growable text
&str borrows text for read-only parsing
Function signatures show ownership intent

Parsing Fields

line.split_once(':').map(|(_, value)| value)

temp_str.trim().parse::<f64>()

Trim text before parsing
Parse external text near the input boundary
Choose splitting logic that fits the field format

Date And Time Values

use chrono::{DateTime, Utc};

struct Record {
    time: DateTime<Utc>,
    temperature: f64,
    pressure: f64,
}

std::time is not a calendar-date parser
chrono parses the timestamp into a typed value
Later code works with dates, not raw strings

Matching Parser State

match parse_record(&record_buffer) {
    Ok(record) => aggregator.add_record(record),
    Err(err) => {
        eprintln!("Failed to parse record:\n{}", record_buffer);
        eprintln!("Error: {}", err);
    }
}

Success updates the aggregator
Failure reports the record and reason
match keeps both paths visible

Matching Aggregator State

match (self.first_time, self.last_time) {
    (None, None) => { /* first record */ }
    (Some(first), Some(last)) => { /* update range */ }
    _ => unreachable!("timestamps move together"),
}

Match on tuple structure
Make the aggregator invariant explicit
Reuse Module 3 match in data-processing code

Terminal: Structural Matching

cd source-code/structural-matching
cargo run -- --file data.txt

Same data format as source-code/strings
Match on split_once(':')
Destructure field name and value
Keep parser assumptions visible

Field Parser Shapes

match line.split_once(':') {
    Some(("time", value)) => { /* parse timestamp */ }
    Some(("temperature", value)) => { /* parse f64 */ }
    Some(("pressure", value)) => { /* parse f64 */ }
    Some(_) | None => {}
}

Some means the separator was present
The tuple pattern checks the field name
value binds the text after the separator

Matching While Processing Input

match nucleotide {
    nucleotide if is_valid_nucleotide(nucleotide) => {
        *counts.entry(nucleotide).or_insert(0) += 1;
    }
    nucleotide if nucleotide.is_whitespace() => {}
    _ => {
        error_tokens.insert(nucleotide);
    }
}

Valid nucleotide: increment count
Whitespace: ignore
Anything else: record an error token

Buffered Text Output

let file = std::fs::File::create(args.file).expect("Unable to create file");
let mut output = std::io::BufWriter::new(file);

write!(output, "{random_nucleotide}").expect("Unable to write file");
writeln!(output).expect("Unable to write file");

Buffered output groups small writes
write! and writeln! format text
Output files become inputs for later steps

Terminal: Generate And Count Data

cd source-code/hashmap-hashset
cargo run --bin generate-data -- --count 200 --file data.txt
cargo run --bin read-errors -- --file data.txt --output errors.txt --error-rate 0.2
cargo run --bin count-nucleotides -- --file errors.txt

Generate synthetic sequence data
Introduce controlled errors
Count valid and invalid tokens

Choosing The Collection

Ordered sequence: Vec<T>
Counts by key: HashMap<K, usize>
Unique values: HashSet<T>
Borrowed view of sequence data: &[T]
Iterator pipeline: transformations without intermediate variables

Hands-On Sequence

Run source-code/iterators
Change the filter threshold
Add a map pipeline
Add fold and scan reductions
Use zip to compute x + y
Run count-nucleotides
Count invalid tokens with a second HashMap
Generate a new input file and count it
Run source-code/strings
Change a timestamp or numeric field
Remove a field and inspect the parse error
Run source-code/structural-matching
Compare the two parser implementations

Questions

Is the data a sequence, a lookup table, or a set?
Should the pipeline work with references or copied values?
What operation consumes the iterator?
Does collect need a type annotation?
Should the file be read by record, line, or byte?
Does the parser need owned text or borrowed text?
Should a timestamp remain a string after parsing?
Does a match make the parser shape clearer?

Connection To The Next Module

File reads can fail
Parsing records can fail
Missing values need explicit representation
Data-processing programs need recoverable errors

Module 8: Error Handling

Module Arc

Make absence explicit with Option
Transform present values with map
Report recoverable failure with Result
Convert absence into an error message
Propagate failure with ?
Handle expected and unexpected failure differently

By The End

Distinguish Some, None, Ok, and Err
Choose between Option and Result
Convert Option to Result
Use ? to keep the success path readable
Use expect only when failure means a bug
Recognize fallible main functions

Terminal: Checked Matrix Access

cd source-code/error-handling
cargo run -- --rows 3 --cols 4

Fill the matrix through checked set
Read the matrix through checked get
Inspect the return types in matrix.rs

Missing Values With `Option`

Some(value)
None

Some contains a value
None means no value is present
The type system forces callers to handle both possibilities

Checked Index Calculation

fn index(&self, row: usize, col: usize) -> Option<usize> {
    if row < self.rows && col < self.cols {
        Some(row * self.cols + col)
    } else {
        None
    }
}

Valid coordinates produce Some(flat_index)
Invalid coordinates produce None
No error message is needed at this helper level

Transforming `Option` With `map`

pub fn get(&self, row: usize, col: usize) -> Option<f64> {
    self.index(row, col).map(|index| self.data[index])
}

map transforms the value inside Some
None stays None
The return type exposes possible absence

Manual `match` Equivalent

pub fn get(&self, row: usize, col: usize) -> Option<f64> {
    match self.index(row, col) {
        Some(index) => Some(self.data[index]),
        None => None,
    }
}

Same behavior as the map version
More explicit control flow
Useful when each case needs different work

Recoverable Failure With `Result`

Ok(value)
Err(error)

Ok contains the successful value
Err contains failure information
The error type is part of the function signature

Checked Mutation With `Result`

pub fn set(&mut self, row: usize, col: usize, value: f64) -> Result<(), String> {
    let index = self
        .index(row, col)
        .ok_or_else(|| format!("matrix index ({row}, {col}) is out of bounds"))?;
    self.data[index] = value;
    Ok(())
}

Success returns Ok(())
Failure returns Err(String)
The matrix is modified only after a valid index is found

Converting `Option` To `Result`

.ok_or_else(|| format!("matrix index ({row}, {col}) is out of bounds"))?

Some(index) becomes Ok(index)
None becomes Err(message)
The error message is built only when needed

The `?` Operator

let index = self
    .index(row, col)
    .ok_or_else(|| format!("matrix index ({row}, {col}) is out of bounds"))?;

Ok(index) extracts index
Err(error) returns early from the function
The successful path stays compact

Without `?`

let index = match self
    .index(row, col)
    .ok_or_else(|| format!("matrix index ({row}, {col}) is out of bounds"))
{
    Ok(index) => index,
    Err(error) => return Err(error),
};

The same propagation is written manually
The successful path is less direct
? is shorthand for this common pattern

Handling Known-Good Indices

for i in 0..matrix.rows() {
    for j in 0..matrix.cols() {
        matrix
            .set(i, j, (i * matrix.cols() + j) as f64)
            .expect("loop indices should be in bounds");
    }
}

The loops generate valid indices
Failure would indicate a programming mistake
expect documents that assumption

Reading Known-Good Indices

let value = matrix
    .get(i, j)
    .expect("loop indices should be in bounds");

get still returns Option
The caller decides how to handle None
expect is for impossible-in-this-context failure

Handling `None` At The Call Site

match matrix.get(row, col) {
    Some(value) => println!("{value}"),
    None => println!("no value at ({row}, {col})"),
}

Both cases are visible
The program can continue after absence
The message belongs near the user-facing boundary

Handling `Err` At The Call Site

match matrix.set(row, col, value) {
    Ok(()) => println!("value updated"),
    Err(message) => println!("{message}"),
}

Success and failure are separate cases
The error carries context
Recoverable errors do not need to panic

`Option` Or `Result`?

Use Option<T> when absence is enough information
Use Result<T, E> when failure needs explanation
Use expect when failure means a bug
Use match when the program can continue differently
Use ? when the current function should propagate failure

Panics Versus Recoverable Errors

Panic for violated internal assumptions
Return Result for expected operational failure
File input, parsing, and configuration usually need Result
Indexing syntax may panic like Rust slices
Checked methods can return Option or Result

Fallible `main`

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // fallible work
    Ok(())
}

main can return Result
? can be used in command-line programs
The exact error type can be refined later

Example From CSV Input

fn main() -> Result<(), Box<dyn Error>> {
    let mut reader = csv::Reader::from_path(args.file)?;
    for result in reader.deserialize() {
        let value: Values = result?;
        // use value
    }
    Ok(())
}

Opening the file can fail
Parsing a record can fail
? propagates both failures

Terminal: Error-Handling Exercises

cd source-code/error-handling
cargo check

Rewrite get with manual match
Add an out-of-bounds read and handle None
Add an out-of-bounds write and handle Err
Rewrite set without ?

Improve The Error Message

format!(
    "matrix index ({row}, {col}) is out of bounds for {} x {} matrix",
    self.rows,
    self.cols,
)

Include the invalid index
Include the valid shape
Put context in the error value

Hands-On Sequence

Run source-code/error-handling
Identify methods returning Option and Result
Rewrite get with match
Handle an out-of-bounds read
Handle an out-of-bounds write
Rewrite set without ?
Improve the ok_or_else message

Questions

Is absence expected and self-explanatory?
Does failure need an error message?
Should this function handle the error or propagate it?
Does expect document a real invariant?
Would a fallible main simplify the program?

Connection To The Next Module

Shared library code should report errors consistently
Tests should cover success and failure paths
Larger packages need reusable fallible functions
Numerical tests often need explicit tolerances

Module 9: Project Organization, Libraries, And Tests

Module Arc

Move from one binary to several targets
Share common rules through a library target
Keep binaries focused on coordination
Put tests near the behavior they check
Test numerical code with appropriate tolerances

By The End

Distinguish binary targets from library targets
Use src/lib.rs for shared package code
Run a specific binary with cargo run --bin
Add unit tests with #[cfg(test)] and #[test]
Use assert! and assert_eq!
Write floating-point checks with tolerances

From One Binary To Several Targets

src/
└── main.rs

src/
├── lib.rs
├── generate-data.rs
├── read-errors.rs
└── count-nucleotides.rs

Small examples can start with one binary
Related tools can live in one package
Shared rules belong in the library

Terminal: Related Binaries

cd source-code/hashmap-hashset
cargo run --bin generate-data -- --count 200 --file data.txt
cargo run --bin read-errors -- --file data.txt --output errors.txt --error-rate 0.2
cargo run --bin count-nucleotides -- --file errors.txt

Generate data
Inject read errors
Count valid and invalid tokens

Binary Targets In `Cargo.toml`

[[bin]]
name = "generate-data"
path = "src/generate-data.rs"

[[bin]]
name = "read-errors"
path = "src/read-errors.rs"

[[bin]]
name = "count-nucleotides"
path = "src/count-nucleotides.rs"

Each target has its own main
cargo run --bin NAME selects the executable
Program arguments still follow --

Library Target

pub const VALID_NUCLEOTIDES: [char; 4] = ['A', 'C', 'G', 'T'];
pub const ERROR_TOKENS: [char; 12] = ['B', 'D', 'E', 'F', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O'];

pub fn is_valid_nucleotide(value: char) -> bool {
    VALID_NUCLEOTIDES.contains(&value)
}

src/lib.rs defines reusable package code
pub exposes items to binaries
Shared rules stay in one place

Importing Library Code

name = "hashmap-hashset"

use hashmap_hashset::{VALID_NUCLEOTIDES, is_valid_nucleotide};

Hyphens in package names become underscores in crate names
Binaries import the package library by crate name
Shared definitions are not duplicated

What Belongs In `main`?

Parse command-line arguments
Open input and output resources
Call reusable logic
Report results

What Belongs Elsewhere?

Shared domain constants
Reusable validation functions
Core numerical algorithms
Testable helper functions

Unit Test Module

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn recognizes_valid_nucleotides() {
        for nucleotide in VALID_NUCLEOTIDES {
            assert!(is_valid_nucleotide(nucleotide));
        }
    }
}

#[cfg(test)] compiles the module for tests
#[test] marks a test case
Tests live close to the code they check

Using `super`

use super::*;

use super::quad;

super refers to the parent module
Tests can access nearby implementation items
Selective imports make dependencies clearer

Assertions

assert!(is_valid_nucleotide(nucleotide));

assert_eq!(matrix[(0, 0)], 1);
assert_eq!(matrix[(1, 1)], 4);

assert! checks a Boolean condition
assert_eq! compares expected equality
Equality failures show both values

Testing Error Paths

#[test]
fn rejects_ragged_rows() {
    let result = Matrix::try_from(vec![vec![1, 2], vec![3]]);

    assert!(result.is_err());
}

Invalid inputs need tests too
The test checks behavior, not implementation details
Error text can be tested separately when needed

Numerical Tests With Tolerances

#[test]
fn integrates_sine_on_zero_to_pi() {
    let result = quad(|x| f64::sin(x), 0.0, std::f64::consts::PI, 1000);

    assert!((result - 2.0).abs() < 2.0e-12);
}

Floating-point results often need tolerances
The tolerance depends on method and scale
Too loose hides bugs; too tight creates noise

Testing Trait Behavior

#[test]
fn displays_matrix_rows() {
    let matrix =
        Matrix::try_from(vec![vec![1, 2], vec![3, 4]])
            .expect("rows have the same length");

    assert_eq!(matrix.to_string(), "1 2\n3 4");
}

Test the public behavior users rely on
Formatting behavior is checked through to_string
Trait implementations deserve tests

Running Tests

cd source-code/hashmap-hashset
cargo test

cd ../traits
cargo test displays_matrix_rows

Run all tests while checking the package
Filter by test name while working on one behavior
Keep test names descriptive

Terminal: Add Shared Code

cd source-code/hashmap-hashset
cargo test

Add is_known_token
Test valid nucleotides
Test error tokens
Run the package tests

Terminal: Numerical Test Tolerance

cd source-code/enum-match
cargo test

Locate the Simpson integration test
Change the tolerance temporarily
Restore the original tolerance

Choosing Test Scope

Pure helper functions are easy unit-test targets
CLI parsing can stay in binaries at first
Shared algorithms belong in modules or libraries
Error paths should have at least one direct test
Numerical tests should state their tolerance clearly

Hands-On Sequence

Identify the three [[bin]] sections
Run each binary with cargo run --bin
Inspect public items in src/lib.rs
Add and test is_known_token
Run cargo test
Inspect the Simpson numerical test
Run one specific matrix test by name
Add one additional matrix behavior test

Questions

Which code is shared by more than one binary?
Which rules should live in src/lib.rs?
Does main coordinate or implement domain logic?
Which error path should be tested?
What tolerance is scientifically meaningful?

Connection To The Next Module

Reproducible random examples need explicit seeds
Data-generation tools benefit from shared helpers
Tests make stochastic and file-based code safer
Larger examples need clearer package boundaries

Module 10: Randomness And Reproducible Runs

Module Arc

Run the same random stream twice
Choose a named RNG algorithm
Parse distribution choices from the command line
Convert CLI choices to runtime distribution objects
Generate data that can be piped into another tool

By The End

Explain why seeds matter
Construct a seedable RNG
Sample uniform and normal distributions
Use ValueEnum for distribution choices
Pass an RNG explicitly to sampling code
Visualize generated samples with a histogram

Terminal: Same Seed, Same Stream

cd source-code/random-numbers
cargo run -- --count 5 --seed 42 --distribution uniform
cargo run -- --count 5 --seed 42 --distribution uniform

Same RNG
Same seed
Same sequence of sampling calls
Same output stream

Terminal: Change The Seed

cargo run -- --count 5 --seed 43 --distribution uniform

One input changed
The stream changes
The command records the run configuration

Why Reproducibility Matters

Random initialization affects simulations
Sampling choices affect results
Synthetic data should be regenerable
Bugs are easier to investigate with repeatable runs
A seed is part of the experiment

Choosing A Named RNG

use rand_chacha::ChaCha12Rng;

let mut rng = ChaCha12Rng::seed_from_u64(args.seed);

The RNG algorithm is explicit
The seed initializes the stream
Sampling mutates the RNG state

Command-Line Parameters

#[derive(Parser, Debug)]
struct Args {
    #[arg(short, long, default_value_t = 1)]
    count: usize,

    #[arg(short, long, default_value_t = 1234)]
    seed: u64,

    #[arg(short, long, default_value = "uniform")]
    distribution: DistributionKind,
}

Count controls stream length
Seed controls the random stream
Distribution controls the sampled values

CLI Choices With `ValueEnum`

#[derive(Clone, ValueEnum, Debug)]
enum DistributionKind {
    Uniform,
    Normal,
}

cargo run -- --help

Accepted values are explicit
clap parses the enum
Invalid choices are rejected at the boundary

Runtime Distribution Objects

enum RealDistribution {
    Uniform(Uniform<f64>),
    Normal(Normal<f64>),
}

CLI choices describe user intent
Runtime objects perform sampling
The program separates parsing from computation

From Choice To Distribution

impl RealDistribution {
    fn from_kind(kind: DistributionKind) -> Self {
        match kind {
            DistributionKind::Uniform => {
                Self::Uniform(Uniform::new(0.0, 1.0).expect("valid uniform distribution"))
            }
            DistributionKind::Normal => {
                Self::Normal(Normal::new(0.0, 1.0).expect("valid normal distribution"))
            }
        }
    }
}

uniform maps to [0.0, 1.0)
normal maps to mean 0.0, standard deviation 1.0
Distribution construction happens in one place

Sampling Method

fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> f64 {
    match self {
        Self::Uniform(distribution) => distribution.sample(rng),
        Self::Normal(distribution) => distribution.sample(rng),
    }
}

The method receives the RNG explicitly
The exact RNG type is generic
Sampling advances the RNG state

Generating A Stream

let args = Args::parse();
let mut rng = ChaCha12Rng::seed_from_u64(args.seed);
let distribution = RealDistribution::from_kind(args.distribution);

for _ in 0..args.count {
    let random_number = distribution.sample(&mut rng);
    println!("{}", random_number);
}

Parse configuration
Initialize the RNG
Sample repeatedly
Write one value per line

Terminal: Normal Samples

cargo run -- --count 5 --seed 42 --distribution normal

Same seed gives the same normal samples
Different distribution changes the stream of values
The command is enough to repeat the run

Data For Pipelines

cargo run -- --count 1000 --seed 42 --distribution normal > samples.txt

Standard output can be redirected
One sample per line is easy to inspect
Text output can feed other tools

Visualization Pipeline

cargo run -- --count 1000 --seed 42 --distribution normal | ./show-distribution.py

Rust generates samples
Python visualizes a histogram
The command records the full data-generation step

Python Helper

values = [float(line.strip()) for line in sys.stdin if line.strip()]
figure = go.Figure(data=[go.Histogram(x=values, nbinsx=20)])
figure.show()

Read numbers from standard input
Build a histogram
Keep visualization separate from generation

Randomness And Program Design

Avoid hidden global RNGs
Pass RNGs into functions that sample
Keep seeds in CLI arguments, config, logs, or metadata
Keep the order of random draws reproducible
Treat random inputs as part of the run

Extending The Example

Change normal mean and standard deviation
Add another DistributionKind
Add a matching RealDistribution variant
Implement the new sampling branch
Compare histograms using the same seed and count

Hands-On Sequence

Run the default command
Repeat a command with the same seed
Change only the seed
Switch from uniform to normal
Inspect cargo run -- --help
Change normal distribution parameters
Add a new distribution choice
Visualize 1000 generated samples

Questions

Which inputs are needed to reproduce this run?
Is the RNG algorithm named explicitly?
Where is the seed recorded?
Does any function create hidden randomness?
Can the generated data be inspected or piped onward?

Connection To Later Modules

Julia set examples are deterministic by design
N-body examples use random initialization
Seeds make simulation comparisons repeatable
Output files and visualization become part of the workflow

Module 11: Data Parallelism With Rayon

Module Arc

Identify independent numerical work
Compare serial and parallel Julia set implementations
Replace a serial range with a parallel iterator
Collect independent results into a matrix
Measure with release builds and meaningful problem sizes

By The End

Explain data parallelism
Use rayon::prelude::*
Use into_par_iter on a range
Map flat indices to matrix coordinates
Avoid concurrent writes to one shared matrix
Control worker threads with RAYON_NUM_THREADS

Why Julia Set Work Parallelizes

z <- z * z + c

Each output grid point starts from its own z
Each point runs the same iteration rule
One grid point does not need another grid point’s result
The output is one value per grid point

Add Rayon

[dependencies]
clap = { version = "4.0", features = ["derive"] }
num-complex = "0.4"
rayon = "1.10"

use rayon::prelude::*;

The crate provides parallel iterators
The prelude brings Rayon traits into scope
into_par_iter becomes available

Terminal: Baseline Output

cd source-code/julia-set/julia-set-baseline
cargo run --release -- --width 400 --height 300 > /tmp/julia-serial.txt

Build with optimizations
Use a fixed image size
Save output for comparison

Serial Matrix Computation

fn iterate_z_matrix(z: &Matrix<Complex64>, c: Complex64, max_iterations: usize) -> Matrix<usize> {
    let mut result = Matrix::new(z.rows(), z.cols(), 0);
    for i in 0..z.rows() {
        for j in 0..z.cols() {
            let z_value = *z.get(i, j).expect("loop indices should be in bounds");
            let iterations = iterate_z_value(z_value, c, max_iterations);
            result.set(i, j, iterations)
                .expect("loop indices should be in bounds");
        }
    }
    result
}

Nested loops visit grid points one by one
The result matrix is mutated in place
This version is the reference behavior

Terminal: Rayon Output

cd ../julia-set-rayon
cargo run --release -- --width 400 --height 300 > /tmp/julia-rayon.txt
diff /tmp/julia-serial.txt /tmp/julia-rayon.txt

Use the same parameters
Compare output, not just speed
Parallel code should preserve results

Parallel Matrix Computation

let data: Vec<usize> = (0..rows * cols)
    .into_par_iter()
    .with_min_len(1000)
    .map(|index| {
        let row = index / cols;
        let col = index % cols;
        let z_value = *z.get(row, col).expect("flat index should be in bounds");
        iterate_z_value(z_value, c, max_iterations)
    })
    .collect();

The flat index range covers every grid point
Rayon splits work across worker threads
Each task computes one output value
collect builds the result vector

Flat Index To Matrix Coordinates

let row = index / cols;
let col = index % cols;

One flat range replaces two nested loops
Division gives the row
Remainder gives the column
The mapping matches row-major storage

Avoid Shared Mutable State

.map(|index| {
    // compute one usize
})
.collect()

Each parallel task returns a value
No task writes into a shared result matrix
The collected vector is wrapped after the parallel work finishes

Matrix From Parallel Data

pub fn from_vec(rows: usize, cols: usize, data: Vec<T>) -> Result<Self, String> {
    if data.len() == rows * cols {
        Ok(Self { rows, cols, data })
    } else {
        Err(format!(
            "matrix data has {} elements, but shape ({rows}, {cols}) requires {}",
            data.len(),
            rows * cols
        ))
    }
}

The vector must match the requested shape
The matrix invariant is checked once
The parallel result becomes a normal matrix

Running With Visualization

cd source-code/julia-set/julia-set-rayon
cargo run --release -- --width 800 --height 600 | ../view-fractal.py

Rust computes the iteration counts
Text output feeds the viewer
The command-line interface matches the baseline

Control Worker Threads

RAYON_NUM_THREADS=4 cargo run --release -- --width 800 --height 600 > /dev/null

Rayon uses a worker thread pool
The environment variable fixes the thread count
Thread count matters for scaling experiments

Thread Count Experiment

RAYON_NUM_THREADS=1 cargo run --release -- --width 1200 --height 1200 > /dev/null
RAYON_NUM_THREADS=2 cargo run --release -- --width 1200 --height 1200 > /dev/null
RAYON_NUM_THREADS=4 cargo run --release -- --width 1200 --height 1200 > /dev/null

Keep problem size fixed
Change one variable at a time
Interpret results with CPU core count in mind

Benchmark All Julia Set Variants

cd source-code/julia-set
WARMUP=1 RUNS=3 ./benchmark.sh

Build implementations in release mode
Smoke-test outputs
Compare with hyperfine

Larger Workloads Matter

WIDTH=1600 HEIGHT=1600 MAX_ITERATIONS=1000 WARMUP=1 RUNS=5 ./benchmark.sh

Small workloads can be dominated by overhead
Larger grids expose more parallel work
Release builds are required for meaningful comparisons

Rayon Thread Scaling Script

cd source-code/julia-set/julia-set-rayon
THREAD_COUNTS="1 2 4 8" ./benchmark.sh

Build only the Rayon implementation
Run the same workload with multiple thread counts
Compare scaling trends

When Rayon Fits

Many independent tasks
Enough work per task
CPU-bound computation
Safe collection or reduction of results
Minimal shared mutable state

When Rayon May Not Help

Tiny workloads
Strong dependencies between neighboring results
I/O-bound programs
Heavy synchronization
Work split into tasks that are too small

Hands-On Sequence

Run the serial baseline
Run the Rayon version with the same parameters
Compare outputs with diff
Inspect both iterate_z_matrix functions
Run with RAYON_NUM_THREADS=1
Run with more worker threads
Use a larger image size
Run the Rayon benchmark script

Questions

Which work units are independent?
What data is shared read-only?
Where are output values collected?
What overhead does parallelism introduce?
Is the benchmark problem large enough?

Connection To Later Material

Julia set variants compare implementation styles
Parallel reductions are a natural next topic
Numerical crates may expose parallel iteration
Reproducible random streams need care in parallel code
Larger simulations need explicit performance measurements

Module 12: Integrated Numerical Example: Julia Set

Module Arc

Start with a scalar numerical kernel
Map a rectangular grid to complex values
Store iteration counts in matrix-like output
Compare custom and library-backed storage
Move from command-line parameters to TOML configuration

By The End

Explain the Julia set computation steps
Use Complex64 for complex arithmetic
Map integer grid indices to floating-point coordinates
Separate scalar and matrix-level iteration
Compare implementation variants
Use configuration files for reproducible runs

Implementation Family

julia-set-baseline
julia-set-mdarray
julia-set-mdarray-expr-eval
julia-set-toml-config
julia-set-rayon
view-fractal.py

Mathematical Core

z <- z * z + c

c is fixed for one run
initial z varies across the grid
output is the escape iteration count
max_iterations caps the work per point

Scalar Kernel

fn iterate_z_value(z: Complex64, c: Complex64, max_iterations: usize) -> usize {
    let mut z_n = z;
    for n in 0..max_iterations {
        if z_n.norm() > 2.0 {
            return n;
        }
        z_n = z_n * z_n + c;
    }
    max_iterations
}

One input point
One fixed complex parameter
One iteration count

Complex Numbers

use num_complex::Complex64;

let c = Complex64::new(args.c_real, args.c_imag);
z_n = z_n * z_n + c;

Complex numbers come from a crate
Arithmetic uses ordinary operators
norm() supports the escape test

Grid To Complex Plane

let domain_min = -2.0;
let domain_max = 2.0;
let delta_re = (domain_max - domain_min) / (cols as f64);
let delta_im = (domain_max - domain_min) / (rows as f64);

Integer grid dimensions become floating-point steps
cols controls the real-axis spacing
rows controls the imaginary-axis spacing

Initial Complex Grid

for i in 0..rows {
    for j in 0..cols {
        let z_value = Complex64::new(
            domain_min + j as f64 * delta_re,
            domain_min + i as f64 * delta_im,
        );
        z.set(i, j, z_value)
            .expect("loop indices should be in bounds");
    }
}

Nested loops visit the image grid
i and j become coordinates
The matrix stores initial complex values

Matrix-Level Iteration

fn iterate_z_matrix(z: &Matrix<Complex64>, c: Complex64, max_iterations: usize) -> Matrix<usize> {
    let mut result = Matrix::new(z.rows(), z.cols(), 0);
    for i in 0..z.rows() {
        for j in 0..z.cols() {
            let z_value = *z.get(i, j).expect("loop indices should be in bounds");
            let iterations = iterate_z_value(z_value, c, max_iterations);
            result.set(i, j, iterations)
                .expect("loop indices should be in bounds");
        }
    }
    result
}

Borrow input grid
Compute one scalar result per point
Return owned output matrix

Terminal: Baseline Variant

cd source-code/julia-set/julia-set-baseline
cargo run --release -- --width 400 --height 300 | ../view-fractal.py

Custom matrix type
Explicit nested loops
Command-line parameters
Text output to visualization

Command-Line Parameters

#[arg(short, long, default_value_t = 1000)]
max_iterations: usize,
#[arg(short = 'x', long, default_value_t = 800)]
width: usize,
#[arg(short = 'y', long, default_value_t = 600)]
height: usize,
#[arg(short = 'r', long, default_value_t = -0.5125)]
c_real: f64,
#[arg(short = 'i', long, default_value_t = 0.5213)]
c_imag: f64,

Numerical parameters are visible
Runs can be repeated from the shell command
Parameter changes affect the image

Terminal: Change The Complex Parameter

cargo run --release -- --width 400 --height 300 --c-real -0.8 --c-imag 0.156 | ../view-fractal.py

Keep image size fixed
Change only c
Compare the resulting structure

External Array Storage With `mdarray`

type MatrixC = DArray<Complex64, 2>;
type MatrixCSlice = DSlice<Complex64, 2>;
type MatrixI = DArray<usize, 2>;

Dedicated array storage replaces the teaching matrix
Dimension is part of the type alias
Library indexing uses array syntax

`mdarray` Indexing

let mut z = MatrixC::from_elem([rows, cols], Complex64::new(0.0, 0.0));

z[[i, j]] = z_value;
let z_value = z[[i, j]];

Array shape is explicit
Indexing syntax is concise
The algorithm remains recognizable

Expression Evaluation Variant

expr::from_fn([rows, cols], |idx| {
    let i = idx[0];
    let j = idx[1];

    Complex64::new(
        domain_min + j as f64 * delta_re,
        domain_min + i as f64 * delta_im,
    )
})
.eval()

Grid construction becomes an expression
Index mapping stays explicit
More work is delegated to the array library

Matrix Iteration As An Expression

z.expr()
    .map(|&z_value| iterate_z_value(z_value, c, max_iterations))
    .eval()

Element-wise computation reads as a pipeline
The scalar kernel is reused
The matrix loop is hidden by the library abstraction

TOML Configuration

max_iterations = 1000
width = 800
height = 600
c_real = -0.5125
c_imag = 0.5213

Parameters are saved in a file
The run configuration can be edited and shared
The command line names the configuration file

Reading Configuration

#[derive(Debug, Deserialize)]
struct Config {
    max_iterations: usize,
    width: usize,
    height: usize,
    c_real: f64,
    c_imag: f64,
}

fn read_config(path: PathBuf) -> Result<Config, Box<dyn Error>> {
    let config_text = fs::read_to_string(path)?;
    let config = toml::from_str(&config_text)?;
    Ok(config)
}

serde maps TOML into a struct
I/O and parsing can fail
? propagates errors

Terminal: TOML Variant

cd ../julia-set-toml-config
cargo run --release -- julia-set.toml | ../view-fractal.py

Run parameters come from TOML
The shell command is shorter
The configuration file carries the details

Text Output And Visualization

cargo run --release -- --width 800 --height 600 | ../view-fractal.py

go.Heatmap(
    z=data,
    colorscale="Viridis",
    colorbar={"title": "Iterations"},
)

Rust computes numerical data
Text output keeps tools loosely coupled
Python provides quick visualization

Comparing Variants

Custom matrix versus array crate
Explicit loops versus expression evaluation
Command-line parameters versus configuration file
Minimal dependencies versus domain-specific crates
Fully visible steps versus delegated library behavior

Hands-On Sequence

Run and visualize the baseline
Change c_real and c_imag
Inspect initialize_z
Inspect iterate_z_value
Run the mdarray variant
Compare explicit loops with expression evaluation
Run the TOML variant
Edit julia-set.toml and rerun

Questions

Which code is the scalar numerical kernel?
Which code maps grid indices to coordinates?
Which storage choice is easiest to read?
Which run style is easier to reproduce later?
Which variant would be easiest to extend?

Connection To The N-Body Example

Both examples combine earlier language features
Both use command-line configuration
Both produce output for external visualization
Julia set is deterministic and compact
N-body adds time evolution, randomness, and diagnostics

Module 13: Integrated Numerical Example: N-Body Simulation

Module Arc

Represent a particle system as Rust data
Initialize from reproducible randomness
Evolve state with a time-integration method
Compute diagnostics during the run
Write structured output for analysis and animation

By The End

Explain how simulation state is stored
Identify random initialization and its seed
Follow the velocity Verlet update stages
Interpret energy and center-of-mass diagnostics
Distinguish evolution output from particle-state output
Use Python helpers to inspect results

Terminal: Default Simulation

cd source-code/n-body-simulation/rust
cargo run

Use default particle count
Use default seed
Run without writing output files

Command-Line Parameters

#[arg(long, default_value_t = 100)]
num_particles: usize,

#[arg(long, default_value_t = 1234)]
seed: u64,

#[arg(long, default_value_t = 0.001)]
delta_time: f64,

#[arg(long = "steps", default_value_t = 100)]
num_steps: usize,

Model size
Random seed
Time step
Number of steps

Optional Outputs

#[arg(long)]
save_evolution: Option<String>,

#[arg(long)]
save_states: Option<String>,

No file name means no output file
Some(filename) enables output
Optional outputs are not errors

Simulation State

pub struct System {
    xs: Vec<f64>,
    ys: Vec<f64>,
    zs: Vec<f64>,
    vxs: Vec<f64>,
    vys: Vec<f64>,
    vzs: Vec<f64>,
    masses: Vec<f64>,
    softening_length: f64,
}

Positions, velocities, and masses are stored in vectors
Fields are private
Methods control access and mutation

Random Initialization

pub fn new(num_particles: usize, seed: u64, softening_length: f64) -> Self {
    let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
    // allocate vectors and sample values
}

The seed fixes the initial condition
Positions, velocities, and masses are sampled
The constructor returns an initialized system

Initialization Distributions

let position_distribution =
    Uniform::new(0.0, 1.0).expect("position distribution bounds should be valid");
let velocity_distribution =
    Normal::new(0.0, 1.0).expect("velocity distribution parameters should be valid");
let mass_distribution =
    Uniform::new(0.1, 1.0).expect("mass distribution bounds should be valid");

Positions use a uniform distribution
Velocities use a normal distribution
Masses use a positive uniform distribution

Gravitational Softening

let distance_squared =
    dx * dx + dy * dy + dz * dz + self.softening_length * self.softening_length;

Softening prevents extremely large close-range forces
It is a numerical modeling choice
Force and energy calculations use the same softening length

Acceleration On One Particle

fn acceleration_on(&self, index: usize) -> (f64, f64, f64) {
    let mut acceleration = (0.0, 0.0, 0.0);
    for i in 0..self.num_particles() {
        if i != index {
            let dx = self.xs[i] - self.xs[index];
            let dy = self.ys[i] - self.ys[index];
            let dz = self.zs[i] - self.zs[index];
            // accumulate contribution
        }
    }
    acceleration
}

One particle receives contributions from all others
The return value is a 3-tuple
The method only reads system state

Acceleration Collection

fn accelerations(&self) -> Vec<(f64, f64, f64)> {
    (0..self.num_particles())
        .map(|i| self.acceleration_on(i))
        .collect()
}

Iterator over particle indices
One acceleration tuple per particle
Results are collected before the update

Velocity Verlet: Positions

let accelerations = self.accelerations();
let half_dt_squared = 0.5 * dt * dt;

for i in 0..self.num_particles() {
    let (ax, ay, az) = accelerations[i];
    self.xs[i] += self.vxs[i] * dt + ax * half_dt_squared;
    self.ys[i] += self.vys[i] * dt + ay * half_dt_squared;
    self.zs[i] += self.vzs[i] * dt + az * half_dt_squared;
}

Compute current accelerations
Update positions
Keep old accelerations for velocity update

Velocity Verlet: Velocities

let new_accelerations = self.accelerations();

for i in 0..self.num_particles() {
    let (ax, ay, az) = accelerations[i];
    let (new_ax, new_ay, new_az) = new_accelerations[i];
    self.vxs[i] += 0.5 * (ax + new_ax) * dt;
    self.vys[i] += 0.5 * (ay + new_ay) * dt;
    self.vzs[i] += 0.5 * (az + new_az) * dt;
}

Recompute accelerations after moving particles
Average old and new accelerations
Update velocities in place

Diagnostics

pub fn kinetic_energy(&self) -> f64
pub fn potential_energy(&self) -> f64
pub fn total_energy(&self) -> f64
pub fn center_of_mass(&self) -> (f64, f64, f64)

Energy tracks numerical behavior
Center of mass tracks system drift
Diagnostics make simulations inspectable

Evolution CSV Records

#[derive(Serialize)]
struct EvolutionRecord {
    step: usize,
    potential_energy: f64,
    kinetic_energy: f64,
    total_energy: f64,
    center_of_mass_x: f64,
    center_of_mass_y: f64,
    center_of_mass_z: f64,
}

One row per time step
Compact diagnostic output
Useful for plots and sanity checks

Particle-State CSV Records

#[derive(Serialize)]
struct ParticleStateRecord {
    step: usize,
    particle: usize,
    x: f64,
    y: f64,
    z: f64,
    vx: f64,
    vy: f64,
    vz: f64,
    mass: f64,
}

One row per particle per time step
Larger output
Useful for animation and detailed inspection

Optional Writers

let mut evolution_writer = args
    .save_evolution
    .as_deref()
    .map(|filename| csv::Writer::from_path(filename).expect("Failed to create evolution file"));

Option<String> becomes Option<Writer>
Missing output file is valid
Writing happens only when a writer exists

Write Only When Enabled

fn write_evolution_record(
    writer: &mut Option<csv::Writer<std::fs::File>>,
    step: usize,
    system: &System,
) {
    if let Some(writer) = writer.as_mut() {
        writer
            .serialize(evolution_record(step, system))
            .expect("Failed to write evolution record");
    }
}

as_mut borrows the optional writer mutably
Some writes one record
None does nothing

Terminal: Save Diagnostics

cargo run -- --steps 200 --save-evolution evolution.csv
../visualize-evolution.py evolution.csv

Save energy and center-of-mass diagnostics
Plot diagnostic time series
Compare runs with different parameters

Terminal: Save Particle States

cargo run -- --steps 100 --save-states states.csv
../animate-states.py states.csv --output animation.html

Save per-particle state over time
Generate an animation
Inspect motion rather than only diagnostics

Parameter Experiments

Reduce --delta-time
Change --seed
Change --softening
Change --num-particles
Compare diagnostics and animations

Interpreting Energy

Total energy should usually vary less with smaller time steps
Softening changes the modeled interaction
Random initialization affects close encounters
Finite precision affects long runs

Hands-On Sequence

Run the default simulation
Save evolution diagnostics
Visualize diagnostics
Repeat with a smaller --delta-time
Change the seed
Save particle states
Animate particle states
Add one extra CSV diagnostic column

Questions

Which methods only read the system?
Which method mutates the system?
Which parameters define a reproducible run?
Which output file is appropriate for diagnostics?
When would a named Vector3 type improve the code?

Relation To The Julia Set Example

Julia set: deterministic grid computation
N-body: random initialization and time evolution
Julia set: matrix-like iteration counts
N-body: diagnostics and structured CSV output
Both use external visualization tools

Rust Scientific Computing Ecosystem

Section Arc

Core numerical data structures
Linear algebra and optimization
Differential equations
DataFrames, plotting, and data formats
Ecosystem checks before adoption

Ecosystem Reality Check

Rust has strong systems foundations
Scientific libraries are uneven
Many crates are domain-specific building blocks
Some workflows still need Python, R, Julia, or C/Fortran libraries

Arrays And Complex Numbers

ndarray: N-dimensional arrays
num-complex: complex scalar types
ndarray-linalg: linear algebra extension traits

Linear Algebra Choices

nalgebra: matrices, vectors, geometry
faer: dense linear algebra in Rust
ndarray-linalg: ndarray plus LAPACK-style operations

Optimization

argmin: numerical optimization framework

Useful for fitting, calibration, and inverse problems
Check available solvers, derivatives, constraints, and maintenance status

Differential Equations

diffsol: ODE and DAE solving
differential-equations: ODE, DDE, and SDE initial-value problems
ode_solvers: ODE solver building block

DataFrames And Analysis

polars: DataFrame and query engine

Lazy queries can optimize larger analysis pipelines
Good fit for structured tabular data and columnar formats

Visualization

plotly: interactive plots from Rust

HTML output works well for reports and lightweight inspection
Publication plotting may still be easier in Python, R, or Julia

Columnar Data Formats

arrow: Apache Arrow memory model
parquet: Apache Parquet files
polars: DataFrame workflows on top of Arrow-style data

HDF5

hdf5: HDF5 access from Rust

Useful when existing instruments or simulations already produce HDF5
Check native-library availability on clusters and CI systems

Adoption Checklist

Does the crate cover the required numerical method?
Are examples and documentation sufficient?
Are releases recent enough for your project risk?
Does it compose with the data structures you use?
Can it build on your target cluster or platform?

Typical Rust Roles

Reliable command-line tools
Fast data conversion and validation
Reproducible simulation kernels
Parallel batch computations
Libraries embedded in larger Python, R, or Julia workflows

Rust, the good, the bad, and the ugly

Motivation

Rust: the good

Rust: the bad

Rust: the ugly

Introduction

Course Arc

Audience

Working Style

The Through-Line

Modules 1-4: Foundations

Modules 5-9: Building Programs

Modules 10-13: Scientific Workflows

Example Workflow

Shorter Course Path

Habits To Build

First Hands-On Module

Module 1: Getting Started With Rust Projects

Module Arc

By The End

Project Anatomy

Terminal: Check The Toolchain

Terminal: First Run

The Smallest Program

Edit-Check-Run Loop

Diagnostics Are Part Of The Workflow

Terminal: Break And Repair

From Program To Tool

Terminal: CLI Help

Dependency In Cargo.toml

Parser As A Type

Terminal: Run With Options

Hands-On Sequence

Questions

Connection To The Next Module

Module 2: Scalar Computation And Numeric Basics

Module Arc

By The End

Terminal: Inspect Scalar Types

Scalar Type Families

Floating-Point Constants

Type Inference And Explicit Types

Terminal: Compare Arithmetic

Integer Arithmetic

Negative Integer Division

Floating-Point Arithmetic

Mathematical Methods

Rounding And Absolute Values

Terminal: Polynomial Function

Typed Numeric Functions

Explicit Conversion

Terminal: Conversion Diagnostic

Avoiding Implicit Double Promotion

Terminal: Literal Types

Complex Numbers

Terminal: Complex Arithmetic

Hands-On Sequence

Questions

Connection To The Next Module

Module 3: Control Flow And Program Structure

Module Arc

By The End

Terminal: Greatest Common Divisors

Branches With if And else

Loops With while

Half-Open Ranges

Inclusive Ranges

Terminal: Change The Grid

Functions With Typed Interfaces

Semicolon And Return Value

Blocks As Expressions

if Expressions

Tuples

Terminal: Quadrature Choices

Enums For Fixed Choices

Selecting Behavior With match

Structural Matches Later

Closures As Function Arguments

Source Modules

Qualified Function Calls

Dependency In `Cargo.toml`

Branches With `if` And `else`

Loops With `while`

`if` Expressions

Selecting Behavior With `match`

Associated Function: `new`

Methods With `&self`

Methods With `&mut self`

Keeping `main.rs` Focused

Methods For All `T`

Indexing With `Index`

Mutable Indexing With `IndexMut`

Formatting With `Display`