Architecture Overview

The tud_lbm package follows a physics-first folder structure combined with hexagonal architecture principles to create a maintainable, testable, and scalable lattice Boltzmann method library.

Core Design Principles

Physics First: Folders are organized by lattice physics concepts such as “lattice”, “collision operators”, “streaming operators”.
Separation of Concerns: Input adapters, simulation logic, and output handling are separated.
Immutability: State is immutable (JAX NamedTuples), enabling transparent JAX compilation
Registry Pattern: Operators self-register to avoid hardcoded factory dependencies
Protocol Contracts: Operators satisfy protocols, not inheritance hierarchies

Folder Structure Rationale

tud_lbm/
├── lattice/           ← Velocity model (d, q, c, w)
├── operators/         ← Physics transformations (collision, streaming, etc)
│   └── 9 subcategories
├── pipeline/          ← Execution harness (setup, runner, state)
├── config/            ← Pure immutable configuration
├── readers/           ← Input adapters (TOML, dict)
├── io/                ← Output adapters (plotting, saving)
├── cli/               ← Command-line interface
└── registry.py        ← Global operator registry

Brief Intro to Hexagonal Architecture

What is Hexagonal Architecture?

Hexagonal Architecture (also called “Ports and Adapters”) is a design pattern that isolates your core business logic from external dependencies like file formats, databases, user interfaces, and configuration sources. The core physics simulation sits at the center, surrounded by ports (abstract contracts) and adapters (concrete implementations). This separation makes the code highly flexible and extensible: you can add new input formats (JSON, YAML, HDF5), new output formats (HDF5, Parquet, NetCDF), or new execution contexts (CLI, Jupyter, API, batch processing) without modifying a single line of core simulation code. The hexagon represents your irreplaceable business logic, and the ports/adapters represent the interchangeable interfaces to the outside world.

Terminology: Core, Ports, and Adapters

1. The Core (Center of the Hexagon)

The core is your pure business logic — the irreplaceable physics simulation:

# Core: pure physics, no I/O
def build_setup(config: SimulationConfig) -> SimulationSetup:
    """Build simulation from config - doesn't care where config came from."""
    operators = registry.lookup_operators(config)
    return SimulationSetup(operators=operators, lattice=lattice)

def run(state: State, setup: SimulationSetup, n_steps: int) -> State:
    """Run simulation - doesn't know or care where output goes."""
    for _ in range(n_steps):
        state = apply_collision(state, setup)
        state = apply_streaming(state, setup)
        state = apply_boundary(state, setup)
    return state

Core characteristics: - Format-agnostic: Doesn’t know about TOML, JSON, YAML. - Pure: No side effects, no I/O, no database calls - Immutable: Works with frozen dataclasses and JAX arrays - Testable: No mocking needed, just pass data in, check data out

2. Ports (Abstract Interfaces)

Ports are the contracts your core logic depends on. They define WHAT your core needs without defining HOW it gets it:

# Port 1: Configuration Input
# "I need some kind of SimulationConfig"
def build_setup(config: SimulationConfig) -> SimulationSetup:
    pass  # Don't care if config came from TOML, JSON, dict, API, etc.

# Port 2: Operator Registry
# "I need to lookup operators somehow"
collision_fn = registry.get_operator("collision_models", "bgk")
# Don't care if registry is in-memory, database, or REST API

# Port 3: State Output
# "I return computed state, someone else handles it"
final_state = run(state, setup, n_steps=1000)
# Don't care if it goes to file, console, database, or memory

Three Input Ports:

Configuration Input Port — Accepts SimulationConfig dataclass
Operator Lookup Port — Uses registry to find operator implementations
Dependency Port — Receives Lattice and other dependencies

Three Output Ports:

State Output Port — Returns final State and diagnostics
Side Effects Port — Invokes optional callbacks during execution
Extensibility Port — Supports custom operator additions via registry

3. Adapters (Concrete Implementations)

Adapters are the concrete implementations that translate between external formats and the abstract port interfaces. They’re the “drivers” that connect external tools to your core:

Input Adapters (in readers/ directory):

Currently implemented adapters:

# TOML Adapter: Converts TOML file → SimulationConfig
class TOMLAdapter:
    def load_simulation_config(self, path: str) -> SimulationConfig:
        data = toml.parse(path)  # External tool
        return SimulationConfig(**data)  # Internal format

# Dictionary Adapter: Converts dict → SimulationConfig
class DictAdapter:
    def load_simulation_config(self, data: dict) -> SimulationConfig:
        # Already compatible, just validate
        return SimulationConfig(**data)

All adapters implement the same interface, so core doesn’t care which one is used:

from tud_lbm.readers import toml, dict as dict_reader

# Load from TOML file
config = toml.load_simulation_config("config.toml")

# Or load from Python dict
config = dict_reader.load_simulation_config({"lattice_type": "D2Q9", ...})

# Both return SimulationConfig - core is identical
setup = build_setup(config)  # ← No changes needed

Output Adapters (in io/ directory):

Currently implemented adapters handle data export and visualization:

# VTK Adapter: Converts State → VTK file (for ParaView visualization)
class VTKAdapter:
    def write(self, state: State, path: str) -> None:
        vtk_data = convert_state_to_vtk(state)  # Internal → External
        write_vtk_file(vtk_data, path)

# NumPy Adapter: Exports State → NumPy arrays for post-processing
class NumpyAdapter:
    def write(self, state: State, path: str) -> None:
        arrays = {
            'f': numpy.array(state.f),
            'u': numpy.array(state.u),
            'rho': numpy.array(state.rho)
        }
        numpy.savez_compressed(path, **arrays)

Future adapters could provide additional analysis and documentation capabilities:

# Future: Plotting Adapter (visualization in Jupyter/matplotlib)
# generate_plots(state, config) → matplotlib figures

# Future: Jupyter Notebook Generator (auto-generate reproducible notebooks)
# generate_notebook(config, final_state) → .ipynb file with:
#   - Simulation parameters
#   - Plots of velocity/density/pressure fields
#   - Physics analysis (kinetic energy, momentum conservation, etc.)
#   - Executable code cells for reproducibility

# Future: HDF5 Adapter (large-scale data storage with metadata)
# write_hdf5(state_trajectory, "results.h5")

Core doesn’t care which adapter is used. Here’s how to save results in different formats:

final_state = run(state, setup, n_steps=1000)

# Currently implemented output formats
from tud_lbm.io.output_data import write_vtk, write_numpy
write_vtk(final_state, "output.vtk")
write_numpy(final_state, "output.npz")

# Future: Visualization and reproducible notebooks
# from tud_lbm.io import generate_plots, generate_notebook
# generate_plots(final_state, config)
# generate_notebook(config, final_state, "report.ipynb")

Design Illustration

The diagram shows how data flows through the architecture:

┌──────────────────────────────────────────────────────────┐
│  EXTERNAL WORLD                                           │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
│  │   TOML   │  │  Dict    │  │  Jupyter │  │   VTK    │  │
│  │(currently)│  │(currently)│  │  input   │  │(currently)│ │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘  │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
│  │   JSON   │  │   YAML   │  │  NumPy   │  │ Plots &  │  │
│  │ (future) │  │ (future) │  │(currently)│ │ Notebooks│  │
│  │          │  │          │  │          │  │ (future) │  │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘  │
└───────┼─────────────┼─────────────┼─────────────┼─────────┘
        │             │             │             │
┌───────▼─────────────▼─────────────▼─────────────▼─────────┐
│  ADAPTER LAYER                                            │
│  ┌────────────────┐  ┌────────────────┐  ┌─────────────┐  │
│  │ Input Adapters │  │Output Adapters │  │ CLI/Jupyter │  │
│  │ (readers/)     │  │ (io/)          │  │ (cli/)      │  │
│  └────────────────┘  └────────────────┘  └─────────────┘  │
└───────┬─────────────────────────────┬──────────────────────┘
        │                             │
        │  SimulationConfig           │  State
        │  (port interface)           │  (port interface)
        │                             │
┌───────▼─────────────────────────────▼──────────────────────┐
│  CORE LOGIC (Pure Physics)                                │
│                                                             │
│  ┌──────────────────────────────────────────────────────┐  │
│  │ build_setup(config) → SimulationSetup               │  │
│  │ run(state, setup, n_steps) → final_state            │  │
│  │ Operators: collision, streaming, boundary, etc.     │  │
│  └──────────────────────────────────────────────────────┘  │
│                                                             │
│  Dependencies (Immutable):                                │
│  • Lattice (D2Q9, D3Q19, ...)                             │
│  • Configuration                                           │
│  • Registry (operator lookup)                              │
│  • State & WettingState (JAX pytrees)                      │
└─────────────────────────────────────────────────────────────┘

Data Flow Through Hexagon

Scenario: Run simulation from Jupyter, output to NumPy arrays

Step 1: Input Adapter (Dict → Config)
─────────────────────────────────
config_dict = {
    "lattice_type": "D2Q9",
    "tau": 0.8,
    "grid_shape": [64, 64]
}
↓
[Dict Adapter converts to SimulationConfig]
↓
config = SimulationConfig(
    lattice_type="D2Q9",
    tau=0.8,
    grid_shape=(64, 64)
)


Step 2: Core Logic (Setup)
─────────────────────────
setup = build_setup(config)
↓
[Core looks up operators from registry]
[Core builds immutable SimulationSetup]
[No knowledge of where config came from!]


Step 3: Core Logic (Simulation)
───────────────────────────────
state = init_state(config, setup)
final_state = run(state, setup, n_steps=1000)
↓
[Pure functional computation]
[No I/O, no formatting, no file writing]
[No knowledge of where output will go!]


Step 4: Output Adapter (State → NumPy)
────────────────────────────────────
[NumPy Adapter converts JAX arrays to NumPy]
↓
write_numpy(final_state, "results.npz")
↓
File written, simulation code unchanged

Why This Architecture is Powerful

Flexibility Example 1: Change Input Format

from tud_lbm.readers import toml, dict as dict_reader
from tud_lbm.pipeline import build_setup

# Load from TOML
config = toml.load_simulation_config("config.toml")

# Or load from dict (Jupyter notebook)
config = dict_reader.load_simulation_config({
    "lattice_type": "D2Q9",
    "tau": 0.8
})

# Rest of code is identical!
setup = build_setup(config)
final_state = run(state, setup, n_steps=1000)

Flexibility Example 2: Change Output Format

final_state = run(state, setup, n_steps=1000)

# Multiple outputs simultaneously, no core changes needed!
from tud_lbm.io.output_data import write_vtk, write_numpy
write_vtk(final_state, "results.vtk")
write_numpy(final_state, "arrays.npz")

# Future: Simulation data handling and visualization
# from tud_lbm.io import generate_plots, generate_notebook
# generate_plots(final_state, config, "plots/")
# generate_notebook(config, final_state, "report.ipynb")
#
# The generated notebook would automatically include:
# - Simulation parameters and configuration
# - Velocity/density/pressure field visualizations
# - Physics diagnostics (conservation laws, stability checks)
# - Executable code cells for reproducibility

Flexibility Example 3: Add New Context

# Core is reusable everywhere with same code:

# 1. CLI (click framework)
@click.command()
def run_cli(config_path):
    config = toml_adapter.load_simulation_config(config_path)
    setup = build_setup(config)
    # ... same core code ...

# 2. Jupyter notebook
config = dict_adapter.load_simulation_config({...})
setup = build_setup(config)
# ... same core code ...

# 3. Batch processing with parameter sweep
for params in sweep_parameters():
    config = dict_adapter.load_simulation_config(params)
    setup = build_setup(config)
    # ... same core code ...

Testability Benefit:

def test_poiseuille_flow():
    """Test physics without any I/O"""
    # Create config in memory
    config = dict_adapter.load_simulation_config({
        "lattice_type": "D2Q9",
        "tau": 0.8,
        "grid_shape": [64, 64]
    })

    # Run core logic
    setup = build_setup(config)
    state = init_state(config, setup)
    final_state = run(state, setup, n_steps=1000)

    # Assert physics properties
    assert velocity_profile_is_parabolic(final_state.u)

    # No file system, no mocking, no adapter complexity!

Summary Table

Operator Registration System

Operators self-register via decorators:

from tud_lbm import registry

@registry.collision_operator(name="bgk")
def bgk_collision(f, feq, tau):
    return f - (f - feq) / tau

This avoids the factory anti-pattern where a single factory file must import everything. Instead:

Operators register themselves on import
Registry is a lookup table (key → implementation)
Factory simply does: registry.get_operators(kind)[name]

Benefits: - New operators added without modifying factory - Circular imports prevented (registry is independent) - Operator location reflects physics category

Configuration as Pure Data

Configuration uses immutable dataclasses:

@dataclass(frozen=True)
class SimulationConfig:
    grid_shape: tuple[int, ...]
    lattice_type: str
    tau: float
    run_type: str
    # ... other physics parameters

NOT (❌):

class Simulation:
    def __init__(self, ...):
        self.collision_fn = load_collision_operator(...)
        self.streaming_fn = load_streaming_operator(...)
        # Live objects embedded in config

Why immutable: - JAX pytree-compatible - No hidden state mutations - Thread-safe - Testable: same config → reproducible results - Config can be serialized/deserialized easily

State Management

Simulation state flows through JAX’s functional paradigm:

class State(NamedTuple):
    f: jax.Array           # Populations (d×q array)
    u: jax.Array           # Velocity field
    rho: jax.Array         # Density field

# Pure function: same state → same output
def step(state: State, setup: SimulationSetup) -> State:
    return apply_operators(state, setup.operators)

# Jitted timestep loop
final_state, diagnostics = jax.lax.scan(step, initial_state, n_steps)

Benefits: - Transparent JAX compilation (no side effects) - Reproducible results - Easy checkpointing/restart - Functional composition

Pipeline Execution

Three layers:

build_setup() (composition root) - Loads config - Instantiates operators from registry - Creates SimulationSetup (immutable, jittable)
init_state() (initialization) - Uses selected initializer operator - Returns initial State (NamedTuple)
run() (jitted timestep loop) - Uses JAX scan for efficient compilation - Applies collision, streaming, boundary operators - Returns final state + diagnostics

Input/Output Adapters

Readers (input adapters in readers/)

Convert external formats to SimulationConfig:

TOML file
  ↓
toml.py (parser)
  ↓
SimulationConfig (pure dataclass)

Benefits: - Decouples file format from core logic - Multiple input formats supported - Easy to add CSV, HDF5, JSON, etc.

IO Handlers (output adapters in io/)

Convert State to external formats:

State (JAX arrays)
  ↓
output_data.py (handler)
  ↓
VTK, NumPy, matplotlib plots

Benefits: - Core logic doesn’t know about file formats - Post-processing loosely coupled - Multiple simultaneous outputs - Testable independently

Why Not Everything in Operators?

Some libraries put everything (collision, streaming, boundaries, output) in a monolithic “operators” directory.

Our approach separates:

operators/ — Physics transformations (pure functions)
pipeline/ — Execution (composition, timing, callbacks)
io/ — Output formatting (presentation layer)
readers/ — Input parsing (presentation layer)

Rationale:

Operators are stable (physics doesn’t change)
Pipeline evolves (new execution strategies)
I/O is volatile (new format requests)
Readers are configurable (many input sources)

Separation enables:

Changing output format without touching collision code
Adding new input format without recompiling operators
Testing operators independently of file I/O
Reusing operators in different execution contexts (JAX, NumPy, GPU)

Test Organization

Tests mirror module structure:

tests/
├── test_tud_lbm_imports.py        ← Package structure
├── integration/
│   └── test_poiseuille.py         ← Physics validation
├── operators/
│   ├── test_collision.py          ← Unit tests
│   ├── test_streaming.py
│   └── ...
├── test_registry.py               ← Registry behavior
└── test_config.py                 ← Configuration

This structure: - Easy to find relevant tests - Enables parallel test execution - Clear separation of unit vs integration

Summary

The tud_lbm architecture achieves:

✓ Clarity — Physics-first naming, clear responsibilities ✓ Testability — Operators decoupled from I/O and config ✓ Extensibility — New operators/readers/writers without modifying core ✓ Performance — JAX jit-compatible immutable state ✓ Maintainability — Separation of concerns, minimal duplication