Blog | Dustin Smith's Online Resume

Feb 24, 2026

Interlock: A STAMP-Based Safety Framework for Data Pipelines

How I built a STAMP-based safety framework in Go with declarative sensors, failure classification, and centralized observability for data pipeline reliability on AWS

data engineering go aws safety dynamodb step functions terraform eventbridge observability

Feb 7, 2026

PySpark Pipeline Framework: Configuration-Driven Pipelines for the Python Ecosystem

How pyspark-pipeline-framework brings configuration-driven architecture, lifecycle hooks, and resilience patterns to PySpark

python pyspark data engineering open source configuration streaming

Jan 26, 2026

Hardening Gastown: Role-Based Access Control for Multi-Agent Workflows

Configuring Gastown for production use with custom role contexts, Claude Code hooks, git guards, and file guards to enforce principle of least privilege across AI agents

claude code multi-agent gastown security configuration ai agents hooks

Jan 25, 2026

Contributing to Gastown: Multi-Agent Orchestration for Claude Code

5 merged PRs and 8 open contributions to Gastown, covering daemon resilience, fresh installation fixes, and autonomous patrol improvements

claude code multi-agent open source gastown llm ai agents

Dec 23, 2025

Fitting 100 Statistical Distributions at Scale: 1000x Memory Reduction with PySpark

How spark-bestfit 3.0 fits distributions across Spark, Ray, and local backends with survival analysis, mixture models, and multivariate support

spark python data engineering data science statistics optimization ray distributed computing survival analysis

Dec 23, 2025

Building Production-Ready Spark Pipelines with Configuration-Driven Architecture

How spark-pipeline-framework reached 1.0 with Spark Connect support, streaming, and enterprise features

spark scala data engineering open source observability spark connect streaming

Feb 3, 2023

Delivery Hero 2023 January Layoffs

My experience from Delivery Hero's 2023 layoffs.

layoffs tech layoffs delivery hero

Mar 24, 2022

Capital Budgeting with Monte Carlo Simulations in Python

How to use Monte Carlo simulations in Python to make better capital investment decisions, with a practical example of evaluating cloud migration costs.

python finance monte carlo capital budgeting data science

Oct 25, 2021

Configuration Files in Python Using Dataclasses

How to use the dataconf library to parse HOCON, JSON, YAML, and properties files directly into Python dataclasses with full type safety.

python dataclasses configuration dataconf type safety

Jul 28, 2021

Data Optimization for Compacted Partitions: Achieving 77% Storage Reduction

How intelligent data optimization with linear ordering and Z-ordering achieved 77% storage reduction and 90% runtime improvements on petabyte-scale data lakes.

apache spark data engineering big data optimization parquet orc