Hi! I am Han Lee.
I build and operate machine learning systems, with expertise on GenAI, agentic systems, LLM agents, search engines, recommendation engines, and large language models. I am the guy to call for fixing spaghetti codes, processes, and orgs.
I write about machine learning engineering, evaluation, compound AI systems, and the tech industry — drawing on years of experience shipping ML at scale and investing in the sector.
A practical guide to designing and implementing AI evaluation systems, grounded in the research papers that shaped the field.
Early Access — 50% OffRecent Posts
See all →"Determinism" is the Biggest Cope in AI Adoption
We've never had determinism in software. We just had the illusion of it. Turing's Halting Problem and Rice's theorem proved that verifying software correctness was never a guarantee anyone could offer. What AI systems shift is not reliability — it's the evaluation surface.
The AI Great Leap Forward
In 1958, Mao ordered every village to produce steel. The steel was useless. The crops rotted. Today's top-down AI mandates are producing the same pattern: backyard furnaces building demoware nobody evaluates, inflated adoption metrics reported to leadership, essential roles eliminated without understanding second-order effects, and employees strategically sabotaging the knowledge extraction designed to replace them.
A Taxonomy of RL Environments for LLM Agents
A structured guide to RL environments for LLM agents. RL environments are the training grounds that shape what agents can learn. This guide covers the five core components (task distribution, harness, verifier, state management, config), the architectural question of where the model lives relative to the environment, verifier design principles, and a practical decision framework for building your own environments.
It's-a Me, Agentic AI
An intuitive guide to agentic AI model development and agent frameworks using Super Mario as an extended analogy. Small Mario is a base model, the Super Mushroom is the model harness, power-ups are agent skills, and learning to beat levels is reinforcement learning. If you can understand Mario, you can understand how agentic AI systems are built.
The Evaluation Design Lifecycle: From Business Need to Valid Metrics
A systematic process for translating stakeholder needs into valid, actionable AI evaluation metrics — the evaluation design lifecycle.