What is Test-Driven Development? A Practical Guide to TDD
This post covers Test-Driven Development (TDD) in Rust. TDD reduces debugging time, improves design, and creates living documentation. Code examples show the process in action. We'll be covering the red-green-refactor cycle, how to change your mindset for implementing TDD, patterns for writing maintainable tests, and work through an example implementation written using TDD.
Introduction To TDD
Test-Driven Development (TDD) is a software development process created by Kent Beck. In TDD, you write a test for a piece of desired functionality before writing the production code that implements it. The test fails at first since the functionality does not exist. Then you write the minimum code needed to make that test pass. You repeat this cycle for each new piece of functionality while occasionally refactoring as needed.
TDD uses tests as executable specifications. Each test states a specific behavior the software should have. Another way to think about TDD is "outcome-based programming": the focus is on verifying that the software produces the right outcomes, independent of the internal implementation details.
About Tests
Given that tests are a central component of TDD (it's in the name after all), it's important to define what a test should look like. This way, whenever you read the word "test" in this post, you'll know what I'm actually talking about. Trust me, this will make a big difference in your understanding of TDD. I feel that this is important because I see a lot of pushback against TDD coming from the perspective of "I don't want to write tests all day" or "Writing tests will slow me down". Both of these statements originate from a misunderstanding of what maintainable test code looks like.
Note that I'm purposefully choosing something trivial to demonstrate tests. However, the principles can be applied to code of high complexity.
Bad Test Code
It's probably safe to assume that we have all written test code like this at some point:
#[test]
fn test_stack() {
let mut stack = Stack::new();
stack.push(1);
stack.push(2);
assert_eq!(stack.pop(), 2);
assert_eq!(stack.pop(), 1);
assert!(stack.inner.is_empty());
stack.push(3);
assert_eq!(stack.len(), 1);
}
Here's why this is a bad test:
- Ambiguous test name. "Testing" the stack can mean anything. We have no indication of what to look for when this test fails.
- Multiple assertions. Any assertion can fail and we won't know exactly what the problem is. We end up playing whack-a-mole by implementing a fix to one assertion but causing a failure in another assertion. Having multiple assertions actively gets in the way and slows us down.
- Accessing private field
innerto check the length. This locks our test to the internal implementation ofStackand will cause test breakage eventually. Tests need to make us go faster, not slow us down by breaking just because we decide to use something else internally.
Good Test Code
The above isn't terrible, but it definitely can be improved.
Let's start with the multiple assertion issue. Tests should only have one assertion. Any amount of assertions greater than one means that you are testing more than one thing.
Test names should also reflect exactly what is being tested. This way you know what failed just by looking at the test runner results.
Here is a list of potential tests that we can write. For the purposes of demonstration, I included my thought process as I wrote the tests. The comments are strictly for the blog post and wouldn't normally be present in your code.
// A stack needs to be able to push things on top.
fn pushes_single_item() { /* ... */ }
// If we push one thing, then we should probably
// try pushing more than one thing.
fn pushes_multiple_items() { /* ... */ }
// The inverse of pushing is popping, so let's
// make two more tests.
fn pops_single_item() { /* ... */ }
fn pops_multiple_items() { /* ... */ }
// What happens if we try to pop an empty stack?
fn pop_returns_none_when_stack_is_empty() { /* ... */ }
// Oh yeah, we need to ensure that popping a
// stack actually takes things off the top.
fn pops_are_in_lifo_order() { /* ... */ }
// It can be useful to know how many items are
// in the stack.
fn stack_maintains_correct_len() { /* ... */ }
I won't show the code for them all, but we can take a look at a few. I'll be using BDD (Behavior-Driven-Development) style tests. BDD tests have three sections which are included in the test as comments:
- Given: "Given" some precondition
- When: "When" something happens
- Then: "Then" there is some change
Here are some tests:
#[test]
fn pushes_single_item() {
// Given an empty stack
let mut stack = Stack::new();
// When a single item is pushed
stack.push(1);
// Then the length is 1
assert_eq!(stack.len(), 1);
}
#[test]
fn pushes_multiple_items() {
// Given an empty stack
let mut stack = Stack::new();
// When we push 2 items
stack.push(1);
stack.push(2);
// Then the length is 2
assert_eq!(stack.len(), 2);
}
#[test]
fn pops_single_item() {
// Given a stack containing 1 item
let mut stack = Stack::new();
stack.push(1);
// When the item is popped
let popped = stack.pop();
// Then the item is returned
assert_eq!(popped, Some(1));
}
#[test]
fn pops_are_in_lifo_order() {
// Given a stack with 2 items
let mut stack = Stack::new();
stack.push(1);
stack.push(2);
// When we pop 2 items
let two = stack.pop();
let one = stack.pop();
// Then we get the items back in LIFO order
assert_eq!((two, one), (Some(2), (Some(1))));
}
Some things to note:
- Every test has discrete sections so they are easy to follow.
- The tests are small, making them easy to copy and paste for different cases. The small size also makes it trivial to update the tests later if you make a breaking change (We'll see later in the post how to avoid breaking changes.)
- We only use public methods (no accessing
.inner). This allows us to work on the implementation while keeping the tests maintainable.
So just to be clear: Whenever you read the word "test" in this post, I am referring to the simple three-section BDD-style tests above, NOT the multi-assertion code from the "bad test code" section.
What TDD Isn't
The pushback I mentioned ("I don't want to write tests all day" or "Writing tests will slow me down") often comes from exposure to bad tests or a misunderstanding of the process. Let's clear up what TDD isn't and then we can finally talk about what it actually is.
TDD isn't:
- Test suites before implementation
- TDD never asks for every test upfront. Write one failing test, pass it, repeat. The suite builds incrementally alongside the code.
- 100% coverage or bug elimination
- Tests verify specific behaviors you choose to drive development. TDD dramatically reduces regressions, but gaps remain.
- Unit tests only
- TDD principles apply to integration testing and end-to-end testing where it aids design clarity. Though unit testing is the optimal area to apply TDD.
- Writing tests after implementation
- TDD requires the test before any code. If you thought that TDD was "write some code and then test all possible things", then that's incorrect.
- About writing tests
- Ironically, TDD isn't about tests at all. Tests are just the vehicle used to help create the implementation. When thinking in terms of TDD, it isn't "What test do I write?", but instead it's "What does my program need to do?".
TDD Mindset
Adopting TDD requires a shift in thinking. Instead of thinking in terms of code, you need to start thinking in terms of behaviors. This mindset treats tests as executable requirements that shape the implementation.
Specifying Over Implementing
When starting a new feature, force yourself to stop thinking about the code structure and start thinking about the desired outcome. The test is your opportunity to define the contract the code must fulfill.
Don't ask, "How should I write this function?" Instead, ask, "What does this part of the program need to do?" This shifts the mental effort from problem-solving to requirement articulation. You are trying to determine what a solved problem looks like instead of going straight for an implementation.
Acting as the User
Writing the test first forces you into the perspective of the consumer of the code. Because the implementation doesn't exist yet, you have to invent the API you wish you had. You are the first client of your own code.
This perspective requires you to prioritize usability. If the test is clunky and setting up the scenario requires too much boilerplate or the method names don't read well, you abandon that approach and try a new design immediately. It didn't cost you anything because there isn't any implementation code at this point. It's just a few lines in a test. This enables you freely design the API based on how you'd like it to be used.
Code Is Temporary
Once a test passes, the behavior is established, but the implementation can still change any time. The TDD mindset encourages you to view the code you just wrote as a temporary solution, regardless of its permanence.
Viewing the code as temporary means you need to be willing to delete or rewrite code as you go. The desired program behavior is encoded in the form of executable tests. So you don't need to worry about breaking anything while adding new behaviors. This frees your mind to focus purely on the one behavior being implemented.
Red-Green-Refactor
The cycle of Red-Green-Refactor is the primary way you'll implement a TDD workflow, so let's take a detailed look at how it works.
The Red Phase
The Red phase begins with a clear intent. You must identify a single behavior or requirement and write a test for it. This test should fail because the functionality does not exist yet.
You must observe the failure. If the test passes when you first write it, you have written a useless test, or the feature already exists. If it fails for a different reason than expected (e.g., a syntax error or a typo), you fix the test before moving on. This step validates that the test is capable of detecting the absence of the feature. It forces you to define the desired outcome before attempting to solve the problem.
The Green Phase
The objective of the Green phase is to transition the test from failing to passing as quickly as possible. This often means ignoring good design principles, performance, or even writing "hardcoded" solutions.
For instance, if a test expects that the function add(2, 2) returns 4, then returning 4 immediately without calculating anything is a valid Green phase strategy. The priority here is speed and feedback. You are establishing a baseline level of functionality and API design at this stage. This is not the final implementation. Writing complex logic during this phase slows down the Red-Green-Refactor loop and defeats the purpose of the process, and you will very likely introduce unnecessary complexity.
Once the test passes (green), then you can move on to the next test.
The Refactor Phase
The Refactor phase doesn't need to occur on every cycle of Red-Green-Refactor. There are two instances where using the Refactor phase provides the best benefit:
- Refactor as you go
- Refactor after implementing all desired behaviors
As you write tests, you'll likely run into duplicate test code. This code should be refactored regularly in order to keep the tests concise, but not on every Red-Green-Refactor cycle. Use the builder pattern, helper functions, and extension traits as needed to keep your tests maintainable. I will typically refactor test code once I notice that I start repeating things 3 or 4 times.
After implementing all the planned desired behaviors, you can then refactor the implementation code. This might involve extracting methods, renaming variables for clarity, or simplifying conditional logic. You have the tests to fall back on, so don't worry about breaking anything because you'll get a failing test and know right away. This is only performed when you are done because if you are still adding new behaviors, then the code is going to change anyway.
Automating Red-Green-Refactor
I recommend setting up your development environment so it automatically runs tests when you save changes. This will provide the most immediate feedback while using TDD.
If your tooling doesn't have builtin support for re-running tests on change, it's fairly trivial to implement it yourself using the watchexec tool combined with just. For Rust, you can do something like this:
# justfile
# run `cargo test`
test:
cargo test
# run tests on file save
tdd:
watchexec --clear --debounce 500ms --wrap-process=none just test
Then run just tdd in a terminal off to the side so you can see test results as you code. For other languages, replace cargo test with whatever command builds and runs your tests.
Worked Example
Let's create a stack data structure using TDD.
First Test: pushes_one_item
We start with a test:
#[test]
fn pushes_one_item() {
let mut stack = Stack::default();
stack.push(1);
assert_eq!(stack.len(), 1);
}
Note
We design the API inside the tests. Since there is no implementation yet, we can make up whatever kind of API we would like for the stack. We are technically "using" the code before it's written so it's easy to change (it doesn't exist yet). You naturally will gravitate towards simple APIs since you are using it in every test.
We will need to create three things to get this to pass:
- The
Stackdata structure - A
pushmethod - A
lenmethod so we can confirm our result
Since we only implement things to pass this one test, we can do something like this:
#[derive(Default, Debug)]
pub struct Stack<T> {
data: Option<T>,
}
impl<T> Stack<T> {
pub fn push(&mut self, item: T) {
self.data = Some(item)
}
pub fn len(&self) -> usize {
// Hard-code 1 here because the test
// expects that value..
1
}
}
This isn't even close to a stack right? That's OK when using TDD. As we declare more behaviors via tests, the implementation will change in order to pass the new tests. TDD is the only way we can be sure that changing the code doesn't change the behavior. If we do change existing behavior, then a test will fail and we will need to correct the code.
TDD also has a built-in end point: Once we write a test for each behavior, and each passes, then we are done. We don't need to write extra code and we don't need to over-engineer various ways to use the code. We are already using the code in every test, so we know that it works the way we want it to.
Second Test: pushes_two_items
Well, we pushed one item to the stack, so we should probably try to push two items:
#[test]
fn pushes_two_items() {
let mut stack = Stack::default();
stack.push(1);
stack.push(2);
assert_eq!(stack.len(), 2);
}
Remember to run your tests before we make any changes. If you've automated your test-runner, then you'll already have a "Red" failing test.
Looking at the code, we only have space for a single item, but now we need to store two items. We can use a collection to store more than one thing:
#[derive(Default, Debug)]
pub struct Stack<T> {
// Change `Option<T>` to `Vec<T>`
data: Vec<T>,
}
impl<T> Stack<T> {
pub fn push(&mut self, item: T) {
// Push onto the Vec
self.data.push(item);
}
pub fn len(&self) -> usize {
// Return 2 now instead of 1
2
}
}
After this change, our test suite fails:
No worries though! TDD has helped us by indicating that we broke some expected behavior. The len method is now hard-coded to return 2 instead of 1, which makes the original test fail. To fix this, we need to return the length of the vector instead of a hard-coded value:
//... existing code ...
impl<T> Stack<T> {
//... existing code ...
pub fn len(&self) -> usize {
self.data.len()
}
}
This change makes both of our tests pass.
The remainder of the stack behavior will be omitted since this post is already getting lengthy. Implementation for all remaining features follows the same pattern:
- Write a test
- The test fails (Red)
- Implement functionality to make the test pass (Green)
- If you break other tests, then STOP and fix the code until they all pass again.
- Go back to step 1
This cycle creates a tight feedback loop. In our example, changing the return value of len to 2 broke the first test. Without that test, that bug might have persisted or been discovered much later. The tests act as a safety net, allowing you to modify the implementation without fear of breaking existing behavior.
Benefits of TDD
Now that we have seen the process in action, we can look at the specific advantages this workflow provides.
Reduced Debugging Time
When you write code without tests, you often spend a significant amount of time debugging issues that appear much later in the development cycle. With TDD, the feedback loop is immediate. You know exactly when and where a bug was introduced because the test you just wrote failed. You fix errors while the code logic is still fresh in your mind, rather than hunting for a regression days or weeks later.
Living Documentation
Documentation in code comments often becomes outdated as the codebase evolves. Tests, however, cannot become outdated without failing. A well-written test suite serves as accurate documentation. It shows exactly how the code is intended to be used, what inputs are valid, and what outputs to expect. New team members can read the tests to understand the system's behavior without needing to decipher implementation details.
Improved Design
Writing the test first forces you to consider the API from the user's perspective before worrying about implementation details. If the code is difficult to test, it often indicates that the design is too complex. TDD naturally guides you toward simpler, more modular interfaces because complex interfaces are painful to use in a test context.
Challenges of TDD
Adopting TDD is not without its hurdles. It requires discipline and a shift in perspective that can be difficult initially. Understanding these challenges helps in managing expectations when starting out.
The Learning Curve
The most immediate barrier is the change in workflow. We are used to writing code, compiling it, and checking if it works manually. Inserting a test step before every implementation change feels slow and disruptive. It can take weeks or months of practice before the Red-Green-Refactor cycle becomes second nature and the feedback loop starts to pay off in speed.
Brittle Tests
Good tests verify behavior, bad tests verify implementation. If a test is too coupled to the internal structure of the code, refactoring becomes a nightmare. You will find yourself fixing tests that broke not because the behavior changed, but because you renamed a variable or moved a method. Maintaining the distinction between testing public interfaces and private implementation details is a constant effort.
External Dependencies
TDD is easiest with pure functions and isolated logic. Real-world applications often involve databases, file systems, and third-party APIs. These elements are difficult to test due to speed, reliability, and state management. To practice TDD effectively with external dependencies, you'll need to learn how to effectively write test doubles. Thankfully, LLMs have made this a mostly trivial effort.
Maintainable Tests
To combat the fragility of tests mentioned in the previous section, we can apply design patterns within our test code. The goal is to reduce boilerplate and encapsulate setup details so that when the implementation changes, only a small amount of test code needs updating, not every individual test.
Note
If applying design patterns already sounds too heavy for test code, keep this in mind: test code is production code. Not only is it production code, but it also verifies that your production code actually works. Apply the same level of effort that you would for any other code.
You don't need to have pristine test code that runs optimally. Prefer maximally maintainable test code over all else.
The test code in the next few sections is added only in a test context. So it won't be present at all in the final build (and not even in a debug build). Note however, that if you are using a workspace with a lot of crates and you have cross-crate interactions during testing, then the test methods demonstrated will need to be made public in a regular (non-test) configuration.
The Builder Pattern
Tests often require complex objects to be in a specific state. Repeating the setup code in every test creates a maintenance burden. If the constructor signature changes or initialization requires more steps, you have to fix dozens of tests.
A builder makes the test intent clearer by hiding the initialization logic and saving you from having to change test code later. In this example, we'll use an extension trait combined with a builder:
// Create a trait so we can call it on the `Stack`
trait StackTestExt {
// Return a builder to create a `Stack`.
fn test_builder() -> StackBuilder;
}
// Implement on `Stack` so we can easily create
// the builder
impl<T> StackTestExt for Stack<T> {
fn test_builder() -> StackBuilder {
StackBuilder::default()
}
}
// A basic stack builder
#[derive(Default)]
struct StackBuilder {
items: Vec<i32>,
}
impl StackBuilder {
// Add new items to the stack.
fn with_items(mut self, items: &[i32]) -> Self {
self.items.extend_from_slice(items);
self
}
fn build(self) -> Stack<i32> {
let mut stack = Stack::default();
for item in self.items {
// Pushing the items is abstracted
stack.push(item);
}
stack
}
}
// Example test case using the new builder.
#[test]
fn pops_items_in_lifo_order() {
// Given a stack containing 3 items
let stack = Stack::test_builder()
.with_items(&[1, 2, 3])
.build();
// When we pop the items
let a = stack.pop();
let b = stack.pop();
let c = stack.pop();
// Then the items are in LIFO order
assert_eq!((a, b, c), (Some(3), Some(2), Some(1)));
}
Using a builder like this allows us to change even the public API of Stack. The construction and pushing of items are completely abstracted away. We could potentially have hundreds of tests, but we only need to change 1 part of our test module (the builder) without needing to change any other test code.
Obviously if you take this to the extreme then all interactions with the Stack would be abstracted as well. Determining how far to take it is a matter of preference and experience. A rule of thumb would be: the more complicated your setup code and interactions are, the more likely you should add a test-specific abstraction on top of it.
Wrapper functions
While builders help create the initial state, wrapper functions help perform common actions on that state. Actions like pushing multiple items or setting up complex configurations often result in repetitive code within the test body.
Writing a helper function helps reduce the amount of code present in a test:
fn push_many(stack: &mut Stack<i32>, items: &[i32]) {
for &item in items {
stack.push(item);
}
}
#[test]
fn calculates_average() {
let mut stack = Stack::new();
// Instead of a loop or multiple `.push` calls:
push_many(&mut stack, &[10, 20, 30]);
assert_eq!(stack.average(), 20.0);
}
Note that you can also extend the Stack with any number of helper functions, similar to the builder in the previous section. However, avoid adding too many extension test methods via traits because it can become confusing as to which methods are actually part of Stack versus added for test. Using a helper function makes it more obvious, or using test_ in extension method names can help clarify that these are test-only methods.
Conclusion
Test-Driven Development is a design methodology that places feedback at the center of the coding process. By focusing on the interface before the implementation, you create software that is modular, decoupled, and easier to maintain.
While the learning curve can be steep and the initial pace may feel slower, the long-term benefits of reduced debugging time and living documentation are substantial. The patterns we discussed help keep the test suite manageable and resilient.
You do not need to adopt TDD for every line of code immediately. Start by writing a test before fixing a bug, or try building a small feature using the Red-Green-Refactor cycle. Over time, the confidence that comes from a comprehensive test suite will change the way you approach software development.
Want to learn Rust?
Establish strong fundamentals with my comprehensive video course which provides the clear, step-by-step guidance you need. Learn Rust starting from the basics up through intermediate concepts to help prepare you for articles on advanced Rust topics.
Enrollment grants you access to a large, vibrant Discord community for support, Q&A, and networking with fellow learners. 🚀 Unlock the full Rust learning experience, including our entire content library, when you enroll at ZeroToMastery.
Comments