The Testing Trophy and Testing Classifications
A general guide for the return on investment of the different forms of testing with regards to testing JavaScript applications.
Overview​

This figure was considered in depth by Kent C. Dodds in 2018 in an Assert(js) Conference. He developed the thought from the famous tweet of Guillermo Rauch, the CEO of Vercel and co-founder of successful JavaScript projects in the US. The saying is: Write tests. Not too many. Mostly integration. And there is a great deal of sense in this concise claim.
The four types of tests​
- End to End - A helper robot that behaves like a user to click around an app and verify that it functions correctly. Cypress
- Integration - Verify several units work together in harmony. Jest
- Unit - Verify that individual, isolated parts work as expected. Jest
- Static - Catch typos and type errors as we write code. Typescript
Why do we write tests?​
I think it's important to remember why it is that we write tests in the first place. Why do you write tests? Because you were told to do it? Or because your MR will get rejected because it doesn't include tests?
The biggest and most important reason that I write tests is CONFIDENCE. I want to be confident that the code I'm writing for the future won't break the app that I have running in production today. So whatever I do, I want to make sure that the kinds of tests I write bring me the most confidence possible and I need to be cognizant of the trade-offs I'm making when testing.
Testing Types​
End to End (E2E)​
Typically these will run the entire application (both frontend and backend) and the test will interact with the app just like a typical user would. These tests are written with Cypress.
import { generate } from 'todo-test-utils';
describe('todo app', () => {
it('should work for a typical user', () => {
const user = generate.user();
const todo = generate.todo();
cy.visitApp();
cy.findByText(/register/i).click();
cy.findByLabelText(/username/i).type(user.username);
cy.findByLabelText(/password/i).type(user.password);
cy.findByText(/login/i).click();
cy.findByLabelText(/add todo/i)
.type(todo.description)
.type('{enter}');
cy.findByTestId('todo-0').should('have.value', todo.description);
cy.findByLabelText('complete').click();
cy.findByTestId('todo-0').should('have.class', 'complete');
// etc...
});
});
Integration​
The test below renders the full app. This is NOT a requirement of integration tests and most integration tests don't render the full app. They will however render with all the providers used in my app (that's what the render method from the imaginary "test/app-test-utils" module does). The idea behind integration tests is to mock as little as possible.
We should only mock:
- Network requests (using MSW)
- Components responsible for animation (because who wants to wait for that in your tests?)
import * as React from 'react';
import { render, screen, waitForElementToBeRemoved } from 'test/app-test-utils';
import userEvent from '@testing-library/user-event';
import { build, fake } from '@jackfranklin/test-data-bot';
import { rest } from 'msw';
import { setupServer } from 'msw/node';
import { handlers } from 'test/server-handlers';
import App from '../app';
const buildLoginForm = build({
fields: {
username: fake((f) => f.internet.userName()),
password: fake((f) => f.internet.password()),
},
});
// integration tests typically only mock HTTP requests via MSW
const server = setupServer(...handlers);
beforeAll(() => server.listen());
afterAll(() => server.close());
afterEach(() => server.resetHandlers());
test(`logging in displays the user's username`, async () => {
// The custom render returns a promise that resolves when the app has
// finished loading (if you're server rendering, you may not need this).
// The custom render also allows you to specify your initial route
await render(<App />, { route: '/login' });
const { username, password } = buildLoginForm();
userEvent.type(screen.getByLabelText(/username/i), username);
userEvent.type(screen.getByLabelText(/password/i), password);
userEvent.click(screen.getByRole('button', { name: /submit/i }));
await waitForElementToBeRemoved(() => screen.getByLabelText(/loading/i));
// assert whatever you need to verify the user is logged in
expect(screen.getByText(username)).toBeInTheDocument();
});
For these types of test we should have a few things configured globally like automatically resetting all mocks between tests.
Learn how to setup a test-utils file like the one above in the React Testing Library setup docs.
Unit​
Pure functions are the BEST for unit testing.
import cases from 'jest-in-case';
import fizzbuzz from '../fizzbuzz';
cases(
'fizzbuzz',
({ input, output }) => expect(fizzbuzz(input)).toBe(output),
[
[1, '1'],
[2, '2'],
[3, 'Fizz'],
[5, 'Buzz'],
[9, 'Fizz'],
[15, 'FizzBuzz'],
[16, '16'],
].map(([input, output]) => ({
title: `${input} => ${output}`,
input,
output,
}))
);
Static​
// can you spot the bug?
// I'll bet ESLint's for-direction rule could
// catch it faster than you in a code review 😉
for (let i = 0; i < 10; i--) {
console.log(i);
}
Testing Trophy Trade-offs​
Cost​
As you move up the testing trophy, the tests become more costly. This comes in the form of actual money to run the tests in a continuous integration environment, but also in the time it takes engineers to write and maintain each individual test.
The higher up the trophy you go, the more points of failure there are and therefore the more likely it is that a test will break, leading to more time needed to analyze and fix the tests.
Speed​
As you move up the testing trophy, the tests typically run slower. This is due to the fact that the higher you are on the testing trophy, the more code your test is running. Unit tests typically test something small that has no dependencies or will mock those dependencies (effectively swapping what could be thousands of lines of code with only a few)
Confidence: Simple problems 👌 ➡ Big problems 😖​
The cost and speed trade-offs are typically referenced when people talk about the testing pyramid 🔺. If those were the only trade-offs though, then I would focus 100% of my efforts on unit tests and totally ignore any other form of testing when regarding the testing pyramid. Of course we shouldn't do that and this is because of one super important principle that you've probably heard me say before:
The more your tests resemble the way your software is used, the more confidence they can give you.
Conclusion​
Every level comes with its own trade-offs. An E2E test has more points of failure making it often harder to track down what code caused the breakage, but it also means that your test is giving you more confidence. This is especially useful if you don't have as much time to write tests. I'd rather have the confidence and be faced with tracking down why it's failing, than not having caught the problem via a test in the first place.