Add 'good tests' post

2023-11-30 09:52:42 -05:00
parent d907390f2e
commit 4b5e7ba111
1 changed files with 227 additions and 0 deletions
--- a/content/elsewhere/maintenance-matters-good-tests/index.md
+++ b/content/elsewhere/maintenance-matters-good-tests/index.md
@@ -0,0 +1,227 @@
+---
+title: "Maintenance Matters: Good Tests"
+date: 2023-11-29T09:41:18-05:00
+draft: false
+canonical_url: https://www.viget.com/articles/maintenance-matters-good-tests/
+references:
+- title: "A year of Rails - macwright.com"
+  url: https://macwright.com/2021/02/18/a-year-of-rails.html
+  date: 2023-07-03T02:52:03Z
+  file: macwright-com-o4dndf.txt
+---
+
+*This article is part of a series focusing on how developers can center
+and streamline software maintenance. The other articles in the
+Maintenance Matters series are: [Continuous
+Integration](/elsewhere/maintenance-matters-continuous-integration/),
+[Code
+Coverage](https://www.viget.com/articles/maintenance-matters-code-coverage/),
+[Documentation](https://www.viget.com/articles/maintenance-matters-documentation/),
+[Default
+Formatting](https://www.viget.com/articles/maintenance-matters-default-formatting/), [Building
+Helpful
+Logs](https://www.viget.com/articles/maintenance-matters-helpful-logs/),
+[Timely
+Upgrades](https://www.viget.com/articles/maintenance-matters-timely-upgrades/),
+and [Code
+Reviews](https://www.viget.com/articles/maintenance-matters-code-reviews/).*
+
+In this latest entry to our [Maintenance
+Matters](https://www.viget.com/articles/maintenance-matters/) series, I
+want to talk about automated testing. Annie said it well in her intro
+post:
+
+> There is a lot to say about testing, but from a maintainer's
+> perspective, let's define good tests as tests that prevent
+> regressions. Unit tests should have clear expectations and fail when
+> behavior changes, so a developer can either update the expectations or
+> fix their code. Feature tests should pass when features work and break
+> when features break.
+
+This is a topic better suited to a book than a blog post (and indeed
+[there are
+many](https://bookshop.org/search?keywords=software+testing)), but I do
+think there are a few high-level concepts that are important to
+internalize in order to build robust, long-lasting software --- I hope
+to cover them here.
+
+My first exposure to automated testing was with Ruby on Rails. Since
+then, I've written production software in many different languages, but
+nothing matches the Rails testing story. Tom Macwright said it well in
+["A year of
+Rails"](https://macwright.com/2021/02/18/a-year-of-rails.html):
+
+> Testing fully-server-rendered applications, on the other hand, is
+> amazing. A vanilla testing setup with Rails & RSpec can give you fast,
+> stable, concise, and actually-useful test coverage. You can actually
+> assert for behavior and navigate through an application like a user
+> would. These tests are solving a simpler problem - making requests and
+> parsing responses, without the need for a full browser or headless
+> browser, without multiple kinds of state to track.
+
+Partly, I think Rails testing is so good because it's baked into the
+framework: run `rails generate` to create a new model or controller and
+the relevant test files are generated automatically. This helped
+establish a community focus on testing, which led to a robust
+third-party ecosystem around it. Additionally, Ruby is such a flexible
+language that automated testing is really the only viable way to ensure
+things are working as expected.
+
+This post isn't about Rails testing specifically, but I wanted to be
+clear on my perspective before we really dive in. And with that out of
+the way, here's what we'll cover:
+
+1.  [Why Test?](#why-test)
+2.  [Types of Tests](#types-of-tests)
+3.  [Network Calls](#network-calls)
+4.  [Flaky Tests](#flaky-tests)
+5.  [Slow Tests](#slow-tests)
+6.  [App Code vs. Test Code](#app-code-vstest-code)
+
+------------------------------------------------------------------------
+
+### Why Test?
+
+The single most important reason to make automated testing part of your
+development process is that it **gives you confidence to make changes**.
+This gets more and more important over time. With a reliable test suite
+in place, you can refactor code, change functionality, and make upgrades
+with reasonable certainty that you haven't broken anything. Without good
+tests ... good luck.
+
+Secondarily, testing:
+
+-   helps during the development process (testable code is correlated
+    with well-factored code, and it's a good way to review your work
+    before you ship it off);
+-   provides a guide to code reviewers; and
+-   serves as a kind of documentation (though not a particularly concise
+    one, and not as a replacement for proper written docs).
+
+### Types of Tests
+
+I write two main kinds of tests, which I call **unit tests** and
+**integration tests**, though my definitions differ slightly from the
+original meanings.
+
+-   **Unit tests** call application code directly -- instantiate an
+    object, call a method on it, make assertions about the result. I
+    don't particularly care what the object under test does in the
+    course of doing its work -- calling off to other objects, performing
+    I/O, etc. (this is where I differ from the official definition).
+-   **Integration tests** test the entire system end-to-end, using a
+    framework like [Capybara](https://teamcapybara.github.io/capybara/)
+    or [Playwright](https://playwright.dev/). We sometimes refer to
+    these as "feature" tests in our codebases.
+
+End-to-end, black-box integration tests are absolutely critical and can
+cover most of your application's functionality by themselves. But it
+often makes sense to wrap complex logic in a module, test that directly
+(this is where [test-driven
+development](https://en.wikipedia.org/wiki/Test-driven_development) can
+come into play), and then write a simple integration test to ensure that
+the module is getting called correctly. I avoid [mocking and
+stubbing](https://en.wikipedia.org/wiki/Mock_object) if at all possible
+-- again, "tests should pass when features work and break when features
+break" -- and really only reach for it when it's the only option to hit
+100% [code
+coverage](https://www.viget.com/articles/maintenance-matters-code-coverage/).
+In all cases, each test case should run against an empty database to
+avoid ordering issues.
+
+### Network Calls
+
+One important exception to the "avoid mocking" rule is third-party APIs:
+your test suite should be entirely self-contained and shouldn't call out
+to outside services. We use
+[webmock](https://github.com/bblimke/webmock#real-requests-to-network-can-be-allowed-or-disabled)
+in our Ruby apps to block access to the wider web entirely. Some
+providers offer mock services that provide API-conformant responses you
+can test against
+(e.g., [stripe-mock](https://github.com/stripe/stripe-mock)). If that's
+not an option, you can use something like
+[VCR](https://github.com/vcr/vcr), which stores network responses as
+files and returns cached values on subsequent calls. Beware, though: VCR
+works impressively in small doses, but you can lose a lot of time
+re-recording "cassettes" over time.
+
+Rather than leaning on VCR, I've instead adopted the following approach:
+
+1.  Wrap the API integration into a standalone object/module
+2.  Create a second stub module with the same interface for use in tests
+3.  Create a [JSON Schema](https://json-schema.org/) that defines the
+    acceptable API responses
+4.  Use that schema to validate what comes back from your API modules
+    (both the real one and the stub)
+
+If ever the responses coming from the real API fail to match the schema,
+that indicates that your app and your tests have fallen out of sync, and
+you need to update both.
+
+### Flaky Tests
+
+Flaky tests (tests that fail intermittently, or only fail under certain
+conditions) are bad. They eat up a lot of development time, especially
+as build times increase. It's important to stay on top of them and
+squash them as they arise. A single test that fails one time in five
+maybe doesn't seem so bad, and it's easier to rerun the build than spend
+time tracking it down. But five tests like that mean the build is
+failing two-thirds of the time.
+
+Some frameworks have libraries that will retry a failing test a set
+number of times before giving up
+(e.g., [rspec-retry](https://github.com/NoRedInk/rspec-retry),
+[pytest-rerunfailures](https://pypi.org/project/pytest-rerunfailures/)).
+These can be helpful, but they're a bandage, not a cure.
+
+### Slow Tests
+
+The speed of your test suite is a much lower priority than the
+performance of your application. All else being equal, faster is better,
+but a slow test suite that fully exercises your application is vastly
+preferable to a fast one that doesn't. Time spent performance-tuning
+your tests can generally be better spent on other things. That said, it
+*is* worth periodically looking for low-hanging speed-ups -- if
+parallelizing your test runs cuts the build time in half, that's worth a
+few hours' time investment.
+
+During local development, I'll often run a subset of tests, either by
+invoking a test file or specific test case directly, or by using a
+wildcard pattern[^1] to run all the relevant tests. Combining that with
+running the full suite in
+[CI](/elsewhere/maintenance-matters-continuous-integration/)
+provides a good balance of flow and rigor. At some point, if your test
+suite is getting so slow that it's meaningfully impacting your team's
+work, it's probably a sign that your app has gotten too large and needs
+to be broken up into multiple discrete services.
+
+### App Code vs. Test Code
+
+Tests are code, but they're not application code, and the way you
+approach them should be slightly different. Some (or even a lot of)
+repetition is OK; don't be too quick to refactor. Ideally, someone can
+get a sense of what a test is doing by looking at a single screen of
+code, as opposed to jumping around between early setup, shared examples,
+complex factories with side-effects, etc.
+
+I think of a test case sort of like a page in a book. I don't expect to
+be able to open any random page in any random book and immediately grasp
+the material, but assuming I'm otherwise familiar with the book's
+content, I should be able to look at a single page and have a pretty
+good sense of what's going on. A book that frequently required me to
+jump to multiple other pages to understand a concept would not be a very
+good book, and a test that spreads its setup across multiple other files
+is not a very good test.
+
+------------------------------------------------------------------------
+
+Automated testing is a (perhaps **the**) critical component of
+sustainable software development. It's not a replacement for human
+testing, but with a reliable automated test suite in place, your testers
+can focus on what's changed and not worry about regressions in other
+parts of the system. It really doesn't add much time to the development
+process (provided you know what you're doing), and any increase in
+velocity you gain by forgoing testing is quickly erased by time spent
+fixing bugs.
+
+[^1]: For example, if I'm working on the part of the system that deals with sending email, I'll run all the tests with `mail` in the filename with `rspec spec/{models,features,lib}/**/*mail*`.