Robby is joined by Sara Jackson, Senior Developer at thoughtbot, to explore the practical ways teams can foster resilience—not just in their infrastructure, but in their everyday habits. They talk about why documentation is more than a chore, how to build trust in test suites, and how Chaos Engineering at the application layer can help make the case for long-term investment in maintainability.
Sara shares why she advocates for writing documentation on day one, how “WET” test practices have helped her avoid brittle test suites, and why she sees ports as a powerful alternative to full rewrites. They also dive into why so many teams overlook failure scenarios that matter deeply to end users—and how being proactive about those situations can shape better products and stronger teams.
Episode Highlights
[00:01:28] What Well-Maintained Software Looks Like: Sara champions documentation that’s trusted, updated, and valued by the team.
[00:07:23] Invisible Work and Team Culture: Robby and Sara discuss how small documentation improvements often go unrecognized—and why leadership buy-in matters.
[00:10:34] Why Documentation Should Start on Day One: Sara offers a “hot take” about writing things down early to reduce cognitive load.
[00:16:00] What Chaos Engineering Really Is: Sara explains the scientific roots of the practice and its DevOps origins.
[00:20:00] Application-Layer Chaos Engineering: How fault injection can reveal blind spots in the user experience.
[00:24:36] Observability First: Why you need the right visibility before meaningful chaos experiments can begin.
[00:28:32] Pitching Resilience to Stakeholders: Robby and Sara explore how chaos experiments can justify broader investments in system quality.
[00:33:24] WET Tests vs. DRY Tests: Sara explains why test clarity and context matter more than clever abstractions.
[00:40:43] Working on Client Refactors: How Sara approaches improving test coverage before diving into major changes.
[00:42:11] Rewrite vs. Refactor vs. Port: Sara introduces “porting” as a more intentional middle path for teams looking to evolve their systems.
[00:50:45] Delete More Code: Why letting go of unused features can create forward momentum.
[00:51:13] Recommended Reading: Being Wrong by Kathryn Schulz.
Resources & Links
- Sara on Mastodon
- thoughtbot
- RubyConf 2024 Talk – Chaos Engineering on the Death Star
- Book: Being Wrong by Kathryn Schulz
- Flu Shot on GitHub
- ChaosRB on GitHub
- Semian from Shopify — a chaos engineering toolkit for Ruby
Thanks to Our Sponsor!
Turn hours of debugging into just minutes! AppSignal is a performance monitoring and error-tracking tool designed for Ruby, Elixir, Python, Node.js, Javascript, and other frameworks.
It offers six powerful features with one simple interface, providing developers with real-time insights into the performance and health of web applications.
Keep your coding cool and error-free, one line at a time!
Use the code maintainable to get a 10% discount for your first year. Check them out!