SREs Aren’t Allergic to Meetings

Created on 2025-11-29 07:10

Published on 2025-12-01 11:30

SREs Aren’t Allergic to Meetings — We’re Allergic to Meetings That Don’t Earn Their Keep

Somewhere along the way, “SREs are allergic to meetings” became a personality test. If your calendar is a graveyard of 90-minute “syncs” where someone reads Jira tickets like bedtime stories, then yes, your sinuses will act up. But when the meeting has a clear hypothesis, produces decisions, and protects reliability? We’ll show up early, caffeinated, and maybe even bring pizza.

Let’s get real: the debate isn’t “meetings or no meetings.” It’s “which meetings actually improve reliability, speed, and sanity?” DevOps culture taught us to optimize systems end to end. The calendar is part of that system. Just like a brittle service, a brittle meeting ritual fails under load. And just like a good SLO, a good meeting sets expectations, measures outcomes, and gets tuned when it drifts.

What the Data Says (And Why SREs Should Care)

Across the industry, two trends collide. First, the number of meetings ballooned in the remote/hybrid era, while the average duration shrank a bit; smaller doesn’t always mean better if the signal-to-noise ratio stays the same. Research tracking hybrid work patterns found meetings have gotten slightly shorter and smaller, but still expanded in volume and time-zone sprawl, which pushes collaboration after-hours and erodes focus time. That matters for SREs because deep work is how we eliminate toil, wrangle flaky tests, and de-risk deployments. 

Second, the best-performing engineering orgs keep reminding us that culture and disciplined practices beat chaos and calendar sprawl. DORA’s long-running research links healthy engineering culture, good documentation, and speedy, high-quality delivery with better outcomes. That doesn’t mean “more meetings”; it means tighter feedback loops, clearer decisions, and strong knowledge flows — the kind you get from well-run postmortems and design reviews. 

Meanwhile, many companies have tried the “calendar purge” experiment — canceling recurring meetings en masse — with mixed results. Shopify’s 2023 cleanse made headlines, saved a lot of hours, and ignited an industry-wide conversation about meeting ROI. But even at Shopify, the message wasn’t “never meet”; it was “earn your slot.” The lesson for SRE: default to async, but keep the rituals that safeguard reliability and learning. 

Meetings We Actually Show Up For: Postmortems, Chaos Drills, and the Pizza Principle

SREs love meetings that are really just structured experiments with human participants. A blameless postmortem isn’t a meeting; it’s a reliability learning engine. It has a hypothesis (“We can prevent a recurrence if we understand the contributing factors”), a method (timelines, evidence, and analysis), and a result (action items with owners and due dates). Google’s SRE guidance is explicit: postmortems must be blameless, focus on systemic causes, and lead to concrete improvements. That shared psychological safety is why people tell the truth when the pager stops screaming. 

Chaos drills are meetings with reality as the guest speaker. The principles of chaos engineering encourage us to form a hypothesis about steady state, induce failure, and see if the system behaves as expected. Run them regularly and your mean time to WTF drops. If you want fewer 3 a.m. surprises, do more 3 p.m. rehearsals. 

And then there’s the pizza factor. It’s not the carbs; it’s the sense of occasion. A lightweight brown-bag on “Why our retry storm behaved like a toddler on espresso” wins mindshare because it feels like we’re together solving something real. Culture amplifies knowledge.

Two Camps, One Calendar: The Great Meeting Debate

In every org, there are two loud camps.

There’s the Delete-The-Meeting faction. They bring receipts about how meetings fragment flow time, inflate after-hours work, and create what researchers call “meeting hangovers.” Their solution is aggressively async: pre-reads, comments in docs, and only convene when there’s a decision to be made. They’ll cite experiments where cutting recurring meetings freed weeks of time. They’re not wrong; context switching is the enemy of reliability work. 

Then there’s the Sync-Or-Sink faction. They argue that complex systems need rapid sensemaking. Incidents don’t resolve via comment threads; you need a commander, comms lead, and a video bridge. Architecture changes benefit from synchronous debate because the unknown unknowns surface faster in real time. Hybrid work changed the mechanics of meetings, not the need for moments of alignment. Again, not wrong — incident war rooms, pre-mortems, and postmortems are high-leverage when they’re concise and facilitated. 

Both camps agree on a hidden truth: bad meetings are a reliability risk. They throttle velocity, hide decisions, and scatter accountability. Reform is not optional; it’s part of your reliability strategy.

The Calendar as an SRE System: Design It Like You Design Prod

If we treat meetings as production traffic, we’d throttle, cache, shed load, and measure user happiness. So let’s apply SRE thinking.

First, design meetings as services with SLOs. Define the purpose (“approve RFC-42 on canary strategy”), the latency budget (start and end on time), the error budget (one missed pre-read tolerated per quarter), and the success metric (decision logged in the design record within 24 hours). The SLO metaphor forces clarity: if your meeting can’t state its user journey, it doesn’t deserve users.

Second, instrument the meeting path. For incident reviews, track time-to-draft, time-to-review, percentage of action items closed on schedule, and reoccurrence rates. For design reviews, measure time from proposal to decision, rework due to unclear non-functional requirements, and support tickets tied to ambiguous decisions. Data wins arguments and shrinks calendars.

Third, practice progressive delivery for rituals. Pilot “no-meeting Wednesdays” with one team and compare their focus time, deployment frequency, and change failure rate before rolling out. You wouldn’t dark-launch a new load balancer globally; don’t dark-launch a new meeting policy either. 

Real Stories from the Pager Trenche

At one company, we ran a weekly cross-team status meeting that was 75% venting, 25% reading dashboards aloud. The on-call engineers hated it, so we replaced it with a 20-minute “ops review” with a rotating facilitator, a shared pre-read, and a ruthless focus on three questions: what degraded, what improved, and what we’re trying next. We added a simple rule: if a topic can’t be resolved in two minutes, it becomes a follow-up doc with a DRI. Two months later, deployment frequency ticked up, and the number of “FYI” Slack threads dropped because decisions were recorded in one place.

Another team had “postmortems” that were really ceremony without outcomes. They wrote long narratives and stopped there. We added a service-level twist: each postmortem had to produce one reliability hypothesis, one anti-fragility experiment, and one toil-killer automation. The next quarter, repeat incidents for the top offender fell by half. Coincidence? Maybe. Causation? Also maybe. But the conversations changed. People started asking, “What experiment did we ship?” rather than, “How many slides do we have?”

Actionable Approaches That Make Meetings SRE-Friendly

Start by making meetings the last resort, not the first impulse. Default to a short pre-read: two pages, few diagrams, and a clear call to action. Asynchronous comments settle most questions without summoning twelve people to hear two speak. When you do meet, you’re debating trade-offs, not deciphering context. DORA’s research emphasizes the outsized impact of documentation quality on performance; treat good pre-reads as docs that pay you back in flow time. 

Next, timebox with compassion. For high-stakes topics — incident reviews or risky migrations — schedule shorter, more frequent sessions rather than a single marathon. The human brain, especially the post-incident brain, has a limited battery. A crisp 25-minute review with a stand-up energy often surfaces the key insights faster than a wandering hour. Hybrid work trend data shows shorter meetings are feasible; embrace that pattern deliberately. 

Then, assign roles like you do during incidents. Meetings should have a facilitator (keeps to agenda and time), a scribe (captures decisions and owners), and a decider (breaks ties, sets next steps). In outages, we respect the Incident Commander for a reason — role clarity reduces chaos. Bring that discipline to design reviews and postmortems so you don’t re-litigate decisions for weeks. 

Also, instrument your calendar. Once per quarter, export meeting data and compute an approximate cost. You don’t need a fancy calculator; just count headcount × duration × frequency. Compare the “spend” with what you shipped or stabilized in that period. If a meeting can’t justify its burn, spin it down or move it async. The point isn’t punishment; it’s stewardship. Our job is to steward error budgets and infra spend; stewarding attention is the same game.

Finally, treat reliability rituals as sacred, but small. Protect the postmortem and the chaos drill, but keep them tight, blameless, and useful. Publish the timeline before the meeting so the session explores causes and experiments rather than live fact-finding. End every postmortem with action items that meet the “observable and testable” bar. End every chaos drill by comparing actual behavior to your steady-state hypothesis and filing follow-ups. When these rituals hum, you’ll notice fewer ad-hoc “what happened?” meetings because the knowledge lives where everyone can find it. 

Opposing Views, Honestly Considered

View A: Meetings are a tax on builders. Engineers don’t need another “alignment” block; they need quiet hours to remove toil, improve test determinism, and burn down reliability debt. Observational data shows late-night meetings rising in global orgs, contributing to burnout. The fix: aggressive async-first, “no-meeting” days, and a hard cap on attendees. 

View B: Meetings prevent expensive misunderstandings. Architecture decisions and incident learning degrade when done purely async. The richest signals — hesitation, conviction, that “wait, what if…” spark — still show up fastest in real time. The fix: fewer but better meetings with strong facilitation, roles, and pre-reads; make the synchronous time precious and decisive. 

Both positions are right in context. The SRE approach is to measure and iterate. If your change failure rate spikes after a calendar purge, you tuned too far. If your on-call health declines because focus time vanished into status theater, you tuned too little.

The Human Part (Because Reliability Is a Team Sport)

Every team has a lore file. It includes the 2 a.m. Slack channel where you discovered a rogue cron job last edited in 2017, the postmortem where someone quietly admitted they didn’t know how the message bus fan-out actually worked, and the design review that saved you a multi-quarter detour because someone asked the “dumb” question. Meetings — the good ones — are where that lore gets harvested and turned into shared understanding. Blamelessness isn’t just a moral stance; it’s an information-optimization strategy. If people fear the room, they’ll avoid the room, and your system will pay the price later. 

A Simple, SRE-Flavored Playbook You Can Try This Week

Start with a pre-read policy: if there’s no doc by T–24 hours, the meeting auto-cancels. In the doc, write the decision to be made, the options considered, and the edge risks. Bring diagrams for systems, not slides for optics. When the meeting happens, keep cameras on if you can, but keep the mic on the data. At the end, write a one-paragraph decision record and paste it into the team’s knowledge base.

Next, look at your reliability rituals. Run a chaos drill on a production-like slice and time how long it takes the team to identify degraded steady state. Compare the result to your hypothesis. If they diverge, don’t blame the humans; update your mental model and your dashboards. The following week, do a ten-minute debrief in your ops review, not a 60-minute symposium.

Finally, create a calendar budget for your team, just like you have an error budget for your service. If the budget is 10 hours of recurring meetings per engineer per week, force trade-offs. Want to add a product-triage session? Something else has to go. Scarcity breeds clarity.

Closing: SREs Aren’t Meeting-Averse; We’re Waste-Averse

If a meeting increases reliability, accelerates flow, and spreads the kind of knowledge that eliminates future pages, it’s worth the calories — with or without pizza. If it’s theater, kill it with kindness (and data). The hallmark of a great SRE culture isn’t the absence of meetings; it’s the presence of useful ones designed like the systems we steward: measured, resilient, and continuously improved. When we apply SRE discipline to the calendar, we don’t just save time; we save ourselves from the slow drift into process debt. And that’s a meeting I’ll always accept.

References

  1. Announcing the 2024 DORA report (Google Cloud Blog)https://cloud.google.com/blog/products/devops-sre/announcing-the-2024-dora-report

  2. 2024 DORA survey now open (key 2023 findings summary, Google Cloud Blog)https://cloud.google.com/blog/products/devops-sre/2024-dora-survey-now-open/

  3. Hybrid Work Has Changed Meetings Forever (Harvard Business Review)https://hbr.org/2024/06/hybrid-work-has-changed-meetings-forever

  4. Breaking down the infinite workday (Microsoft Work Trend Index, 2025)https://www.microsoft.com/en-us/worklab/work-trend-index/breaking-down-infinite-workday

  5. Blameless Postmortem for System Resilience (Google SRE Book)https://sre.google/sre-book/postmortem-culture/

  6. Principles of Chaos Engineering (principlesofchaos.org)https://principlesofchaos.org/

  7. Shopify is cutting all unnecessary meetings… (Fortune)https://fortune.com/2023/01/03/shopify-cutting-meetings-worker-productivity/

#SRE #SiteReliability #DEVOPS #DevOpsCulture #Postmortems #ChaosEngineering #EngineeringLeadership #RemoteWork #HybridWork #Meetings #ReliabilityEngineering