SRE Journey: From Fear to Ownership

Published: April 26, 2026

Over the past 4 years as a Site Reliability Engineer at H&R Block, I’ve had the opportunity to grow through some of the most demanding and mission-critical periods—Tax Seasons. Each year brought new challenges, responsibilities, and lessons that shaped my journey from a beginner to a confident SRE owning production systems.

Year 1: Stepping Into the Unknown

When I first joined, I had very little understanding of incident management. Terms like incidents, on-call rotations, and root cause analysis (RCA) were completely new to me. The production environment felt intimidating—large-scale, complex, and critical. I was hesitant to make any changes, worried about the impact.

Year 2: Learning the Foundations

By my second Tax Season, I started to understand how incident management worked. I joined incident calls, observed discussions, and learned troubleshooting approaches from experienced engineers. While I became more comfortable with the concepts, I still lacked the confidence to actively lead or make decisions in production.

After this phase, I was added to the on-call rotation. This marked a turning point—I began participating in production deployments, handling real-time issues, and gaining hands-on experience.

Year 3: Taking Responsibility

During my third Tax Season, I stepped into a bigger role as the primary on-call engineer for a week during one of the most critical periods. This was my first real test of ownership—handling incidents, making decisions under pressure, and ensuring system stability.

Year 4: Driving Reliability

In my fourth Tax Season, I once again took on the role of primary on-call during peak critical days. This time, the experience was different—not because it was easier, but because I was prepared.

The most rewarding outcome? We had no major incidents during the season.

This achievement was the result of consistent efforts over time—implementing multi-region architecture, improving system resilience, and completing thorough seasonal readiness activities. I’m proud to have contributed to these initiatives.

Looking Back

From being someone who was hesitant to even touch production systems to becoming the primary SRE for multiple mission-critical Tier 1 applications—including several AKS clusters—this journey has been incredibly fulfilling.

And this is just the beginning.