darrylcauldwell.com On a journey around the datacenter and public cloud.

Is the future of sysadmin to be an SRE?

I recently came across the role SRE, I didn’t know what this was so researched it a little and found that it refers to Site Reliability Engineers after reading around the subject here is my considered opinion.

The classic sysadmin role could be defined in generic terms as “IT operations staff responsible for designing, building, and maintaining an organization’s computer infrastructure”. The world of IT is continually growing and changing, we’re presently going through technological changes and a move to virutalization and containization of server services, a sysadmin now only needs to manage the hosting platform and can manage by policy applied around the server instance and use light touch operational administration of the each server instance.  As well as this businesses are also changing and attempting to embrace lean methodologies to gain the efficiencies they promise, starting in software engineering using Agile process and now moving to encompass operational management by breaking down the silos between development and operations. A healthy DevOps culture is shown by having working relationships which allows each classical team to see how their work influences and affects the other, and by combining knowledge and effort, produces a more robust, reliable, agile product as a result.

But what of the next stages when server administration is so light tough and infrastructure is delivered by coded workflow, well then you only need hire people who write code. It is within businesses evolved to this point where the term Site Reliability Engineers (SREs) comes in. These are engineers who know enough about programming languages, data structures and algorithms, and performance to properly review the working of an application to properly instrument, measure and alert on its running. Alongside these application skills they have knowledge of operational management to ensure the software continues to have these capabilities through its operational life which might include resilience of failures component, server and site (cloud provider), scalability to accommodate varying workload levels, and security patch management.

Over the years I’ve spoken to a lot of system administrators have come into their roles as an evolution as well maybe through help desk, various layers of support, or even just running computer systems at home and transitioning those skills into servers at work. It is pretty clear in my mind that the same evolutionary path won’t work for the transition into SRE, as the move towards the SRE role requires software engineering skills to understand the application itself these skills are classical and learned in a structured way. I have learned a lot about programmatic structure through working with Powershell however the discipline of a computer scientist or software engineer are still quite distant, programming at any level however is a good starting point to work from, and the more you look at programming and languages the more you understand of a developers view point.

Today many businesses are on a journey of evolution and right now only a handful are at the point in the journey were SREs are needed.  However right now every infrastructure would benefit from having its systems administrator having better programming skills, as such I will be looking to further advancing and formalizing my programming skills.  Bearing in mind the future I believe all sys admins should do the same!

Interesting article about the job from a Google Site Reliability Engineer

Be social and share this post!