Software is Eating the Ops World

One thing I’ve thought a lot about is how the role of the system administrator is changing. This reflection was prompted by a couple of things: one, I’m a co-chair for talks at one of the longest running system administration conferences, so I should probably think about this kind of thing seriously when planning what talks we’ll accept, etc. The other thing, though, is that I’ve read what some peers have had to say about the tone of the Google Site Reliability Engineering (SRE) book. My own interpretation is that the book thinks of traditional system administrators as “button pushers” who solely operate something that someone else gave them; similar to what you see in many large organization IT departments. There’s a heavy emphasis on Engineering™, which isn’t present in large organization IT departments. I haven’t really dug in to the book – so I’m going to leave those thoughts here and circle back in a few.

The idea that administrators don’t program reminds me of my first interview for a system administration internship back in May 2007. This is funny to me because one thing I remember clear as day was stating that I was not a programmer. Code was not something that I did. I just made computers work well and do useful things for people. Other people wrote code, I just lived in their world.

Yet, in the last week I found and patched a bug¹ in Debian apt (a minor one at that, but nonetheless something installed on many Linux machines across the world) and added functionality to the software we’re using to review submissions for LISA ‘16². This does not correlate with the 2007 me, who decided to be so resolute in what he could do and could (would?) not do. What gives?

I was one of those stereotypical kids who was good at computers. In short, I ended up making my town’s website at a young age, could fix computers and their software issues, played with Linux at a young age, life was good. What was more difficult to wrap my head around was learning how to actually program. I was given a copy of Visual Basic growing up; I wasn’t quite sure what to do with it. Our basic CS classes in high school had us reading in values and printing out algorithmically generated Christmas trees. I was not exceptional at math, a skill that people often referred to as extremely important to being able to write programs.

Here’s the thing, though: I was able to run my high school’s email server, and make sure backups worked for a small elementary school in the school district, and I was later asked to help my town buy and maintain computers at town hall and basically be the IT manager there. In considering my future, system administration didn’t seem like it got the same respect that developers did, but I felt decent at it. I felt decent at making systems reliable and avoiding data loss and working around the issues that developers seemed to throw at their users (both end users and administrators). Additionally, people seemed to find value in these skills, and even paid for this. I was a lucky high schooler, and undergrad student, no doubt. Speaking of undergrad, I sucked it up and did what I could to pass C++ and Java (the only coding requirements for my major) but wasn’t particularly stellar at either.

I remember being interested in automation – it was important to figure out out Windows Deployment Services because there was no way I was going to install Windows by hand on 20+ machines. Reducing patterns into shell scripts was a lot more palatable than going around doing things by hand. During my internship, I wrote a provisioning system in Perl because one of my first assignments was to get Red Hat Enterprise Linux 4 onto 14 machines. There was parsing to be done to move the infrastructure from NIS+ to LDAP. User accounts were previously done by email - we moved to a web-based system after I made one.

The entire time I was working on these things it never felt like programming-with-a-capital-P. I was writing in Perl, PHP, BASH and TCSH – not the compiled languages I had struggled with earlier. But I wasn’t a developer, I wasn’t mathing (is that a word?) and so it never felt real. No complicated algorithmic sorts here - just some SQL and outputting HTML and moving text around when I had to.

I think that I would have been a lot better off if I started with a different mindset: if you made a computer do something, congrats! you did a code. I’m not going to say this is the same as building a complex distributed system for production use, for sure (maybe someday!). I do think I would have been more comfortable with such a thing earlier (rather than now) if I had a few wins early on. Alas.

Today I mostly write in Ruby, and have reached the point where I can jump to another language and flail around enough to get something done. I “get” types now, thanks to toying around with writing something in Go. A few weeks ago I wrote an internal signup form in node.js. I still experience a certain amount of angst when I start writing something, but once I get going it’s okay.

I wrote this because I don’t think I’m the only person in the world who has learned how to code as a system administrator or operations engineer or whatever it’s called today. As software continues to eat the world, I don’t think the operations side of the house is going to get a free pass on continuing to maintain the status quo just because these issues are hard to write logic around. Indeed, some places are figuring out how to remove the human out of the loop as we speak. I see this change in the interview stories that people share in various ops-heavy communities (hangops, etc.), and in the kind of work that people at the middle or top of the field (that I look up to) are doing.

Which brings me back to the SRE book. I can understand the frustration with the way in which they write of (write off?) system administration, especially for those who have cut their teeth and evolved over time. But even 10 years ago, I didn’t have the impression that system administration involved actual engineering so much as just MacGuyver-ing around the decisions of folks who may or may not have run a system in production. Perhaps I was too naive. As the pendulum swings towards larger organizations being more comfortable running their own systems, more organizations are going to look at processes and techniques that have worked for others and try to apply the ones that make sense to their organization. Obviously, not everyone is a large organization that makes everything from scratch, but there are broader lessons to be learned here.

Administrators (myself included!) have complained about the disconnect between people who write software and us for decades. Now we finally have a chance to sit at the table in some organizations. My own hangups about code ability aside: this is actually exciting! We get to care about the things that are important to us! This shift carries with it some baggage³, but I suppose sitting at the table ain’t free, either.

I’m looking forward to a future of working with (and contributing to) systems that care about ops the way I always have, regardless of what it’s called. That’s pretty cool in my book.

Thanks to Julie Hansbrough, Jarrod Pooler, Karen Rivera, and Carolyn Rowland for their thoughts and corrections.

Shameless feel goodery. ↩︎
Jason Dixon’s work on Judy has been a lifesaver as a conference organizer. Thanks Jason! ↩︎
tef’s writing on whiteboard interviews comes to mind when thinking of baggage. ↩︎