Monday, February 10, 2014
Being the Maintenance Guy
It’s a Friday afternoon and you have a big weekend planned. You have left work half an hour early so that you can get your shopping out of the way. The list is in your hands and the rush hour hasn’t yet started. The supermarket is just down the road. First, however, you need to stop off at the bank to draw some cash. You select the amount you figure you’ll need, but nothing happens. After a moment, the screen displays a warning. There are not enough funds in your account. You feel the panic rise in your chest. No problem, you tell yourself. Just a glitch. You try again. Another pause. The same message.
With your card clutched in your hands, you storm the bank. This has never happened to you before. Somebody is going to have to explain, The teller can see you coming and shrinks in her chair. You slam the card on the counter and demand to know what is wrong with the machine. You need cash. Now.
The teller types anxiously into her computer and looks nervously up at you. Your salary, she says, has not been paid into your account. You tell her to check again. She does. You have ten dollars in your account, Ten dollars and fifty two cents. You demand that they fix this problem. The money, she says, did not arrive. It is not the bank’s fault. You should take it up with your firm. Sorry. Oh, and have a nice day.
As citizens of a modern technological society, we use computers every day. When you use an ATM, or a cell phone, or make a purchase, or even fill your car with gas, there will have been a computer somewhere in that process, and quite a lot of software. Without programmes, a computer is just a fancy piece of metal and plastic. On any given day, you probably use, or cause to be used, dozens of computer programs.
Computer programs do not write themselves. Somebody, somewhere sat behind a terminal and typed on a keyboard so that you could draw money from your ATM, or fill your car with gas, or pay at a cash register, or use your telephone. The program that person wrote is designed to perform one task in a long chain of tasks. Perhaps more than one person was involved. Chances are, many people were involved including testers, and designers, and quality controllers. Computer systems go through a manufacturing process just like any other usable object. In fact, you will often hear computer programmers refer to their work in ways normally associated with the manufacturing industry.
And, just as with any manufactured goods, things can go wrong.
Take your Friday afternoon ATM fiasco. Your salary did not make it to the bank, which means that something went wrong in the process which transfers funds from your employer’s account to your account. For years, you have gone to the bank to find that your balance was happy and healthy. You never questioned how the funds got there. They just did. Until today, that is.
For the past twenty five years, I have worked with computers pretty much exclusively in a maintenance capacity of some sort or other. Every morning I check the batches from the night before to see that everything has run smoothly. If anything has gone wrong, I dive into the job’s log to see what happened. Sometimes it is as easy as restarting the job. Other times it is a data issue. Every now and then, the program contains a bug that needs fixing and I get my programming hat on. Once in a while, a whole new program is required because something has corrupted the database and I need a program to fix it.
And then there is the online activity. For an ATM to work on demand, programs need to be running, and programs that allow those programs to run, and systems that work in the background at machine level that hold everything together. When things go wrong with an online system at our company, my job is to see what went wrong and then report it to the correct team. Sometimes, the fault is caused by a program under my control and I have to do some checks. If the program is at fault, it’s back to programming.
For the past twenty five years, I would say that I have been busy. Very busy. The problem is, most people don’t know you’re there until something goes wrong. Managers only seem to notice the teams that create stuff. They’re not interested in stuff that’s been running for a while. Promotions and accolades come from building pretty new systems, not from keeping the existing systems tuned and lubricated.
Many times, the maintenance team shares room with the development team and it is in this situation that the attitude of the managers becomes obvious. Worse still, it rubs off on the programmers. I spent a couple of years working in the same room as a team of developers. We, the maintenance people, would quietly do our thing, checking, fixing, tweaking. Meanwhile, just a few yards away, the developers would build new systems. In those years, the maintenance team was never once treated to a drink to celebrate our 100th fix, or a lunch to mark a trouble-free year. We never had a senior manager stop by to thank us for our hard work. For the developers, however, this was a common occurrence. Every milestone was a cause for celebration. And the end of a project was always marked with a big bash.
I’m not complaining. I’m happy to get on with my work. I get a feeling of accomplishment from seeing a system run smoothly, or from averting an impending disaster with a few lines of code, even if the managers don’t notice or care. What does get a little bit annoying is if this attitude comes from the developers themselves, because they should know better.
Like the time a programmer commented on how I was always so busy and another snarkily commented: “If Paul stopped working the whole company would crash”, much to the amusement of his buddy. I ignored them because a) I was too busy and, b) someone that ignorant is not worth wasting breath on. Later, however, I thought about that sarcastic throw-away and realised that there was some truth in what they had said. At that time I was busy working on a fix to a batch job from the previous night. I was the only one who knew how to fix it. If I hadn’t fixed it, the batch would not have run the following night, which would have caused problems for other batch jobs. In fact, had I downed tools and stopped working altogether, the whole system would have been effected and, over the following few days, experienced catastrophic failure.
Just as a broken cog may not be too serious in the short term, the problem, if left unfixed, would soon escalate and bring the whole system to a grinding halt. So, yes, those two clowns were actually spot on. If I had stopped working then the company may eventually have come crashing down. Customers who normally experience uninterrupted service would no longer have been able to do business with our company. Of course, nobody else could see this, but such is the life of a maintenance worker. Nobody knows or cares until something goes wrong.
So the next time you draw cash from an ATM, or make a telephone call, or buy something from a shop, spare a thought for the poor unappreciated souls who work very hard to ensure everything is running smoothly and often give up their weekend so that you can enjoy yours.