Human Side of System Administration

Mark Verber

Draft 0.1 / October 5, 1999

The follow are notes I have made to myself over the years which I am starting to organize and write out. A lot of work is needed before this page will be ready for prime time.

System administrators are human. Users are human. The system designers and engineers are human. System administrators are often the human interface between the people who engineer systems, and the people who are using those systems. Sometimes the system administrator are the interface between users and the system themselves.  As a result, a core part of a system administrator's job involves effectively working with people.  If you don't know how to work with people, to respect other people's dignity, how to corroborate with others, build consensus, etc. you will be an ineffective system administrator.  Something else to keep in mind.  You will never been done, it's a process... but seeing people develop is the most important thing.

Understanding Basic Human Nature

Understanding basic human nature is quite helpful when dealing with other people. The following principles seem to be true for most people, no matter how well educated or experienced. It should be noted that these are rules on thumb, not absolutes, so expect to find exceptions to each of these rules.  Some similar observations were made in the article four rules to understand what makes people tick.

People are Superstitious (and blame shift)

Once people have learned something, they rarely revisit what they learned to question it's validity.  As a result, people's behavior can be driven by constraints which are no longer valid.   This often results in what I call "voodoo system administration".  Voodoo system administration is when you follow a procedure to accomplish a result without understanding why you are doing each step, and objectively what each action is accomplishing.  It is very common for inexperienced system administrators to do this.  When they start working they are shown ways to accomplish various tasks.  They don't understand all the context, so they carefully learn the process.  Even after the could understand the process they never go back an think through what they are doing.  Over time, constrains change, but the old process continues to be used.  For example, many people have gotten used to working around limitations in tools.  In effect, they don't trust a tool to function correctly, so they create ways to cope with the tools limitation.  Once the tool is fixed, the work-around continues to be used, even though the fixed tool is a significantly better solution.  <sync;sync,halt -vs- shutdown> <boiling water for every receipe> <stuck ports> In essence, people are behaving in a superstitious manner.

When overwhelmed by complexity, people look for something greater than themselves for help <no atheists in foxholes>, or resort to "magic" to explain things.

People will blame whatever they understand the least.  A few years ago I was working with a number of really good engineers who were working on a project which initiated  HTTP connection to fetch data.  There were two major components used to accomplish this task.  First, there was a freely available library (V0.9) which implemented an HTTP protocol stack.  The engineers had previously used this library, and had invested significant efforts to remove bugs and improve the overall quality of the code.  Second, there was a commercial HTTP proxy produced by a vendor with a good reputation for effective testing and building solid products... but the engineers had no previous experience using.  In a three month period of time, numerous problems were detected.  Each time we discover that the library was at fault.  Why was the HTTP proxy suspected?  Because people didn't have significant experience with it.  It didn't matter that the proxy was significantly more widely used than our library... the people involved hadn't used the proxy, so the mistrusted it.  In later years, bugs were discovered in the proxy, but a significantly larger number have been discovered in the library.

People like to blame problems on the "outside" forces. Few people are prepared to admit they made a mistake. Most people assume that if they encounter a problem, it is someone else's fault. Users will assume their program stopped running because someone change the system.  Someone who says "I didn't change anything" isn't always lying. Sometimes they're just ignorant or forgetful.  <raising machines breaking floating point>. Software vendors will blame the hardware manufactures. Third party peripheral vendors will be blamed by your primary vendor.  Typically people will blame whatever has caused them the most problems in the past, whether that makes sense or not.  These days people are often likely to blame "the network", a particular server, or a tool for all problems.  Whether or not it's even possible for this object of hatred to be responsible.

People Look Out For Themselves (and sometimes others)

People are basically selfish.  They have there altruistic moments, but you can't count on this from everyone, nor should you rely on it over an extend period of time, the most consistent behavior you should expect is that people will look out for themselves.  If you make plans which require people to make sacrifices or work harder without winning something of equal value (to them) in return, your plan is doomed.  To use an old saying, "You get what you reward."  This means you have to be vary careful what behaviors you measure and reward.  <bug removal -vs- solid code>. feedback loops and applying system thinking to workplace.

People ask for what they care about - merchants want your credit card, they don't care about a certificate. <SET -vs- credit card #s over SSL>

The user who says "Can X be done?" is usually really asking "Would someone please do X?". Make sure you answer both questions.

Path of Least Resistance - They will go around you if you are in the way.  Corollary: if you provide an easier road... they will do what you ask them to do.

Don't rely on a captive Audience - they will find a way to take their money elsewhere

Change is Hard

No one can keep up with infinite change.  The higher the rate of change, the more people get overloaded.  People's ability to manage change is variable.

Blaauw's Law: Established technology tends to persist in the face of new technology.  If you want to make something go away, you have to help it along.   No program or hardware is so obsolete that nobody wants it anymore.  Sometime it is just easier to turn it off and see who screams for an extended period of time.

Eliot's Observation: Nothing is so good as it seems beforehand.

Managing Your Customers

Being a system administrator would be much easier without the users. It would also be pointless. Network administration is the closest to being a user-less system administrator.

The Dynamics of Trust

You will be graded on whatever expectations have been set.  Once you set expectations it is very hard to chance people's expectations.  Failing to set expectations is extremely risky because some people will assume you are going to deliver more than you intended (or that is possible).  <no SLA -vs- 98%> 

Direct communication

Easily destroyed

Share the Work

Empower people - don't get in the way. Figure out your value add and get out of the way for the rest of it. <Example: control the namespace for /depot, but let users populate it.>

Helping people serve themselves

user's can be your friend... they can be a big help. Learn who the experts are and rely on them.

"He/she who is annoyed is anointed to fix it." -Jeff Allen

Training Folks

Secretaries are easier to teach than engineers - they understand shit sometimes happens. Sec use to phone system and copiers

Always explain things to users as you fix them. Sometimes they learn.

No amount of intelligence and no amount of education can ever substitute for common sense. <terminal line> Corollary: Brilliant PhD engineers do not usually make good system administrators.

different people what to know what is going on to different levels

Good judgment comes from experience... comes from bad judgment.  People learn from failure more than success

PR

honesty best policy

solving the problems is what makes the difference

People don't read messages 10% miss messages use high priority too much, ignored

Bucy's Law: Nothing is ever accomplished by a reasonable man. (Zwicky's corollary: If nobody calls you a fascist, you're doing something wrong.)

Zimmerman's Law of complaints: Nobody notices when things go right.

Your brilliance depends on success and has no relationship to difficulty of the problem.

It's easy to cook the books (lying with statistics) - make sure you know what your statistics mean. <Example of user satisfaction survey.>

Potter's Law: The amount of flak received on any subject is inversely proportional to the subject's true value.

Pareto's Law: 80/20.

The power of tee-shirts or other trinkets

Always say you're sorry. There's no need to lie: "I'm sorry you're so upset about this" is usually both true and acceptable.

Policy, Process, and Procedure

Power of abstraction / clean interfaces

Power of policy

Power of process

The power and limitations of specifications

Common Vocabulary makes a big difference

Not just train - get it into the hiring process

Logs are good Record what you are told to do. When asked "Why did you?" Say you told me so. Here it is in this log.

Document, Document, Document