When I teach troubleshooting to IT professionals, I always use this Sherlock Holmes quote to begin the session:
“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
And no, we are not talking about Data from Star Trek Next Generation here. Although I’m sure he could figure out any computer problem. And while we are on the subject…the reboot of the movies series is awesome! Nice to see we are getting away from the whole, “Captain Kirk, there be wales here!” franchise. No disrespect, but did anyone else have a problem with that? Khan was cool, the rest I could do without.
Anyway, stay with me, don’t go downloading the movies now and taking up company bandwidth (a subject we will cover in a future instalment) let’s stay on track.
You need data to figure out what is wrong, or to prevent something from going wrong in the first place. Information is your best friend and you need to know where to get it. Collecting data and troubleshooting; these are the life blood of the System Admin. They are two sides of the same coin.
So what information sources do we have as a System Admin? When we encounter an issue, where do we begin our investigation?
Number one on my list would be monitoring services along with having instant access to control them.
Let’s get back to basics here, so what are Windows Services?
A service is like an application that runs in the background. Obviously, this means you don’t have a window that the end user can interface with. If you did the screen would look like that old Windows 3.1 Screen saver… the one with all the different color windows flying at you in random patterns?
Anyone? Anyone? Bueller? Sigh…
So, why is it important for you to monitor these services?
Well, for one, security software such as Antivirus and Firewalls run as a service. A good sign a system has been infected or compromised would be finding that one of these is shut off or disabled. Or perhaps you have mission-critical software running as a service, having a good pulse on that service would be very useful.
Some services are not needed depending on the role the machine plays. Since every service running takes resources to do so (see process monitoring) some can be disabled to speed up your system. Knowing services and their functions can lead to increasing the efficiency and security of your system.
So, monitoring services and having access to them is a very good thing!
Another great information source would be event logs…
When I used to work in application support, I was amazed at the number of times I would request event logs, and then have to explain to the system admin what I was talking about or how to gather them. I feel like reviewing these logs has become a lost art.
These logs contain a wealth of information for the System Admin to both diagnosis problems and head-off systems failures or security breaches before they happen.
There are several different kinds of logs, but if you are not familiar with reading them, the top three to get started with would be Application Logs, System Logs and Security Logs.
Investigate these logs on a regular basis. Read them like you would the Sunday paper (do people still read the Sunday paper?). Check them especially when you have a reoccurring problem.
You will find them to contain a rich source of error numbers and messages that can be Googled, often leading to a quick resolution of the problem. You fix the problem, then you are the hero. Isn’t that our goal?
Monitoring your system services and event logs can seem a daunting task at first. However, the more you learn how to use these valuable resources, the sooner you will come to a place where you will wonder how you ever did your job without them.
Next up, I’m going to talk about Patch Management.