Building the network and installing applications is one thing. Managing the resulting mess is totally different. Almost everything about agency networks is growing: the number of users, the volume of traffic, the number of applications, the required bandwidth. What's not growing is user, constituent and executive patience.
In large part, flagging patience can be attributed to the changing role of networks and computers over the past several years. They have evolved from an interesting adjunct that may help someone do their job better to a utility central to getting the job done. This fundamental change helps explain why downtime might give rise to more expressive and sometimes colorful emotional responses.
Unfortunately, the subject of managing computer resources is so vast that no single article can hope to do more than touch the surface. Nonetheless, there are general concepts, trends and definitions that might help when starting to confront this vital issue.
Traditionally, computer-resource management has been divided into several categories applicable to the management of a network, an individual computer or a running application. The more common of these subdivisions, as enumerated in The Essential Client/Server Survival Guide by Robert Orfali, Dan Harkey and Jeri Edwards, include:
1) Performance monitors that provide text-based and graphical representations of how the hardware, network or application is performing. They generally answer "how much" or "how many" questions. Managers can use them to drill down to a particular piece of the computing environment to get details on how things are going. Perhaps more importantly, these tools usually include the ability to set thresholds for the values they track. The performance monitor silently watches the value unless it goes above or below the threshold; then it takes some action, most often to notify an administrator of the situation. In some cases, the performance monitor can be instructed to carry out an action designed to rectify the problem or situation.
2) Inventory or asset-management tools keep track of what's available. They answer "what" questions -- what computers, what software, what network devices do we have?
3) Configuration management deals with the way a particular piece of software or hardware is set up. Such tools answer "how" questions. This information can be used to help understand and optimize system performance. For example, tools to assist "database tuning" may fall into this category, as would tools that help track and adjust the configuration of network devices.
4) Security-management tools can apply to hardware and software. They generally deal with "who" questions -- who has access to what and, with access, what are they allowed to do? In the heterogeneous environments common today, such tools can be vital. Without them, system administrators have to manually grant or revoke rights for each computer or network that an individual needs to access. Well-designed security tools help give administrators a single view of mixed-vendor environments. Security management is an issue whether the administrator is managing
network devices, individual computers or even software.
In the computer and software arenas, several trends exist. Some tools provide a single interface for granting user access to a variety of systems. One interface may be used to create a user account on both UNIX and Windows NT systems -- the administrator just enters the required data and the tool takes care of figuring out how to get the job done on the target system. Another approach is to use a universal directory service that defines access rights no matter the target operating system (see "Directory Service Slugfest," Government Technology, January).
5) Software-distribution tools automate or assist in distributing and configuring software. They contain elements of inventory, configuration and security management. They answer questions like "who got what, and when," and provide an interface to help administrators distribute software without having to walk from machine to machine, CD in hand.
In the salad days of mainframes, software upgrades happened once -- on the mainframe. Today, when an application needs upgrading or an operating system needs a patch, the task is more challenging. Software distribution applications help automate the process of managing a software base that may be distributed across thousands of desktops.
6) Fault-management software helps administrators answer "what happened" questions -- or in more modern versions, "What's about to happen?" When a computer goes down, or a critical piece of software stops working or the network slows to a crawl, administrators are called on to isolate and repair the problem -- at any time. Fault-management software provides friendly interfaces to help administrators do their jobs and get things running again. It also includes notification mechanisms --e-mail, pagers or on-screen notifications -- to alert administrators to the problem.
When integrated with the information supplied by performance-management software, fault-management software is now being called on to predict, and sometimes fix, problems before they affect users. In those cases, they may send administrators notification of the perceived problem, the corrective action they took and the results.
It should come as no surprise in this era of consolidation that computer-resource-management technologies have been converging.
Traditionally, network management consisted of tools and techniques for managing only the network. A March 1998 Business Research Group (BRG) report defines network management as "a general term that embraces all the functions and processes involved in managing a network."
The same report defines system management as "the management of separate individual computers, including data management, data security, installation and configuration of software and hardware, fault management and performance tuning."
Because of the explosion in computer use and the complexity of networks and systems, the distinction between networks and systems management is fading. BRG calls this new paradigm "dynamic management" and defines it as "management applications that identify user behavior; actively manage the relationship between the user, the network, and applications; and translates the information to a business prospective."
The convergence of management technologies has emphasized what was already becoming clear: No single vendor could possibly provide all the tools necessary to manage large, heterogeneous networks. Instead, what has emerged are several management frameworks that provide a common structure that can accommodate tools from a variety of vendors. Hewlett-Packard's OpenView and IBM's TME 10 are the best known and most widely deployed management frameworks.
Both provide a variety of built-in services and operate on a "manager-agent" basis. The manager is the central program that can graphically show administrators an overview of the network. It is often referred to as the management console.
Management consoles can also display information about individual resources, such as a workstation, a router or other hardware or software entity. The individual resources are called, appropriately, managed nodes. The management console can collect information from the managed nodes in many ways, depending on the kind of node. However, the IBM and HP products provide software that can be installed onto a variety of managed nodes. This software is known as an "agent." Agents can be configured to watch and report on a plethora of information. They can also be instructed to take action based on the information they collect.
For example, an agent installed on a Windows NT server can be configured to monitor disk usage. When a configured threshold is exceeded, the agent may delete all temporary files. If this doesn't handle the problem, the agent may alert the management station, which may place a message on an administrator's workstation or send an alphanumeric page to the administrator, notifying her of the problem.
Both HP and IBM offer system-management software that works in their management framework, but these frameworks' real power lies in their ability to accommodate third-party software. This means that tool vendors can write software that users can "plug in" to the framework, thereby customizing the management environment. Because the OpenView and TME 10 frameworks both supply a unified view of management data, common user interfaces, standardized ways for administering agents and the ability to define customized responses to specific conditions, administrators are spared the problem of having to integrate a bunch of software from different vendors.
Timing is Everything
The best time to start working on network and system management is early in the design phase. Network and system management can then be included in all discussions about architecture, operating systems, network protocols, security, applications, etc.
However, most of us aren't fortunate enough to have the opportunity to build a network from the ground up. And, unfortunately, network and system management problems will not just go away. The larger a network and the more important it is to the agency's core services, the more imperative the need to begin to tackle the management problem.
When trying to bring an existing network under control, the key decision is the framework. Crossing that bridge must take into account existing hardware, software and projections for where the network will be in one, five and 10 years. Despite great strides in technology recently, network and system management is not for those seeking a tame existence -- it involves work and commitment. But, again, perhaps the only thing worse would be to not tackle the problem at all.
David Aden is a senior consultant for webworld studios -- a Northern Virginia-based Web application development consulting company. Email