Managing the Dynamic Datacenter

Datacenter Automation

Subscribe to Datacenter Automation: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Datacenter Automation: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Datacenter Automation Authors: Yeshim Deniz, Liz McMillan, Pat Romanski, Elizabeth White, Glenn Rossman

Related Topics: Virtualization Magazine, VMware Journal, Tidal Software, Datacenter Automation, CEP on Ulitzer

Datacenter Automation: Article

The Table Stakes for Managing IT Just Went Up

Changing the approach to system operations

CIOs see operational excellence as the "table stakes" in the high-roller game of IT. Often thought of as the starting point to creating value for the business in this game, businesses use IT for competitive advantage, and the business that is most successful wins. However, operational excellence has always been a moving target - more of a journey than a destination because it's seldom attained, and then not for very long. What was good enough yesterday, no longer meets the today's needs and surely won't meet tomorrow's.

Today's financial climate adds massive complications to this journey and totally changes the stakes for CIOs and the IT organizations they run. To succeed in competitive markets, operational excellence needs to constantly improve over the next several years. Attaining operational excellence will require IT to change how it approaches system operations. The stakes for managing IT just went up.

The current business environment presents more challenges than any in years. As its business partners grapple with white-knuckle ups and downs in the marketplace, IT must find ways to deliver efficiency and improved service without being any less nimble or responsive. Innovation can't take a backseat to reducing costs, absorbing change, or continuous improvement. The business side needs every competitive advantage it can get, and IT needs to be part of the solution, or it risks being seen as part of the problem. If you're in IT, it's plain to see that both alignment with and adding value to the business have taken on new urgency.

From a business perspective, IT only adds value when it builds and deploys new services and applications that the business uses. Anything else is simply housekeeping that maintains existing systems and services. Yet today, most IT organizations only expend 10% to 20% of their resources building and deploying these new services and applications. That's because 80% to 90% of IT's resources are focused on maintaining existing systems. IT professionals spend most of the day fixing problems that prevent them from applying time and resources to activities that could truly drive business innovation. In fact, IT managers may find themselves mired down in operations and spending too little time helping the business grow. Systems management for existing systems such as providing performance management, backup and recovery, systems maintenance, etc. are all necessary tasks, but not ones the business sees as adding greater value. The key to adding greater value is to increase management focus and the level of resources dedicated to building and deploying new services and applications that support business initiatives.

Fundamentally IT needs to reset its economics and dramatically improve its operational efficiency. There are three ways to reset economics: simplify its infrastructure; reduce waste and error; and increase automation.

Efforts to simplify infrastructure are already underway in many IT organizations partly initiated with efforts to "go green." These organizations are reducing costs by taking advantage of larger servers and virtualization to support server consolidation. This reduces both facilities costs and the energy footprint. In addition, second passes at consolidation can lead to greater efficiencies created by datacenter consolidation as well as standardizing tools and processes.

A variety of approaches can be used to reduce waste and error:

Identify expenditures that are no longer justified such as support for legacy applications and hardware that have been retired.

  • Evaluate tasks and processes that are not strategic to supporting the business and consider no longer doing them or outsourcing them where it is cost justified.
  • Ensure operational best practices are identified, kept up-to-date and consistently followed to minimize operational issues. If the appropriate process and policy is not consistently followed, errors occur that negatively impact service. In addition, additional resources are wasted backing out these errors and redoing the task correctly to restore service.
  • Stop wasting your most expensive staff by consuming their time performing routine and repetitive tasks. Wherever possible move these tasks to less expensive resources and refocus expensive high-performing staff members on identifying and managing best practices as well as delivering new services and applications.

Increasing IT automation is a particularly enticing area to consider. IT has been automating business processes, so it knows how to automate and knows the value automation can deliver: lower costs, higher quality, and greater responsiveness. Like the cobbler's children who have no shoes too often IT is so busy serving the automation needs of the business that it gives insufficient service to its own needs.

But where can IT get the greatest return from automating its processes? For those familiar with ITIL processes, the place to start is with incident management within service operations. The purpose of incident management is to return IT service to customers as quickly as possible once an incident occurs. Incidents don't have to actually create a service issue. They also include situations where an incident is imminent such as detecting that an out-of-space situation will occur in an Exchange Server data store during the next few days unless action is taken or in a Web application where resources pools are nearly consumed indicating a strong chance for impending end-user performance impact. This is also an incident and should be dealt with proactively. Incident management is the service operation discipline that if the business could see into IT, it would understand how it directly touches the business and understand its added value.

Incident management has three components where automation can significantly assist in reducing costs: incident avoidance; incident detection; and incident resolution.

Automated incident avoidance provides very high value by automating the daily, weekly, and monthly checklists that exist as best practices for managing infrastructure and applications. Many suppliers of servers, operating systems, and applications provide recommended checklists that should be run regularly to uncover incidents. In ITIL's problem management, when resolution to a problem is determined, it too can become part of the ongoing automated preventive maintenance. Unfortunately, preventive maintenance is often neglected. Ironically, it's often deferred to resolve incidents that would have been detected in advance if preventive maintenance had been done in the first instance! Automating daily maintenance such as checklists can significantly reduce incidents from occurring in the first place and avoid wasting resources on incidents and problems that were avoidable.

Incident detection involves manually sifting through calls and e-mails to the service desk as well as processing operational alerts. ITIL places the service desk at the heart of incident management because for many IT organizations a customer complaint is generally the first indication there is an incident that threatens service. IT needs to be more proactive with managing incidents and should have a goal to detect most incidents without requiring a customer to call.

Complex event processing can be an effective tool to automatically determine when patterns of alerts, events, and other data indicate that an incident has just occurred. By automating incident detection, IT can improve service by more quickly identifying when an incident has occurred and begin resolving the incident. IT can also improve its internal reputation by avoiding the awkward situation of having to have your customers tell you that incidents are occurring.

Incident resolution is the process of doing those tasks necessary to close an incident and return processing to its normal state. Today this is a time-consuming manual process and there is always the issue of ensuring that the analyst assigned the incident actually follows best practices and adheres to company policy. Automation of incident resolution provides several benefits in closing incidents. First, it can ensure that there is little delay between identifying an incident and beginning to resolve it. Second, it can ensure that best practices and company policies are consistently followed. Third, it creates an audit trail to track and enhance processes. And finally, it can significantly reduce the mean time to repair and the resources required to resolve an incident.

The class of technical solutions that can assist in automating IT processes and procedures including incident management will be classified under a new class of tools called IT process automation or IT process management. These are tools that allow the automation of processes and policies without requiring scripting, although existing scripts can be reused if so desired.

They are graphically oriented and allow you to use a visual editor to create these processes and policies using a familiar drag-and-drop paradigm. These tools make it significantly easier to create, document, automate, and audit process and policy. And they can simultaneously reduce the resources required to support IT and increase levels of service to the business. Some of the capabilities you should look for when considering these tools are:

  • Drag-and-drop visual process editor - makes it easy to design and build processes, and makes it easy for others to evaluate and support
  • Automatic process wiring - makes it easy to link the steps together
  • Full-function workflow engine - ensures that you can handle iterative steps, case logic, and other complexities in processes
  • The ability to manage processes separately from policy - since policy and who implements it will change more often than the process itself, you want to be able to isolate policy from the process
  • Embedded administrator expertise - having domain knowledge built-in makes the technology easier to implement in that domain
  • The ability to initiate processes ad hoc or automatically, based on sophisticated calendars or events - ensure that the processes can be initiated how and when you want them to.
  • Automation of routine daily, weekly, and monthly best practices - support for incident avoidance and preventive maintenance
  • Support for client self-service - allows expensive IT human resources to offload tasks to more cost-effective persons
  • Full-function security model - ensure that you can control who has access to the automated processes, who can edit them etc. to ensure system integrity.
  • Auditing and logging of all active processes - required for system auditing and to support identifying areas for enhancement
  • Support for the platforms and applications important to your business - IT process automation will be most effective if you select a specific domain to start in rather than approaching the problem broadly
  • Complex event processing - provides the ability to handle complex decision-making based on prior events and the current status of your systems

By increasing automation for managing IT itself, the industry has significant opportunities to change the economics of providing IT to the business.

Operational excellence has been and will continue to be a moving target. While the economic climate is challenging, the good news is that there are opportunities to reset the economics of IT and dramatically improve its operational efficiency. There are areas where we can reduce our operational costs. If the industry focuses on what delivers increased value to the business - building and deploying new services and applications the business uses - it can weather the current financial difficulties and emerge with its reputation enhanced.

More Stories By Douglas Mackinnon

Douglas R. MacKinnon, Ph.D. is Director of Product Strategy at Tidal Software. Tidal Software brings a radically simple approach to optimizing the operations of one of your company’s most complex assets: its enormous infrastructure of enterprise applications including SAP, Oracle, Informatica, Symantec/Veritas and many others.

More Stories By Wayne Greene

Wayne Greene is vice president of product management at Tidal Software. He also manages the Tidal Enterprise Scheduler product line and drives the product strategy across the company. Prior to Tidal, he was director of competitive and market intelligence at the Wily Technology Division of Computer Associates. Prior to CA, he was at Hewlett Packard for 18 years in a variety of different roles, spending the last 6 years at HP in the Corporate IT and Outsourcing Services division where he was responsible for the enterprise management toolset. Wayne has a PhD f rom University of California at Berkeley, and a Bachelor of Science from M.I.T.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.