Wednesday, October 14, 2009

Running with scissors? Or maybe trimming the risks out of virtualization instead

What's riskier: standing at the top of a hill in a thunderstorm while holding a golf club...or commuting to work? Skydiving...or flying to Pamplona and then taking part in the Running of the Bulls? OK, now for the really tough one: juggling knives...or implementing virtualization in production?

Before you answer, you should be warned that humans are quite bad at assessing relative risks. TIME Magazine had a cover story a number of years back on that very subject. The problem in a nutshell, scientists say, is that we're "moving through the modern world with what is, in many respects, a prehistoric brain."

Deploying virtualization doesn't sound all that dangerous, especially compared with some of the scarier items above (like, say, if you lacking knife-juggling skills). If that was your answer, you probably haven't been in an IT shop recently. OK, maybe it's not the same as spending the night in the polar bear cage at the zoo, but it's not without risks. And since risk is a four-letter word in large IT shops that handle mission-critical systems, it's worth figuring out how to get the benefits while minimizing potential problems.

The IT Process Institute survey that VMware and my employer, CA, sponsored and released (actual survey available here) a few weeks back was put together to try to identify what some of the more mature IT shops are doing to deal with worries about risk that virtualization introduces.

The survey itself, which talked to 323 different IT organizations, is a bit daunting to wade through, so I pulled out some interesting tidbits worth highlighting here:

People's sights are set higher than just server consolidation. And they are being aggressive. 72% are aggressively virtualizing production servers, but only 19% are using virtualization just to consolidate servers. The bigger focus is on pursuing high availability and disaster recovery. And, nearly another third are shooting for dynamic resource optimization.

If you use virtualization in production, you are going to have to change operating procedures and controls. The survey found that those organizations with a strong foundation of process controls and procedures were likely to only need to modify the controls they already have in place. That's good news for some of the bigger IT shops and their IT ops staff. However, the more complex things you try to do with virtualization, the survey found, the more modifications should be considered. Kind of straightforward, but worth repeating.

Many mature virtualization users have at one point limited the release of virtualization in production until training requirements and management procedures were taken care of. Maybe it's just a phase everyone has to go through, but it seems many have slowed things down to err on the side of caution. The survey shows, however, that many IT organizations have now reached what it calls "a level of confidence needed to aggressively virtualize business critical systems, including those that are in scope for regulatory compliance." That's impressive, actually, and is a big change from a few years ago.

Here's where the running with scissors part comes in

The study identified a bunch of virtualization bogeymen -- things that seriously worry the IT shops working to deploy virtualization. Some of those worries included:

· It makes a mess (technology-wise). Also known as virtual sprawl (a term that VMware was very sensitive about when we started using this a few years back at Cassatt). This can also hinder compliance efforts.
· Things can hide. There are potential issues with discovery tools not tuned to work with virtual systems.
· Too much of a good thing. There is license compliance risk if virtual servers can appear too easily. You might also exceed available resource capacity.
· Putting all your eggs in one basket. Well-meaning administrators can inadvertently make things riskier by stacking critical apps together on one faulty machine.
· It makes a mess (organization-wise). Aggressive adoption probably means specialized training and new organization structures.
· A perfect target. Security is a big concern by survey respondents, worrying about the hypervisor as a new layer of technology that can be attacked.

Those probably all sound familiar. The interesting point is that the survey said they all added up to this: "putting virtualized systems into production without a well-reasoned set of operational controls creates an unacceptable level of production and compliance risk."

OK, time to hit these problems head-on, then.

The survey's recommendations for reducing virtualization risk

So what's a good way to start addressing those risks (besides hiding the scissors)? The survey has three sets of recommendations. I've noted where in the survey to find them so you won't have to dig through it yourself:

· 11 practices for those organizations with "baseline maturity" (generally doing server consolidation-type things with business critical systems). The focus for those orgs talks about host access, configuration controls, VM provisioning, and improvements to capacity & performance management. See page 14 of the survey for the exhaustive list.
· 25 practices for "highly mature, but static" uses of virtualization (generally looking at HA & DR issues). There the suggestions are about configuration standardization, approved build images for provisioning, and using a "trust but verify" approach for changes. It takes all of page 17 of the survey to list these suggestions.
· For the braves ones doing "highly mature, but dynamic" things with their virtualization, the research suggested 12 items around configuration discovery & tracking, change approvals, capacity management, and the overall process maturity needed to support automation. See page 22 for this list.

Some virtualization management suggestions

One of the suggestions that's "highly desirable" is a coordinated view between your physical and virtual environment, according to today's Computerworld article from Beth Schultz about "Getting a Grip on Multivendor Virtualization." CA's Stephen Elliot was quoted in the article talking about some of this survey's findings. "A lot of customers are recognizing that virtualization is great, and works wonders," said Elliot, "but certain environments will not be virtualized and so they need to figure out how to manage and automate both worlds together."

I've posted before on why the automation side of the equation is important, as have Laurie MacVittie and others. The report chimes in here, too: "Many view automation used to manage dynamic virtual resources as a prerequisite for tapping internal and external cloud computing resources." But that's a subject for another day.

The Computerworld article also has some good comments about the importance of being able to manage across multiple virtualization vendors' environments, something that has also been discussed here, but was outside the more process-oriented scope of this particular survey.

"The key thing that pleasantly surprised us [in the study] is that customers right now...are thinking more proactively about the need to manage their virtual infrastructure," Elliot said in an interview with Jeffrey Burt of eWeek. "Just because they've got new innovations [in their data centers] doesn't mean that their need for management just disappears."

That, after all, would be pretty risky.

No comments: