The Value of Configuration Management

[article]
Summary:

 

Real-World Reasons for Investing in CM: At CM Crossroads, most discussion is about software CM. Many SCM practitioners are looking for help on-line, and SCM tool vendors keep adding more and more features to their products, pushing the envelope of CM. But no matter how far the envelope is pushed, software configuration management will remain a subspecialty of "plain old" configuration management.

 

If you're new to CM or if you're struggling to get more visibility for CM in your organization, it's important to understand what CM is about. Rather than talk about versions and baselines, let's step back and look at a much simpler question: what value does configuration management provide? Why should you be interested in software configuration management? Why should your engineering manager schedule a CM audit? What does CM offer that your CTO can understand?

It's hard to put a dollar value on improved branching support or distributed conflict resolution. Every SCM tool vendor has a list of little things that incrementally improve the value of their tool, but what was the initial value-the value that we're incrementally improving? Let's skip past the gleaming chrome and the fancy paint of your SCM environment and look at those big heavy beams underneath, holding it all up-the basic value of CM. In simplest terms, configuration management can save your life, it can save your job, it can save your money, and it can keep you out of jail. How's that for value?

CM Can Save Your Life

1) “Pennsylvania firm recalls ground beef products due to possible E. coli 0157:H7 contamination”:
Washington, Nov. 3, 2007 - Cargill Meat Solutions Corp., a Wyalusing, Pa., firm, is voluntarily recalling approximately 1,084,384 pounds of ground beef products. Each package or label bears the establishment number "Est. 9400" inside the USDA mark of inspection. The ground beef products subject to recall were produced between Oct. 8 and 11, 2007.

2) “Ohio firm recalls frozen meat pizzas due to possible E. coli 0157:H7 contamination”: Washington, Nov. 1, 2007 - General Mills Operations, a Wellston, Ohio, establishment, is voluntarily recalling approximately 3.3 million pounds of frozen meat pizza products. Each package also bears the establishment number "EST. 7750" inside the USDA mark of inspection as well as a "best if used by" date on or before "02 APR 08 WS." The frozen meat pizza products subject to recall were produced on or before Oct. 30.

3) “California rirm recalls frozen beef tamales that may contain pieces of metal”: Washington, Nov. 8, 2007 - Circle Foods, LLC., a Chula Vista, Calif., establishment, is voluntarily recalling approximately 3,750 pounds of frozen beef tamales. Each carton also bears the establishment number "EST. 17417" inside the USDA mark of inspection as well as a "use by" date of "110308." The products were produced on Nov. 3.

4) “Test triggers lettuce recall in Bay state”: Boston - The state Department of Public Health is asking consumers to throw out packages of "Dole Hearts Delight" with a "best if used by" date of September 19, 2007 and a production code of "A24924A" or "A24924B."

 

5) “Metz Fresh announces voluntary recall of spinach”: Salinas, CA - August 28, 2007 - Metz Fresh, LLC is voluntarily recalling bagged spinach. The only Metz Fresh product affected is spinach that bears the tracking codes 12208114, 12208214 and 12208314..

If you look closely at the recall information above, you'll notice that some recall descriptions are more detailed than others. This is largely a function of the company involved in the recall-neither the FDA nor the USDA impose very stringent CM requirements for companies in the food business.

When you eat a frozen tamale filled with metal shards, you'll probably know when it's time to go to the emergency room, and you'll probably live. But when you get ground beef contaminated with E. Coli, you won't generally know it until late in the game. If you get help you might be fine, or you might suffer some organ damage, or you might die. Knowing that you are affected can help you get treatment faster. You might feel better, you might avoid organ damage, or you might just avoid dying. All of the recalls cited above include some kind of CM details-generally using lot numbers or an establishment number and a date range. Identifying the products in your fridge can help you avoid trouble, or realize that you need to get help. After the products are consumed it's harder to know, since the box or wrapper is generally in the trash and out the door.

On January 21st, 1993, the Jack in the Box restaurant chain (owned by Foodmaker, Inc.) began taking responsibility for E. coli contamination due to contaminated, inadequately prepared hamburger meat. Four children suffered from the contamination and died. Subsequently, food preparation and handling standards have been increased nation-wide.

In 2000 and 2002, the USDA ordered a record number of beef contamination recalls. In 2007, the number of recalls was one short of that record. The beef industry says it spends in excess of $350 million annually to prevent contamination, but the fact remains that this is a real, recurring problem. In effect, there was one beef recall every three weeks in 2007, and those recalls got started because contamination was detected in a product already shipped from the meat producers. If the manufacturer isn't doing good CM, then the recall has to be very broad. That makes it harder to know if you're affected. And in this case, what you don't know can kill you or someone in your family.

Bridge Collapse: A Half-Inch Closer To Why

The collapse of the Interstate 35W bridge may have originated with the failure of gusset plates that were sized a half-inch too thin in the original 1960s design, the National Transportation Safety Board (NTSB) said Tuesday. If you think CM only applies to engineering projects, meat and some vegetables, consider the collapse of the Interstate 35W bridge between Minneapolis and St. Paul, Minnesota. Current thinking (the final NTSB report hasn't been released) indicates that the gusset plates-plates that overlap the joints between beams-were too thing. Out of 224 gusset plates, 16 were the wrong thickness. The thicknesses, right and wrong, are specified in the original design. This was not a construction error, or a QA failure-it was a CM failure. Adequate CM could have identified the discrepancy between sizes of gusset plates, and adequate CM should have caught the failure during an audit. Instead, 13 people died and more than 100 more were injured.

CM Can Save Your Job
The Topps Meat Company recall was the most spectacular public CM failure of 2007. There is a QA component to the failure, but after the "quality problem" was detected (in this case, by E. coli sufferers appearing in hospitals), there was no way to narrow down the products affected. Topps engaged in what is called carrying over. Basically, meat that was left over from one shipment was mixed in with meat arriving in the next shipment. This allowed contamination to remain in the system essentially forever. The result was that the Topps recall specified the USDA establishment number (basically identifying the Topps meat plant) and specified all products coming from that establishment for one year.

If Topps had paid serious attention to their CM function, in particular, if they had performed even one serious CM audit, with the example in mind of every other recall in recent years-the need for separate, discrete lots or runs would have been obvious, and the carry-over of excess ground meats would have been eliminated in favor of a lot or batch system. The result of this CM failure is pretty dramatically illustrated. A week after they announced the recall of 21.7 million pounds of ground beef products, Topps issued a press release explaining that they "cannot overcome the economic reality of a recall this large. Topps shut down its plant, discharged nearly all of its employees, and ceased operations on October 5, 2007.

A less recent CM failure was the WorldCom implosion. WorldCom was one of the spectacular failures that prompted the passage of the Sarbanes-Oxley law, creating an entire new market for CM tools and practitioners, and making ITIL a recognized term for IT professionals worldwide. The collapse of WorldCom disrupted the lives of the 80,000 employees, and eventually left 17,000 of them unemployed. CM can save your job by saving your entire company.

On a smaller scale, there are stories about in the media about production software CDs that are inadvertently shipped with viruses on them. Arguably worse are the stories about software deliberately shipped with malicious or inappropriate software, such as rootkits and Easter eggs. While it is conceivable that burning a CD image that includes a virus won't cost you your job, do you want to count on it? CM Can Save Your Money
WorldCom wasn't a traditional CM failure. There wasn't an omission or oversight that caused the collapse. Instead, it was fraud. Scott Sullivan, WorldCom CFO, ordered changes to accounts so as to allow the company to report phony profits. In effect, there was already a CM policy in effect: U.S. federal law. But WorldCom didn't implement good enough controls, and so these changes passed through unreviewed. As a result, the company no longer exists.

To add insult to injury, the company culture discouraged the sale of shares. About 32 percent of the employee pension fund was invested in WorldCom shares. In 1999 the pension fund held WorldCom shares valued at $1.19 billion. In 2002, after the implosion, the shares were worth $18.7 million-roughly one percent of their earlier value. The other billion dollars went up in smoke.

On the bright side, Playboy magazine featured the "Women of Worldcom" in their December, 2002 issue. If you've just lost your income, the town you live in has just lost their single biggest employer, and your retirement portfolio has just shrank by a third, take heart! You might have the body to appear in Playboy. Maybe I should have included a "CM can save your dignity" section?

If you don't like the telecom industry, you could always try energy. Enron, the other big name associated with the passage of Sarbanes-Oxley, collapsed in late 2001. Once again, accounting fraud was the culprit (and "getting caught" was the cause). Share prices dropped from just over $90 to basically nothing-another 99% reduction in value. Perhaps the share prices were propped up by a large investment in Aeron chairs for IT staffers. Even used, they have a pretty good resale value.

At any rate, Hugh Hefner stepped in and rescued some of the unfortunate workers. The Women of Enron issue came out in June, 2002. Ten lucky gals, out of some 4,500 employees, got to cash one last check. The other 4,490 workers-including all the guys-got to start selling their living room furniture on eBay.

Cleverly, the Enron 401(k) contribution-matching plan used Enron stock. Equally cleverly, employees were prohibited from trading the company's contributions until they reached 50 years of age, and prohibited from trading more that 25% of the contributions after 50. A little aggressive financial management, accounting fraud and corporate collapse, and Enron employees lost an estimated one billion retirement-plan dollars when the company collapsed.

The Sarbanes-Oxley act was passed in response to these kinds of shenanigans. While the act can't prevent all fraudulent accounting, it would have prevented the worst offenses in both cases. Not only would the the employees still have their retirement savings, but the corporations' shareholders would still have their billions of dollars of value.

CM Can Keep You Out of Prison
Both the Enron, WorldCom, and incidents like them have resulted in significant prison terms for those involved. (Former WorldCom CEO Bernard Ebbers was sentenced to 25 years. Enron CEO Jeff Skilling got 24 years. Jamie Olis, of Dynegy, received 24 years for his part in an Enron-like deal.) Sarbanes-Oxley now requires that all changes affecting a corporation's balance sheet be controlled changes-documented, reviewed, approved. In effect, the law requires CM in the finance department. Of course, the list of things that affect a balance sheet is pretty long, so there is more impacted than just finance. But what the law really does is require a paper trail for changes: "if you're going to be a crook, sign here."

In high profile blow-ups like these, the law is generally used as a sort of ultimate CM policy. Sometimes this is the right thing to do-like Sarbanes-Oxley. Each company is substantially different in how they handle their finances, and in fact many companies outperform their competition in the marketplace based on superior financial management. Having the freedom to experiment with novel approaches is important, and Sarbanes-Oxley permits this freedom (provided that everything is documented). In other cases, like the collapse of the I-35W Mississippi River bridge in Minnesota, the benefit of construction standards is that they serve as a repository of knowledge. Each time a failure is detected, the standards are updated to prevent that failure from happening again. Obviously, failure to comply with the latest standard can lead to repeated failure, rework expenses, financial penalties, lawsuits, ignominy, and even jail time.

In the case of bridges-discrete, identifiable items each of which is a complex assembly-it would make sense to build and maintain a database of the construction details of each one. This would not be a CM activity directly, but more of a "meta CM" activity. The value, though, is pretty clear. Instead of performing a separate series of panicky investigations, a central registry of designs would allow bridges likely to share a similar vulnerability to be identified. Any remediation or mitigation activities could be coordinated and scheduled based on vulnerability.

The key difference between software development and other forms of product development is speed. Very large software systems can be specified, designed, and built in amazingly short time. Automatic deployment of changes is now a routine feature. This makes software increasingly cheap, but makes systems that depend on software increasingly vulnerable to error. CM on a software system has to keep up with that speed and complexity, which is part of why software CM tools keep adding features. The fundamental principles of CM are the same, though: name, version, identify, control, audit. These heavy beams support the rest of your development infrastructure. They aren't glamorous, but they don't need to be: they are supporting everything else.

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.