11. Configuration Management (unfinished)¶

The practice of handling changes systematically so that a system maintains integrity over time.
— CFEngine, What is CM? 

The field of Configuration Management (CM), also known by the names “change management” or “change control,” predates the software industry. This wide-ranging subject is concerned with the management and replication of products and environments in a consistent, reliable manner, supporting Project Management, Software Quality, Continuous Integration, and Continuous Delivery among other fields. Snooze all you want, but without proper CM a software project is likely, if not much more likely to descend into chaos.

“Entropy,” courtesy HMP Comics ¶

As we’ve discussed at length, living software projects and operations are dynamic and ever-changing. New or modified requirements ripple outward—affecting design, code, tests, documentation, environments, and more. The resulting entropy  from all these changes threatens to spiral out of control if not managed proactively. Additionally, individuals and teams who work on subsets of the project need to keep synchronized and handle conflicts with others; the larger a project is the more likely it’ll suffer from communication issues. Screw-ups inevitably occur and need to be remedied without delay, typically by rolling-back to a “last-known-good” state. Diagnosis should be performed afterward in order to minimize downtime for customers. As Steve McConnell explains in Code Complete (Ch. 28):

If you don’t control changes to requirements, you can end up writing code for parts of the system that are eventually eliminated. You can write code that’s incompatible with new parts of the system. You might not detect many of the incompatibilities until integration time, which will become finger-pointing time because nobody will really know what’s going on.
[snip]
A significant percentage of the projects that are perceived to be late would actually be on time if they accounted for the impact of untracked but agreed-upon changes. Poor change control allows changes to accumulate off the books, which undermines status visibility, long-range predictability, project planning, risk management specifically, and project management generally.

Apple’s Time Machine visualization ¶

In response to these issues we have the field of configuration management—the practice of evaluating and controling changes, maintaining products, environments, and documentation artifacts  in a consistent state throughout their lifecycle. Most often this is source code, but will also include any other work performed that’s needed to build or understand the product. People, processes, and hardware are also part of this knowledge area, which encompasses and intersects with the following subjects:

Project Management’s change control process to evaluate changes.
Source code management (aka version control)
Build, release, and deployment engineering
Environment configuration
Auditing of inventory and status
Hardware & Systems Administration

11.1. Full Automation¶

We are programmed just to do… anything you want us to ♪
— Kraftwerk

We are the robots ¶

With theses considerations in mind, the next objective that CM (or SCM, S for Software) enables and demands is complete automation of the construction of products and production environments—as explained by continuousdelivery.com :

We have two overriding goals:

Reproducibility:
We should be able to provision any environment in a fully automated fashion, and know that any new environment reproduced from the same configuration is identical.

Traceability:
We should be able to pick any environment and be able to determine quickly and precisely the versions of every dependency used to create that environment. We also want to to be able to compare previous versions of an environment and see what has changed between them.

These capabilities give us several very important benefits:

Disaster recovery

Higher quality

Capacity management

Rapid response to defects

Auditability

Now that we’ve covered the purpose of configuration management, we’ll move on to implementing it, and the first step on that road is the use of a version control system.

11.2. Environment Configuration¶

When many people discuss Configuration Management, they are often speaking of the subset related to the automation of the configuration of computing environments.

Cfengine’s video  on CM-based automation touts these benefits:

Increased uptime
Improves performance
Ensures compliance

Prevents errors
Reduces costs

Remember the three I’s (just made that up):

Identical:: asddf
Idempotent:: foo bar
Immutable:: foo bar http://www.infoq.com/presentations/scaling-operations-facebook

CFEngine
Chef and Puppet
Ansible and Salt
Dockerfiles

Warning: Special Snowflakes

Many have made excuses over the years for inefficient manual procedures, but it’s not really a defensible position in the 21^st century where the benefits of Continuous Integration and Continuous Delivery are widely recognized. The rise of virtualization, containers, cloud, and infrastructure as a service have made deployment automation the norm—ubiquitous and cheap. Even those slinging their own hardware for performance reasons should have their configuration automated these days, there’s just no excuse—unless you like throwing money away and torturing employees with mindless cleanup tasks.

Further, if your infrastructure pieces (dev boxes, build farms, servers, etc.) are managed as “special snowflakes”  (configured by hand) you’ll be in for a world of hurt when (not if) they crash. How will your organization be hurt by (and can it survive) an extended downtime? 

11.3. Build Engineering¶

11.4. Release Engineering¶

11.5. Deployment¶

asdf

11.6. More¶

Gorelick/Osvald:

326 | Chapter 12: Lessons from the Field

Defining servers in code has many benefits: complete parity with the production environment; version control of the configuration; having everything in one place. It also serves as documen‐ tation on the setup and dependencies required by a cluster.

11.7. Task Q¶

Gorelick/Osvald:

Advice to a Fellow Developer

My main advice would be to shove as much as you can into a task queue (or a similar loosely coupled architecture) as soon as possible. It takes some initial engineering effort, but as you grow, operations that used to take half a second can grow to half a minute, and you’ll be glad they’re not blocking your main rendering thread. Once you’ve got there, make sure you keep a close eye on your average queue latency (how long it takes a job to go from submission to completion), and make sure there’s some spare capacity for when your load increases.

Finally, be aware that having multiple task queues for different priorities of tasks makes sense. Sending email isn’t very high priority; people are used to emails taking minutes to arrive. However, if you’re rendering a thumbnail in the background and showing a spinner while you do it, you want that job to be high priority, as otherwise you’re making the user experience worse. You don’t want your 100,000-person mailshot to delay all thumbnailing on your site for the next 20 minutes!