Posts tagged powershell

Puppets

Today’s post will be about another proof-of-concept I’ve been doing recently — using Puppet to manage the test lab (and more). By the way, if you’re interested in working for me, here’s the job description.

What is Puppet?

Puppet is an infrastructure management software that allows to control the configuration of multiple servers from one central place. The configuration is defined in a declarative way via so-called manifests. A manifest is a collection of resource definitions and each resource describes the desired state one thing, e.g. a file with name X should exist and have this or that content or service Y should be running.

Puppet consists of two components, an agent and a server (a.k.a. master). The agent needs to be installed on each managed machine and it’s purpose is to apply the manifests sent by the master to the local machine. Agent software is free (Puppet Open Source) and can run on any OS. Master on the other hand is part of Puppet Enterprise and obviously is not a free software.

Other interesting thing about Puppet is the Forge. It is a place where the community can exchange Puppet modules (packaged, reusable configuration elements).

Last but not least, there is the idea of master-less Puppet. In such scenario there is no central server and agents get their manifests straight from some package repository or even have the manifests pushed to them (e.g. using Pulp).

Puppet for Windows

It’s probably not a surprise that Puppet is focused on non-Microsoft OS, in particular Red Hat and Debian Linux distributions. Support for Windows is not that complete but all the important parts are working (e.g. file manipulation, service management, package installation). The only problem might be that the Puppet master is not available for Windows. It would pose a challenge for me (and our IT department) if we wanted to use it, but… this slide explains why we’ve chosen the master-less way. One more reason for going that route is the fact that I’d like to keep my manifests in the source code repository. But I am getting ahead of myself.

Puppet in a test lab

Why do we even need puppet to manage our test lab? We decided that for each project we run we automatically create two virtual environments, one for automated and one for manual testing. Spinning up these environments should be effort-less and repeatable. This directly leads to Puppet or similar technologies. A big advantage is that, for projects for which we also run the production environment, we can use the very same process to manage the production VMs.

In order to deploy Puppet in the master-less way one needs to implement the manifest distribution himself. Since Octopus Deploy, our favorite deployment engine, uses NuGet for packaging, we decided to use the same package format for distributing the manifests. But first, how do you know which manifests should go where? We devised a very simple schema that allows us to describe our machines like this

<Machines>
	<Machine name="Web">
		<Roles>
			<Role name="Web"/>
			<Role name="App"/>
		</Roles>
	</Machine>
	<Machine name="Web2">
		<Roles>
			<Role name="Web"/>
			<Role name="App"/>
		</Roles>
	</Machine>
</Machines>

And their roles in terms of manifests

<Roles>
	<Role name="Web">
		<Manifests>
			<Manifest file="Web.pp"/>
			<Manifest file="Common.pp"/>
		</Manifests>
		<Modules>
			<Module name="joshcooper-powershell"/>
		</Modules>
	</Role>
	<Role name="App">
		<Manifests>
			<Manifest file="App.pp"/>
			<Manifest file="Common.pp"/>
		</Manifests>
	</Role>
</Roles>

These files are part of so-called infra repository. We have one such (git) repo for each Team Project. The infra repo also contains Puppet modules and manifests in a folder structure like this:


/
|- machines.xml
|- roles.xml
|- Modules
|  |- joshcooper-powershell
|  |  |- Modulefile
|  |  \- ...
|  \- puppetlabs-dism
|     \- ...
\- Manifests
   |- app.pp
   |- web.pp
   |- common.pp
   \- ...

On our lovely TeamCity build server we run a PowerShell script to create one NuGet package for each module (using the Modulefile as a source of metadata) and one package for each machine. It uses the xml files to calculate which manifests should be included in the package. We also use the module information in the role definition file to define dependencies of the machine packages so that when we do


nuget install INFN1069.Infra.Web.1.0.0

on the target machine, NuGet automatically fetches the modules manifests depend on. I’ll leave the exercise of writing such a PowerShell script to the reader. Last but not least, we need another small script that will run periodically on each machine in the test lab. This script should download the packages and call


puppet apply [folder with manifests] --modulepath=[folder with modules]

to apply the latest manifests.

VN:F [1.9.22_1171]
Rating: 4.5/5 (2 votes cast)

Deploy based on external description of the environment

This post is a part of automated deployment story

On of the problems with many simple deployment solutions is, the structure of the target environment is embedded into the deployment scripts. This was also the case with the previous, batch based, version of my deployment process. With the advent of PowerShell, everything changed.

Commas

CSV is one of the simplest formats for representing tabular data. It’s easy to read, easy to edit (also in Notepad) and it behaves nicely when stored in VCS. All these properties make it an ideal candidate to be used to describe the deployment environment. Last but not least, PowerShell has very nice built-in features for handling CSV files: Import-Csv, Export-Csv, ConvertTo-Csv and ConvertFrom-Csv. The most interesting one is Import-Csv as it allows to load data stored in file into an object form with one object per row and one property per column. It makes loading entire environment description file as easy as

$machines = Import-Csv environments.txt

The structure of the file is also quite simple. Because I deploy to DEV environment quite frequently (10+ times a day) and that it is done by an automated process, I’ve chosen to store credentials straight in the config file.

IP, Environment, Roles, User, Password
192.168.0.1, DEV, Web|App, devUser, p@ssw0rd
192.168.1.1, TEST, Web,,
192.168.1.2, TEST, App,,

Given these, you can imagine that getting the list of machines in particular environment (you need it to know where to deploy) can be done in another one-liner

$target_machines = $machines | Where-Object { $_.Environment = $target_environment }

Roles

You have probably noticed a column called ‘Roles’ in the description file. A concept of role is key to my deployment strategy. I borrowed the name from cloud computing lingo but the meaning is slightly different. My role is a named set of things that are deployed together and form a logical whole. The current list of roles in my application include web, tools (which is an odd name for application server stuff), tests (this role is installed only on DEV) and db.

As you can see in the sample environment description, a DEV machine serves both roles while in test environment there’s dedicated machine for each role. You can imagine, that production would probably have 3 or 4 machines serving web role and maybe 2 for application role. Having a simple file that has all this information really helps organize things.

Orchestration

Looking from 10.000 ft, the process of deploying one role to one machine is following

  • stop the role
  • update the binaries for this role
  • start the role
  • (optional) verify health of the role

Given that we have multiple machines, each of which potentially serving multiple roles, the question arises how do we orchestrate the whole process so we minimize the down time. The answer, as for all difficult questions, is, it depends. Because it depends, we want to abstract away the concrete orchestration and be able to pick the actual implementation based on requirements at hand. In my project I’ve identified two distinct cases.

If I don’t have to change the structure of transaction database, I can upgrade the machines one by one. To achieve zero downtime I need to fake the response for balancer’s health enquiry telling it that the machine is down before is actually is. This way clients will be served normally before load balancer picks up the faked ‘down’ response and actually removes the machine from the cluster. Then I can safely upgrade this one machine. Last but not least, I have to wait some time before load balancer figures out the machine is up. Voilà! A super-simple zero-downtime deployment.

In case I don’t need to change the structure (and assuming I can’t write code that can work with both structures) I want to upgrade all machines in the environment as quickly as possible. This is where PowerShell v2 jobs come in handy. You can use Start-Job to virtually deploy to each machine in a separate thread of execution. Because the bulk of work is done not on the coordinating machine but on the target machine, this literally cuts the deployment time to the amount it takes to deploy to one machine. Cool, isn’t it?

VN:F [1.9.22_1171]
Rating: 1.0/5 (1 vote cast)