Posts tagged build automation

Puppets

Today’s post will be about another proof-of-concept I’ve been doing recently — using Puppet to manage the test lab (and more). By the way, if you’re interested in working for me, here’s the job description.

What is Puppet?

Puppet is an infrastructure management software that allows to control the configuration of multiple servers from one central place. The configuration is defined in a declarative way via so-called manifests. A manifest is a collection of resource definitions and each resource describes the desired state one thing, e.g. a file with name X should exist and have this or that content or service Y should be running.

Puppet consists of two components, an agent and a server (a.k.a. master). The agent needs to be installed on each managed machine and it’s purpose is to apply the manifests sent by the master to the local machine. Agent software is free (Puppet Open Source) and can run on any OS. Master on the other hand is part of Puppet Enterprise and obviously is not a free software.

Other interesting thing about Puppet is the Forge. It is a place where the community can exchange Puppet modules (packaged, reusable configuration elements).

Last but not least, there is the idea of master-less Puppet. In such scenario there is no central server and agents get their manifests straight from some package repository or even have the manifests pushed to them (e.g. using Pulp).

Puppet for Windows

It’s probably not a surprise that Puppet is focused on non-Microsoft OS, in particular Red Hat and Debian Linux distributions. Support for Windows is not that complete but all the important parts are working (e.g. file manipulation, service management, package installation). The only problem might be that the Puppet master is not available for Windows. It would pose a challenge for me (and our IT department) if we wanted to use it, but… this slide explains why we’ve chosen the master-less way. One more reason for going that route is the fact that I’d like to keep my manifests in the source code repository. But I am getting ahead of myself.

Puppet in a test lab

Why do we even need puppet to manage our test lab? We decided that for each project we run we automatically create two virtual environments, one for automated and one for manual testing. Spinning up these environments should be effort-less and repeatable. This directly leads to Puppet or similar technologies. A big advantage is that, for projects for which we also run the production environment, we can use the very same process to manage the production VMs.

In order to deploy Puppet in the master-less way one needs to implement the manifest distribution himself. Since Octopus Deploy, our favorite deployment engine, uses NuGet for packaging, we decided to use the same package format for distributing the manifests. But first, how do you know which manifests should go where? We devised a very simple schema that allows us to describe our machines like this

<Machines>
	<Machine name="Web">
		<Roles>
			<Role name="Web"/>
			<Role name="App"/>
		</Roles>
	</Machine>
	<Machine name="Web2">
		<Roles>
			<Role name="Web"/>
			<Role name="App"/>
		</Roles>
	</Machine>
</Machines>

And their roles in terms of manifests

<Roles>
	<Role name="Web">
		<Manifests>
			<Manifest file="Web.pp"/>
			<Manifest file="Common.pp"/>
		</Manifests>
		<Modules>
			<Module name="joshcooper-powershell"/>
		</Modules>
	</Role>
	<Role name="App">
		<Manifests>
			<Manifest file="App.pp"/>
			<Manifest file="Common.pp"/>
		</Manifests>
	</Role>
</Roles>

These files are part of so-called infra repository. We have one such (git) repo for each Team Project. The infra repo also contains Puppet modules and manifests in a folder structure like this:


/
|- machines.xml
|- roles.xml
|- Modules
|  |- joshcooper-powershell
|  |  |- Modulefile
|  |  \- ...
|  \- puppetlabs-dism
|     \- ...
\- Manifests
   |- app.pp
   |- web.pp
   |- common.pp
   \- ...

On our lovely TeamCity build server we run a PowerShell script to create one NuGet package for each module (using the Modulefile as a source of metadata) and one package for each machine. It uses the xml files to calculate which manifests should be included in the package. We also use the module information in the role definition file to define dependencies of the machine packages so that when we do


nuget install INFN1069.Infra.Web.1.0.0

on the target machine, NuGet automatically fetches the modules manifests depend on. I’ll leave the exercise of writing such a PowerShell script to the reader. Last but not least, we need another small script that will run periodically on each machine in the test lab. This script should download the packages and call


puppet apply [folder with manifests] --modulepath=[folder with modules]

to apply the latest manifests.

VN:F [1.9.22_1171]
Rating: 4.5/5 (2 votes cast)

Go as continuous delivery tool for .NET

Following my previous post regarding a possible design of continuous delivery scheme for an ISV, I’d like to focus today on ThoughtWorks Go. This tool used to be quite expensive but just a few days ago ThoughtWorks made it completely free and open source (under Apache 2.0 license). Because of this dramatic price drop I thought that I would give Go a second chance as try to replicate the same stuff I did with TeamCity. Let me share my insights after spending few days with Go.

Name

The name is probably Go’s biggest problem. It is absolutely impossible to google for any information regarding it. Try ‘NUnit Go‘ for example. Really, these days when choosing name for a product one should think about it’s googleability.

Installation

As we’re a .NET shop, I installed Go on my Windows machine. It was quick and easy. Good job here. Same for installing the agents.

Documentation

Go’s docs are very clean and nice but I have an impression that there’s more chrome than content in them if you know what I mean. Take NUnit integration for example. The only thing I found was the information that Go ‘supports NUnit out of the box’. It turned out that by ‘support’ they mean it can process NUnit’s TestResult.xml file and display ugly (yes, I mean very ugly) test summary on release candidate details page. In order to generate this file I need to run NUnit on my own using the task ‘framework’ (more on that later). Of course I need to install NUnit runner on the agent first.

By the way, there is quite a lot of video how-to’s but personally I don’t think that’s what devs are looking for. On the good side, the HTTP API is very well documented.

Last but not least, I have a feeling that Go’s docs lack transparency a bit, especially compared to Octopus. I mean things like what is the protocol between the server and the agents and why it is secure should be better explained so that I as an ISV can use them to convince my clients to using Go.

Pipelines

Go has a concept of pipeline which lets you define complex build and deployment workflows. Each pipeline has one or more stages executed sequentially, either automatically or with manual approval. Each stage consists of multiple jobs which can be executed in parallel on multiple agents. Finally, each job is a sequence of tasks.

To add even more possibilities, pipelines can be chained together so that completion of one pipeline kicks of another one. Pretty neat. I really like it. The sequential-parallel-sequential design is clean and easy to understand and is expressive enough to implement complex processes and constrained enough to not let these processes become a pile ugly spaghetti.

Agents

Go’s agents are universal. They can execute any shell command for you and pass the results back to the server. They have no built-in intelligence like TeamCity (build-specific) or Octopus (deployment-specific) agents and can be used for both building and deploying. Plus they are free. Good job.

Tasks

Tasks are in my opinion the second biggest (just after the name) failure in Go. A task can be either Ant or NAnt script or… any shell command you can imagine. While I appreciate the breadth of possibilities that come from being able to execute just anything, I really don’t like the fact that I have to do everything myself.

Do you, like me, enjoy TeamCity’s MSBuild configuration UI? Or it’s assembly version patch feature? Or maybe it’s visual NUnit runner configurator? Nothing like this here. To be fair, there is a concept of command repository which allows you to import frequently used command examples but it really isn’t something comparable to TeamCity.

What surprised me is that there seem to be no plug-in system for tasks and for sure no lively plug-in ecosystem. I would expect that if ThoughtWorks made a decision to focus on workflow and agents (which are really good), they would publish and document some API that would allow people to easily write custom task types as plugins. For example, if I would install NUnit plugin into my Go, I would expect NUnit runner to be deployed automatically to my agents.

Summary

I managed to build a simple pipeline that does build my source code, packages it up into NuGets (using OctoPack) and runs the unit tests. It’s for sure doable but it’s way more work compared to TeamCity. Because I don’t like a role of release manager who owns the build and deployment infrastructure and prefer teams to own their own stuff, I made a decision to drop Go and focus on TeamCity. It is much friendlier and I don’t want to scare people when I am helping them set up their builds. If ThoughtWorks or the community that will probably form around Go gives some love to defining tasks I will consider switching to Go in future. Go is definitely worth observing but in my opinion, for a .NET shop it is not yet worth adopting.

To be fair, TeamCity is not a perfect tool either. To be able use it we have to overcome two major problems

  • No support for defining deployment pipelines (everything is a build type). Bare TeamCity lacks higher-level concepts
  • While TeamCity’s base price is reasonable, a per-agent price is insane if one wants to use agents to execute long running tests (e.g. acceptance)

More on dealing with these problems in following posts.

VN:F [1.9.22_1171]
Rating: 4.7/5 (10 votes cast)

TFS 2010 and multiple projects output

This post is a part of automated deployment story

Some time ago we were forced to switch from TFS 2008 to TFS 2010. I must emphasise here that choosing TFS as source code repository and CI software was not my choice in the first place. We were forced to use it. Anyway, we wanted to move because the new system was in the same physical network as the whole environment so it would make transporting binary packages much easier and safer.

Apart from obvious changes, like replacing MSBuild with workflow, there is one thing that is more subtle but has so tremendous impact that nearly blocked our adoption of 2010.

It turns out that 2010 overrides the bin directory when building projects so that output of all compilations goes to one folder. By doing so it saves a lot of effort of copying copy local binaries here and there. There is a downside however. Having out output directory makes it really hard to build more than one application in the solution. The result is, all the binaries are mixed together and you can’t figure out (by the results alone) which one belongs to which application (and which are shared).

The problem is less dramatic with web applications because they have publish feature out of the box. What publish does is especially it gathers all the files related to the app and puts them into a zip file. It also gathers all the directly and indirectly referenced binaries, which is cool.

What about console applications then? In the obj directory you can find only the result of compiling the application. There are libraries it depends on. How can we find them?

We can use the very same file that you use to publish web applications. It is called Microsoft.WebApplication.targets and it is located in MSBuild folder in Program Files. All you need to do is strip it from all stuff that does not apply to console apps. Here’s what remains of it:

<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <UsingTask TaskName="Microsoft.WebApplication.Build.Tasks.CopyFilesToFolders"  AssemblyFile="Microsoft.WebApplication.Build.Tasks.dll" />

  <PropertyGroup>
    <BuildDependsOn>
      $(BuildDependsOn);
      PackageBinaries;
    </BuildDependsOn>
  </PropertyGroup>

  <Target Name="PackageBinaries" DependsOnTargets="ResolveReferences">
    <!-- Log -->
    <Message Text="Generating binary package for $(MSBuildProjectName)" />

    <!-- copy any referenced assemblies -->
    <Copy SourceFiles="@(ReferenceCopyLocalPaths)"
          DestinationFiles="@(ReferenceCopyLocalPaths->'$(IntermediateOutputPath)\%(DestinationSubDirectory)%(Filename)%(Extension)')"
          SkipUnchangedFiles="true"
          Retries="$(CopyRetryCount)"
          RetryDelayMilliseconds="$(CopyRetryDelayMilliseconds)"/>

    <!-- Copy content files -->
    <Copy SourceFiles="@(Content)" Condition="'%(Content.Link)' == ''"
          DestinationFolder="$(IntermediateOutputPath)\%(Content.RelativeDir)"
          SkipUnchangedFiles="true"
          Retries="$(CopyRetryCount)"
          RetryDelayMilliseconds="$(CopyRetryDelayMilliseconds)" />

  </Target>
</Project>

It automatically hooks to a compile process, calculates the list of transitive dependencies of your project and fetches their binary forms from the common output location.

Now all you need to do is include the file in your .csproj files just under this line

<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />

Then, after the build is finished, the obj folders of your projects are populated with all necessary files.

VN:F [1.9.22_1171]
Rating: 5.0/5 (2 votes cast)

Prezentacja Continuous Delivery

Wszystkich zainteresowanych tematyką Continous Delivery zapraszam na moją prezentację podczas 73. spotkania KGD.NET. Osoby, które nie mogą dotrzeć do Krakowa mogą, dzięki uprzejmości portalu VirtualStudy.pl oglądać sesję na żywo on-line. Strona spotkania na VS znajduje się tutaj.

Sesja jest wprowadzeniem w tematykę Continuous Delivery. W kilku słowach można powiedzieć iż CD jest to podejście do tworzenia oprogramowania kładące nacisk na traktowanie każdego commita jako potencjalnie wdrażalnego na środowisko produkcyjne. Każdy commit jest budowany, poddawany wielorakim testom (jednostkowym, integracyjnym, akceptacyjnym etc.), a ostatecznie wdrażany produkcyjnie.

Continuous delivery kładzie ogromny nacisk na automatyzację wszelkich czynności związanych nie tylko z budowaniem i instalacją aplikacji, ale także z przygotowaniem odpowiedniego środowiska.

Podczas sesji zaprezentuję konkretną implementację idei continuous delivery, którą zbudowałem wraz z moim zespołem w ramach ostatniego projektu. Jest to implementacja niskokosztowa (zero płatnych narzędzi) stworzona głównie w oparciu o PowerShell.

Zapraszam także na drugą (a właściwie w kolejności chronologicznej — pierwszą) sesję 73-go spotkania: Pogromcy mitów: Czy specjaliści od użyteczności i programiści mogą ze sobą zgodnie współpracować? Marcina Czyżowskiego.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Automated deployment security concerns

This post is a part of automated deployment story

One of the blockers to widespread adoption of automated deployment tools is the fear that they cause greater security risk than humans executing a written deployment procedure. The greatest of fears is that an attacker would be able to push a malicious version of binaries to the production environment.

If you want to succeed implementing an automated deployment solution you must ensure at least same level of security as when humans manually copy binaries.

Who can deploy?

When there is no automation, the deployment is performed by designated humans. For testing environment it can be the QA manager, for production it is usually an operations team. They (and only they) are in possession of credentials required to log on to the target environment machines to copy the binaries and perform other deployment-related tasks.

People who have just begun automating the deployment tend to think that, since everything can be automated, the deployment process can be started straight from the build agent without any human action. This was also true in my case. Fortunately some wiser people reminded me about how important is to be able to name the person (not a machine) who was responsible for particular deployment.

The rule of thumb is, you can (and should) automate everything, but two things. First is starting the whole procedure. There should be a person that has to hit ENTER key in order to start deployment. We’ll be the one to blame if something goes wrong (for example upgrading the production site during peak usage hours). The second thing not to be automated is authentication. You should never store production (or even testing) environment credentials in order to be automatically used by the script. Whenever credentials are required, the script should ask the user to provide them. Sometimes it is possible to store them temporarily in memory so that user does not get prompted for same credentials twice.

What can be deployed?

Usually the more layers of security, the better. Should one get compromised, the other can prevent the disaster. Beside controlling who can deploy, it is also worth controlling what can be deployed. The easiest thing (and this is what we have actually implemented) is digital signature-based verification. A binary package is signed using a certificate stored on the build agent. The corresponding public key along with a gateway script is securely installed on all machines of the environment.

The only remotely accessible endpoint that is exposed by target environment machines is the aforementioned gateway script. This ensures that an attacker can’t bypass the security measures put in place. One can only trigger the deployment on the machine via this script and the script ensures that deployment package signature is verified against stored public key of the build agent.

Tools

Unfortunately there is no command-line tool to sign and verify signature built into Windows. That’s why PackMan was born. I mentioned this tool before in context of building a package. When building a package, PackMan also calculates the hash of the data and encrypts it using private key from provided certificate. It can be done with a command like this (MSBuild variable syntax)

PackMan.exe -i d:$(PackageDir) -o $(OutDir)\Package-$(BuildVersion).zip --cn DeployerCert -a create

When asked to unpack, PackMan first verifies the signature using a provided public key (PowerShell variable syntax)

PackMan.exe -p ${package_store}\${package_file} --vcn DeployerCert -a unpack -d ${tmp_package_dir}

This ensures that package is authentic and not corrupted.

VN:F [1.9.22_1171]
Rating: 5.0/5 (1 vote cast)