Posts tagged continuous delivery

Prezentacja Continuous Delivery

Wszystkich zainteresowanych tematyką Continous Delivery zapraszam na moją prezentację podczas 73. spotkania KGD.NET. Osoby, które nie mogą dotrzeć do Krakowa mogą, dzięki uprzejmości portalu VirtualStudy.pl oglądać sesję na żywo on-line. Strona spotkania na VS znajduje się tutaj.

Sesja jest wprowadzeniem w tematykę Continuous Delivery. W kilku słowach można powiedzieć iż CD jest to podejście do tworzenia oprogramowania kładące nacisk na traktowanie każdego commita jako potencjalnie wdrażalnego na środowisko produkcyjne. Każdy commit jest budowany, poddawany wielorakim testom (jednostkowym, integracyjnym, akceptacyjnym etc.), a ostatecznie wdrażany produkcyjnie.

Continuous delivery kładzie ogromny nacisk na automatyzację wszelkich czynności związanych nie tylko z budowaniem i instalacją aplikacji, ale także z przygotowaniem odpowiedniego środowiska.

Podczas sesji zaprezentuję konkretną implementację idei continuous delivery, którą zbudowałem wraz z moim zespołem w ramach ostatniego projektu. Jest to implementacja niskokosztowa (zero płatnych narzędzi) stworzona głównie w oparciu o PowerShell.

Zapraszam także na drugą (a właściwie w kolejności chronologicznej — pierwszą) sesję 73-go spotkania: Pogromcy mitów: Czy specjaliści od użyteczności i programiści mogą ze sobą zgodnie współpracować? Marcina Czyżowskiego.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Automated deployment security concerns

This post is a part of automated deployment story

One of the blockers to widespread adoption of automated deployment tools is the fear that they cause greater security risk than humans executing a written deployment procedure. The greatest of fears is that an attacker would be able to push a malicious version of binaries to the production environment.

If you want to succeed implementing an automated deployment solution you must ensure at least same level of security as when humans manually copy binaries.

Who can deploy?

When there is no automation, the deployment is performed by designated humans. For testing environment it can be the QA manager, for production it is usually an operations team. They (and only they) are in possession of credentials required to log on to the target environment machines to copy the binaries and perform other deployment-related tasks.

People who have just begun automating the deployment tend to think that, since everything can be automated, the deployment process can be started straight from the build agent without any human action. This was also true in my case. Fortunately some wiser people reminded me about how important is to be able to name the person (not a machine) who was responsible for particular deployment.

The rule of thumb is, you can (and should) automate everything, but two things. First is starting the whole procedure. There should be a person that has to hit ENTER key in order to start deployment. We’ll be the one to blame if something goes wrong (for example upgrading the production site during peak usage hours). The second thing not to be automated is authentication. You should never store production (or even testing) environment credentials in order to be automatically used by the script. Whenever credentials are required, the script should ask the user to provide them. Sometimes it is possible to store them temporarily in memory so that user does not get prompted for same credentials twice.

What can be deployed?

Usually the more layers of security, the better. Should one get compromised, the other can prevent the disaster. Beside controlling who can deploy, it is also worth controlling what can be deployed. The easiest thing (and this is what we have actually implemented) is digital signature-based verification. A binary package is signed using a certificate stored on the build agent. The corresponding public key along with a gateway script is securely installed on all machines of the environment.

The only remotely accessible endpoint that is exposed by target environment machines is the aforementioned gateway script. This ensures that an attacker can’t bypass the security measures put in place. One can only trigger the deployment on the machine via this script and the script ensures that deployment package signature is verified against stored public key of the build agent.

Tools

Unfortunately there is no command-line tool to sign and verify signature built into Windows. That’s why PackMan was born. I mentioned this tool before in context of building a package. When building a package, PackMan also calculates the hash of the data and encrypts it using private key from provided certificate. It can be done with a command like this (MSBuild variable syntax)

PackMan.exe -i d:$(PackageDir) -o $(OutDir)\Package-$(BuildVersion).zip --cn DeployerCert -a create

When asked to unpack, PackMan first verifies the signature using a provided public key (PowerShell variable syntax)

PackMan.exe -p ${package_store}\${package_file} --vcn DeployerCert -a unpack -d ${tmp_package_dir}

This ensures that package is authentic and not corrupted.

VN:F [1.9.22_1171]
Rating: 5.0/5 (1 vote cast)

Deploy based on external description of the environment

This post is a part of automated deployment story

On of the problems with many simple deployment solutions is, the structure of the target environment is embedded into the deployment scripts. This was also the case with the previous, batch based, version of my deployment process. With the advent of PowerShell, everything changed.

Commas

CSV is one of the simplest formats for representing tabular data. It’s easy to read, easy to edit (also in Notepad) and it behaves nicely when stored in VCS. All these properties make it an ideal candidate to be used to describe the deployment environment. Last but not least, PowerShell has very nice built-in features for handling CSV files: Import-Csv, Export-Csv, ConvertTo-Csv and ConvertFrom-Csv. The most interesting one is Import-Csv as it allows to load data stored in file into an object form with one object per row and one property per column. It makes loading entire environment description file as easy as

$machines = Import-Csv environments.txt

The structure of the file is also quite simple. Because I deploy to DEV environment quite frequently (10+ times a day) and that it is done by an automated process, I’ve chosen to store credentials straight in the config file.

IP, Environment, Roles, User, Password
192.168.0.1, DEV, Web|App, devUser, p@ssw0rd
192.168.1.1, TEST, Web,,
192.168.1.2, TEST, App,,

Given these, you can imagine that getting the list of machines in particular environment (you need it to know where to deploy) can be done in another one-liner

$target_machines = $machines | Where-Object { $_.Environment = $target_environment }

Roles

You have probably noticed a column called ‘Roles’ in the description file. A concept of role is key to my deployment strategy. I borrowed the name from cloud computing lingo but the meaning is slightly different. My role is a named set of things that are deployed together and form a logical whole. The current list of roles in my application include web, tools (which is an odd name for application server stuff), tests (this role is installed only on DEV) and db.

As you can see in the sample environment description, a DEV machine serves both roles while in test environment there’s dedicated machine for each role. You can imagine, that production would probably have 3 or 4 machines serving web role and maybe 2 for application role. Having a simple file that has all this information really helps organize things.

Orchestration

Looking from 10.000 ft, the process of deploying one role to one machine is following

  • stop the role
  • update the binaries for this role
  • start the role
  • (optional) verify health of the role

Given that we have multiple machines, each of which potentially serving multiple roles, the question arises how do we orchestrate the whole process so we minimize the down time. The answer, as for all difficult questions, is, it depends. Because it depends, we want to abstract away the concrete orchestration and be able to pick the actual implementation based on requirements at hand. In my project I’ve identified two distinct cases.

If I don’t have to change the structure of transaction database, I can upgrade the machines one by one. To achieve zero downtime I need to fake the response for balancer’s health enquiry telling it that the machine is down before is actually is. This way clients will be served normally before load balancer picks up the faked ‘down’ response and actually removes the machine from the cluster. Then I can safely upgrade this one machine. Last but not least, I have to wait some time before load balancer figures out the machine is up. Voilà! A super-simple zero-downtime deployment.

In case I don’t need to change the structure (and assuming I can’t write code that can work with both structures) I want to upgrade all machines in the environment as quickly as possible. This is where PowerShell v2 jobs come in handy. You can use Start-Job to virtually deploy to each machine in a separate thread of execution. Because the bulk of work is done not on the coordinating machine but on the target machine, this literally cuts the deployment time to the amount it takes to deploy to one machine. Cool, isn’t it?

VN:F [1.9.22_1171]
Rating: 1.0/5 (1 vote cast)

The release candidate repository

This post is a part of automated deployment story

I am sorry you have to wait so long for a new episode of the story. The truth is, this episode was written a week ago and saved as a draft. Just before publishing I got an e-mail from the IT department that the solution I proposed (using SharePoint site) is not acceptable from the security point of view. They are not going to allow any access to the site from an automated process accounts. Period. So the blog post went to trash but not the idea. Here it is, reborn, better than ever!

Release candidate repository

I’ve taken the idea of release candidate repository from the Continuous Delivery book. The idea is, release candidate is not a term describing a binary package generate just before releasing, waiting for final tests and go-live flag. Instead, a Release candidate is every binary that passes unit tests (or commit stage tests how book authors tend to call them). The candidate then goes through various stages of the deployment pipeline, passing (or failing) integration tests, automated user acceptance tests, load tests, usability tests etc. During this journey, a candidate is deployed to various environments. Finally, a successful candidate ends up being deployed to production. As you probably expect, I was in desperate need of a tool that could track my RCs from the very beginning to the end of their lifetime. Buying an expensive and have-it-all tool was, of course, not an option.

Release Candidate Tracker

It took me one afternoon to hack a quick and dirty solution. As always, you can download it from github. Please don’t complain about the code quality. I know it’s not the cleanest code on the planet, but it works (or sort-of). RCT uses an embedded RavenDB database to store information. The database files are stored in the App_Data folder. If I understand Raven’s license correctly, you can use RCT anywhere you want, also in commercial environment.

Workflow

Here’s the workflow describing the typical usage scenario of RCT. First, the build script (MSBuild in my case) creates a release candidate after successfully running unit (commit stage) tests. To achieve this, it uses curl tool to call RCT API.

$id = .\curl -s --data "State=${state}&VersionNumber=${version}&ProductName=${product}" "${server}/ReleaseCandidate/Create"
if ($id -ne $null) {
    .\curl -s --upload-file $scipt_file "${server}/ReleaseCandidate/AttachScript/${id}"
}

The first call creates a RC entity while the second one attaches a generated deployment bootstrap script. At this point, a candidate is in initial UnitTestsPassed state. You can see it on the release candidate list

The subsequent stages exercise the candidate using various test suites: integration, user acceptance and so on. However, before the candidate can be tested, it has to be deployed to a suitable test environment. Here comes another feature of RCT — the deployment bootstrap script. You can download it simply by clicking a ‘Deploy’ link on the candidate list. If you want, you can even associate the .ps1 extension with PowerShell so that it will be automatically executed. The script (which will be covered in detail in later episodes) starts the deployment process. Eventually, a deployment script completes the deployment and updates the state of the candidate.  There’s another API call and another script that does this

.\curl -s --data "Environment=${environment}&VersionNumber=${version}" "${server}/ReleaseCandidate/MarkAsDeployed"

After executing the tests, the state of the candidate needs to be updated to reflect the result. Here’s a script for this task

.\curl -s --data "State=${state}&VersionNumber=${version}" "${server}/ReleaseCandidate/UpdateState"

Eventually, is the candidate manages to successfully pass all the tests, it can be potentially deployed to production. Of course not every successful candidate ends up on production. Each change of candidate state is reflected on its detail view page

Any contributions to RCT are, of course, more than welcome. Happy deploying!

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Building the deployment package

This post is a part of the automated deployment story.

The first custom step of my build process is building the deployment package. Starting from two applications in the solution it is impractical to just copy over a bunch of DLLs. Much better idea is to package them somehow so they can be easily handled as a whole. There is a number of packaging technologies available for .NET applications, including:

None of them is a complete and perfect solution. First two are biased towards packaging and managing reusable libraries so they put emphasis on versioning and enabling usage of packages in IDE and in the build process. Web Deploy is focused on, well, web deployment and it is lacking good support for any other application type.

Web Deploy

In my solution I’ve chosen Web Deploy because it few reasons. First, it was already there since it was the foundation of old deployment scripts. Second, it has a very nice feature of so-called publishing a web application. You may know this feature from VisualStudio as it is available from context menu on web projects. Basically, it removes all the files that are not necessary to run the application (like .cs, .csproj).

There are two central concepts in Web Deploy: operation and provider. Operation describes what to do. There are three (only!) possible operations:

  • dump – displays information about an object
  • delete – deletes an object
  • sync – synchronizes two objects

Web Deploy comes with a great variety of providers. Providers implement specific deployment steps like changing the machine.config file or putting an assembly to GAC. The most important one for me is the package provider. It allows to pack a specified web site in IIS into a zip file and clone it somewhere else. What you can’t read on the documentation page (but can find on the Internet, of course) is, you can build similar package via MSBuild. Do do so, just call

MSBuild "ProjectName.csproj" /T:Package

After your main build process completes. The output is a nice zip file containing all the information necessary to deploy the web application.

DRYing the build script

For each application I deploy I need to call some packaging routing in the build script. This can quickly become a copy-pasted spaghetti. Fortunately I found out that MSBuild supports iteration. Here’s how you define the collection of elemenents

<ItemGroup Label="WebPackages">
  <WebPackages Include="ProjectA\ProjectA.csproj" />
  <WebPackages Include="ProjectB\ProjectB.csproj" />
  <WebPackages Include="ProjectC\ProjectC.csproj" />
</ItemGroup>

And here is how you call a specific routine on all of them

<Target Name="web-package-build" Inputs="@WebPackages" Outputs="%(WebPackages.Identity)">   
  <Exec Command="$(MSBuild4) &quot;$(SolutionRoot)\%(WebPackages.Identity)&quot; /p:Configuration=$(BuildConfig) /p:DeployOnBuild=true /p:DeployTarget=Package /tv:4.0" />
</Target>

Nice, isn’t it?

One to rule them all

Bare Web Deploy would probably be enough for three or four web applications. As you remember, in my case there is about a dozen of them and another dozen of console tools. That’s why I decided I need another layer. Later on it turned out to be a good idea also for another reason — security. During the audit process one of the auditors pointed out that we don’t have any mechanism to guarantee that the binaries deployed to production are not forged.

I thought it will be a trivial task to implement — just use a command-line tool to zip all the packages (and console applications, each in it’s own folder) and another command-line tool to sign the zip file digitally. Unfortunately it turned out that there are no built-in tools on the Windows platform for these simple tasks. I didn’t like the idea of installing applications like 7zip or similar on production servers so I decided to create a custom tool for the task. This is exactly how PackMan was born.

PackMan is a quick and dirty tool for packaging a bunch of files together in a digitally signed zip file. Of course it also allows you to verify the signature and unpack the contents. PackMan is stream-based so it has no problem supporting quite large packages (mine has about 120 MB). It uses Windows PKI to retrieve the keys for signing and verifying.

VN:F [1.9.22_1171]
Rating: 4.0/5 (1 vote cast)