Deployment via Gitlab CD/CI: Part 1 Your tech Stack

Creating pipelines to deploy your app onto a host/s is.. well… erm… A complete pain in the rear end!!! I’m a Developer and not a DevOps specialist. I generally work on my own, so deployment of a web app to a server is pushed onto me as part of my job.

The thing is, these apps are not that complex. Generally a web server, the app and a database. I like to use RabbitMQ to deal with backend processing so the Rabbit Server and the daemons for running the backend processing are usually included as well. This is NOT a complex set-up, so why is it that the CD/CI pipelines for deploying such are a pain.

I’m hoping that writing this set of articles will result in feedback positive and negative. I’m sure I’m not doing things the right way, if there is a right way. Its definitely not an easy way. I’ve looked and not found that easy way yet.

The Technical Stack: Gitlab

A Stack has its own definition when it comes to DevOps, but in this case, I am referring to the technology I am using or trying to use for this project. So the tech stack refers to the systems and apps I’m using.

The first item I’m using is Gitlab. Why Gitlab and not GitHub, well Gitlab for a start is not owned my Microsoft (Yeah, bet you didn’t know that about github. Kinda puts a different prospective on the whole open source thing).

The Second is that Gitlab works in a very different way. It built with the idea that it offers features to manage a project. The git repository is just one of those features. This means that Gitlab is a little more feature rich when it comes to managing your project outside of just managing the git Repository and this includes the CD/CI side of things. GitHub has added most of the features available in Gitlab, but the layout and usage of Gitlab is just easier.

Gitlab Runners

All CD/CI applications work in basically the same way. They wait for a trigger to occur, such as the updating of a branch in the repository, then they run a script on a worker machine. For gitlab this is known as a gitlab runner. So to be clear, your script for CD/CI is never run on the main server running the CD/CI app, but on another machine known as a worker or runner.

Gitlab has a large number of pre-defined, ready to use runners. They can do lots of the simple tasks, but they are used by lots of different people, and they are not customized in any way to the software your using. Just for example, if you want to zip instead of tar a file, you need to install the zip package as part of your script, or it will crash. There are some custom runners in the group that can be selected for some tasks with pre-installed app such as docker, node.js etc, so you can work a CD/CI script via these.

However, Gitlab offers a way to have your own private runners. You need to have an instance or two, available and install the Gitlab-runner software, then register the runners with your project. Best to remember here that you do need to get a unique token from the Gitlab site, linked to a given project or group, and run a command on your runner with this token to register. No auto set-up for the runners.

Runners can be registered in a number of ways including starting Docker containers to run a job in. I kinda found this a little complex, so I stuck with the simplest form of a “Shell” runner.

With your own runners in place, you can preinstall packages needed by your pipeline, such as node, Docker, Ansible, fabric, Python etc. The Runners belong to you, so the packages installed are down to you.

For me, I have a dedicated ubuntu server, so I just use LXC to create two runner containers. On a dedicated Ubuntu server LXC containers are dead easy.

Gitlab CD/CI variables

When we are running our CD/CI scripts, you need a lot of variables that will be unique to the run, and unique to the type of deployment you are running. In this I mean dev, staging, prod etc.

For the run gitlab automatically supplies a series of variables all starting with CI_. These include items like the branch name, the sha code for the deployment, temp users and tokens for accessing Gitlab services and others.

Gitlab also has a feature that holds your own variables that will be needed for deployments. Things like API tokens, database names, user, passwords, system setting etc. Each variable you save can be marked as a standard value, or masked. Masked means the value is not shown in the logs. Important in the case of tokens and passwords. It also can store variables marked as a file. This means that when the variable is used by the CD/CI pipeline, it sees it as a file, not a string. Useful for items like SSH keys etc.

Gitlab Environments

When we run our CD/CI pipelines, we are going to want different variables available for different types of runs. For example, you don’t want the password used in dev to be used in production.

Gitlab allows for this, but its not directly linked to the CD/CI settings or the variables for that. Its defined by using Gitlab Environments.

I’m not going to go into all the uses of Environments here, only their use for variables. In the Environments section you can create a number of environments. When you set a variable, you can then state what environment the variable belongs to and you can set the same variable name for different environments.

If you don’t state an environment, then this makes the variable the default one, in the case when its not set in a given environment.

What this means is that you can set the username as a default to be used by all, but set the password to be unique for each environment.

In your Gitlab CD/CI code, you then state which environment a given job is to use. Its important to understand that Environments are NOT linked to branches. They are independent. Why is this you ask. Well lets say you are deploying to three production servers. You will have three jobs triggered by the update to your production branch, each of them with a different environment attached. Thus your three production servers can have different passwords, tokens etc. Its kinda neat, even if most of the time we don’t really need to use it like that.

Gitlab Artifacts and Packages

This is a concept that is very confusing when you start using Gitlab CD/CI. It has a package feature that can store package files, docker images etc, but it also has what it calls Artifacts, which are also files. So why the two.

First off, you need to understand that each job run in your CD/CI pipeline is self contained. If you create a file in job1, it won’t be available to job2 by default. This is where Artifacts come in. You can create a file and save as an artifact in job1, then tell Job2 to use the artifact saved in job1.

Another important point here is that artifacts are considered temporary. They have a set expiry time. If your artifacts contain sensitive data, you can set this to an hour or something. The concept is that they are not Packages and will not be kept until you delete them.

Packages are where you store your compiled packages or docker images that you want to upload to the Host and where you want to keep old versions so you can roll back.

Technical Stack: Deploying to your Target

This is where things start to get messy, especially if you want a simple deployment. The CD/CI pipeline basically works using shell commands that run on the gitlab runner. It has no defined way to ship or deploy code and packages to the target host.

As such, how the code and services get deployed is down to you. Basically this means using Ansible, Fabric, or SSH’ing into the host from the runner. They all kinda do the same thing as Ansible and Fabric still SSH into the target and run scripts there.

So which is the best way? The answer is that I have no idea. I have used Ansible in some cases, Fabric in others, and recently used SSH to just run invoke commands on the Target. I cannot say that I’m happy with any of these. My most recent work has involved using Python Invoke and I quite like this. My reasoning is simple though. Invoke is made to run locally only. It was part of Fabric, where fabric took the Invoke command and ran it on the other machine using SSH. What Invoke also allowed me to do is run the commands while I’m developing.

Which is the best I think is for a future article. My next will be going into the gitlab CD/CI scripts. You might be disappointed that its not in this one, but you really do need to understand how runners, variables and articles work before we go into the scripts.