More Testable Rake Tasks
A few months ago I had a brief pairing session where I attempted to help another developer with getting some working tests around a rake task they had created. They were having issues testing within rake and executing tasks which is a common issue in rake testing. My suggesion was to build the entire logic of the task in a PORO to avoid the pain of testing rake plumbing.
The approach takes the following steps:
- Create a spec file to test the new class
- Create a class that handles doing the thing
- Create a spec file for the rake task
- Add the rake task and defer everything to the new class
For this example say you have some rake task that should just return the current versions of ruby and bundler locally. Not super useful, but it will involve executing some local commands. We want to be able to run:
And have it dump out:
Instead of diving in and just doing all the logic in rake and having to figure out
how to test it in that context, we start with the class that we want to build the
results, say VersionDisplayer
, and we start with the spec:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
require_relative 'version_displayer'
describe VersionDisplayer do
describe '#display' do
let(:version_displayer) { VersionDisplayer.new }
let(:expected_output) do
<<~OUTPUT
Gathering versions
Ruby VERSION: ruby 2.7.0
Bundler VERSION: bundler 2.0
OUTPUT
end
before do
allow(Open3).to receive(:capture2)
.with('ruby --version').and_return(['ruby 2.7.0', nil])
allow(Open3).to receive(:capture2)
.with('bundler --version').and_return(['bundler 2.0', nil])
end
it 'returns the current bundler and ruby version' do
expect { version_displayer.display }.to output(expected_output).to_stdout
end
end
end
This leads us to the VersionDisplayer class that satisfies the spec:
1
2
3
4
5
6
7
8
9
10
11
require 'open3'
class VersionDisplayer
def display
puts "Gathering versions"
ruby_version, _status = Open3.capture2('ruby --version')
bundler_version, _status = Open3.capture2('bundler --version')
puts "Ruby VERSION: #{ruby_version}"
puts "Bundler VERSION: #{bundler_version}"
end
end
Now we can write a simple rake integration spec:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
require 'rake'
describe ':display_versions' do
let(:version_displayer) { instance_double('VersionDisplayer') }
before do
load File.expand_path('Rakefile')
allow(VersionDisplayer).to receive(:new).and_return(version_displayer)
allow(version_displayer).to receive(:display).with(no_args)
end
it 'uses version displayer to output the version information' do
Rake::Task['display_versions'].invoke
expect(version_displayer).to have_received(:display).with(no_args)
end
end
And finally we a simple rake task that delegates all everything:
1
2
3
4
5
6
require_relative 'version_displayer'
desc 'Displays versions of bundler and ruby'
task :display_versions do
VersionDisplayer.new.display
end
I much prefer this to embedding the logic in the rake task and writing complex tests around the rake context.
Configuring Jenkins Properties As Code
Jenkins appears to be the single most popular CI/CD tool despite its age and corresponding legacy tradeoffs. When I landed on the State of California’s Child Welfare project Jenkins was already the default option. I brushed up on the new features since I had largely missed out on the new pipeline features with an actual code as configuration option that was added to Jenkins around 2016.
We moved to the state’s infrastructure and Jenkins server by making heavy use of Jenkinsfiles and a few triggering plugins like Github Pull Request Builder and the Generic Webhook Trigger. Later in the project, I volunteered to lead the pipeline team to ‘Automate All the Things’. Eventually, we experimented with writing Jenkins Shared Libraries which is a somewhat clunky way to share code between Jenkinsfiles without breaking down and writing whole Jenkins plugins. After some early successes, we eventually moved to write a significant library of shared steps which were then used on dozens of the organization’s projects.
After a time it occurred to me that we could embed many of the settings in the projects that were a pain to set up in the GUI from within a shared library custom step. One of the most elaborate plugins to configure was Github PullRequestBuilder, but we were eventually able to capture all of its settings in a fairly simple step.
Under the covers it sets up an Frankenstien sort of object:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
[$class: 'org.jenkinsci.plugins.ghprb.GhprbTrigger',
spec: 'H/5 * * * *',
configVersion: 3,
allowMembersOfWhitelistedOrgsAsAdmin: true,
orgslist: 'ca-cwds',
cron: 'H/5 * * * *',
onlyTriggerPhrase: false,
useGitHubHooks: true,
permitAll: false,
autoCloseFailedPullRequests: false,
displayBuildErrorsOnDownstreamBuilds: false,
triggerPhrase: 'retest this please',
skipBuildPhrase: '.*\\[skip\\W+ci\\].*',
extensions: [
[
$class: 'org.jenkinsci.plugins.ghprb.extensions.build.GhprbCancelBuildsOnUpdate',
overrideGlobal: false
],
[
$class: 'org.jenkinsci.plugins.ghprb.extensions.status.GhprbSimpleStatus',
commitStatusContext: 'Pull Request Testing',
statusUrl: 'https://jenkins.example.com',
addTestResults: false,
completedStatus: [
[
$class: 'org.jenkinsci.plugins.ghprb.extensions.comments.GhprbBuildResultMessage',
message: 'Success',
result: 'SUCCESS'
],
[
$class: 'org.jenkinsci.plugins.ghprb.extensions.comments.GhprbBuildResultMessage',
message: 'Build Failed',
result: 'FAILURE'
],
[
$class: 'org.jenkinsci.plugins.ghprb.extensions.comments.GhprbBuildResultMessage',
message: 'Build Error',
result: 'ERROR'
]
]
]
]
]
This saved significant time for any team trying to set up their first pull-request based build in their Jenkinsfile, as they no longer had to figure out all of the boxes to check and text fields to fill out. So if you’re just getting started trying to reuse code among Jenkinfiles don’t forget the option of embedding complex config as well.
Notes:
We’ve pulled in three things now Jenkins Github plugin, GHPRB, Generic Webhook Trigger. Had to deal with them as steps. Eventual goal is to move all config out of Jenkins into code. Unfortunate bootstrap issue especially with triggers, or a first time project. Jenkins X is supposed to provide some of this functionality.
Evaluating Concourse CI
Over the holidays I had a chance to do a review of the CI options out there. I’m more than happy with something like Travis, and I’ve often said Jenkins is the Wordpress of CI Servers even though I’ve used it now for the bulk of my career. Currently, I’m working on building out a full production pipeline for the State of California where we are attempting to be transparent and use AGPL 3.0 for all of our code. That pretty much eliminated Travis CI, Codeship, and Circle CI. I realize they’re free for open source projects in general, but suffice it to say that getting a SASS solution, even a free one approved with some level of access to our Github organization is a tall order with the existing policies in place. So self-hosted open source was going to be our options.
So one of our options was Concourse CI, so I sat down with several hours to dive in and take a good look at it.
There is a lengthy tutorial put out by some developers from Stark and Wayne. The tutorial gives you a good sense of how it would be to use Concourse CI. I did find a few stumbling blocks running through it:
Fly Install
There isn’t a convenient brew install, so I had to go looking for this command to install instead. (You can download from the initial docker container, but it downloaded a strange fly.dms file for me, and I found it easier to download and install directly.)
curl -Lo fly https://github.com/concourse/concourse/releases/download/v4.2.2/fly_darwin_amd64 && chmod +x fly && mv fly /usr/local/bin/
Private Keys
The tutorial step where you publish outputs involves putting private keys in your pipeline.yml file which feels terrible. I was able to swap it out to use a personal access token on GitHub by switching the out the :uri under the git resource for username and password where password is just the reference to the personal access token. I was surprised to see the use of private keys at all in the tutorial. I also made sure to lock down the token to only be able to create gists since this tutorial step updates a particular gist.
Original Config
- name: resource-gist
type: git
source:
uri: git@gist.github.com:e028e491e42b9fb08447a3bafcf884e5.git
branch: master
private_key: |-
-----BEGIN RSA PRIVATE KEY-----
MIIEpQIBAAKCAQEAuvUl9YU...
...
HBstYQubAQy4oAEHu8osRhH...
-----END RSA PRIVATE KEY-----
Config With Personal Access Token
- name: resource-gist
type: git
source:
uri: https://gist.github.com/e028e491e42b9fb08447a3bafcf884e5.git
branch: master
username: edgibbs
password: ddf78dffcaf4398bec2323
CredHub
Secrets With Credentials Manager
After the step of making sure I had an up to date VirtualBox and cloning Stark and Wayne’s CLI bucc
, it
failed right away. They have a note about needing direnv
, but I had to do a bit more config with it:
echo 'eval "$(./bin/bucc env)"' > .envrc
brew install direnv
direnv allow
Despite these few missteps I was able to work my way through the tutorial and get an overall impression.
Takeaways
Overall Architecture
Under the covers Concourse relies on three main parts:
- A master server, the ATC or air traffic controller that is the heart of concourse
- The TSA which is an ssh server for registering workers with the ATC
- Postgres for storing persistent data
A pipeline in Concourse also has three main items:
- Resources are things like git, artifact creation, or Slack
- Resource types are custom resources that you can define or use other community contributed resources
- Jobs are where the work of the pipeline happens
In contrast with Jenkins, there is a UI with the TSA, but it is used only for reporting or potentially
kicking off jobs. The main interface is the fly
CLI. And everything essentially runs inside containers like
Docker.
Impressions
Much of the workflow derives from the CLI, and pipelines are first class citizens. The ability to test
out a pipeline locally with fly
is way ahead of setting up something like Jenkins especially if your organization
uses a lot of plugins. Another change is because Concourse is designed to have atomic jobs so you can actually retrigger
a pipeline from a particular stage.
The UI in Concourse is a minimalist dark mode style web interface. Jenkins still feels like it’s straight out of 2002, so it was nice to see a visual representation with boxes and lines versus the tables of Jenkins current pipeline representations.
One area that looks like it might be a pain with Concourse is its support for artifacts. Since each job is atomic, you have to handle passing things between jobs by pushing it out and storing it with a resource. Then some late job can pick it up and process it further.
The declarative style syntax of the pipelines appears to be fine, though I wouldn’t have minded the option to escape out to a full programming language.
Overall I thought it was an novel direction on CI servers and worth a deeper dive down the road.
The Dark Side of Javascript Fatigue
Javascript fatigue is a real experience for many developers who don’t spend their day to day in Node.js bashing out javascript. For many developers javascript is an occasional concern. The thing I can’t figure out about the javascript development world is the incredible churn. Churn is often disaster for a programming community. It frustrates anyone trying to build a solid application that will have a shelf life of a decade or more. Newcomers are treated to overwhelming choices without enough knowledge to choose. Then they find what they’ve learned is no longer the new and shiny tool only a few months later. And anyone on the outside feels validated in not jumping in.
Many in the javascript community attempt to couch all the churn as a benefit. It’s the incredible pace of innovation. I see sentiments like this:
The truth is, if you don’t like to constantly be learning new things, web development is probably not for you. You might have chosen the wrong career!
Even if we accept that it all the ‘innovation’ is moving things forward more quickly, there is rarely the reflection on the consequences. I’ve worked on an approximately 9 year old Rails app for about 5 years now and I’m still shocked by the number different frameworks and styles of javascript that litter the app:
- Hand rolled pre JQuery javascript
- Javascript cut and paste style
- RJS (an attempt to avoid writing javascript altogether in early rails)
- YUI
- Prototype
- Google Closure
- JQuery
- Angular
Eight different frameworks in about as many years. And though we adopted Angular about 2 years ago we’re already dealing with non-backwards compatibility, Angular 2.0. This is a large burden on maintenance and it costs us very real time to spin up on each one when we have to enhance the app or fix a bug.
This is a monolithic app that’s been built over quite a few years, but the big difference is the Rails app was opinionated and stuck to a lot of default conventions. The framework churn of Rails has been much more gradual and generally backwards compatible. The largest pain we experience was going from Rails 2 to 3, when Rails was merged with Merb. The knowledge someone built up in their first few years working in Ruby and Rails still applies. The churn is certainly exists, but at a measure pace.
In phone screens when I describe our main app, I list off the myriad javascript frameworks we use as a negative they should know about. And almost none of the candidates have heard of Google Closure, even though a critical piece of the app was written in it. They often assume I must be talking about the JVM Clojure.
Javascript has never been popular because of elegance or syntax. Rants like the following are not hard to find:
You see the Node.js philosophy is to take the worst fucking language ever designed and put it on the server.
Large majorities of developers would rather avoid it completely to focus on any modern language and hopefully use a transpiler if they have to touch Javascript. In this environment it might do the javascript community some good to settle down some and focus on some stability.
Mob Programming (Pair Squared)
I came across the idea of mob programming on an Elixir podcast, Elixir Fountain. Its pair programming on steroids, where you sit together and work on one problem together. There’s still just one driver at a time, though it rotates. There’s already a web site and a conference based on it.
I like the concept, but I’m not sure how effective it would be as a constant practice. I’ve done exercises like mob programming in small doses on particularly hard problems that involve architecture choices and occasionally as an exercise at user group meet ups. Anecdotally I don’t see doing it most of the time, but it is possible it works as a regular practice. It’s different enough that it might need some further experiments.