Working Effectively with CI/CD Pipelines

Overview

  • When and when not to leverage the CI/CD pipeline
  • Assumes familiarity with CI/CD basics
  • Does not cover all edge-cases, there are exceptions

Not all changes are equal

  1. Updating Security Group with new IP address
  2. Deploying new AWS resource e.g. AWS Neptune
  3. Developing script e.g. to orchestrate DNS
  4. Updating IAM permissions for AWS Role

What should I use a CI/CD pipeline for?

  • Updating something when the outcome is very predictable
  • Validating more complex changes that have been sufficiently tested
  • Copying existing, working code and using it in other environments
  • Double checking that your new feature passes tests in a consistent environment
  • Testing changes to the pipeline itself

Feedback loop

  • Running on workstation with hot reload - instant (milliseconds)
  • Compiling on workstation - fast (seconds)
  • Deploying Lambda from workstation - okay (tens of seconds)
  • Deploying Lambda/Docker from pipeline - slow (minutes)
  • Deploying AWS infrastructure from workstation - very slow (many minutes)
  • Baking an AMI, deploying it to an ASG with rolling updates - extremely slow (tens of minutes)

What should I not use a CI/CD pipeline for?

  • Developing code
  • Experimentation for proof of concepts
  • Researching a new, unfamiliar tech stack
  • Checking whether you have the right syntax for a command
  • Orchestrating changes with complex dependency graphs

Considerations

Parameterisation

  • Script should ideally be able to be run locally
  • Mock/conditional logic for things that can’t run locally e.g. EC2 metadata
  • Pass parameters (env vars, CLI flags) to set URLs, etc.
  • Scripts should not concern themselves with authentication
    • EC2 instances will have instance profiles
    • Running locally will use personal credentials
    • CI system will have commands for assume role

Mocking

def create_aem_volume(client):
    time.sleep(10)
    response = client.create_volume(name="foobar", size="100")
    return {"VolumeId": response["VolumeId"]}

@patch('time.sleep')
def test_can_create_aem_volume(mock_sleep):
    client = boto3.client('ec2')  # normal boto3 client
    stubber = Stubber(client)  # put a wrapper around the client
    stubber.add_response('create_volume', {"VolumeId": "a81d2cc238226c8fd333b3ec2b53ada772b386be"})  # mock the response for `create_volume` with a hardcoded value
    stubber.activate()  # activate it.
    response = create_aem_volume(client=client)  # pass the client to our function, now any API calls it makes will be mocked
    assert response = {"VolumeId": "a81d2cc238226c8fd333b3ec2b53ada772b386be"}

TL;DR: any function using boto3 should accept clients as an argument to facilitate easy mocking

Running Pipeline Steps Locally

  • Everything we do in the pipeline should be able to be run locally
  • Yes, we should be able to build and deploy even the jankiest monolith from our workstations
  • Avoiding functionality as Jenkins plugins helps with this
  • All things within reason; needing to run Jenkins locally may be overkill

Signs you may need to change your workflow

  • Excessive amounts of commits in a PR
  • Commit messages not meaningful
  • Using the pipeline as a remote shell
  • Fighting self-healing/immutable infrastructure
  • Complex orchestration/coupling of infrastructure creation to fulfill dependency graph

Conclusion

  • The CI system is not an efficient first pass for validating code
  • Appropriate feedback loop for appropriate changes:
    • Fast feedback loop for experimentation
    • Thorough testing once reasonably confident
  • Write things to be run locally

Thanks