Practical Tekton
Overview
Introduction
This post gives an overview of Tekton based on my initial testing and exploration of the project within a RedHat OpenShift (Kubernetes) cluster (using v4.4 and v4.5)
My goal was to explore the new technologies to see if I could simplify the developer experience for our developers and users. A simple task: to take a VS Code commit (running on Windows 10 desktop I used in my day job), trigger a pipeline to build the application into a container image, run unit tests and vulnerability scanning, and then deploy the application image to the OpenShift cluster using KNative.
Note that Tekton is known as OpenShift Pipelines in OpenShift, and KNative is known as OpenShift Serverless. They're essentially the same thing, and I may use those terms interchangably in this document.
I'm not going to cover installation of Tekton or KNative here, because there are plenty of documents out there that already cover it, both in vanilla Kubernetes clusters and OpenShift clusters. In any case with OpenShift versions of these products, it's relatively easy to install using the Operators feature found in OpenShift.
What is Tekton?
Tekton is a tool for creating CI/CD application container build pipelines within Kubernetes. Tekton is kubernetes-native, meaning it was designed from the ground up as a CI/CD system running directly inside a Kubernetes cluster. It is implemented using various CRDs (Pipeline, PipelineRun, Tasks, ClusterTasks, Triggers, etc
) provided under tekton.dev
and triggers.tekton.dev
API groups. As such, if you're experienced Kubernetes user and comfortable with applying, patching and editing the YAML of Kubernetes resources, you'll be right at home.
Pipelines, Tasks and Steps
To deploy a Tekton CI/CD pipeline, you start of by creating a Pipeline
YAML resources which consists of a number of Tasks
. Each Task
in the Pipeline
gets deployed as its own pod within the namespace, and is made up of a series of steps, which each step being a separate container image in the pod.
So, for example if you have a Task
with three build, push, scan steps, a single pod with three different container images is normally deployed. In addition, Tekton itself adds init containers and other things to the pod in order to execute the pipeline (for example, the pipelines-creds-init-rhel8 init container which setups secrets and any script resources specified in your tasks, etc). The order of execution of each Task
is determined by setting the spec.tasks[*].runAfter:
parameter, which can be used to ensure the Pipeline runs in a specific order (steps can also run in parallel by setting these to be the same on different Tasks
). Below is a fabricated example to illustrate this:
1apiVersion: tekton.dev/v1beta1
2kind: Pipeline
3metadata:
4 name: my-pipeline
5spec:
6 tasks:
7 - name: git-clone
8 taskRef:
9 kind: Task
10 name: git-clone
11 - name: build
12 taskRef:
13 kind: Task
14 name: build
15 runAfter:
16 - git-clone
17 - name: scan
18 taskRef:
19 kind: Task
20 name: scan
21 runAfter:
22 - build
23 - name: push
24 taskRef:
25 kind: Task
26 name: push
27 runAfter:
28 - scan
1apiVersion: tekton.dev/v1beta1
2kind: Task
3metadata:
4 name: build
5spec:
6 steps:
7 - name: build-binary
8 image: registry/builder:latest
9 - name: test
10 image: registry/test:latest
Tekton has a catalog of pre-created Tasks
, and RedHat have a curated list of these that are then bundled as ClusterTasks and installed when OpenShift Pipelines is. ClusterTasks
are identical to Tasks
except that they are global resources and available for use by all users across all namespaces in a cluster, whereas Tasks
are namespaced.
Parameters
Inputs to your pipeline are specified in spec.params
of the Pipeline
object, and these can then be made available to and referenced by each task (spec.tasks[*].params
):
1apiVersion: tekton.dev/v1beta1
2kind: Pipeline
3metadata:
4 name: my-pipeline
5spec:
6 params:
7 - name: GIT_REPO
8 type: string
9 description: URL of the Git repository
10 - name: GIT_BRANCH
11 type: string
12 description: Git Branch
13 - name: REGISTRY
14 type: string
15 description: Container registry name
16 - name: DEST_IMAGE
17 type: string
18 description: Name of image that will be built
19 - name: DEST_IMAGE_TAG
20 type: string
21 description: Image tag
22 tasks:
23 - name: git-clone
24 taskRef:
25 kind: Task
26 name: git-clone
27 params:
28 - name: source_dir
29 value: "$(params.GIT_REPO)"
30 - name: build
31 taskRef:
32 kind: Task
33 name: build
34 params:
35 - name: image-name
36 value: "$(params.REGISTRY)/${DEST_IMAGE}/${DEST_IMAGE_TAG}
You can also use parameters that were the outputs from other Tasks
that have already executed. For example, you could use the Git commit SHA from the HEAD of the repository you clone in a git-clone Task
, and pass it as an input into the build Task
that builds the container (for example, perhaps you want to add a container image LABEL with the commit sha). These are held in a parameter called $(tasks.<task_name>.results)
(tou can also create these yourself, more on that in a different post)
1 tasks:
2 - name: git-clone
3 taskRef:
4 kind: Task
5 name: git-clone
6 - name: build
7 taskRef:
8 kind: Task
9 name: build
10 params:
11 - name: commit
12 value: "$(tasks.git-clone.results.commit)"
PipelineResources
There is also a Tekton resource type called a PipelineResource
, which is an input or output object that can be defined as a spec.resource
and provided to a Task
. For example, you might have a Git Repo PipelineResource
input which defines a Git repo (url, branch) a Git clone type Task
can use as an input, or an output object for a container image output from a build Tasks
(eg $(resources.outputs.image.url)
).
In my experience with Tekton 0.7, these were hideously complicated to debug, I found them very opaque and hard to troubleshoot, and had compatability issues using them where they didn't fully support the auth mechanism of our container registry (JFrog Artifactory). In most cases using parameters was easier and just as functional, so I quickly stopped using PipelineResources.
In additon, during OpenShift Pipelines version updates I noticed that more ClusterTasks were switching from using PipelineResouurces as inputs to using standard parameters as inputs (eg $(params.git_url
) instead of $(resources.inputs.git.url)
). There is a section of the Tetkon docs Why Aren't PipelineResources in Beta? which also suggests there have been issues, and so it appears that they are falling out of favour or at least need some more work. In any case, I've not found a compelling reason for using them over parameters.
Secrets
When using Secrets
with Tekton it is worth noting that in comparison to others users of Secrets
. Tekton is very particular about their type and annotations. Most of our "Git secrets" (Secrets
containing private SSH keys that are attached to BuildConfigs
) were initially created as type: Opaque
. Whilst this allows them to be used with BuildConfigs
, they are not picked up and used with Tekton. To use them, you will need to ensure the Secrets
type is set to type: kubernetes.io/ssh-auth
.
In addition, Tekton secrets must have annotations that explictly define hosts where they will be used. Lastly, a data.known_hosts
(which you can get from $HOME/.ssh/know_hosts after you've connected to it via SSH) is also required to avoid error messages. The example below shows an example of a correctly annotated SSH key Secret
for use with Tekton.
1apiVersion: v1
2kind: Secret
3metadata:
4 name: my_git_ssh_key
5 annotations:
6 tekton.dev/git-0: mygithost.example.com
7type: kubernetes.io/ssh-auth
8data:
9 ssh-privatekey: <base64 encoded>
10 known_hosts: <base64 encoded>
Workspaces
Workspaces were introduced into Tekton as a new feature whilst I was working on it. They're kind of like standard Volumes
in a Pod in that they present Secrets
, ConfigMaps
or PersistentVolumeClaims
into a location in the filesystem of a given Task's
pod. I only used a PVC
in my testing. The value of this is that you can share data between tasks (eg the source code your application build task needs after it has cloned from your git clone task).
Running a Pipeline
When you execute a Pipeline
a PipelineRun
object is created that connects the Workspace
, ServiceAccounts
, Pipelines
etc together and deploys the Pods needed to run your Pipeline
- or in other words, there is one PipelineRun
object created for each time the Pipeline
is executed.
As all Tekton objects are implemented natively in Kubernetes as CRDs, you could just use oc/kubectl to use Tekton, but using tkn CLI client is much easier (in OpenShift, you can also use the Web Console UI, navigate Pipelines menu then simply click Start in the context menu) for interacting and using Tekton pipelines. For example, to start your pipeline run the following command:
1tkn start mypipeline
By default, tkn will go into an interactive mode and prompt you to enter all the input information it requires. But I quickly got into the habit of passing these all as additional command line parameters, and immediately displaying any logs.
1tkn pipeline start mypipeline --user-param-defaults -w name=workspace,claimName=workspace --showlog
Conclusion
Though I have been using OpenShift and Kubernetes for quite a few years, I was new to both KNative and Tekton. Both are fast moving projects, with many resources in alpha or beta (RedHat Tech Preview) at best.
As with any new software, it can be a little tricky to workout and understand how it all integrates together. This is particulary true in a company like the one where I work where we run OpenShift/Kubernetes clusters in a more secure, disconnected, on-premise environment.
In particular, we tend to run with on-prem instances of commercial products (eg JFrog Artifactory Container Registry) instead of cloud or externally hosted service (like Docker Hub or Quay.io). Even though these services run on a private disconnected network, we are also required to adhere to stricter security policies than most organisations (eg never public open access Git repos or registries, always requiring credentials for any type of pull or push, using private CAs and PKI). This means being an early adopter of new software- that developers and testers have developed with standard cloud services out on the internet rather than on-prem commerical products - you're likely to encounter more bugs for the first time using new projects.
The table below shows the versions I've been using, and each version fixed bugs I'd actually seen myself in the prior version. I can't stress enough that if you're an OpenShift or non-cloud Kubernetes user where you manage the versions, use both the latest OpenShift and Pipeline versions.
OpenShift | Pipelines | Base Tekton | tkn CLI |
---|---|---|---|
OpenShift 4.6 | Tech Preview 1.2 | Tekton 0.16.3 | tkn 0.13.1 |
OpenShift 4.5 | Tech Preview 1.1 | Tekton 0.14.3 | tkn 0.11.0 |
OpenShift 4.4 | Tech Preview 1.0 | Tekton 0.11.3 | tkn 0.9.0 |
The other thing with new alpha/beta software is that it changes quite quickly, and this means that many of the articles, tutorials, GitHub comments and issues reference objects that have been deprecated or now have a new spec:
or something (for example one of the prominent Serverless tutorials still in the top of Google searchs as I write this even details the KNative build component, which has now been deprecated in favour of Tekton in a separate project). Be very careful using older examples and tutorials found on the internet.
Anyway, after a few teething issues I got a working Python pipeline where I built, scanned and deployed a Python container image as a Knative service and I can say I really like Tekton. In particular, I never really enjoyed writing Groovy and much prefer creating pipelines from combination of bash scripts and YAML editing of Kubernetes resources.
In my next post, I will detail how I'm using KNative (Serverless) as a lightweight way to deploy your application at the end of the Pipeline. Knative promises lots, but its use in development and build systems is a solid use case that can easily be used today.