TEP-0100: Embedded TaskRuns and Runs Status in PipelineRuns
October 18, 2023 ยท View on GitHub
Summary
This TEP proposes changes to PipelineRun Status to reduce the amount of information stored
about the status of TaskRuns and Runs to improve performance, reduce memory bloat and
improve extensibility.
Instead of the full embedded statuses, the PipelineRunStatus will contain:
- the api versions, kinds and names of its
TaskRunsandRuns. - the names of the
PipelineTasksfrom which theTaskRunsandRunswere executed.
Motivation
Today, we embed the status of each TaskRun and Run into the PipelineRun status.
This causes several issues: performance degradation, memory bloat, and lack of extensibility.
1. Performance Degradation
Every time the status of TaskRuns and Runs change, the status of the parent PipelineRun
is updated as well. For example, the status of a PipelineRun is updated upon completion of
each Step, even if the TaskRun or Run has not completed. This causes extra requests to
etcd and extra load on the Dashboard, which reacts to CRD status updates.
Read more in the related issue.
2. Memory Bloat
Embedded statuses increase the size of the serialized PipelineRuns.
As shared in API WG on 10/01/2022, the embedded statuses is costly for users:
"embedding status more than doubles the storage and has direct consequences on what customers end up paying"
Read more in the related issue.
3. Lack of Extensibility
The above problems will be exacerbated when we support features that execute multiple
TaskRuns and Runs from one PipelineTask. For example:
Matrix: fan out a givenPipelineTaskinto multipleTaskRunsorRuns. Fanned outTaskRunsandRunscan even be created dynamically by consumingResultsfrom previousTaskRunsandRuns.PipelinesinPipelines: pass inPipelinestoPipelineTasksto run them similarly toTasks.
Goals
- Improve performance by reducing updates to
PipelineRunstatus fromTaskRunsandRuns. - Improve memory usage by reducing the amount of storage
PipelineRunstatus uses forTaskRunsandRuns. - Improve extensibility by setting up
PipelineRunstatus to better support upcoming features in Tekton Pipelines.
Non-Goals
- Improve other aspects of
PipelineRunstatus other than the embedding ofTaskRunsandRuns. - Make any changes to
PipelineRunspec.
Background
PipelineRun Status
The PipelineRunStatus contains the status (ConditionSucceeded) of the PipelineRun and other
details including the complete status of its TaskRuns and Runs. This TEP aims to optimize the
TaskRuns and Runs fields only in the PipelineRunStatus. The other fields, such as the resolved
PipelineSpec, are out of scope and will not be changed.
type PipelineRunStatus struct {
duckv1beta1.Status `json:",inline"`
PipelineRunStatusFields `json:",inline"`
}
type PipelineRunStatusFields struct {
StartTime *metav1.Time `json:"startTime,omitempty"`
CompletionTime *metav1.Time `json:"completionTime,omitempty"`
TaskRuns map[string]*PipelineRunTaskRunStatus `json:"taskRuns,omitempty"`
Runs map[string]*PipelineRunRunStatus `json:"runs,omitempty"`
PipelineResults []PipelineRunResult `json:"pipelineResults,omitempty"`
PipelineSpec *PipelineSpec `json:"pipelineSpec,omitempty"`
SkippedTasks []SkippedTask `json:"skippedTasks,omitempty"`
}
Owner References and Labels
TaskRuns and Runs have owner references to PipelineRuns.
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
...
name: myTaskRun
ownerReferences:
- apiVersion: tekton.dev/v1beta1
blockOwnerDeletion: true
controller: true
kind: PipelineRun
name: myPipelineRun
...
TaskRuns and Runs also have labels for the source PipelineRuns:
tekton.dev/pipelineRun: <PipelineRunName>.
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
labels:
app.kubernetes.io/managed-by: tekton-pipelines
tekton.dev/memberOf: tasks
tekton.dev/pipeline: myPipeline
tekton.dev/pipelineRun: myPipelineRun
tekton.dev/pipelineTask: myPipelineTask
tekton.dev/task: myTask
name: myTaskRun
...
Users and tools can rely on these owner references and labels to connect
TaskRuns and Runs to the PipelineRuns that created them.
Tekton Results API
Tekton Results bundles related PipelineRuns, TaskRuns and Runs
into a single unit (called Result).
For example, a PipelineRun with two TaskRuns would have one Result with
three Records:
myPipelineRun [Result]
|
v
--------------------------------------------------------------
| | |
v v v
myPipelineRun [Record] myTaskRun1 [Record] myTaskRun2 [Record]
In addition to grouping related resources, Tekton Results provides long term storage
of PipelineRuns, TaskRuns and Runs away from the runtime storage in etcd.
Read more in TEP-0021.
Proposal
The PipelineRunStatus should contain:
- the api versions, kinds and names of its
TaskRunsandRuns. - the names of the
PipelineTasksfrom which theTaskRunsandRunswere executed.
Users and tools can find the complete status of TaskRuns and Runs in the cluster
in the sort term, and can rely on Tekton Results in the long term.
In addition, they can use Owner References and Labels to
identify related objects.
API Changes
1. Add Minimal Embedded Status
We will introduce a new struct to hold references to child TaskRuns and Runs, and
their corresponding WhenExpressions and ConditionChecks:
type ChildStatusReference struct {
runtime.TypeMeta `json:",inline"` // contains API version and Kind
Name string `json:"name,omitempty"` // name of the TaskRun/Run
PipelineTaskName string `json:"pipelineTaskName,omitempty"` // name of the PipelineTask used to create the TaskRun/Run
ConditionChecks []*PipelineRunChildConditionCheckStatus `json:"conditionChecks,omitempty"` // the condition checks for the TaskRun/Run in the pipeline
WhenExpressions []WhenExpression `json:"whenExpressions,omitempty"` // the WhenExpressions for the TaskRun/Run in the pipeline
}
The existing fields providing the complete TaskRun and Runs are maps with the
resource names as keys. However, the new fields are sub-objects instead of maps as
recommended by the Kubernetes API conventions.
While the names of TaskRuns and Runs are concatenations of the names of the
PipelineRuns and PipelineTasks, they are sometimes truncated when they are
too long. Therefore, we include the PipelineTask name because tools, such as
the Tekton Dashboard, would still need the PipelineTask name in these situations.
ConditionChecks is present in the existing PipelineRunTaskRunStatus struct,
and WhenExpressions is present in both the existing PipelineRunTaskRunStatus
and PipelineRunRunStatus structs. They provide information which is not available
from the individual TaskRun or Run status, since they represent concepts which
only exist at the PipelineRun level. Therefore, they need to be preserved.
To support ConditionChecks, we will add a new struct PipelineRunChildConditionCheckStatus
which will hold the names and statuses of condition checks for the PipelineTask. It
will inline the PipelineRunConditionCheckStatus currently used in the full embedded
statuses. This is needed because PipelineRunConditionCheckStatus doesn't contain
the ConditionCheckName, which is the equivalent of a PipelineTask's name, just
ConditionName, which is the equivalent of a TaskRun or Run's name. Since we're
going to store an array of PipelineRunChildConditionCheckStatus rather than a map of
ConditionCheckName to PipelineRunConditionCheckStatus, we need the ConditionCheckName
in the new struct.
This struct, and ChildStatusReferences.ConditionChecks, will be removed once
Conditions, which have been deprecated, are removed completely. We are not using child
references for the conditions' statuses, because ConditionCheckStatus, the only thing
in PipelineRunConditionCheckStatus other than the ConditionName, isn't replicated
anywhere else, and contains a fairly minimal amount of data - the pod name, start and
completion times, and a corev1.ContainerState. See the issue for deprecating Conditions
for more information on the planned removal of Conditions.
type PipelineRunChildConditionCheckStatus struct {
PipelineRunConditionCheckStatus `json:",inline"` // the inlined condition check status
ConditionCheckName string `json:"conditionCheckName,omitempty"` // the condition check's name
}
Alternatives
- Separate TaskRuns and Runs
- Separate TaskRuns and Runs - Use Maps
- Retain TaskRun and Run Status information
2. Deprecate and Remove Full Embedded Status
Deprecate and remove the old fields from PipelineRunStatusFields from the Beta API.
type PipelineRunStatusFields struct {
...
ChildReferences []ChildReference `json:"childReferences,omitempty"`
...
}
This is a backwards incompatible change in the Beta API, therefore the fields will be deprecated and removed per our deprecation policy, as described in the Beta API section below.
Example
This is an example PipelineRun status as provided in the documentation:
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
...
spec:
...
status:
completionTime: "2020-05-04T02:19:14Z"
conditions:
- lastTransitionTime: "2020-05-04T02:19:14Z"
message: "Tasks Completed: 4, Skipped: 0"
reason: Succeeded
status: "True"
type: Succeeded
startTime: "2020-05-04T02:00:11Z"
taskRuns:
triggers-release-nightly-build:
pipelineTaskName: build
status:
completionTime: "2020-05-04T02:10:49Z"
conditions:
- lastTransitionTime: "2020-05-04T02:10:49Z"
message: All Steps have completed executing
reason: Succeeded
status: "True"
type: Succeeded
podName: triggers-release-nightly-build-pod
resourcesResult:
- key: commit
resourceRef:
name: git-source-triggers
value: 9ab5a1234166a89db352afa28f499d596ebb48db
startTime: "2020-05-04T02:05:07Z"
steps:
- container: step-build
imageID: docker-pullable://golang@sha256:a90f267133
name: build
terminated:
containerID: docker://6b6471f501f59dbb
exitCode: 0
finishedAt: "2020-05-04T02:10:45Z"
reason: Completed
startedAt: "2020-05-04T02:06:24Z"
Taking the above example, this will be the new minimal PipelineRun status:
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
...
spec:
...
status:
completionTime: "2020-05-04T02:19:14Z"
conditions:
- lastTransitionTime: "2020-05-04T02:19:14Z"
message: "Tasks Completed: 4, Skipped: 0"
reason: Succeeded
status: "True"
type: Succeeded
startTime: "2020-05-04T02:00:11Z"
childReferences:
- apiVersion: tekton.dev/v1beta1
kind: TaskRun
name: triggers-release-nightly-build
pipelineTaskName: build
Fetching complete statuses of TaskRuns and Runs
Cluster
If a user is interested in the complete status of a TaskRun or Run, they can
fetch it by its name from the cluster; the name is in the minimal child references.
Taking the example above, this would be the status of the TaskRun:
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
labels:
app.kubernetes.io/managed-by: tekton-pipelines
tekton.dev/memberOf: tasks
tekton.dev/pipeline: triggers-release-nightly
tekton.dev/pipelineRun: triggers-release-nightly
tekton.dev/pipelineTask: build
tekton.dev/task: build
name: triggers-release-nightly-build
ownerReferences:
- apiVersion: tekton.dev/v1beta1
blockOwnerDeletion: true
controller: true
kind: PipelineRun
name: triggers-release-nightly
spec:
...
status:
completionTime: "2020-05-04T02:10:49Z"
conditions:
- lastTransitionTime: "2020-05-04T02:10:49Z"
message: All Steps have completed executing
reason: Succeeded
status: "True"
type: Succeeded
podName: triggers-release-nightly-build-pod
resourcesResult:
- key: commit
resourceRef:
name: git-source-triggers
value: 9ab5a1234166a89db352afa28f499d596ebb48db
startTime: "2020-05-04T02:05:07Z"
steps:
- container: step-build
imageID: docker-pullable://golang@sha256:a90f267133
name: build
terminated:
containerID: docker://6b6471f501f59dbb
exitCode: 0
finishedAt: "2020-05-04T02:10:45Z"
reason: Completed
startedAt: "2020-05-04T02:06:24Z"
Results API
If the cluster has been cleaned up, a user can rely on the Results API to get the full
details of the PipelineRun's TaskRuns and Runs. They can use the Tekton CLI plugin
for the Tekton Results API:
tkn-results records list default/results/- --filter name=="triggers-release-nightly"
For more details on fetching Results and Records, see the documentation.
Go Libraries
We will provide functions in the Go libraries to fetch the TaskRuns and Runs of a
given PipelineRun. These functions will be useful for the Tekton Dashboard, Tekton
CLI and other projects using our Go libraries.
Beta API
Because the PipelineRun status is part of the Pipelines API,
removing the full embedded statuses
is backwards incompatible.
To support migration as required by our API compatibility policy
we will add a behavior flag - embedded-status - used to configure whether the
PipelineRuns should contain:
- the full embedded status of its
TaskRunsandRuns- using valuefull. - the minimal references to its
TaskRunsandRuns- using valueminimal. - both the full embedded status and minimal references of its
TaskRunsandRuns; this provides a smoother transition for users and tools - using valueboth.
Following our policy on updating behavior flags:
- The
embedded-statusflag will befullby default, users can set it tominimalorboth. The existing fields will be deprecated at this point. - After 9 months in v1beta1, the
embedded-statusflag will be changed tominimalby default, users can set it tofullorboth. - As soon as the next release in v1beta1, the
embedded-statusflag will be removed as well as the full embedded status fields. In reality, this would take a bit longer (about 3 months) after confirming that users and contributors are ready for the flag to be removed.
Users can opt in to use both at any time, but it is never the default value. It provides
a seamless transition for API clients for a short period needed to upgrade to minimal.
Alternatives
V1 API
In V1, we will have the minimal references - ChildReferences - to TaskRuns and Runs in
PipelineRuns:
type PipelineRunStatusFields struct {
StartTime *metav1.Time `json:"startTime,omitempty"`
CompletionTime *metav1.Time `json:"completionTime,omitempty"`
ChildReferences []ChildReference `json:"childReferences,omitempty"`
PipelineResults []PipelineRunResult `json:"pipelineResults,omitempty"`
PipelineSpec *PipelineSpec `json:"pipelineSpec,omitempty"`
SkippedTasks []SkippedTask `json:"skippedTasks,omitempty"`
}
The full embedded statuses of TaskRuns and Runs will not be available in PipelineRuns.
Read more about V1 in TEP-0096: Pipelines V1 API.
Tekton Projects
Tekton Pipelines
The PipelineRun controller currently fetches TaskRuns, whether from etcd or from a
cache, on each reconcile loop. The TaskRuns and Runs fields in PipelineRunStatus
are populated from the Resolved PipelineRunTaskRuns ("rprts") in PipelineRunState.
The direct uses of pipelineRun.Status.TaskRuns and pipelineRun.Status.Runs fields in
the PipelineRun controller would need to be updated to use the TaskRuns and Runs
from the Resolved PipelineRunTaskRuns ("rprts"). For example:
- Cancellation: Implementation uses the
TaskRunonly from the full embedded status, which is still available in the minimal references. See code for details. - Pipeline Results: Implementation uses the
TaskRunsfrom thePipelineRunstatus. This will be updated to useTaskRunsdirectly. See code for details. - Retries: Implementation already uses the
TaskRunsfromResolved PipelineRunTaskRuns("rprts") inPipelineRunState. See code for details.
Making the needed updates is an implementation detail that we will figure out in the relevant pull requests.
Tekton Results
As described in the background section, the Results API enables
users to bundle TaskRuns and Runs to their parent PipelineRuns. It also provides
long term storage of resources. Users can rely on Tekton Results to
provide the mapping that was available in the full embedded statuses.
Note that Results API is still in alpha, but progress is being made towards beta - we estimate that the Results API will be in beta by the time we remove the full embedded statuses.
Tekton Dashboard
Tekton Dashboard shows the status the TaskRuns and Runs of a given
PipelineRun, and this should continue to be supported. The Tekton Dashboard currently
relies on the full embedded statuses, including when the scheduled cleanup of resources
removed TaskRuns and Runs from the cluster. The Dashboard will need to be updated
to use the minimal references and rely on Tekton Results for long term storage (read
more in related issue).
We expect the load on the Dashboard to reduce and its performance to improve, given
that the PipelineRuns would not be reacting to the updates in Steps.
Tekton Chains
Tekton Chain observes TaskRuns and signs them directly, it doesn't depend
on the full embedded status in PipelineRun status. TEP-0084 proposes that
Tekton Chains starts to sign PipelineRuns - it involves creating a single attestation
record upon completion of a PipelineRun that includes all TaskRuns, the PipelineRun,
and the event-payload instead of a record for each of them. We will ensure that the
proposal in TEP-0084 aligns with the changes to PipelineRuns proposed in this TEP.
Design Evaluation
- API conventions: This design complies with the Kubernetes API conventions by using sub-objects instead of maps for fields, and using string aliases instead of booleans for behavior flags.
- Simplicity: This design simplifies the
PipelineRunstatus by providing the minimum information and updates needed fromTaskRunsandRuns. - Reusability: This design encourages reuse of existing components, such as Owner References
and Tekton Results, by removing the duplication caused by embedding the complete statuses of
TaskRunsandRuns. - Flexibility: This design improves the extensibility of Tekton Pipelines to support upcoming
features that create multiple
TaskRuns,RunsorPipelineRunsfrom a singlePipelineTask. The behavior flag is also flexible to support more configurations if needed. - Conformance: This design impact the conformance surface through changes to the
PipelineRuninterface. The changes are backwards incompatible but will be introduced in a backwards compatible manner first with migration instructions and deprecation warnings.
Alternatives
Add Minimal Embedded Status for TaskRuns and Runs
Instead of using the same field to hold references to both TaskRuns and Runs, we could use a
separate field for each. We would introduce two new structs to store the minimal status of
TaskRuns and Runs in the PipelineRun status, with the names only:
type PipelineRunTaskRunMinimalStatus struct {
PipelineTaskName string `json:"pipelineTaskName,omitempty"`
TaskRunName string `json:"taskRunName,omitempty"`
}
type PipelineRunRunMinimalStatus struct {
PipelineTaskName string `json:"pipelineTaskName,omitempty"`
RunName string `json:"runName,omitempty"`
}
We would then add the new fields to PipelineRunStatusFields as follows:
type PipelineRunStatusFields struct {
...
TaskRuns map[string]*PipelineRunTaskRunStatus `json:"taskRuns,omitempty"`
TaskRunsStatuses []PipelineRunTaskRunMinimalStatus `json:"taskRunsStatuses,omitempty"`
Runs map[string]*PipelineRunRunStatus `json:"runs,omitempty"`
RunsStatuses []PipelineRunRunMinimalStatus `json:"runsStatuses,omitempty"`
...
}
An example PipelineRun status might look like this:
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
...
spec:
...
status:
completionTime: "2020-05-04T02:19:14Z"
conditions:
- lastTransitionTime: "2020-05-04T02:19:14Z"
message: "Tasks Completed: 4, Skipped: 0"
reason: Succeeded
status: "True"
type: Succeeded
startTime: "2020-05-04T02:00:11Z"
taskRunsStatuses:
- taskRunName: triggers-release-nightly-build
pipelineTaskName: build
Discussion
While this approach makes it easy to identify TaskRuns vs Runs, as they are in
separate fields, it'd require us to add a new field for new types thus limiting the
extensibility. For example, we may allow PipelineRuns to have child PipelineRuns
in our implementation of Pipelines in Pipelines.
Add Minimal Embedded Status - Use Map
Use maps for the new fields, as we do with the existing fields:
type PipelineRunStatusFields struct {
...
TaskRuns map[string]*PipelineRunTaskRunStatus `json:"taskRuns,omitempty"`
TaskRunsStatuses map[string]*PipelineRunTaskRunMinimalStatus `json:"taskRunsStatuses,omitempty"`
Runs map[string]*PipelineRunRunStatus `json:"runs,omitempty"`
RunsStatuses map[string]*PipelineRunRunMinimalStatus `json:"runsStatuses,omitempty"`
...
}
Taking the example above, this would be the PipelineRun status:
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
...
spec:
...
status:
completionTime: "2020-05-04T02:19:14Z"
conditions:
- lastTransitionTime: "2020-05-04T02:19:14Z"
message: "Tasks Completed: 4, Skipped: 0"
reason: Succeeded
status: "True"
type: Succeeded
startTime: "2020-05-04T02:00:11Z"
taskRunStatus:
triggers-release-nightly-build:
Discussion
While this approach is consistent with existing code, it does not comply with the Kubernetes API conventions that recommend against maps. The main problem with maps is:
The crux of maps is that it isn't clear to the user what "left-hand side strings" are "magic keywords" in the config system/API vs. which are user data.
Maps also make it hard to use other keys to identify the resource. We use names
today, but may want to use Namespace, Cluster or other fields later.
Add Minimal Embedded Status - Include TaskRun and Run status
In this approach, we would store the conditionSucceeded field of the TaskRuns and Runs
in the PipelineRun status, for example:
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
...
spec:
...
status:
completionTime: "2020-05-04T02:19:14Z"
conditions:
- lastTransitionTime: "2020-05-04T02:19:14Z"
message: "Tasks Completed: 4, Skipped: 0"
reason: Succeeded
status: "True"
type: Succeeded
startTime: "2020-05-04T02:00:11Z"
resources:
- apiVersion: v1beta1
kind: TaskRun
name: triggers-release-nightly-build
conditions:
- lastTransitionTime: "2020-05-04T02:10:49Z"
message: All Steps have completed executing
reason: Succeeded
status: "True"
type: Succeeded
Discussion
This approach uses more storage than the proposed solution, and the status of the child object can easily be fetched by reference.
Beta API - Use Booleans
We could add a behavior flag - enable-full-embedded-status - used to configure
whether PipelineRuns should contain the embedded status or minimal references of
its TaskRuns and Runs.
Following our policy on updating behavior flags:
- The
enable-full-embedded-statusflag will be true by default. - After 9 months in v1beta1, the
enable-full-embedded-statusflag will be flipped to false by default. - As soon as the next release in v1beta1, the
enable-full-embedded-statusflag will be removed as well as the full embedded status fields.
Discussion
While the behavior flag taking booleans solves for the options we need, the Kubernetes API conventions warn "think twice about boolean fields" because "many ideas start as boolean but eventually trend towards a small set of mutually exclusive options". Using booleans would make it difficult to have both full embedded statuses and minimal references as users make transitions, using a boolean for this field would be limiting. Therefore, we prefer the alternative to using string aliases.
Beta API - Default to Full then Both then Minimal
To provide a smoother migration as required by our API compatibility policy
we will add a behavior flag - embedded-status - used to configure whether the
PipelineRuns should contain:
- the full embedded status of its
TaskRunsandRuns - the minimal embedded status of its
TaskRunsandRuns - both the full status and minimal references of its
TaskRunsandRuns
Following our policy on updating behavior flags:
- The
embedded-statusflag will befullby default, users can set it tominimalorboth. - After 2 months in v1beta1, the
embedded-statusflag will be changed tobothby default, users can set it tofullorminimal. - After 7 more months in v1beta1, the
embedded-statusflag will be changed tominimalby default, users can set it tofullorboth. - As soon as the next release in v1beta1,
embedded-statusflag will be removed as well as the full embedded status fields. In reality, this would take a bit longer after confirming that users and contributors are ready for the flag to be removed.
Discussion
While this approach gives users more control by allowing them to receive both the full and minimal references, it causes more duplication and worsens the problems described above. This remains an option we can support later if we receive feedback that users need it for smoother migration, and the proposal is set up to easily support this expansion.
References
- Issues:
- TEPs:
- Tekton Results
- Pull Requests:
- [TEP-0100] Fields/flags/docs for embedded TaskRun and Run statuses in PipelineRuns
- [TEP-0100] Prepare for testing of minimal status implementation
- [TEP-0100] Switch ApplyTaskResultsToPipelineResults to not use status maps
- [TEP-0100] Add functionality to be used in supporting minimal embedded status
- [TEP-0100] Add new updatePipelineRunStatusFromChildRefs function
- [TEP-0100] Implementation for embedded TaskRun and Run statuses in PipelineRuns