Version 0.30.0
May 17, 2021 ยท View on GitHub
Major Features and Improvements
- Upgraded TFX to KFP compiler to use KFP IR schema version 2.0.0.
- InfraValidator can now produce a SavedModel with warmup requests. This feature is
enabled by setting
RequestSpec.make_warmup = True. The SavedModel will be stored in the InfraBlessing artifact (blessingoutput of InfraValidator). - Pusher's
modelinput is now optional, andinfra_blessingcan be used instead to push the SavedModel with warmup requests, produced by an InfraValidator. Note that InfraValidator does not always create a SavedModel, and the producer InfraValidator must be configured withRequestSpec.make_warmup = Truein order to be pushed by a Pusher. - Support is added for the JSON_VALUE artifact property type, allowing storage of JSON-compatible objects as artifact metadata.
- Support is added for the KFP v2 artifact metadata field when executing using the KFP v2 container entrypoint.
- InfraValidator for Kubernetes now can override Pod manifest to customize annotations and environment variables.
- Allow Beam pipeline args to be extended by specifying
beam_pipeline_argsper component. - Support string RuntimeParameters on Airflow.
- User code specified through the
module_fileargument for the Evaluator, Transform, Trainer and Tuner components is now packaged as a pip wheel for execution. For Evaluator and Transform, these wheel packages are now installed on remote Apache Beam workers.
Breaking Changes
For Pipeline Authors
- CLI usage with kubeflow changed significantly. You MUST use the new:
--build-imageto build a container image when updating a pipeline with kubeflow engine.--build-target-imageflag in CLI is changed to--build-imagewithout any container image argument. TFX will auto detect the image specified in the KubeflowDagRunnerConfig class instance. For example,tfx pipeline create --pipeline-path=runner.py --endpoint=xxx --build-image tfx pipeline update --pipeline-path=runner.py --endpoint=xxx --build-image--package-pathand--skaffold_cmdflags were deleted. The compiled path can be specified when creating a KubeflowDagRunner class instance. TFX CLI doesn't depend on skaffold any more and use Docker SDK directly.- Default orchestration engine of CLI was changed to
localorchestrator frombeamorchestrator. You can still usebeamorchestrator with--engine=beamflag. - Trainer now uses GenericExecutor as default. To use the previous Estimator based Trainer, please set custom_executor_spec to trainer.executor.Executor.
- Changed the pattern spec supported for QueryBasedDriver:
- @span_begin_timestamp: Start of span interval, Timestamp in seconds.
- @span_end_timestamp: End of span interval, Timestamp in seconds.
- @span_yyyymmdd_utc: STRING with format, e.g., '20180114', corresponding to the span interval begin in UTC.
- Removed the already deprecated compile() method on Kubeflow V2 Dag Runner.
- Removed project_id argument from KubeflowV2DagRunnerConfig which is not used and meaningless if not used with GCP.
- Removed config from LocalDagRunner's constructor, and dropped pipeline proto support from LocalDagRunner's run function.
- Removed input parameter in ExampleGen constructor and external_input in dsl_utils, which were called as deprecated in TFX 0.23.
- Changed the storage type of
spanandversioncustom property in Examples artifact from string to int. ResolverStrategy.resolve_artifacts()method signature has changed to takeml_metadata.MetadataStoreobject as the first argument.- Artifacts param is deprecated/ignored in Channel constructor.
- Removed matching_channel_name from Channel's constructor.
- Deleted all usages of instance_name, which was deprecated in version 0.25.0. Please use .with_id() method of components.
- Removed output channel overwrite functionality from all official components.
- Transform will use the native TF2 implementation of tf.transform unless TF2
behaviors are explicitly disabled. The previous behaviour can still be
obtained by setting
force_tf_compat_v1=True.
For Component Authors
- N/A
Deprecations
- RuntimeParameter usage for
module_fileand user-defined function paths is marked experimental. LatestArtifactsResolver,LatestBlessedModelResolver,SpansResolverare renamed toLatestArtifactStrategy,LatestBlessedModelStrategy,SpanRangeStrategyrespectively.
Bug Fixes and Other Changes
- GCP compute project in BigQuery Pusher executor can be specified.
- New extra dependencies for convenience.
- tfx[airflow] installs all Apache Airflow orchestrator dependencies.
- tfx[kfp] installs all Kubeflow Pipelines orchestrator dependencies.
- tfx[tf-ranking] installs packages for TensorFlow Ranking. NOTE: TensorFlow Ranking only compatible with TF >= 2.0.
- Depends on 'google-cloud-bigquery>=1.28.0,<3'. (This was already installed as a transitive dependency from the first release of TFX.)
- Depends on
google-cloud-aiplatform>=0.5.0,<0.8. - Depends on
ml-metadata>=0.30.0,<0.31.0. - Depends on
portpicker>=1.3.1,<2. - Depends on
struct2tensor>=0.30.0,<0.31.0. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.5.*,<3. - Depends on
tensorflow-data-validation>=0.30.0,<0.31.0. - Depends on
tensorflow-model-analysis>=0.30.0,<0.31.0. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.5.*,<3. - Depends on
tensorflow-transform>=0.30.0,<0.31.0. - Depends on
tfx-bsl>=0.30.0,<0.31.0.
Documentation Updates
- N/A