Full fledged Kubeflow deployment on Google Cloud

This guide describes how to deploy Kubeflow and a series of Kubeflow components on Google Kubernetes Engine (GKE).


Kubeflow deployed on Google Cloud includes the following:

  1. Full-fledged multi-user Kubeflow running on Google Kubernetes Engine.
  2. Cluster Autoscaler with automatic resizing of the node pool.
  3. Cloud Endpoint integrated with Identity-aware Proxy (IAP).
  4. GPU and Cloud TPU accelerated nodes available for your Machine Learning (ML) workloads.
  5. Cloud Logging for easy debugging and troubleshooting.
  6. Other managed services offered by Google Cloud, such as Cloud Storage, Cloud SQL, Anthos Service Mesh, Identity and Access Management (IAM), Config Controller, and so on.

Kubeflow on Google Cloud: central dashboard

Figure 1. User interface of full-fledged Kubeflow deployment on Google Cloud.

Management cluster

Kubeflow on Google Cloud employs a management cluster, which lets you manage Google Cloud resources via Config Controller. The management cluster is independent from the Kubeflow cluster and manages Kubeflow clusters. You can also use a management cluster from a different Google Cloud project, by assigning owner permissions to the associated service account.

Kubeflow on Google Cloud: clusters hierarchy

Figure 2. Example of Kubeflow on Google Cloud deployment.

Deployment process

To set up a Kubeflow environment on Google Cloud, complete these steps:

  1. Set up Google Cloud project.
  2. Set up OAuth client.
  3. Deploy Management cluster.
  4. Deploy Kubeflow cluster.

For debugging approaches to common issues encountered during these deployment steps, see troubleshooting deployments to find common issues and debugging approaches. If the issue isn’t included in the list of commonly encountered issues, report a bug at googlecloudplatform/kubeflow-distribution.

Next steps


Was this page helpful?

Last modified January 3, 2023: Fix broken links (ef94489)