Deploy to a Google Kubernetes Engine (GKE) cluster¶
Follow the instructions below to set up a Google Kubernetes Engine (GKE) cluster and a container image repository in Artifact Registry for building and running the controller.
Costs¶
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.
When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.
Before you begin¶
-
Install the Google Cloud SDK.
-
Configure authorization and a base set of properties for the
gcloud
command line tool. Choose a project that has billing enabled. -
Install
kubectl
andgke-gcloud-auth-plugin
:gke-gcloud-auth-plugin
enableskubectl
to authenticate to GKE clusters using credentials obtained usinggcloud
. -
To build the binary and the container image for the controller, install all of the following:
-
Set the Google Cloud project you want to use:
Replace
PROJECT_ID
with the project ID of the Google Cloud project you want to use. -
Enable the Artifact Registry and GKE APIs:
-
Clone the Git repository and navigate to the directory
projects/k8s-hybrid-neg-controller
.
Firewall rules¶
-
If this is a temporary project, create a firewall rule that allows all TCP, UDP, and ICMP traffic within your VPC network:
gcloud compute firewall-rules create allow-internal \ --allow tcp,udp,icmp \ --network default \ --source-ranges "10.0.0.0/8"
If you are unable to create such a wide rule, you can instead create more specific firewall rules that only allow traffic between your GKE cluster nodes.
-
Create a firewall rule that allows health checks from Google Cloud Load Balancers to Compute Engine instances in your VPC network that have the
allow-health-checks
network tag:
Artifact Registry setup¶
-
Define environment variables that you use when creating the Artifact Registry container image repository:
REGION=us-west1 AR_LOCATION="$REGION" AR_REPOSITORY=hybrid-neg PROJECT_ID="$(gcloud config get project 2> /dev/null)" PROJECT_NUMBER="$(gcloud projects describe $PROJECT_ID --format 'value(projectNumber)')"
Note the following about the environment variables:
REGION
: the Compute Engine region where you have or will create your GKE cluster..AR_LOCATION
: an Artifact Registry location. In order to reduce network cost, you can use the region that you will use for your GKE cluster. If you have GKE clusters in multiple regions, you may consider using a multi-region location such asus
.AR_REPOSITORY
: the repository name. You can use a different name if you like.PROJECT_ID
: the project ID of your Google Cloud project.PROJECT_NUMBER
: the automatically generate project number of your Google Cloud project.
-
Create a container image repository in Artifact Registry:
-
Configure authentication for
gcloud
and other command-line tools to the Artifact Registry host of your repository location: -
Grant the Artifact Registry Reader role on the container image repository to the IAM service account assigned to the GKE cluster nodes. By default, this is the Compute Engine default service account:
Create the Google Kubernetes Engine (GKE) cluster¶
-
Create GKE cluster:
gcloud container clusters create hybrid-neg \ --enable-dataplane-v2 \ --enable-ip-alias \ --enable-l4-ilb-subsetting \ --enable-master-global-access \ --enable-private-nodes \ --gateway-api standard \ --location "$REGION" \ --network default \ --release-channel rapid \ --subnetwork default \ --workload-pool "${PROJECT_ID}.svc.id.goog" \ --enable-autoscaling \ --max-nodes 3 \ --min-nodes 1 \ --num-nodes 1 \ --scopes cloud-platform,userinfo-email \ --tags allow-health-checks,hybrid-neg-cluster-node \ --workload-metadata GKE_METADATA kubectl config set-context --current --namespace=hybrid-neg-system
-
Allow access to the cluster API server from your current public IP address, and from private IP addresses in your VPC network:
PUBLIC_IP="$(dig TXT +short o-o.myaddr.l.google.com @ns1.google.com | sed 's/"//g')" gcloud container clusters update hybrid-neg \ --enable-master-authorized-networks \ --location "$REGION" \ --master-authorized-networks "${PUBLIC_IP}/32,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"
If you want to allow access from other IP address ranges, or if you use non-RFC1918 IPv4 address ranges for your GKE cluster nodes and/or Pods, add those address ranges to the
--master-authorized-networks
flag.
Configure Workload Identity Federation for GKE¶
Allow the controller manager to authenticate to Google Cloud APIs, by using Workload Identity Federation for GKE to grant an IAM role to the controller's Kubernetes service account.
-
Create a custom IAM role with permission to manage zonal network endpoint groups (NEGs):
gcloud iam roles create compute.networkEndpointGroupAdmin \ --description "Full control of zonal Network Endpoint Groups (NEGs)" \ --permissions "compute.instances.use,compute.networkEndpointGroups.attachNetworkEndpoints,compute.networkEndpointGroups.create,compute.networkEndpointGroups.createTagBinding,compute.networkEndpointGroups.delete,compute.networkEndpointGroups.deleteTagBinding,compute.networkEndpointGroups.detachNetworkEndpoints,compute.networkEndpointGroups.get,compute.networkEndpointGroups.list,compute.networkEndpointGroups.listEffectiveTags,compute.networkEndpointGroups.listTagBindings,compute.networkEndpointGroups.use,compute.zones.list" \ --project $PROJECT_ID \ --stage GA \ --title "Zonal Network Endpoint Groups Admin"
This custom role provides permissions to manage zonal network endpoint groups using the Compute Engine API.
You can create the custom role at the organization level instead of at the project level, by replacing the
--project
flag with the--organization
flag and your organization resource ID.You can use predefined roles, such as the Kubernetes Engine Service Agent role (
container.serviceAgent
), instead of creating a custom role. However, the predefined roles typically provide additional permissions that aren’t needed to manage zonal NEGs. -
Grant the custom IAM role on the Google Cloud project to the
hybrid-neg-controller-manager
Kubernetes service account in thehybrid-neg-system
namespace:gcloud projects add-iam-policy-binding $PROJECT_ID \ --member "principal://iam.googleapis.com/projects/${PROJECT_NUMBER}/locations/global/workloadIdentityPools/${PROJECT_ID}.svc.id.goog/subject/ns/hybrid-neg-system/sa/hybrid-neg-controller-manager" \ --role projects/$PROJECT_ID/roles/compute.networkEndpointGroupAdmin
Configure the controller¶
-
Create a patch that sets the name of your VPC network on Google Cloud as an environment variable in the controller manager Pod spec:
export NETWORK=VPC_NETWORK eval "echo \"$(cat k8s/components/google-cloud-vpc-network/patch-google-cloud-vpc-network.yaml.template)\"" \ > k8s/components/google-cloud-vpc-network/patch-google-cloud-vpc-network.yaml
Replace
VPC_NETWORK
with the name of the VPC network you want the controller to use.You can list the VPC networks in your project with this command:
Deploy the hybrid NEG controller¶
-
Create and export an environment variable called
SKAFFOLD_DEFAULT_REPO
to point to your container image registry: -
Build the controller manager container image, render the manifests, deploy to the GKE cluster, and tail the logs:
Verify that the controller can create hybrid NEGs¶
-
Create a Kubernetes Deployment resource with Pods running nginx, and expose them using a Kubernetes Service that has the
solutions.cloud.google.com/hybrid-neg
annotation: -
Verify that the controller created one hybrid NEG in each of the Compute Engine zones
us-west1-{a,b,c}
:gcloud compute network-endpoint-groups list \ --filter 'name=nginx-80 AND networkEndpointType:NON_GCP_PRIVATE_IP_PORT'
The output looks similar to the following:
-
Verify that the hybrid NEGs in zones
us-west1-{a,b,c}
have twonetworkEndpoints
in total:for zone in us-west1-a us-west1-b us-west1-c ; do gcloud compute network-endpoint-groups list-network-endpoints nginx-80 \ --format yaml \ --zone $zone done
The output looks similar to the following:
Verify that the controller can delete hybrid NEGs¶
-
Remove the
solutions.cloud.google.com/hybrid-neg
from thenginx
Kubernetes Service: -
Verify that the controller deleted the hybrid NEGs:
gcloud compute network-endpoint-groups list \ --filter 'name=nginx-80 AND networkEndpointType:NON_GCP_PRIVATE_IP_PORT'
The output matches the following:
It may take a few seconds for the controller to delete the hybrid NEGs.
Troubleshoot¶
If you run into problems, please review the troubleshooting guide.
Clean up¶
-
Set up environment variables:
-
Undeploy the controller manager from the GKE cluster:
-
Delete the GKE cluster:
-
Delete the container image repository in Artifact Registry:
-
Remove the IAM policy binding:
gcloud projects remove-iam-policy-binding $PROJECT_ID \ --member "principal://iam.googleapis.com/projects/${PROJECT_NUMBER}/locations/global/workloadIdentityPools/${PROJECT_ID}.svc.id.goog/subject/ns/hybrid-neg-system/sa/hybrid-neg-controller-manager" \ --role projects/$PROJECT_ID/roles/compute.networkEndpointGroupAdmin