Deployment¶
This document describes how you can deploy the Confidential AI Model Serving in a Google Cloud project by using Terraform.
Cost¶
In this document, you use the following billable components of Google Cloud:
Before you begin¶
- 
In the Google Cloud console, on the project selector page, select or create a Google Cloud project. 
- 
Make sure that billing is enabled for your Google Cloud project. 
Deploy the application¶
This section shows you how to deploy the example application by using Terraform.
- 
Open Cloud Shell. 
- 
Set an environment variable to contain your project ID: Replace project-idwith the ID of your project.
- 
Set another environment variable to contain your preferred region: Replace regionwith a region that supports Cloud Run and Compute Engine, for exampleus-central1.
- 
Authorize gcloud:You can skip this step if you're using Cloud Shell. 
- 
Authorize terraform:gcloud auth application-default login && gcloud auth application-default set-quota-project $PROJECT_IDYou can skip this step if you're using Cloud Shell. 
- 
Clone the Git repository containing the code to build the example architecture: 
- 
Initialize Terraform: 
- 
Apply the configuration: When the command completes, it prints the URL of the Cloud Run service that runs the broker. The URL looks similar to the following: Make note of this URL, you'll need it later. Note If you haven't used Artifact Registry before, the command might fail with the following error: denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.uploadArtifacts" on resourceYou can fix this error by running the following command: Then re-run terraform apply.
Test the deployment¶
To verify that the deployment was successful, run the test client application:
- 
Change to the sourcesdirectory:
- 
Run the test client: Replace URLwith the URL of the Cloud Run service that you obtained previously.The command output looks similar to the following: [INFO] Running as client [INFO] Waiting for workload instances to become available... Your prompts will be served by one of the following workload instances: Instance Zone Prod Hardware OS Image ---------- ------------------ ----- --------------- ------------------ ------------ workload-x1rf asia-southeast1-a true GCP_AMD_SEV CONFIDENTIAL_SPACE bc84e0191c2e workload-r3lq asia-southeast1-a true GCP_AMD_SEV CONFIDENTIAL_SPACE 904c6cbc5c56 Enter a question>- The line Waiting for workload instances to become availableindicates that the Confidential Space VM hasn't registered with the broker yet, and that the client is waiting for this process to complete.
- The table shows the Confidential Space VM instances that are available to handle requests.
- 
The Prodcolumn indicates if the respective instance uses the Production image.The client refuses to interact with instances that use the Debug image unless you specify the --debugcommand line flag.
 
- The line 
- 
Enter an example prompt and press Enter. The client randomly selects one of the available workload instances, encrypts the prompt so that only this specific workload instance can read the prompt, and passes the encrypted message to the broker. The broker forwards the encrypted message to the workload instance, which generates an example response, which it encrypts so that only the client can read the response. The client then displays the response: 
What's next¶
- Review Cloud Logging logs to follow the interaction between the client, broker, and workload instances.
- Learn more about Confidential Space.