Deployment¶
This document describes how you can deploy the Confidential AI Model Serving in a Google Cloud project by using Terraform.
Cost¶
In this document, you use the following billable components of Google Cloud:
Before you begin¶
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
Deploy the application¶
This section shows you how to deploy the example application by using Terraform.
-
Open Cloud Shell.
-
Set an environment variable to contain your project ID:
Replace
project-id
with the ID of your project. -
Set another environment variable to contain your preferred region:
Replace
region
with a region that supports Cloud Run and Compute Engine, for exampleus-central1
. -
Authorize
gcloud
:You can skip this step if you're using Cloud Shell.
-
Authorize
terraform
:gcloud auth application-default login && gcloud auth application-default set-quota-project $PROJECT_ID
You can skip this step if you're using Cloud Shell.
-
Clone the Git repository containing the code to build the example architecture:
-
Initialize Terraform:
-
Apply the configuration:
When the command completes, it prints the URL of the Cloud Run service that runs the broker. The URL looks similar to the following:
Make note of this URL, you'll need it later.
Note
If you haven't used Artifact Registry before, the command might fail with the following error:
denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.uploadArtifacts" on resource
You can fix this error by running the following command:
Then re-run
terraform apply
.
Test the deployment¶
To verify that the deployment was successful, run the test client application:
-
Change to the
sources
directory: -
Run the test client:
Replace
URL
with the URL of the Cloud Run service that you obtained previously.The command output looks similar to the following:
[INFO] Running as client [INFO] Waiting for workload instances to become available... Your prompts will be served by one of the following workload instances: Instance Zone Prod Hardware OS Image ---------- ------------------ ----- --------------- ------------------ ------------ workload-x1rf asia-southeast1-a true GCP_AMD_SEV CONFIDENTIAL_SPACE bc84e0191c2e workload-r3lq asia-southeast1-a true GCP_AMD_SEV CONFIDENTIAL_SPACE 904c6cbc5c56 Enter a question>
- The line
Waiting for workload instances to become available
indicates that the Confidential Space VM hasn't registered with the broker yet, and that the client is waiting for this process to complete. - The table shows the Confidential Space VM instances that are available to handle requests.
-
The
Prod
column indicates if the respective instance uses the Production image.The client refuses to interact with instances that use the Debug image unless you specify the
--debug
command line flag.
- The line
-
Enter an example prompt and press Enter.
The client randomly selects one of the available workload instances, encrypts the prompt so that only this specific workload instance can read the prompt, and passes the encrypted message to the broker.
The broker forwards the encrypted message to the workload instance, which generates an example response, which it encrypts so that only the client can read the response.
The client then displays the response:
What's next¶
- Review Cloud Logging logs to follow the interaction between the client, broker, and workload instances.
- Learn more about Confidential Space.