Skip to content

Deployment

This document describes how you can deploy the Confidential AI Model Serving in a Google Cloud project by using Terraform.

Cost

In this document, you use the following billable components of Google Cloud:

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Make sure that billing is enabled for your Google Cloud project.

Deploy the application

This section shows you how to deploy the example application by using Terraform.

  1. Open Cloud Shell.

    Open Cloud Shell

  2. Set an environment variable to contain your project ID:

    export PROJECT_ID=project-id
    

    Replace project-id with the ID of your project.

  3. Set another environment variable to contain your preferred region:

    export REGION=region
    

    Replace region with a region that supports Cloud Run and Compute Engine, for example us-central1.

  4. Authorize gcloud:

    gcloud auth login
    

    You can skip this step if you're using Cloud Shell.

  5. Authorize terraform:

    gcloud auth application-default login &&
      gcloud auth application-default set-quota-project $PROJECT_ID
    

    You can skip this step if you're using Cloud Shell.

  6. Clone the Git repository containing the code to build the example architecture:

    git clone https://github.com/GoogleCloudPlatform/cloud-solutions.git &&
        cd cloud-solutions/projects/confidential-ai-model-serving/
    
  7. Initialize Terraform:

    terraform init
    
  8. Apply the configuration:

    terraform apply -var "project_id=$PROJECT_ID" -var "region=$REGION"
    

    When the command completes, it prints the URL of the Cloud Run service that runs the broker. The URL looks similar to the following:

    https://broker-PROJECTNUMBER.REGION.run.app/
    

    Make note of this URL, you'll need it later.

    Note

    If you haven't used Artifact Registry before, the command might fail with the following error:

    denied: Unauthenticated request. Unauthenticated requests do not have permission
    "artifactregistry.repositories.uploadArtifacts" on resource
    

    You can fix this error by running the following command:

    gcloud auth configure-docker $REGION-docker.pkg.dev
    

    Then re-run terraform apply.

Test the deployment

To verify that the deployment was successful, run the test client application:

  1. Change to the sources directory:

    cd ../sources
    
  2. Run the test client:

    ./run.sh client --broker URL
    

    Replace URL with the URL of the Cloud Run service that you obtained previously.

    The command output looks similar to the following:

    [INFO] Running as client
    [INFO] Waiting for workload instances to become available...
    
    Your prompts will be served by one of the following workload instances:
    
    Instance   Zone               Prod  Hardware        OS                 Image
    ---------- ------------------ ----- --------------- ------------------ ------------
    workload-x1rf asia-southeast1-a  true  GCP_AMD_SEV     CONFIDENTIAL_SPACE bc84e0191c2e
    workload-r3lq asia-southeast1-a  true  GCP_AMD_SEV     CONFIDENTIAL_SPACE 904c6cbc5c56
    
    Enter a question>
    
    • The line Waiting for workload instances to become available indicates that the Confidential Space VM hasn't registered with the broker yet, and that the client is waiting for this process to complete.
    • The table shows the Confidential Space VM instances that are available to handle requests.
    • The Prod column indicates if the respective instance uses the Production image.

      The client refuses to interact with instances that use the Debug image unless you specify the --debug command line flag.

  3. Enter an example prompt and press Enter.

    The client randomly selects one of the available workload instances, encrypts the prompt so that only this specific workload instance can read the prompt, and passes the encrypted message to the broker.

    The broker forwards the encrypted message to the workload instance, which generates an example response, which it encrypts so that only the client can read the response.

    The client then displays the response:

    > Who are you?
    That's a good question.
    

What's next

  • Review Cloud Logging logs to follow the interaction between the client, broker, and workload instances.
  • Learn more about Confidential Space.