Skip to content

Vertex AI LLM Evaluation Services

We offer a comprehensive set of notebooks that demonstrate how to use Vertex AI LLM Evaluation Services in conjunction with other Vertex AI services. Additionally, we have provided notebooks that delve into the theory behind evaluation metrics.

Computation-Based Evaluation: - Workflow for Evaluating LLM Performance in a Text Classification Task using Gemini and Vertex AI SDK - LLM Evaluation workflow for a Classification task using a tuned model and Vertex AI SDK - LLM Evaluation Workflow for a Classification Task using Gemini and Vertex AI Pipelines - Complete LLM Model Evaluation Workflow for Classification using KFP Pipelines

Evaluation of RAG Systems: - Evaluating Retrieval Augmented Generation (RAG) Systems

Theory notebooks: - Metrics for Classification - Metrics for Summarization - Metrics for Text Generation - Metrics for Q&A

Requirements

To run the walkthrough and demonstration in the notebook you'll need access to a Google Cloud project with the Vertex AI API enabled.

Getting Help

If you have any questions or find any problems, please report through GitHub issues.