Skip to content

`mcp-gemini-go` MCP Server

This server provides an MCP interface to Google’s Gemini models, allowing for multimodal content generation.

Generates content (text and/or images) based on a multimodal prompt.

Parameters:

  • prompt (string, required): The text prompt for content generation.
  • model (string, optional): The specific Gemini model to use. Defaults to gemini-3.1-flash-image-preview.
  • images (string array, optional): A list of local file paths or GCS URIs for input images.
  • output_directory (string, optional): Local directory to save any generated image(s) to.
  • gcs_bucket_uri (string, optional): GCS URI prefix to store any generated images.

Synthesizes speech from text using Gemini models, allowing for granular control over style, pace, tone, and emotional expression through natural-language prompts.

Parameters:

  • text (string, required): The text to synthesize (up to 800 characters).
  • prompt (string, optional): Stylistic instructions on how to synthesize the content.
  • voice_name (string, optional): The voice to use. Defaults to Callirrhoe. Use the list_gemini_voices tool to see all options.
  • model_name (string, optional): The model to use. Defaults to gemini-3.1-flash-tts-preview.
  • output_directory (string, optional): Local directory to save the generated audio file to.
  • output_filename_prefix (string, optional): A prefix for the output WAV filename.

Lists the available single-speaker voices for use with the Gemini-TTS models.

Provides a list of supported languages and their BCP-47 codes. Currently, only en-US is supported.

The tool utilizes the following environment variables:

  • GOOGLE_CLOUD_PROJECT (string): Required. Your Google Cloud Project ID.
    • Override: You can override this globally for this specific server by setting GEMINI_PROJECT_ID.
  • GOOGLE_CLOUD_LOCATION (string): The preferred Google Cloud location/region for Vertex AI services.
    • Default: "us-central1"
    • Fallback: LOCATION is also supported as a fallback for GOOGLE_CLOUD_LOCATION.
    • Override: You can override this globally for this specific server by setting GEMINI_LOCATION.
  • ALLOW_UNSAFE_MODELS (boolean): Optional (true/false). Allows users to bypass strict local model constraint validation, enabling them to test experimental or pre-release model strings that are not yet hardcoded in the registry.
    • Default: false
  • ENABLE_OPTIONAL_HEADER_CAPTURE (boolean): Optional (true/false). Intended for internal debugging. When set to true, the server intercepts API requests and injects the raw ADC Bearer token to capture and surface the x-goog-sherlog-link header in the tool output. This feature is supported for Gemini.
    • Default: false
Terminal window
export GOOGLE_CLOUD_PROJECT=your-gcp-project
mcptools call gemini_image_generation \
--params '{"prompt": "a picture of a cat sitting on a table", "output_directory": "./output"}' \
mcp-gemini-go

First, ensure the GOOGLE_CLOUD_PROJECT environment variable is set. Then, you can call the gemini_audio_tts tool. The following example generates an audio file and saves it to a local directory named tts_output.

Terminal window
export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value project)
mcptools call gemini_audio_tts \
--params '{"text": "Hello, this is a test of the Gemini Text-to-Speech API.", "output_directory": "./tts_output"}' \
mcp-gemini-go

If you want to test the direct audio output without saving to a file via the output_directory parameter, you can send a raw JSON-RPC request to the server. This is necessary because the mcptools client does not support rendering audio content to the terminal.

The following command pipes a tools/call request to the server, parses the JSON response with jq to extract the base64-encoded audio data, decodes it, and saves it to a local file.

Terminal window
echo '{"jsonrpc":"2.0","method":"tools/call","id":1,"params":{"name":"gemini_audio_tts","arguments":{"text":"This is a direct JSON-RPC output test."}}}' | \
mcp-gemini-go | \
jq -r '.result.content[] | select(.type == "audio") | .data' | \
base64 --decode > test_direct_jsonrpc.wav