Skip to content
AI on GKE
Serve a model with a GPU on GKE Autopilot
Initializing search
AI on GKE Assets
Applications
Best practices
Tutorials and examples
AI on GKE
AI on GKE Assets
Applications
Applications
Jupyter
Jupyter
JupyterHub Profiles
Persistent Storage
Glossary for variables
Rag
Rag
Frontend
Frontend
Container
Container
Ray
Ray
Tfvars examples
Tfvars examples
Best practices
Best practices
Startup latency
Gke batch refarch
Gke batch refarch
Ml platform
Ml platform
Tutorials and examples
Tutorials and examples
Cloudshell tutorial
genAI LLM
genAI LLM
deploying mistral 7b instruct L4gpus
deploying mistral 7b instruct L4gpus
deploying mixtral 8x7b instruct L4 gpus
deploying mixtral 8x7b instruct L4 gpus
E2e genai langchain app
E2e genai langchain app
Finetuning gemma 2b on l4
Finetuning gemma 2b on l4
Finetuning llama 7b on l4
Finetuning llama 7b on l4
Serving llama2 70b on l4 gpus
Serving llama2 70b on l4 gpus
Gpu examples
Gpu examples
A100 jax
A100 jax
Online serving single gpu
Online serving single gpu
Training single gpu
Training single gpu
Hf tgi
Hf tgi
Inference servers
Inference servers
Checkpoints
Checkpoints
Jetstream
Jetstream
Maxtext
Maxtext
Single host inference
Single host inference
Pytorch
Pytorch
Single host inference
Single host inference
Maxdiffusion
Maxdiffusion
Kserve
Kserve
Models as oci
Models as oci
Nvidia nim
Nvidia nim
Skypilot
Skypilot
Text classification
Text classification
Storage
Storage
Hyperdisk ml
Hyperdisk ml
Tpu examples
Tpu examples
Single host inference
Single host inference
Jax
Jax
Stable diffusion
Stable diffusion
Training
Training
Mnist single tpu
Mnist single tpu
Vector databases
Vector databases
Vector Database Repo
NEXT 2024 Weaviate Demo
NEXT 2024 Weaviate Demo
Demo website
Demo website
Workflow orchestration
Workflow orchestration
Dws examples
Dws examples
Indexed job
Indexed job
Jobset
Jobset
Pytorch
Pytorch
Serve a model with a GPU on GKE Autopilot
Please follow the How-to at TO DO: add link to how-to