AI on GKE

Serve a model with a GPU on GKE Autopilot

AI on GKE

AI on GKE Assets
Applications
Applications
- Jupyter
  Jupyter
- Rag
  Rag
  - Frontend
    
    Frontend
    
    Container
    
    Container
- Ray
  Ray
  - Tfvars examples
    
    Tfvars examples
Best practices
Best practices
- Startup latency
- Gke batch refarch
  Gke batch refarch
- Ml platform
  Ml platform
Tutorials and examples
Tutorials and examples
- Cloudshell tutorial
- genAI LLM
  genAI LLM
  - deploying mistral 7b instruct L4gpus
    
    deploying mistral 7b instruct L4gpus
  - deploying mixtral 8x7b instruct L4 gpus
    
    deploying mixtral 8x7b instruct L4 gpus
  - E2e genai langchain app
    
    E2e genai langchain app
  - Finetuning gemma 2b on l4
    
    Finetuning gemma 2b on l4
  - Finetuning llama 7b on l4
    
    Finetuning llama 7b on l4
  - Serving llama2 70b on l4 gpus
    
    Serving llama2 70b on l4 gpus
- Gpu examples
  Gpu examples
  - A100 jax
    
    A100 jax
  - Online serving single gpu
    
    Online serving single gpu
  - Training single gpu
    
    Training single gpu
- Hf tgi
  Hf tgi
- Inference servers
  Inference servers
  - Checkpoints
    
    Checkpoints
  - Jetstream
    Jetstream
    
    Maxtext
    Maxtext
    
    Single host inference
    
    Single host inference
    
    Pytorch
    Pytorch
    
    Single host inference
    
    Single host inference
  - Maxdiffusion
    
    Maxdiffusion
- Kserve
  Kserve
- Models as oci
  Models as oci
- Nvidia nim
  Nvidia nim
- Skypilot
  Skypilot
  - Text classification
    
    Text classification
- Storage
  Storage
  - Hyperdisk ml
    
    Hyperdisk ml
- Tpu examples
  Tpu examples
  - Single host inference
    Single host inference
    
    Jax
    Jax
    
    Stable diffusion
    
    Stable diffusion
  - Training
    Training
    
    Mnist single tpu
    
    Mnist single tpu
- Vector databases
  Vector databases
  - Vector Database Repo
  - NEXT 2024 Weaviate Demo
    
    NEXT 2024 Weaviate Demo
    
    Demo website
    
    Demo website
- Workflow orchestration
  Workflow orchestration
  - Dws examples
    
    Dws examples
  - Indexed job
    
    Indexed job
  - Jobset
    Jobset
    
    Pytorch
    
    Pytorch

Serve a model with a GPU on GKE Autopilot

Please follow the How-to at TO DO: add link to how-to