Calling the Vertex AI Gemini API from PowerShell

April 3, 2026

This covers how to call Gemini models via Google Cloud's Vertex AI from PowerShell. Both the OpenAI-compatible endpoint and the native Gemini endpoint are explained.

Authentication

gcloud auth (Recommended)

No API key required. Uses your existing Google Cloud credentials.

$accessToken = (gcloud auth print-access-token)

API key

$apiKey = $env:VERTEX_API_KEY

Endpoints

OpenAI-compatible endpoint (gcloud auth)

https://{region}-aiplatform.googleapis.com/v1beta1/projects/{projectId}/locations/{region}/endpoints/openapi/chat/completions

The request and response format is identical to the OpenAI API. The model name requires a google/ prefix (e.g., google/gemini-2.5-flash-lite).

Native Gemini endpoint (API key)

https://{region}-aiplatform.googleapis.com/v1/projects/{projectId}/locations/{region}/publishers/google/models/{model}:generateContent

For streaming, use :streamGenerateContent.

Basic calls

OpenAI-compatible (gcloud auth)

$projectId   = "your-project-id"
$region      = "us-central1"
$model       = "google/gemini-2.5-flash-lite"
$accessToken = (gcloud auth print-access-token)

$body = @{
    model    = $model
    messages = @(
        @{
            role    = "user"
            content = "What is the population of Tokyo?"
        }
    )
} | ConvertTo-Json -Depth 10

$uri = "https://$region-aiplatform.googleapis.com/v1beta1/projects/$projectId/locations/$region/endpoints/openapi/chat/completions"

$response = Invoke-RestMethod `
    -Uri         $uri `
    -Method      Post `
    -ContentType "application/json" `
    -Headers     @{ Authorization = "Bearer $accessToken" } `
    -Body        $body

$response.choices[0].message.content

Native Gemini (API key)

$projectId = "your-project-id"
$region    = "us-central1"
$model     = "gemini-2.5-flash-lite"
$apiKey    = $env:VERTEX_API_KEY

$body = @{
    contents = @(
        @{
            role  = "user"
            parts = @(
                @{ text = "What is the population of Tokyo?" }
            )
        }
    )
} | ConvertTo-Json -Depth 10

$uri = "https://$region-aiplatform.googleapis.com/v1/projects/$projectId/locations/$region/publishers/google/models/${model}:generateContent?key=$apiKey"

$response = Invoke-RestMethod `
    -Uri         $uri `
    -Method      Post `
    -ContentType "application/json" `
    -Body        $body

$response.candidates[0].content.parts[0].text

Response structure

OpenAI-compatible

$response.choices[0].message.content  # generated text
$response.usage.total_tokens           # total token count
$response.model                        # model used

Native Gemini

$response.candidates[0].content.parts[0].text  # generated text
$response.usageMetadata.totalTokenCount         # total token count
$response.modelVersion                          # model version used

For streaming (streamGenerateContent), an array of chunks is returned. Concatenate them to retrieve the full text.

$fullText = ($response | ForEach-Object {
    $_.candidates[0].content.parts[0].text
}) -join ""

Adding a system prompt

OpenAI-compatible

$body = @{
    model    = $model
    messages = @(
        @{
            role    = "system"
            content = "You are an AI assistant that responds in Japanese. Answer concisely."
        }
        @{
            role    = "user"
            content = "What is the speed of light?"
        }
    )
} | ConvertTo-Json -Depth 10

Native Gemini

$body = @{
    system_instruction = @{
        parts = @(
            @{ text = "You are an AI assistant that responds in Japanese. Answer concisely." }
        )
    }
    contents = @(
        @{
            role  = "user"
            parts = @(@{ text = "What is the speed of light?" })
        }
    )
} | ConvertTo-Json -Depth 10

Multi-turn conversation

Place the conversation history in an array to achieve multi-turn conversation.

OpenAI-compatible

$body = @{
    model    = $model
    messages = @(
        @{ role = "user";      content = "Do you prefer cats or dogs?" }
        @{ role = "assistant"; content = "I prefer cats." }
        @{ role = "user";      content = "Why is that?" }
    )
} | ConvertTo-Json -Depth 10

Native Gemini

The assistant role is specified as "model".

$body = @{
    contents = @(
        @{
            role  = "user"
            parts = @(@{ text = "Do you prefer cats or dogs?" })
        }
        @{
            role  = "model"
            parts = @(@{ text = "I prefer cats." })
        }
        @{
            role  = "user"
            parts = @(@{ text = "Why is that?" })
        }
    )
} | ConvertTo-Json -Depth 10

Available models

Model	OpenAI-compatible name	Description
`gemini-2.5-flash-lite`	`google/gemini-2.5-flash-lite`	Lightweight, fast, low-cost
`gemini-2.5-flash`	`google/gemini-2.5-flash`	Balanced
`gemini-2.5-pro`	`google/gemini-2.5-pro`	High-precision, for complex tasks

Which approach to use

Situation	Recommended approach
GCP-authenticated environment (dev, CI, etc.)	OpenAI-compatible + gcloud auth
Only an API key available	Native Gemini
Migrating from OpenAI	OpenAI-compatible (minimizes code changes)
Streaming required	Native Gemini

Notes

Do not hardcode the API key in scripts; load it from an environment variable ($env:VERTEX_API_KEY).
With gcloud auth, tokens expire in about 1 hour. Long-running scripts should refresh the token as needed.
Each project has its own rate limits and quotas. Check them before sending large numbers of requests.

Calling the Vertex AI Gemini API from PowerShell

Authentication

gcloud auth (Recommended)

API key

Endpoints

OpenAI-compatible endpoint (gcloud auth)

Native Gemini endpoint (API key)

Basic calls

OpenAI-compatible (gcloud auth)

Native Gemini (API key)

Response structure

OpenAI-compatible

Native Gemini

Adding a system prompt

OpenAI-compatible

Native Gemini

Multi-turn conversation

OpenAI-compatible

Native Gemini

Available models

Which approach to use

Notes

Comments

Post a Comment

Authentication​

gcloud auth (Recommended)​

API key​

Endpoints​

OpenAI-compatible endpoint (gcloud auth)​

Native Gemini endpoint (API key)​

Basic calls​

OpenAI-compatible (gcloud auth)​

Native Gemini (API key)​

Response structure​

OpenAI-compatible​

Native Gemini​

Adding a system prompt​

OpenAI-compatible​

Native Gemini​

Multi-turn conversation​

OpenAI-compatible​

Native Gemini​

Available models​

Which approach to use​

Notes​

Comments

Post a Comment

Authentication

gcloud auth (Recommended)

API key

Endpoints

OpenAI-compatible endpoint (gcloud auth)

Native Gemini endpoint (API key)

Basic calls

OpenAI-compatible (gcloud auth)

Native Gemini (API key)

Response structure

OpenAI-compatible

Native Gemini

Adding a system prompt

OpenAI-compatible

Native Gemini

Multi-turn conversation

OpenAI-compatible

Native Gemini

Available models

Which approach to use

Notes