Lorenze/fix google vertex api using api keys (#4243)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled

* supporting vertex through api key use - expo mode

* docs update here

* docs translations

---------

Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>
This commit is contained in:
Lorenze Jay
2026-01-20 09:34:36 -08:00
committed by GitHub
parent ceef062426
commit b267bb4054
6 changed files with 293 additions and 30 deletions

View File

@@ -0,0 +1,75 @@
interactions:
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: What is the capital
of Japan?\n\nThis is the expected criteria for your final answer: The capital
of Japan\nyou MUST return the actual complete content as the final answer, not
a summary.\n\nBegin! This is VERY important to you, use the tools available
and give your best Final Answer, your job depends on it!\n\nThought:"}], "role":
"user"}], "systemInstruction": {"parts": [{"text": "You are Research Assistant.
You are a helpful research assistant.\nYour personal goal is: Find information
about the capital of Japan\nTo give my best complete final answer to the task
respond using the exact following format:\n\nThought: I now can give a great
answer\nFinal Answer: Your final answer must be the great and the most complete
as possible, it must be outcome described.\n\nI MUST use these formats, my job
depends on it!"}], "role": "user"}, "generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
content-length:
- '952'
content-type:
- application/json
host:
- aiplatform.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.59.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
uri: https://aiplatform.googleapis.com/v1/publishers/google/models/gemini-2.0-flash-exp:generateContent
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"role\":
\"model\",\n \"parts\": [\n {\n \"text\": \"The
capital of Japan is Tokyo.\\nFinal Answer: Tokyo\\n\"\n }\n ]\n
\ },\n \"finishReason\": \"STOP\",\n \"avgLogprobs\": -0.017845841554495003\n
\ }\n ],\n \"usageMetadata\": {\n \"promptTokenCount\": 163,\n \"candidatesTokenCount\":
13,\n \"totalTokenCount\": 176,\n \"trafficType\": \"ON_DEMAND\",\n
\ \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 163\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 13\n
\ }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash-exp\",\n \"createTime\":
\"2026-01-15T22:27:38.066749Z\",\n \"responseId\": \"2mlpab2JBNOFidsPh5GigQs\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Thu, 15 Jan 2026 22:27:38 GMT
Server:
- scaffolding on HTTPServer2
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
X-Frame-Options:
- X-FRAME-OPTIONS-XXX
X-XSS-Protection:
- '0'
content-length:
- '786'
status:
code: 200
message: OK
version: 1

View File

@@ -728,3 +728,39 @@ def test_google_streaming_returns_usage_metrics():
assert result.token_usage.prompt_tokens > 0
assert result.token_usage.completion_tokens > 0
assert result.token_usage.successful_requests >= 1
@pytest.mark.vcr()
def test_google_express_mode_works() -> None:
"""
Test Google Vertex AI Express mode with API key authentication.
This tests Vertex AI Express mode (aiplatform.googleapis.com) with API key
authentication.
"""
with patch.dict(os.environ, {"GOOGLE_GENAI_USE_VERTEXAI": "true"}):
agent = Agent(
role="Research Assistant",
goal="Find information about the capital of Japan",
backstory="You are a helpful research assistant.",
llm=LLM(
model="gemini/gemini-2.0-flash-exp",
),
verbose=True,
)
task = Task(
description="What is the capital of Japan?",
expected_output="The capital of Japan",
agent=agent,
)
crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()
assert result.token_usage is not None
assert result.token_usage.total_tokens > 0
assert result.token_usage.prompt_tokens > 0
assert result.token_usage.completion_tokens > 0
assert result.token_usage.successful_requests >= 1