Gemini AI

Add Google's multimodal AI to your agent workflows. Agents generate content, analyze documents and images, extract structured data, and process audio and video files through Gemini's Files API.

5 actions available

Vendor submits scanned invoice image that Parseur cannot extract

Agent uploads image to Gemini Files API via Upload File

Agent calls Generate Content with structured extraction prompt for invoice fields

Agent validates extracted JSON against vendor record and expected amounts

Agent creates bill in ERP with validated extracted data

AP team notified via Slack with extraction summary

Gemini-flagged high-risk contract clauses routed to stakeholder for review before proceeding

What This Integration Enables

Gemini integration extends FlowRunner's AI capabilities with Google's multimodal model. Agents use Gemini for tasks that require understanding content beyond text: analyzing scanned documents, interpreting images, processing audio, and generating structured outputs from complex inputs. Combined with FlowRunner's other integrations, Gemini becomes the reasoning layer for document-heavy and multimedia workflows.

Without FlowRunner

Manual scanned invoice entry Staff type data from low-quality scans that Parseur cannot process

Unreviewed contract stacks Due diligence teams read every contract page manually

Inbox-based support triage Support team opens each email to classify and route it

With FlowRunner

AI extraction for any document Scanned images and complex formats processed automatically via Gemini

AI-summarized due diligence Contract summaries with risk flags generated in seconds per document

Pre-classified support queue Emails classified and routed before anyone opens the inbox

Use Case Scenarios

Scanned Invoice Extraction

A vendor submits a scanned invoice image rather than a digital PDF. Parseur cannot extract structured data from a low-quality scan. The agent uploads the image to Gemini using Upload File, then calls Generate Content with a structured extraction prompt: "Extract vendor name, invoice number, invoice date, line items, and total amount from this invoice. Return as JSON." Gemini returns the structured data. The agent validates it and creates the bill in the ERP. Scanned documents that used to require manual entry are handled automatically.

Contract Summary for Due Diligence

An M&A agent is processing a stack of contracts as part of due diligence. For each contract PDF, it uploads the document to Gemini and calls Generate Content: "Summarize this contract in 3 bullet points. Identify any unusual terms, termination clauses, or change-of-control provisions. Return as JSON with fields: summary, unusual_terms, risk_level." The agent stores the structured output in the Notion due diligence database. The deal team reviews AI-generated summaries with risk flags instead of reading every contract themselves.

Customer Feedback Classification

Customer support emails arrive in a shared inbox. The agent reads each email, passes the content to Gemini with a classification prompt: "Classify this support message as one of: billing_issue, technical_issue, feature_request, compliment, or other. Extract the key issue in one sentence. Return as JSON." Based on the classification, the agent routes the email to the appropriate team in Jira or Asana. Support triage happens before anyone opens the inbox.

Human-in-Loop Highlight

Gemini AI is a reasoning tool, not a decision-maker. When an agent uses Gemini to classify a contract as high-risk or to extract terms that might affect a business decision, the AI output is an input to the human, not a substitute for the human. The agent formats the Gemini analysis and routes it to the relevant stakeholder via Slack: "Gemini flagged a change-of-control clause in the [Vendor] contract. Here is the relevant section and the AI summary. Review before I proceed with the next step in the approval workflow." The human sees the AI's work. They decide what happens next.

Agent processes routinely

Detects exception requiring judgment

Clear match Continues automatically

Ambiguous Routes to human via Slack

Human decides

Agent resumes with decision

Agent Capabilities

5 actions

Content Generation

Generate Content Sends a prompt to the Gemini model and returns a response. Configurable parameters include temperature for model behavior control and output format selection (text or structured JSON). Used for: document summarization, data extraction from unstructured text, content classification, structured JSON generation from natural language, and any task requiring LLM reasoning within a workflow step. The JSON output mode is particularly useful for extraction workflows: the agent prompts Gemini to extract specific fields from a document and receive structured data it can immediately pass to downstream systems.

File Management

Upload File Uploads a document, image, audio file, or video to the Gemini Files API for processing. Used before Generate Content when the input is a file rather than text. Supported types include PDFs, images, audio, and video.
Get File Info Retrieves metadata about an uploaded file. Used to verify upload status and access file details before referencing them in generation requests.
List Files Returns files currently stored in the Gemini Files API. Used to manage the file inventory and reference available files in generation workflows.
Delete File Removes a file from the Gemini Files API. Used in cleanup workflows to manage storage and avoid referencing stale files.

Start building with Gemini AI

$100 in credits. No card required. Connect in minutes.

Start Building Free Book a demo