Sign In

Gemini AI

AI

Add Google's multimodal AI to your agent workflows. Agents generate content, analyze documents and images, extract structured data, and process audio and video files through Gemini's Files API.

5 actions available
Vendor submits scanned invoice image that Parseur cannot extract
Agent uploads image to Gemini Files API via Upload File
Agent calls Generate Content with structured extraction prompt for invoice fields
Agent validates extracted JSON against vendor record and expected amounts
Agent creates bill in ERP with validated extracted data
AP team notified via Slack with extraction summary
Gemini-flagged high-risk contract clauses routed to stakeholder for review before proceeding

What This Integration Enables

Gemini integration extends FlowRunner's AI capabilities with Google's multimodal model. Agents use Gemini for tasks that require understanding content beyond text: analyzing scanned documents, interpreting images, processing audio, and generating structured outputs from complex inputs. Combined with FlowRunner's other integrations, Gemini becomes the reasoning layer for document-heavy and multimedia workflows.

Without FlowRunner

Manual scanned invoice entry Staff type data from low-quality scans that Parseur cannot process
Unreviewed contract stacks Due diligence teams read every contract page manually
Inbox-based support triage Support team opens each email to classify and route it

With FlowRunner

AI extraction for any document Scanned images and complex formats processed automatically via Gemini
AI-summarized due diligence Contract summaries with risk flags generated in seconds per document
Pre-classified support queue Emails classified and routed before anyone opens the inbox

Use Case Scenarios

Scanned Invoice Extraction

A vendor submits a scanned invoice image rather than a digital PDF. Parseur cannot extract structured data from a low-quality scan. The agent uploads the image to Gemini using Upload File, then calls Generate Content with a structured extraction prompt: "Extract vendor name, invoice number, invoice date, line items, and total amount from this invoice. Return as JSON." Gemini returns the structured data. The agent validates it and creates the bill in the ERP. Scanned documents that used to require manual entry are handled automatically.

Contract Summary for Due Diligence

An M&A agent is processing a stack of contracts as part of due diligence. For each contract PDF, it uploads the document to Gemini and calls Generate Content: "Summarize this contract in 3 bullet points. Identify any unusual terms, termination clauses, or change-of-control provisions. Return as JSON with fields: summary, unusual_terms, risk_level." The agent stores the structured output in the Notion due diligence database. The deal team reviews AI-generated summaries with risk flags instead of reading every contract themselves.

Customer Feedback Classification

Customer support emails arrive in a shared inbox. The agent reads each email, passes the content to Gemini with a classification prompt: "Classify this support message as one of: billing_issue, technical_issue, feature_request, compliment, or other. Extract the key issue in one sentence. Return as JSON." Based on the classification, the agent routes the email to the appropriate team in Jira or Asana. Support triage happens before anyone opens the inbox.

Human-in-Loop Highlight

Gemini AI is a reasoning tool, not a decision-maker. When an agent uses Gemini to classify a contract as high-risk or to extract terms that might affect a business decision, the AI output is an input to the human, not a substitute for the human. The agent formats the Gemini analysis and routes it to the relevant stakeholder via Slack: "Gemini flagged a change-of-control clause in the [Vendor] contract. Here is the relevant section and the AI summary. Review before I proceed with the next step in the approval workflow." The human sees the AI's work. They decide what happens next.

Agent processes routinely
Detects exception requiring judgment
Clear match Continues automatically
Ambiguous Routes to human via Slack
Human decides
Agent resumes with decision

Agent Capabilities

5 actions

Content Generation

1
  • Generate Content Sends a prompt to the Gemini model and returns a response. Configurable parameters include temperature for model behavior control and output format selection (text or structured JSON). Used for: document summarization, data extraction from unstructured text, content classification, structured JSON generation from natural language, and any task requiring LLM reasoning within a workflow step. The JSON output mode is particularly useful for extraction workflows: the agent prompts Gemini to extract specific fields from a document and receive structured data it can immediately pass to downstream systems.

File Management

4
  • Upload File Uploads a document, image, audio file, or video to the Gemini Files API for processing. Used before Generate Content when the input is a file rather than text. Supported types include PDFs, images, audio, and video.
  • Get File Info Retrieves metadata about an uploaded file. Used to verify upload status and access file details before referencing them in generation requests.
  • List Files Returns files currently stored in the Gemini Files API. Used to manage the file inventory and reference available files in generation workflows.
  • Delete File Removes a file from the Gemini Files API. Used in cleanup workflows to manage storage and avoid referencing stale files.

Start building with Gemini AI

$100 in credits. No card required. Connect in minutes.