API Feature Comparison
Below is an updated overview (as of 2024-12-12) of the core features and capabilities provided by three prominent AI APIs: Anthropic, Gemini, and OpenAI. This summary includes details on functionality such as function calls, image and vision processing, system directives, and more.
Feature Category
Feature
Anthropic
Gemini
OpenAI
Message & Interaction Structure
Role labels
user, assistant
user, model
user, assistant, system, tool
Named participants
Not available
Not available
Yes (can identify participants by name)
Content represented as array
Supported
Supported
Supported
Content Types & Multimodal Abilities
Text generation
Supported
Supported
Supported
Image interpretation
Supported
Supported
Supported
Audio processing
Not supported
Supported
Not supported
Video processing
Not supported
Supported
Not supported
Image Handling
Supported formats
JPEG, PNG, GIF, WebP
JPEG, PNG, WebP, HEIC, HEIF
PNG, JPEG, WebP, non-animated GIF
Maximum file size
5MB per image
(20MB per request)
20MB per image
Detail level
Not applicable
Not applicable
Low, high, auto
Allowed resolution
Up to 1568×1568
Minimum 768×768, up to 3072×3072
Minimum 512×512, up to 2048×2048
Token counting for images
(width × height) / 750; max 1,600
258 tokens
85 + 170 × {patches}
Image data retention
Deleted immediately after processing
Not specified
Removed once processing is complete
Audio & Video Handling
Supported audio formats
Not applicable
WAV, MP3, AIFF, AAC, OGG, FLAC
Not applicable
Supported video formats
Not applicable
MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP
Not applicable
System Instructions & Tools
System instruction method
Provided (array of text sections)
Provided (parts array)
Provided (system message)
Function/Tool Usage
Parallel tool calls
Not supported
Not supported
Supported
Tool declarations
Stored in a tools array
Stored in a tools array
Stored in a tools array
Function name constraints
Yes
Yes (up to 63 characters)
Yes (up to 64 characters)
Function definition
name, description, input_schema
name, description, parameters
name, description, parameters
Option structure
JSON Schema for input
Object with properties
JSON Schema for parameters
Forced function usage
Via tool_choice parameter
Via toolConfig parameter
Via tool_choice parameter
Model-triggered invocation
The model outputs a tool_use block with guessed parameters
Produces a functionCall part with estimated parameters
Produces message.tool_calls with predicted arguments
Execution
Client side
Client side
Client side
Injecting tool results
A user message is appended with a tool_result content block
A function message is appended with functionResponse
A new tool message is sent with tool_call_id and content
Built-in code execution
Not supported
Supported
Not supported
Vision-integrated tool use
Supported
Supported
Supported
Generation Configuration
temperature
Supported
Supported
Supported
max_tokens
Supported
Supported
Supported
stop_sequences
Supported
Supported
Supported
top_k
Supported
Supported
Not supported
top_p
Supported
Supported
Supported
seed
Not supported
Not supported
Supported
Multiple candidate outputs
Not supported
Not supported
Supported (via n parameter, may affect streaming)
Streaming & Response Layout
Streamed responses
Supported
Supported
Supported
Enabling streaming
stream=true
streamGenerateContent path
stream=true
Streaming events
Multiple event types
Not specified
Single delta-type event
Response container
content (array)
candidates (array)
choices (array)
Usage Metrics & Errors
Token usage
Yes
Yes
Yes
Granular token breakdown
input, output
prompt, cached, candidates, total
prompt, completion, total
Live usage metrics
Not supported
Not supported
Optional
Error data in response
Not specified
Not specified
Yes (but undocumented)
Error data in streaming
Not specified
Not specified
Yes (but undocumented)
Advanced Capabilities
JSON-specific mode
Partial (using structured prompts)
Supported (responseMimeType)
Supported
Consistency features
Supported (various methods)
Not specified
Not specified
Log probabilities
Not supported
Not supported
Supported (though not in schema by default)
System fingerprint
Not supported
Not supported
Supported
Semantic caching
Not supported
Supported
Not supported
Assistant prefill
Supported
Not supported
Not supported
Preferred formatting
XML tags, JSON
Not specified
Markdown
Safety & Compliance
Request-level safety settings
Stop sequences
Detailed category-based
Moderation endpoint
Safety feedback
Supported
Supported
Not specified