API Feature Comparison

Below is an updated overview (as of 2024-12-12) of the core features and capabilities provided by three prominent AI APIs: Anthropic, Gemini, and OpenAI. This summary includes details on functionality such as function calls, image and vision processing, system directives, and more.

Feature Category

Feature

Anthropic

Gemini

OpenAI

Message & Interaction Structure

Role labels

user, assistant

user, model

user, assistant, system, tool

Named participants

Not available

Not available

Yes (can identify participants by name)

Content represented as array

Supported

Supported

Supported

Content Types & Multimodal Abilities

Text generation

Supported

Supported

Supported

Image interpretation

Supported

Supported

Supported

Audio processing

Not supported

Supported

Not supported

Video processing

Not supported

Supported

Not supported

Image Handling

Supported formats

JPEG, PNG, GIF, WebP

JPEG, PNG, WebP, HEIC, HEIF

PNG, JPEG, WebP, non-animated GIF

Maximum file size

5MB per image

(20MB per request)

20MB per image

Detail level

Not applicable

Not applicable

Low, high, auto

Allowed resolution

Up to 1568×1568

Minimum 768×768, up to 3072×3072

Minimum 512×512, up to 2048×2048

Token counting for images

(width × height) / 750; max 1,600

258 tokens

85 + 170 × {patches}

Image data retention

Deleted immediately after processing

Not specified

Removed once processing is complete

Audio & Video Handling

Supported audio formats

Not applicable

WAV, MP3, AIFF, AAC, OGG, FLAC

Not applicable

Supported video formats

Not applicable

MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP

Not applicable

System Instructions & Tools

System instruction method

Provided (array of text sections)

Provided (parts array)

Provided (system message)

Function/Tool Usage

Parallel tool calls

Not supported

Not supported

Supported

Tool declarations

Stored in a tools array

Stored in a tools array

Stored in a tools array

Function name constraints

Yes

Yes (up to 63 characters)

Yes (up to 64 characters)

Function definition

name, description, input_schema

name, description, parameters

name, description, parameters

Option structure

JSON Schema for input

Object with properties

JSON Schema for parameters

Forced function usage

Via tool_choice parameter

Via toolConfig parameter

Via tool_choice parameter

Model-triggered invocation

The model outputs a tool_use block with guessed parameters

Produces a functionCall part with estimated parameters

Produces message.tool_calls with predicted arguments

Execution

Client side

Client side

Client side

Injecting tool results

A user message is appended with a tool_result content block

A function message is appended with functionResponse

A new tool message is sent with tool_call_id and content

Built-in code execution

Not supported

Supported

Not supported

Vision-integrated tool use

Supported

Supported

Supported

Generation Configuration

temperature

Supported

Supported

Supported

max_tokens

Supported

Supported

Supported

stop_sequences

Supported

Supported

Supported

top_k

Supported

Supported

Not supported

top_p

Supported

Supported

Supported

seed

Not supported

Not supported

Supported

Multiple candidate outputs

Not supported

Not supported

Supported (via n parameter, may affect streaming)

Streaming & Response Layout

Streamed responses

Supported

Supported

Supported

Enabling streaming

stream=true

streamGenerateContent path

stream=true

Streaming events

Multiple event types

Not specified

Single delta-type event

Response container

content (array)

candidates (array)

choices (array)

Usage Metrics & Errors

Token usage

Yes

Yes

Yes

Granular token breakdown

input, output

prompt, cached, candidates, total

prompt, completion, total

Live usage metrics

Not supported

Not supported

Optional

Error data in response

Not specified

Not specified

Yes (but undocumented)

Error data in streaming

Not specified

Not specified

Yes (but undocumented)

Advanced Capabilities

JSON-specific mode

Partial (using structured prompts)

Supported (responseMimeType)

Supported

Consistency features

Supported (various methods)

Not specified

Not specified

Log probabilities

Not supported

Not supported

Supported (though not in schema by default)

System fingerprint

Not supported

Not supported

Supported

Semantic caching

Not supported

Supported

Not supported

Assistant prefill

Supported

Not supported

Not supported

Preferred formatting

XML tags, JSON

Not specified

Markdown

Safety & Compliance

Request-level safety settings

Stop sequences

Detailed category-based

Moderation endpoint

Safety feedback

Supported

Supported

Not specified