Groq

Groq is known for its ultra-fast inference speeds powered by custom LPU (Language Processing Unit) hardware. Their API provides access to popular open-source models like Llama, Mixtral, and Gemma with industry-leading latency. Ideal for real-time applications.

Official Website Documentation

Endpoints

POST

Chat Completions

/openai/v1/chat/completions

Ultra-fast chat with open-source models.

Streaming

Vision

Function Calling

Models

llama-3.3-70b-versatilellama-3.1-8b-instantmixtral-8x7b-32768gemma2-9b-it

Pricing

Model	Input Price	Output Price	Context Window
Llama 3.3 70B	$0.59	$0.79	128.0K
Llama 3.1 8B	$0.05	$0.08	128.0K
Mixtral 8x7B	$0.24	$0.24	32.8K

Capabilities

Streaming

Function Calling

Tool Use

JSON Mode

Image Input

Audio Input

Try it in Playground

Test Groq APIs directly in our interactive playground with your own API key.

Try it in Playground

Groq

Endpoints

Chat Completions

Capabilities

Try it in Playground

Overview

Groq

Endpoints

Chat Completions

Capabilities

Try it in Playground

Overview