LLM Service Overview

LLM Service Overview

Access large language models through WaveSpeedAI’s unified API.

What is LLM Service?

WaveSpeedAI provides access to multiple large language models (LLMs) from different providers through a single API. Instead of managing multiple API keys and integrations, use one platform to access models from OpenAI, Google, ByteDance, Mistral, and more.

Features

FeatureDescription
Multiple providersAccess OpenAI, Google, ByteDance, Mistral, NVIDIA models
Web PlaygroundTest models directly at wavespeed.ai/llm
StreamingReal-time response streaming
Enable ThinkingSome models support reasoning/thinking mode
View CodeGenerate API code directly from the Playground

Web Playground

Try LLMs directly in your browser:

  1. Go to wavespeed.ai/llm
  2. Select a model from the dropdown
  3. Configure parameters (temperature, max_tokens, etc.)
  4. Type your message and start chatting
  5. Click View Code to get the API code for your configuration

Available Parameters

ParameterDescriptionRange
max_tokensMaximum response lengthUp to 16,384
temperatureCreativity/randomness0.0 - 2.0
top_pNucleus sampling0.0 - 1.0
top_kTop-k sampling1 - 100
presence_penaltyPenalize repeated topics-2.0 - 2.0
frequency_penaltyPenalize repeated words-2.0 - 2.0

Use Cases

Use CaseDescription
ChatbotsBuild conversational AI assistants
Content GenerationWrite articles, marketing copy, stories
Code GenerationGenerate, explain, and debug code
AnalysisSummarize documents, extract information
TranslationTranslate between languages
ReasoningComplex problem solving with thinking mode

Pricing

LLM pricing is based on tokens:

Token TypeDescription
Input tokensYour messages and prompts
Output tokensModel’s response

Example pricing (varies by model):

  • Input: $0.0750 / 1M tokens
  • Output: $0.3000 / 1M tokens

Check the Playground for each model’s specific pricing.

Next Steps

© 2025 WaveSpeedAI. All rights reserved.