Qwen: Qwen3 Embedding 8B – API Quickstart

Sample code and API for Qwen3 Embedding 8B

Create an API key from your OpenRouter dashboard and set it as an environment variable:

Use qwen/qwen3-embedding-8b with the OpenRouter API:

OpenRouter provides an OpenAI-compatible embeddings API that you can call directly, or using the OpenAI SDK.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

POSThttps://openrouter.ai/api/v1/embeddings

AuthorizationBearer $OPENROUTER_API_KEY

Content-Typeapplication/json

HTTP-Refereroptional — your site URL, for rankings

X-Titleoptional — your site name, for rankings

Modelqwen/qwen3-embedding-8b

Name	Type	Default	Description
`max_tokens`	integer	—	This sets the upper limit for the number of tokens the model can generate in response.
`temperature`	float	`1`	This setting influences the variety in the model's responses.
`top_p`	float	`1`	This setting limits the model's choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P.
`stop`	array	—	Stop generation immediately if the model encounter any token specified in the stop array.
`frequency_penalty`	float	`0`	This setting aims to control the repetition of tokens based on how often they appear in the input.
`presence_penalty`	float	`0`	Adjusts how often the model repeats specific tokens already used in the input.
`seed`	integer	—	If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
`top_k`	integer	`0`	This limits the model's choice of tokens at each step, making it choose from a smaller set.
`logit_bias`	map	—	Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100.
`logprobs`	boolean	—	Whether to return log probabilities of the output tokens or not.
`top_logprobs`	integer	—	An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability.