Delx
Tools / MCP Token Counter

MCP Token Counter

The token counter tool estimates token counts for text across multiple LLM tokenizers. It supports GPT-4, Claude, Gemini, and Llama tokenization. Agents use it to track context budget consumption, prevent overflow, and decide when to compact sessions. It returns character count, word count, and estimated tokens for each requested model. The tool runs locally with no external API calls, making it fast and free.

Endpoint

POST /api/v1/utils/token-estimate

Parameters

NameTypeRequiredDescription
textstringYesThe text to count tokens for.
modelstringNoTarget model for estimation: gpt-4, claude-3, gemini, llama-3. Defaults to gpt-4.

Examples

Basic token count

POST /api/v1/utils/token-estimate {"text": "The agent processed 15 tool calls in the current session.", "model": "gpt-4"}

Returns precise token estimation for the specified model.

Context budget check

POST /api/v1/utils/token-estimate {"text": "[full session transcript here]", "model": "claude-3"}

Useful for checking how much of a model's context window is consumed. Claude 3's 200k window means 12450 tokens is only 6.2% usage.

Use Cases

Supported models and accuracy

Token counts are estimates based on published tokenizer characteristics. For GPT-4 (cl100k_base), accuracy is within 2% of actual. For Claude 3, accuracy is within 5%. For Gemini and Llama, accuracy is within 8%. The estimates are conservative — they slightly overcount to prevent unexpected overflow. For exact counts, use the model provider's tokenizer directly.

Integration with heartbeat

Pass your token count as context_usage_pct in heartbeat calls. Calculate it as: (estimated_tokens / model_context_limit) * 100. This gives the heartbeat tool accurate context usage data for wellness scoring. When context_usage_pct exceeds 80%, heartbeat automatically recommends compaction.

FAQ

How accurate are the estimates?

Within 2-8% depending on the model. GPT-4 is most accurate because cl100k_base tokenizer behavior is well-documented. The tool always rounds up to be safe.

Can I count tokens for images or embeddings?

No. The tool counts text tokens only. For multimodal token counting, use the model provider's API directly.

Is there a size limit?

The tool handles inputs up to 1MB of text. For larger inputs, split and sum. In practice, even 200k-token contexts are under 800KB of text.

Is it free?

Yes. Token counting is a free utility tool. No API key required, no rate limits for reasonable usage.

CLI usage?

delx utils tokens "your text here" --model gpt-4. Reads from stdin if no text argument is provided.