lm_polygraph.utils.api_with_uncertainty module
API model wrapper with uncertainty estimation, analogous to VLLMWithUncertainty.
Wraps any OpenAI-compatible API model with lm-polygraph uncertainty scoring. Supports both generation (delegated to the wrapped model) and standalone scoring of pre-extracted logprobs.
- Usage:
from lm_polygraph.estimators import MeanTokenEntropy from lm_polygraph.stat_calculators import VLLMLogprobsExtractionCalculator, EntropyCalculator from lm_polygraph.utils import APIWithUncertainty
# Wrap an existing API model model_with_uncertainty = APIWithUncertainty(
model=blackbox_model, stat_calculators=[VLLMLogprobsExtractionCalculator(), EntropyCalculator()], estimator=MeanTokenEntropy(),
)
# Option 1: Generate with immediate scoring results = model_with_uncertainty.generate(chats, max_new_tokens=1024, n=8) # results[i][“uncertainty_score”], results[i][“token_ids”], etc.
# Option 2: Score pre-extracted logprobs separately uncertainty = model_with_uncertainty.score(token_ids, logprobs)
# Get pseudo-tokenizer for step boundary mapping tokenizer = model_with_uncertainty.get_tokenizer() tokenizer.set_context(token_ids, logprobs) text = tokenizer.decode(token_ids[0:5])
- class lm_polygraph.utils.api_with_uncertainty.APILogprobData(logprob: float, token: str)[source]
Bases:
objectMinimal logprob entry mirroring vLLM’s logprob format.
- logprob: float
- token: str
- class lm_polygraph.utils.api_with_uncertainty.APIWithUncertainty(model=None, stat_calculators: List = None, estimator=None)[source]
Bases:
objectWraps an OpenAI-compatible API model with uncertainty estimation, analogous to VLLMWithUncertainty for vLLM models.
Delegates generation to the wrapped model and scores outputs using lm-polygraph stat calculators and estimators. Also supports standalone scoring of pre-extracted logprobs via score().
- Args:
- model: API model instance with generate_texts(chats, **kwargs) method
that returns results with “logprobs” in OpenAI API format. Can be None if only using score() for pre-extracted logprobs.
- stat_calculators: List of lm-polygraph stat calculators
(e.g., [VLLMLogprobsExtractionCalculator(), EntropyCalculator()]).
- estimator: lm-polygraph Estimator instance
(e.g., MeanTokenEntropy, Perplexity).
- generate(chats: List[List[Dict[str, str]]], compute_uncertainty: bool = True, **kwargs) List[List[Dict]][source]
Generate completions with optional uncertainty scores.
Delegates to the wrapped model’s generate_texts(), converts logprobs to vLLM format, and optionally computes uncertainty scores.
- Args:
chats: List of chat message lists. compute_uncertainty: If True, compute uncertainty for all outputs. **kwargs: Generation parameters passed to model.generate_texts()
(max_new_tokens, temperature, n, stop, etc.)
- Returns:
- List of lists of result dicts. Each result dict contains:
text: Generated text
logprobs: API-format logprobs
token_ids: Pseudo token IDs (vLLM-compatible)
vllm_logprobs: Logprobs in vLLM format
uncertainty_score: Float (if compute_uncertainty=True)
finish_reason: API finish reason
- get_tokenizer()[source]
Return pseudo-tokenizer for step boundary mapping.
The returned tokenizer implements decode(token_ids) by looking up token text from logprob entries. Call tokenizer.set_context() with the full trajectory’s token_ids and logprobs before using decode().
- score(token_ids: List[int], logprobs: List[Dict]) float[source]
Compute uncertainty score from token IDs and logprobs.
Can be used standalone on pre-extracted logprobs, or called internally by generate(). Mirrors VLLMWithUncertainty.score().
- Args:
token_ids: Pseudo token IDs (from convert_api_logprobs). logprobs: Logprob dicts in vLLM-compatible format.
- Returns:
Uncertainty score (float). Higher = more uncertain.
- lm_polygraph.utils.api_with_uncertainty.convert_api_logprobs(api_logprobs: List[Dict]) tuple[source]
Convert OpenAI API logprobs to lm-polygraph/vLLM format.
API returns: [{token: str, logprob: float, top_logprobs: [{token, logprob}]}] lm-polygraph expects: (List[int], List[Dict[int -> obj_with_logprob_attr]])
Uses hash-based pseudo token IDs since API doesn’t provide real IDs.
- Args:
api_logprobs: List of logprob entries from OpenAI API.
- Returns:
Tuple of (pseudo_token_ids, vllm_format_logprobs).