LM-Polygraph Normalization Methods

LM-Polygraph implements several uncertainty normalization methods to convert raw uncertainty scores into more interpretable confidence values bounded between 0 and 1. Here are the key normalization approaches:

MinMax Normalization (MinMaxNormalizer in minmax.py)

  • Takes raw uncertainty scores and linearly scales them to [0,1] range.

  • Flips the sign since uncertainty scores should be negatively correlated with confidence.

  • Uses scikit-learn’s MinMaxScaler internally.

  • Simple but doesn’t maintain a direct connection to output quality.

Quantile Normalization (QuantileNormalizer in quantile.py)

  • Transforms uncertainty scores into their corresponding percentile ranks.

  • Uses empirical CDF to map scores to [0,1] range.

  • Provides uniformly distributed confidence scores.

  • May lose some granularity of original uncertainty estimates.

Performance-Calibrated Confidence (PCC) Methods

Binned PCC (BinnedPCCNormalizer in binned_pcc.py)

  • Splits calibration data into bins based on uncertainty values.

  • Each bin has approximately an equal number of samples.

  • Confidence score is the mean output quality of samples in the corresponding bin.

  • Provides an interpretable connection between confidence and expected quality.

  • Drawback: Can change ordering of samples compared to raw uncertainty.

Isotonic PCC (IsotonicPCCNormalizer in isotonic_pcc.py)

  • Uses Centered Isotonic Regression (CIR) to fit a monotonic relationship.

  • Maps uncertainty scores to output quality while preserving ordering.

  • Enforces monotonicity constraint to maintain uncertainty ranking.

  • More robust than the binned approach while maintaining interpretability.

  • Implementation based on CIR algorithm from Oron & Flournoy (2017).

Common Interface: BaseUENormalizer

All normalizers follow a common interface defined in BaseUENormalizer:

  • fit(): Learns normalization parameters from calibration data.

  • transform(): Applies normalization to new uncertainty scores.

  • dumps()/loads(): Serialization support for fitted normalizers.

Key Benefits of PCC Methods

  • Direct connection to output quality metrics.

  • Bounded interpretable range [0,1].

  • Maintained correlation with generation quality.

  • Easy to explain meaning to end users.

Highlight: Isotonic PCC

The Isotonic PCC approach provides the best balance between:

  • Maintaining the original uncertainty ranking.

  • Providing interpretable confidence scores.

  • Establishing a clear connection to expected output quality.

When using normalized scores, users can interpret them as estimates of relative output quality, making them more useful for downstream applications and human understanding.