lm_polygraph.utils.manager module
- class lm_polygraph.utils.manager.UEManager(data: Dataset, model: Model, estimators: List[Estimator], builder_env_stat_calc: BuilderEnvironmentStatCalculator, available_stat_calculators: List[StatCalculatorContainer], generation_metrics: List[GenerationMetric], ue_metrics: List[UEMetric], processors: List[Processor], ignore_exceptions: bool = True, verbose: bool = True, max_new_tokens: int = 100, log_time: bool = False, save_stats: List[str] = [])[source]
Bases:
objectManager to conduct uncertainty estimation experiments by using several uncertainty methods, ground-truth uncertainty values and correlation metrics at once. Used for running benchmarks.
Examples:
`python >>> from lm_polygraph import WhiteboxModel >>> from lm_polygraph.utils.dataset import Dataset >>> from lm_polygraph.estimators import * >>> from lm_polygraph.ue_metrics import * >>> from lm_polygraph.generation_metrics import * >>> model = WhiteboxModel.from_pretrained( ... 'bigscience/bloomz-560m', ... device='cuda:0', ... ) >>> dataset = Dataset.load( ... '../workdir/data/triviaqa.csv', ... 'question', 'answer', ... batch_size=4, ... ) >>> ue_methods = [MaximumSequenceProbability(), SemanticEntropy()] >>> ue_metrics = [RiskCoverageCurveAUC()] >>> ground_truth = [RougeMetric('rougeL'), BartScoreSeqMetric('rh')] >>> man = UEManager(dataset, model, ue_methods, ground_truth, ue_metrics, processors=[]) >>> results = man() >>> results.save("./manager.man") `- calculate(batch_stats: dict, calculators: list, inp_texts: list) dict[source]
Runs stat calculators and handles errors if any occur. Returns updated batch stats
- Parameters:
batch_stats (dict): contains current batch statistics to be updated calculators (list): list of stat calculators to run inp_texts (list): list of inputs to the model in the batch
- estimate(batch_stats: dict, estimators: list) Dict[Tuple[str, str], List[float]][source]
Runs stat calculators and handles errors if any occur. Returns updated batch stats
- Parameters:
batch_stats (dict): contains current batch statistics to be updated estimators (list): list of estimators to run
- static load(load_path: str, builder_env_stat_calc: BuilderEnvironmentStatCalculator = None, available_stat_calculators: List[StatCalculatorContainer] = None) UEManager[source]
Loads UEManager from the specified path. To save the calculated manager results, see UEManager.save().
- Parameters:
load_path (str): Path to file with saved benchmark results to load.
- lm_polygraph.utils.manager.order_calculators(stats: List[str], stat_calculators: Dict[str, StatCalculator], stat_dependencies: Dict[str, List[str]]) Tuple[List[str], Set[str]][source]