lm_polygraph.model_adapters.visual_whitebox_model module

class lm_polygraph.model_adapters.visual_whitebox_model.VisualWhiteboxModel(model: AutoModelForImageTextToText, processor_visual: AutoProcessor, model_path: str | None = None, model_type: str = 'VisualLM', generation_parameters: GenerationParameters = GenerationParameters(temperature=1.0, top_k=50, top_p=1.0, do_sample=False, num_beams=1, presence_penalty=0.0, repetition_penalty=1.0, stop_strings=None, allow_newlines=True, max_new_tokens=100))[source]

Bases: Model

device() → device[source]

static from_pretrained(model_path: str, model_type: str, image_urls: List[str] = None, image_paths: List[str] = None, generation_params: Dict | None = {}, add_bos_token: bool = True, **kwargs)[source]

generate(**args)[source]: Abstract method. Generates the model output with scores from batch formed by HF Tokenizer. Not implemented for black-box models.

generate_texts(input_texts: List[str], input_images: List[Image | str | bytes], **args) → List[str][source]

Abstract method. Generates a list of model answers using input texts batch.

Parameters:: input_texts (List[str]): input texts batch.
Return:: List[str]: corresponding model generations. Have the same length as input_texts.

tokenize(texts: List[str] | List[List[Dict[str, str]]])[source]