API Reference

API Reference#

This section provides detailed API documentation for the TexTeller package. TexTeller is a tool for detecting and recognizing LaTeX formulas in images and converting mixed text and formula images to markdown.

Image to LaTeX Conversion #

img2latex(model: VisionEncoderDecoderModel | ORTModelForVision2Seq, tokenizer: RobertaTokenizerFast, images: list[str] | list[ndarray], device: device | None = None, out_format: Literal['latex', 'katex'] = 'latex', keep_style: bool = False, max_tokens: int = 1024, num_beams: int = 1, no_repeat_ngram_size: int = 0) → list[str][source]#

Convert images to LaTeX or KaTeX formatted strings.

Parameters:

model – The TexTeller or ORTModelForVision2Seq model instance
tokenizer – The tokenizer for the model
images – List of image paths or numpy arrays (RGB format)
device – The torch device to use (defaults to available GPU or CPU)
out_format – Output format, either “latex” or “katex”
keep_style – Whether to keep the style of the LaTeX
max_tokens – Maximum number of tokens to generate
num_beams – Number of beams for beam search
no_repeat_ngram_size – Size of n-grams to prevent repetition

Returns:

List of LaTeX or KaTeX strings corresponding to each input image

Example

>>> import torch
>>> from texteller import load_model, load_tokenizer, img2latex
>>>
>>> model = load_model(model_path=None, use_onnx=False)
>>> tokenizer = load_tokenizer(tokenizer_path=None)
>>> device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
>>>
>>> res = img2latex(model, tokenizer, ["path/to/image.png"], device=device, out_format="katex")

Paragraph to Markdown Conversion #

paragraph2md(img_path: str, latexdet_model: InferenceSession, textdet_model: TextDetector, textrec_model: TextRecognizer, latexrec_model: VisionEncoderDecoderModel | ORTModelForVision2Seq, tokenizer: RobertaTokenizerFast, device: device | None = None, num_beams=1) → str[source]#

Convert an image containing both text and mathematical formulas to markdown format.

This function processes a mixed-content image by: 1. Detecting mathematical formulas using a latex detection model 2. Masking detected formula areas and detecting text regions using OCR 3. Recognizing text in the detected regions 4. Converting formula regions to LaTeX using the latex recognition model 5. Combining all detected elements into a properly formatted markdown string

Parameters:

img_path – Path to the input image containing text and formulas
latexdet_model – ONNX InferenceSession for LaTeX formula detection
textdet_model – OCR text detector model
textrec_model – OCR text recognition model
latexrec_model – TexTeller model for LaTeX formula recognition
tokenizer – Tokenizer for the LaTeX recognition model
device – The torch device to use (defaults to available GPU or CPU)
num_beams – Number of beams for beam search during LaTeX generation

Returns:

Markdown formatted string containing the recognized text and formulas

Example

>>> from texteller import load_latexdet_model, load_textdet_model, load_textrec_model, load_tokenizer, paragraph2md
>>>
>>> # Load all required models
>>> latexdet_model = load_latexdet_model()
>>> textdet_model = load_textdet_model()
>>> textrec_model = load_textrec_model()
>>> latexrec_model = load_model()
>>> tokenizer = load_tokenizer()
>>>
>>> # Convert image to markdown
>>> markdown_text = paragraph2md(
...     img_path="path/to/mixed_content_image.jpg",
...     latexdet_model=latexdet_model,
...     textdet_model=textdet_model,
...     textrec_model=textrec_model,
...     latexrec_model=latexrec_model,
...     tokenizer=tokenizer,
... )

LaTeX Detection #

latex_detect(img_path: str, predictor: InferenceSession) → List[Bbox][source]#

Detect LaTeX formulas in an image and classify them as isolated or embedded.

This function uses an ONNX model to detect LaTeX formulas in images. The model identifies two types of LaTeX formulas: - ‘isolated’: Standalone LaTeX formulas (typically displayed equations) - ‘embedding’: Inline LaTeX formulas embedded within text

Parameters:

img_path – Path to the input image file
predictor – ONNX InferenceSession model for LaTeX detection

Returns:

List of Bbox objects representing the detected LaTeX formulas with their positions, classifications, and confidence scores

Example

>>> from texteller.api import load_latexdet_model, latex_detect
>>> model = load_latexdet_model()
>>> bboxes = latex_detect("path/to/image.png", model)

Model Loading #

load_model(model_dir: str | None = None, use_onnx: bool = False) → VisionEncoderDecoderModel | ORTModelForVision2Seq[source]#

Load the TexTeller model for LaTeX recognition.

This function loads the main TexTeller model, which is responsible for converting images to LaTeX. It can load either the standard PyTorch model or the optimized ONNX version.

Parameters:

model_dir – Directory containing the model files. If None, uses the default model.
use_onnx – Whether to load the ONNX version of the model for faster inference. Requires the ‘optimum’ package and ONNX Runtime.

Returns:

Loaded TexTeller model instance

Example

>>> from texteller import load_model
>>>
>>> model = load_model(use_onnx=True)

load_tokenizer(tokenizer_dir: str | None = None) → RobertaTokenizerFast[source]#

Load the tokenizer for the TexTeller model.

This function loads the tokenizer used by the TexTeller model for encoding and decoding LaTeX sequences.

Parameters:: tokenizer_dir – Directory containing the tokenizer files. If None, uses the default tokenizer.
Returns:: RobertaTokenizerFast instance

Example

>>> from texteller import load_tokenizer
>>>
>>> tokenizer = load_tokenizer()

load_latexdet_model() → InferenceSession[source]#

Load the LaTeX detection model.

This function loads the model responsible for detecting LaTeX formulas in images. The model is implemented as an ONNX InferenceSession for optimal performance.

Returns:: ONNX InferenceSession for LaTeX detection

Example

>>> from texteller import load_latexdet_model
>>>
>>> detector = load_latexdet_model()

load_textdet_model() → TextDetector[source]#

Load the text detection model.

This function loads the model responsible for detecting text regions in images. It’s based on PaddleOCR’s text detection model.

Returns:: PaddleOCR TextDetector instance

Example

>>> from texteller import load_textdet_model
>>>
>>> text_detector = load_textdet_model()

load_textrec_model() → TextRecognizer[source]#

Load the text recognition model.

This function loads the model responsible for recognizing regular text in images. It’s based on PaddleOCR’s text recognition model.

Returns:: PaddleOCR TextRecognizer instance

Example

>>> from texteller import load_textrec_model
>>>
>>> text_recognizer = load_textrec_model()

KaTeX Conversion #

to_katex(formula: str) → str[source]#

Convert LaTeX formula to KaTeX-compatible format.

This function processes a LaTeX formula string and converts it to a format that is compatible with KaTeX rendering. It removes unsupported commands and structures, simplifies LaTeX environments, and optimizes the formula for web display.

Parameters:: formula – LaTeX formula string to convert
Returns:: KaTeX-compatible formula string

API Reference

Contents

API Reference#

Image to LaTeX Conversion#

Paragraph to Markdown Conversion#

LaTeX Detection#

Model Loading#

KaTeX Conversion#

Image to LaTeX Conversion #

Paragraph to Markdown Conversion #

LaTeX Detection #

Model Loading #

KaTeX Conversion #