Skip to content

Pyntagma

Pyntagma Logo

Welcome to the documentation for the pyntagma package!

Pyntagma is a Python library for creating and managing complex data extraction pipelines with ease. Its name is derived from the Greek word 'Syntagma', meaning 'composition', symbolizing that this package fits for semi-structured documents.

Pyntagma aims to bring modern document-processing tools together into a single, standardized, and convenient library. It lets practitioners and researchers compose precise, testable rules to extract complex data from large archives.

Installation

Install Pyntagma using:

pip install pyntagma

Features

  • Structured PDF parsing with clear geometry
  • Composable algebra on positions and regions
  • Bidirectional navigation (pages ⇄ lines ⇄ words ⇄ chars)
  • Multimodal AI integration for crop-aware prompts

Get started with the Overview, then see Concepts for details on the difference between algebra and bidirectional navigation, and AI Tools for model-assisted workflows.