The Challenge

Governmental procurement professionals face a critical challenge: tender documents are massive, legally complex, and written in Hebrew—a morphologically rich language that breaks most standard NLP tools. Finding specific clauses and drafting compliant responses requires hours of manual work.

The Solution: Agentic Workflow Architecture

TenderPilot takes a fundamentally different approach than traditional RAG systems. Instead of simply chunking documents and embedding them, we built an autonomous multi-agent workflow where specialized AI agents collaborate to process, understand, and index tender documents with human-level comprehension.

How It Works

At the heart of TenderPilot is a stateful orchestration engine that manages specialized agents:

  • Segmenter Agent: Identifies logical document sections (not arbitrary page breaks)
  • Cleaner Agent: Normalizes Hebrew text and handles encoding issues
  • Summarizer Agent: Extracts core meaning for efficient retrieval
  • Question Generator Agent: Creates a "Reverse-HyDE" index by generating questions each segment answers

Each agent makes context-aware decisions, can retry or backtrack when encountering ambiguous content, and works in parallel while respecting dependencies. The result: human-level document comprehension with complete data sovereignty.

System Gallery

Click on images to zoom in.

Technical Challenges & Solutions

PDF Data Extraction

Tender documents contain complex tables, multi-column layouts, and Hebrew text that standard PDF parsers mangle. We integrated LandingAI—a specialized computer vision product—for high-fidelity extraction that preserves document structure and accurately handles complex layouts before the agentic workflow begins processing.

Retrieval Accuracy

We implemented a Relevance Filter to grade retrieved documents and a Generated Questions mechanism where the system embeds questions answered by the text rather than just the raw text, dramatically improving retrieval precision.