Use Case
Better LLM answers start with better input
Turn PDF documents into token-efficient Markdown that preserves structure. Get better answers from GPT, Claude, and open-source models by feeding them properly formatted source material.
No credit card required
Raw PDF text vs. structured Markdown
What you feed the model determines what you get back. Real examples from actual document types.
Tables
An invoice line-item table. Raw extraction scrambles rows and columns -- the LLM guesses at relationships and gets them wrong.
Item Description Unit Price Widget A Industrial widget $24.99 per unit Widget B Premium widget with coating $42.50 per unit Shipping Flat rate $15.00 Total $1,299.49
| Item | Description | Unit Price | |----------|--------------------------------|------------| | Widget A | Industrial widget | $24.99 | | Widget B | Premium widget with coating | $42.50 | | Shipping | Flat rate | $15.00 | | **Total**| | **$1,299.49** |
Document structure
A service contract. Without headings, the model has no sense of what section it is reading or how to navigate the document.
SERVICE AGREEMENT This Agreement is entered into as of January 15, 2024. DEFINITIONS "Service Provider" means Acme Corp. "Client" means the undersigned party. PAYMENT TERMS Payment is due within 30 days of invoice date. Late payments accrue interest at 1.5% per month. TERMINATION Either party may terminate with 90 days written notice.
# Service Agreement This Agreement is entered into as of January 15, 2024. ## Definitions - **"Service Provider"** means Acme Corp. - **"Client"** means the undersigned party. ## Payment Terms Payment is due within 30 days of invoice date. Late payments accrue interest at 1.5% per month. ## Termination Either party may terminate with 90 days written notice.
Fewer tokens, more content
Raw extraction wastes tokens on whitespace artifacts, repeated headers/footers, and rendering noise. Clean Markdown means more of your context window goes to actual document content -- so you can fit more pages into a single prompt or spend fewer tokens per query.
What you can build
Document Q&A
Ask questions about contracts, reports, or manuals and get answers that reference specific sections and table rows.
Summarization
Generate summaries that respect document hierarchy instead of flattening everything into a single blob.
Data extraction
Pull structured data from invoices, financial statements, and forms with tables that come through as actual tables.
Multi-document analysis
Compare clauses across contracts or reconcile data across reports in a single prompt.
PDF to LLM in two API calls
Parse the document with ParseBridge, inject the Markdown into your prompt. Works with any model that accepts text.
import OpenAI from "openai";
// 1. Parse the PDF into structured Markdown
const res = await fetch("https://api.parsebridge.com/v1/parse/url", {
method: "POST",
headers: {
Authorization: "Bearer pb_your_api_key",
"Content-Type": "application/json",
},
body: JSON.stringify({
url: "https://example.com/contract.pdf",
}),
});
const { markdown } = await res.json();
// 2. Feed the Markdown into your LLM prompt
const openai = new OpenAI();
const completion = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "user",
content: `Extract all payment terms and deadlines
from this contract:\n\n${markdown}`,
},
],
});
console.log(completion.choices[0].message.content);50 free pages, no credit card required. View the API docs