BlueprintParser
Documentation

BlueprintParser — Construction Blueprint Intelligence

An open-source pipeline that turns construction PDFs into structured, LLM-queryable data — with a human-in-the-loop viewer, automated takeoff, and on-demand YOLO object detection on top.

In plain English: drop in a drawing set, BP reads every page, finds the schedules and callouts, lets a YOLO model locate doors / windows / tags on the floor plans, and a chat-enabled LLM answers questions like "how many doors on the second floor" by looking at the structured data it just built. You stay in the loop: every number clicks back to the pixels it came from.
BlueprintParser — AWS runtimeUser browser/home, /project, /docsCloudFrontassets.*, CORS at edgeALBHTTPS → 3000 / 8080ECS Fargateblueprintparser-app2 vCPU / 4 GBNext.js 16Label StudioECS + EFSlabelstudio.*RDS PostgreSQL 16projects, pages, annotationsS3PDFs, page PNGs, YOLO outSecrets ManagerDB, NEXTAUTH, LLM keysStep Functionsblueprintparser-process-blueprintECS Fargatecpu-pipeline taskprocess-worker.jsTextract(Tesseract fallback)S3pages/*.pngSageMaker Processing — ml.g4dn.xlargeYOLO inference (admin-initiated)
The BP runtime on AWS. ECS Fargate serves the Next.js app and runs the preprocessing worker; SageMaker runs YOLO on-demand; S3 + CloudFront store and serve page images; RDS PostgreSQL holds structured results. Local / development uses the same code with AWS services disabled.
Start Here

Your First Project in Five Minutes

If you have never seen BlueprintParser before, read this section first. It ignores the code, the AWS stack, and the tool-registry plumbing. It shows you the happy path a working estimator takes: upload a PDF, wait a minute, open the viewer, let BP find the things on the page, and export numbers for a bid.

Your first project — 5 steps from PDF to bid-ready numbers1Upload a PDFDrag your drawing setinto the dashboard.2BP reads the pagesOCR, CSI codes, schedules,title blocks — automatic.3Open the viewerPan, zoom, search. Panels onthe right for every feature.D14Run detection + tagYOLO finds doors & windows. Maptags from schedules to drawings.qty$5ExportTakeoff numbers + CSV /Excel for the bid package.
The happy path — five steps, no jargon. Each step corresponds to a section later in these docs if you want the full depth.

1. Upload a PDF

From the dashboard at /home, drag a drawing set onto the upload card. BP accepts a normal multi-page PDF; individual pages up to roughly 10,000 px on either axis work fine at 300 DPI, which covers the common 24×36 and 30×42 sheet sizes. You get a progress bar; when it finishes, the project appears in the project list.

Behind the scenes, the file is uploaded to S3 and a processing job is kicked off. You do not have to wait on the page; you can close the tab and come back later.

2. BP reads the pages

For each page, BP runs OCR, detects CSI MasterFormat codes (the industry-standard classification scheme — "08 14 00 = Wood Doors"), extracts drawing numbers from title blocks, detects schedules and keynotes, and classifies what's on the sheet. This takes roughly one minute per ten pages. A 200-page set is usually ready in five to ten minutes on the default Fargate tier.

You don't have to do anything during this step. When the project card shows Ready, click in.

3. Open the viewer

The viewer lives at /project/[id]. It looks like a drawing review tool: a page sidebar on the left, a big canvas in the middle, a toolbar across the top, and a stack of panels you can flip open from the right edge. Pan with V, click with A, scroll to zoom (hold / Ctrl if you're on a trackpad). The panels on the right — Text, CSI, LLM Chat, QTO, Schedules, Keynotes — are the feature surface. You only open the ones you need.

First thing to try
Click LLM Chatand ask "what disciplines are in this project?". The chat has tools to look up the CSI network graph, schedules, and spatial context; it will usually answer with a breakdown and offer to jump to relevant pages.

4. Run detection and tag a schedule

BP's text pipeline already knows where the schedules and keynotes are. What it doesn't know, until you ask, is where every door and window physically ison the floor plans. That's a YOLO run (an admin kicks it off — see Section 5). Once it's done, open the Schedules/Tables panel, point at the door schedule, and click Auto Parse. Then pick which YOLO class the tags are drawn inside (usually circle) and run Map Tags. BP binds each schedule row to every matching shape in the drawings and gives you a count per row.

Auto-QTO (the QTO → Auto tab) does all of that on autopilot: pick a material type, confirm the schedule, run the mapping, review the counts, export.

5. Export

Everything in the QTO panel exports to CSV or Excel through the Export CSVbutton at the bottom of the panel. One row per tag or area item, with counts, pages, annotations, and notes. Paste it into the bid spreadsheet and you're done.

When something looks off

Most of the complexity in these docs is the answer to one question: what if the happy path doesn't work? If bucket fill leaks through an open doorway, Section 8 covers barriers and the four tuning knobs. If Auto-QTO blocks you at the start, Section 7 covers the YOLO class requirements and how to fix them from Admin. If chat runs out of context room on a big project, Section 9 explains the budget and the presets that trade structure for OCR. And if you want to know how the whole thing runs on AWS, Section 11 is the tour.

Intro

What BlueprintParser Is

In plain English: BP reads a construction drawing set the same way a junior estimator would — it looks at each page, recognizes the text, identifies which part of the project it's about, finds the schedules and callouts, and produces a structured summary. Then it gives you a viewer to review everything and a chat window to ask questions. Nothing leaves the LLM's context without a tool call that cites a page or a row; every number is traceable.

BlueprintParser (BP) is an open-source, self-hostable platform that turns construction PDFs into structured, LLM-queryable data. You upload a multi-page drawing set; BP rasterizes each page, runs OCR, detects CSI MasterFormat codes, extracts structured text annotations, parses tables and schedules, classifies drawing regions, and produces a per-project projectIntelligencebundle — a compact description of the project that is small enough to fit inside an LLM context window but rich enough to answer detailed questions about quantities, trades, cross-references, and specifications.

On top of that structured layer, BP ships a full blueprint viewer with markup, takeoff, tag-mapping, and chat — plus an admin dashboard that runs on-demand YOLO object detection via SageMaker when you want the project to become spatially aware as well as textually.

Feature map: engines + viewer

BP is organized as a set of engines that produce structured data and a single Viewer that consumes it, with a Graph/Output layerthat feeds everything back to the LLM and the Admin dashboard. Data flows upload → Preprocessing Engine → (optional) YOLO Post-Pipeline → Viewer surfaces (display + user parsing) → ParsedRegions → Graph/Output → LLM chat and downstream tools. Every stage persists to pageIntelligence or projectIntelligence; nothing is ephemeral.

  • Preprocessing Engine(upload-time, always runs per page): rasterize at 300 DPI → Textract OCR (LAYOUT + TABLES) → drawing-number extraction → CSI code detection (3-tier matching) → text annotations (phones, equipment tags, abbreviations, 37+ types) → shape parse (keynote symbols via Python/OpenCV) → page intelligence analyze (classification, cross-refs, noteBlocks) → text-region classify (6-stage composite: LINE consumption, column-aware proposal, whitespace-rect discovery, Union-Find merge, per-region analysis, decision tree) → heuristic engine (9 rules, text-only mode) → table classifier → CSI spatial map (9×9 grid with title-block + right-margin zones).
  • YOLO Post-Pipeline(admin-triggered, optional): SageMaker Processing job on g4dn.xlarge → YOLO annotations ingested → re-run heuristic engine with YOLO data → re-classify tables → composite region classifier (classifiedRegions) → YOLO density heatmap (text_box + vertical_area + horizontal_area aggregated on a 16×16 grid) → ensemble reducer (cross-signal agreement, suppresses keyword-only false positives) → auto-table-detector (emits AutoTableProposal[], read-only until user commits).
  • Viewer (user surface, /project/[id]): canvas with pdf.js rasterizer + nine overlay layers, a dense toolbar, three mutually-exclusive modes (pointer/move/markup), a stack of toggleable right-side panels (Text, CSI, LLM Chat, QTO, Schedules/Tables, Keynotes, Specs/Notes, Page Intelligence, View All), a bottom Annotation Panel, and user-driven parsing tools (Table Parse, Keynote Parse, Notes Parse, Spec Parse [planned], Symbol Search, Bucket Fill, Shape Parse, Split Area, Scale Calibration). Section 02 enumerates the full tree.
  • Graph / Output Layer (downstream consumers): every user-committed ParsedRegion promotes via /api/regions/promote into pageIntelligence.parsedRegions; CSI tags merge into pages.csiCodes via idempotent mergeCsiCodes; computeProjectSummaries rebuilds projectIntelligence.summaries (schedules, notesRegions, specRegions, parsedTables, yoloTags). The context-builder assembles a budget-allocated LLM payload from all of the above; the CSI network graph + hub pages are derived once per project and surfaced in chat and the View All panel.
  • Admin Dashboard: Pipeline config (toggle stages, concurrency, per-company heuristic overrides), Heuristics tab (DSL editor for rules), AI Models tab (register YOLO models, trigger runs), LLM Config (provider + context-budget allocations across 19 sections), Overview (reprocess controls + Lambda CV job status). Every viewer feature has a corresponding admin tuning surface.

How they connect: Preprocessing runs once per upload and populates JSONB blobs on pages. YOLO Post-Pipeline augments those blobs on admin trigger. The Viewer reads them into a Zustand store (17 slice hooks) and renders overlays; user parsing tools write back to the same blobs via the generic /api/regions/promote commit route. The Graph/Output layer re-derives summaries on every commit and serves them to Chat and the Admin dashboard. The whole stack is one database shape with one write path per mutation, which is why every number in the UI traces back to a pixel on a page.

The two data models

Everything in BP ultimately fits into two axes. Horizontally, a project is a list of pages; each page carries OCR text, a classification, detected text annotations, detected tables, CSI codes, and (optionally) YOLO detections. Vertically, a project is a bundle of cross-cutting data: annotations (user markups + YOLO + takeoff), pageIntelligence (per-page structured analysis), and projectIntelligence (a project-wide summary including the CSI network graph, hub pages, and discipline breakdown).

The preprocessing pipeline, the LLM context builder, the takeoff engine, and the viewer all read and write through those two shapes. Section 03 walks through how the data actually arrives; Section 11 walks through where it is stored and why.

What runs locally vs. what needs AWS

BP is the same codebase in every deployment tier — the difference is purely which external services are configured. A development machine with Docker and a Groq free-tier API key can run the full viewer against a locally-hosted PostgreSQL instance, parse tables with img2table and Camelot, and chat with an LLM, all without a single AWS credential. Add an S3 bucket and page images become durable; add the full Terraform stack and you get CloudFront, Textract, Step Functions, and Label Studio; add a SageMaker Processing role and a YOLO ECR image and YOLO object detection becomes available on-demand.

TierRequiresWorksDoes not work
Local DockerDocker Compose, postgres:16, no AWSUpload, viewer, CSI detect, table parse (img2table/Camelot/TATR), LLM chat via Groq free tier, heuristics, QTO (manual), Bucket FillTextract (falls back to Tesseract), SageMaker YOLO, CloudFront, S3 durability
Local + S3AWS creds for S3, S3_BUCKET, rest localEverything Local Docker does, plus durable page/thumbnail storage and cross-device viewer loadTextract, SageMaker YOLO, CloudFront
Full AWS (CPU-only)Terraform stack: ECS, RDS, S3, CloudFront, Step Functions, Textract, Secrets ManagerProduction pipeline with Step Functions orchestration, Textract OCR, cached page CDN, multi-user auth, Label StudioYOLO inference (no GPU)
Full AWS + SageMakerAdd SageMaker Processing role, a YOLO ECR image, sagemakerEnabled toggleAll of the above plus on-demand YOLO object detection on ml.g4dn.xlarge for Auto-QTO, tag mapping, symbol search(nothing — this is the full stack)
Try it without credentials
The /demoroute hosts a read-only view of a seeded demo project, including YOLO detections, parsed schedules, and chat. It's the fastest way to kick the tires without installing anything.

Tech stack snapshot

BP is a single Next.js 16 application (App Router, React 19, TypeScript) backed by PostgreSQL 16 via drizzle-orm. State in the viewer lives in a single zustand store with slice selectors. LLM access goes through a thin adapter layer over the Anthropic, OpenAI, and Groq SDKs, plus a generic OpenAI-compatible endpoint for Ollama and self-hosted models. The CSI network graph is rendered with d3-force. Python sidecars (pdfplumber, Camelot, img2table, TATR, OpenCV, Tesseract, and the YOLO inference container) are spawned from TypeScript via stdin/stdout JSON. AWS deployment is codified in infrastructure/terraform/— 13 files totaling the full stack including ECS, RDS, S3, Step Functions, IAM, Secrets Manager, and CloudFront.

User Guide

Inside the Viewer

In plain English: the viewer looks like a dense PDF review tool. Sidebar on the left with page thumbnails. Big canvas in the middle showing the current page with interactive overlays for markup, measurements, YOLO detections, and search hits. A toolbar across the top picks the current mode. Panels on the right slide in for each feature — text, CSI, chat, takeoff, schedules, keynotes. Everything is one click away.

The viewer lives at /project/[id] and is the primary surface for every user-facing feature in BP. It is a single React tree driven by a Zustand store with 17 slice selectors, backed by a client-side pdf.js rasterizer for the canvas and a series of overlay layers for annotations, markups, YOLO detections, keynotes, parse regions, and search highlights. Everything you do inside a project flows through this view.

Feature tree — brute-force inventory

Every feature under the Viewer, nested by the DOM/panel hierarchy it renders into. One-line description under each. If a feature has sub-modes or tabs, those are indented under the parent. This is deliberately exhaustive; skim for the shape, read for the specifics.

  • Canvas core
    • PDFPage.tsx pdf.js rasterizer
      Renders the current page as a bitmap at the user's zoom scale; caches the last 8 rendered pages as ImageBitmaps for instant tab-back.
    • Zoom / Fit / Pan controls
      +/− buttons, Fit-to-window, wheel-zoom in Move mode, drag-to-pan in Move mode.
    • Thumbnail sidebar
      Collapsible left-side page list with page-name + drawing-number labels; click to jump, scrolls synchronously with the main canvas.
  • Modes (mutually exclusive, keyboard-bound)
    • Pointer (A)
      Click-to-select overlays; double-click opens edit dialogs for markups, annotations, parsed regions.
    • Move / Pan (V)
      Click-drag pans the canvas, wheel zooms. No overlay interaction.
    • Markup
      Draw rectangle, polygon, or freehand stroke. Opens MarkupDialog on finish for name + note + color pick from 20-color palette.
    • Group / multi-select
      Shift-click-add plus empty-canvas lasso; applies bulk ops (delete, recolor, category change) across selection.
  • Canvas overlay layers(stable z-order, normalized 0–1 coordinates)
    • SearchHighlightOverlay
      Yellow boxes around tsvector search hits from the toolbar text-search.
    • TextAnnotationOverlay
      Boxes around detected phones, equipment tags, room names, abbreviations (37+ annotation types).
    • KeynoteOverlay
      Keynote shape detections (circles, hexagons, diamonds) with inner-text OCR; gated by the showKeynotes toggle.
    • AnnotationOverlay
      The master layer — YOLO detections, user markups, takeoff items, shape-parse output, symbol-search results. Click-to-select, drag-to-move, vertex-edit on polygons. Also hosts the draw-rect state machine for Parse flows.
    • ParseRegionLayer
      Saved ParsedRegion outlines + grid preview, color-coded by type (keynote amber, notes blue, spec violet, schedule pink). Also renders the shared parseDraftRegion dashed preview while a user is actively parsing.
    • GuidedParseOverlay
      Draggable row + column boundary lines rendered during a Guided Parse (keynote and notes share this via a prop-based API).
    • FastManualParseOverlay
      Stage 4 Notes primitive — double-click snaps to Textract LINE, derives columns from line margins. Pending rework into ParagraphOverlay (paragraph-level hit-test + adjustable BB + Cmd+C/V template paste).
    • DrawingPreviewLayer
      Rubber-band preview while the user is dragging to draw a new markup or bbox.
    • ParsedTableCellOverlay
      TATR cell-structure overlay for parsed tables; click-a-cell to search by its text, double-click to toggle highlight.
  • Toolbar
    • Back-to-dashboard + click-to-rename project name
      Inline edit on the project name, persists to projects.name.
    • Zoom controls (− / % / +) + Fit
      Symmetric bracketed zoom; Fit recalculates for the current page dimensions.
    • Mode toggle (Pointer / Pan / Markup)
      3-state button; keyboard shortcuts A, V.
    • Symbol Search button
      Draw a template bbox to find all instances; exposes Lite / Power / Custom presets for confidence thresholds.
    • Menu dropdown
      Labeling wizard (YOLO training export), Settings, Page Intelligence toggle, Admin link, Help tips toggle, Export PDF (disabled placeholder).
    • Text search
      Full-text search over OCR via Postgres tsvector; highlights on canvas + lists hits in Text panel.
    • Trade filter / CSI code filter
      Filter the CSI Network Graph, View All, and QTO lists by trade or specific CSI division.
    • YOLO toggle (+ per-model confidence sliders)
      Shows when any YOLO annotation is loaded. Dropdown chevron opens per-model sliders (yolo_medium / yolo_primitive / yolo_precise).
    • Six panel toggles
      Text / CSI / LLM Chat / QTO / Schedules/Tables / Keynotes (+ Specs/Notes in the D2 panel orchestrator).
  • Right-side panels (toggleable, stackable)
    • Text Panel (TextPanel.tsx)
      OCR text viewer, searchable, per-word Textract confidence, click-to-jump-to-canvas-position.
    • CSI Panel (CsiPanel.tsx)
      Detected CSI MasterFormat codes grouped by division; toggle between page-scope and project-scope; click a code to highlight triggers on canvas.
    • LLM Chat Panel (ChatPanel.tsx)
      Project- or page-scoped chat, streams via SSE; has 20 tools (search, read-page, highlight, zoom, list-schedules, count-takeoff, etc.).
    • QTO Panel (TakeoffPanel.tsx) — quantity takeoff
      • Count tab
        Click-to-count with color-coded markers; auto-deduplicates via YOLO tag bindings where available.
      • Area tab (+ Scale Calibration + Bucket Fill + Split Area)
        Polygon draw or bucket-fill flood with text-as-wall barrier detection; Scale Calibration is a 2-point known-dimension flow; Split Area slices a saved polygon.
      • Linear tab
        Polyline length; same scale-calibration model as Area.
      • Auto-QTO tab
        Suggests pages with likely schedules (ensemble-driven after Stage 2b); “Find & Parse Doors Schedule” style shortcuts auto-trigger Table Parse.
      • All tab
        Flat list of every committed takeoff item, exportable to CSV.
    • Schedules/Tables Panel (TableParsePanel.tsx)
      • Auto Parse
        Multi-method merger: OCR-positions, Textract TABLES, OpenCV lines, img2table. Returns a consolidated grid + confidence.
      • Guided Parse
        User draws region, server proposes row/col boundaries, user drags to adjust, client extracts cells.
      • Manual Parse
        User draws column BBs + row BBs; grid extraction runs client-side via word-center hit-test.
      • Compare / Edit
        Side-by-side method outputs; edit cell text and re-save.
      • Map Tags section
        Bind tag column of a parsed table to YOLO tag instances; auto-infers scope + pattern.
    • Specs/Notes Panel (SpecsNotesPanel.tsx) — D2 orchestrator
      • Spec Parse tab
        Stage 5 scope, currently stubbed. Will target full-page vertical-column spec layouts (PART / SECTION / GENERAL NOTES dense prose).
      • Notes Parse tab (NotesPanel.tsx)
        • Index
          Project-wide table of detected note regions from summaries.notesRegions; row click jumps to page and opens Parser pre-filled with the region bbox.
        • Classifier
          Per-page Accept / Edit / Reject cards for Layer-1 classified textRegions (notes-numbered + notes-key-value). Accept one-click-promotes via /api/regions/promote; Reject persists to rejectedTextRegionIds with stale-ID cleanup.
        • Parser — Auto sub-mode
          Server runs parseNotesFromRegion (numbered-first, K:V fallback) + CSI detection; client shows dashed preview on canvas.
        • Parser — Guided sub-mode
          Propose row/col boundaries, user drags on GuidedParseOverlay, client extracts grid.
        • Parser — Fast-manual sub-mode (pending rework)
          Double-click Textract LINE to snap columns. Known-broken on dense multi-line paragraphs; scheduled for redesign as ParagraphOverlay primitive.
        • Parser — Manual sub-mode
          Draw column BBs + row BBs; grid extracted client-side via word-center hit-test. The always-works fallback.
      • Keynotes tab (KeynotePanel.tsx)
        • All Keynotes
          Flat list of every parsed keynote table across the project; CSV export.
        • Auto / Guided / Manual / Compare
          Same sub-mode taxonomy as Table Parse but scoped to bubble-keyed keynote grids.
    • Page Intelligence Panel (PageIntelligencePanel.tsx)
      Read-only dump of pageIntelligence for the current page — classification, crossRefs, textRegions, noteBlocks, heuristicInferences, ensembleRegions. Debug/inspection surface.
    • View All Panel (ViewAllPanel.tsx)
      Project-wide list with per-entity eye toggles (master-eye memento); surfaces schedules, parsed tables, keynotes, notes, specs, YOLO tags, CSI codes. Clickable-graph substrate for future LLM-side reasoning.
  • Bottom Annotation Panel
    Horizontal summary row grouping Markups, YOLO detections, and takeoff items by source; filter chips per category.
  • Dialogs / modals
    • Markup dialog
      Name + note + color on markup save.
    • Bucket Fill Assign dialog
      Assign a filled region to an Area item + color; surfaces HTTP errors inline.
    • Scale Calibration dialog
      Two-point calibration with known real-world distance + unit selector (ft, in, m, mm).
    • Symbol Search config
      Confidence presets (Lite / Power / Custom) + per-project defaults.
    • Export CSV modal
      Keynote / Schedule / Notes export with column selection.
  • Standalone tools (trigger from toolbar or panels)
    • Symbol Search
      Draw a bbox around any symbol on the page; CV matcher finds every other instance across the project.
    • Bucket Fill
      Flood-fill area computation from a click point; text-as-wall paradigm with 1k/2k/3k/4k resolution slider; assigns to Area item with error surfacing.
    • Split Area
      Slice a saved polygon with a user-drawn line into two children.
    • Shape Parse
      Python/OpenCV keynote-shape detector (circles, hexagons, diamonds, pills, squares). Runs at upload; results live in pages.keynotes.
    • Scale Calibration
      Per-page; stored in scaleCalibrations[pageNumber]. Required before any Area/Linear takeoff produces real-world units.
  • ParsedRegion outputs (write path from Viewer into the graph)
    • type: "schedule"
      Tabular grid from Schedules/Tables panel.
    • type: "keynote"
      Key → Description grid from the Keynotes tab.
    • type: "notes"
      Notes-numbered or notes-key-value grid from Notes Parse.
    • type: "spec" (Stage 5 planned)
      Section-header → body list from Spec Parse.
    • type: "legend"
      Symbol legend variant; shares NotesData shape.
    All types commit through the generic POST /api/regions/promote route. Server merges CSI tags into pages.csiCodes via mergeCsiCodes and refreshes projectIntelligence.summaries after the transaction commits.

Anatomy

Top: the toolbar. Left: a collapsible page sidebar with thumbnails. Center: the canvas, which renders the current page and its overlays. Right: a stack of toggleable panels — Text, CSI, LLM Chat, QTO, Schedules/Tables, Keynotes, Page Intelligence — which fly in from the right edge when activated. Bottom: the Annotation Panel, a summary row grouping markups, YOLO detections, and takeoff items by source.

Viewer chrome — one canvas, six toggleable panelsEvery region is independently toggleable; panels remember their open state per session.Toolbarzoom · mode · symbol · menu · search · trade · CSI · YOLO · 6 panel togglesSidebarpage thumbnailsCanvasPDF page + overlay layersSearch · TextAnnotation · Keynote · Annotation · ParseRegion · GuidedParse · DrawingPreviewRight panelsTextCSILLM ChatQTOTablesKeynotesAnnotation Panelgrouped by source: MARKUPS · YOLO · TAKEOFFclick an annotation to select; hover for tooltip
ViewerAnatomyDiagram — toolbar, sidebar, canvas with overlay layers, the right-side panel stack, and the bottom annotation panel. Each region toggles independently.

The toolbar

The toolbar is dense by design — a working estimator needs every mode and every panel within one click of the canvas. Below is a live static rendition with fake data so you can see the exact layout and control styling without loading a real project.

Demo Project
100%
ToolbarDemo — a pixel-for-pixel copy of src/components/viewer/ViewerToolbar.tsx, minus the Zustand wiring.

From left to right: the back arrow returns to the project dashboard; the project name is click-to-rename; the - and + buttons bracket the current zoom percentage and the Fit button auto-fits the page; the 3-state mode toggle selects Pointer / Pan / Markup; the Symbol button opens a draw-a-bbox-to-find-all-instances workflow; the Menu button opens the dropdown (shown below). The right half of the toolbar carries the text search, the trade filter, the CSI code filter, the YOLO toggle (with per-model dropdown when multiple models are loaded), and the six panel toggles.

Modes: pointer, move, markup

The canvas has three mutually-exclusive modes, controlled by setMode() in the viewer store. The internal mode values are "pointer", "move", and "markup". Keyboard shortcuts are A (pointer), V (pan/move), and switching to Markup mode activates the drawing tools. Pointer mode clicks on overlays to select them; move mode click-drags the canvas to pan and mouse-wheel zooms; markup mode lets you draw rectangles, polygons, or freehand strokes.

mode = "pointer"
ModeToggleDemo — click to cycle through pointer / move / markup.

Markup mode

Markup annotations are user-authored overlays: a rectangle, polygon, or freehand stroke with an associated name and optional multi-line note. Each markup gets one of twenty colors drawn from the TWENTY_COLORS palette (src/types/index.ts), and the markup dialog captures a name and note on save. Markups show up in the Annotation Panel at the bottom of the viewer under the MARKUPScategory, and they're persisted to the annotations table with source = "user".

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
TWENTY_COLORS — 20 entries
ColorSwatchDemo — the 20-color palette. Import is live from TWENTY_COLORS so it stays in sync.
MarkupDialogDemo — the real MarkupDialog.tsx mounted inside the docs. Click to open it.

Menu dropdown

The menu collects operations that don't belong on the main toolbar: a labeling wizard for building YOLO training sets, a settings modal, a toggle for the Page Intelligence panel, a link to the admin dashboard, and a help tips toggle that reveals contextual tooltips across the UI.Export PDFis present but disabled — it's the obvious future feature.

Menu
Admin
MenuDropdownDemo — items verbatim from ViewerToolbar.tsx:288–331.

YOLO controls in the toolbar

When a project has any YOLO annotations loaded, the purple YOLObutton appears. It toggles the canvas overlay and opens the Detection Panel. When multiple models are loaded, the dropdown chevron reveals a per-model panel with independent confidence sliders — useful for tuning the output on a project where yolo_medium is noisy but yolo_precise is conservative.

YOLO runs from admin only
The toolbar YOLO toggle only displays results. To actually run YOLO inference, go to Admin → AI Models and start a SageMaker Processing job. Section 05 explains the full pipeline.
25%
ConfidenceSliderDemo — per-model confidence slider mirroring the viewer toolbar dropdown.

Right-side panel toggles

The right half of the toolbar holds six panel toggles. Panels slide in from the right edge and can be stacked. Each is independently toggleable and each keeps its own internal state.

PanelPurposeLives in
TextOCR text viewer, searchable, shows per-word confidence from Textract.TextPanel.tsx
CSIDetected CSI MasterFormat codes grouped by division. Page / project scope.CsiPanel.tsx
LLM ChatProject- or page-scoped chat with 20 tools. Streams via SSE.ChatPanel.tsx
QTOQuantity takeoff: Count, Area, Linear, Auto-QTO, and All tabs.TakeoffPanel.tsx
Schedules/TablesParsed tables with Auto / Guided / Manual / Compare tabs and Map Tags.TableParsePanel.tsx
KeynotesDetected keynote symbols (circles, hexagons) with per-shape summaries.KeynotePanel.tsx

Canvas overlays

The canvas mounts several overlay layers on top of the rendered page. They stack in a stable z-order and each can be toggled or filtered independently. All overlays operate in normalized 0–1 page coordinates so they stay aligned when the user zooms or the page dimensions change across pages.

  • SearchHighlightOverlay— yellow boxes around tsvector search hits.
  • TextAnnotationOverlay— boxes around detected phone numbers, equipment tags, room names, and other text-annotation matches.
  • KeynoteOverlay— detected keynote shapes (circles, hexagons, diamonds) with their inner text.
  • AnnotationOverlay— the main YOLO + user markup layer. Click-to-select, click-to-edit.
  • ParseRegionLayer— saved table parse regions, click to jump to the parsed data.
  • GuidedParseOverlay— the live grid lines rendered while tuning a Guided Parse.
  • DrawingPreviewLayer— the rubber-band preview while the user is drawing a new markup.

State management: the 17 slice hooks

The viewer's state lives in a single Zustand store at src/stores/viewerStore.ts (1,986 lines). The store is large but access is scoped through seventeen slice hooks— each hook returns a narrow set of fields memoized by useShallow, so components only re-render on changes to their own slice.

viewerStore.ts — 17 slice hooks around one Zustand storeSubscribe via slice hooks (not individual fields) to minimize re-renders. Line numbers are verbatim.useViewerStoreL609(ViewerState, ~1400 LOC body)useNavigationL1675pageNumber, numPages, mode.usePanelsL168612 showX flags + toggles.useSelectionL1726multi-select ids + helpers.useAnnotationGroupsL1737groups, memberships, upsert.useDrawingStateL1751_drawing/_drawStart/_drawEnd/_mousePos.useSymbolSearchL1764results, confidence, dismissed.useChatL1792messages, scope.useTableParseL1803step, region, grid, col/row BBs.useKeynoteParseL1832step, region, yolo-class bind.useProjectL1859projectId, publicId, dataUrl, isDemo.usePageDataL1882pageNames, pageDrawingNumbers.useDetectionL1901annotations, showDetections, filters.useYoloTagsL1921tags, activeId, visibility, picking mode.useTextAnnotationDisplayL1940shown types + colors + hidden set.useAnnotationFiltersL1956active filter, csi filter, trade filter.useQtoWorkflowL1969active wf, cell structure, toggleCellHighlight.useSummariesL1977summary arrays + chunk loader state.Rule: prefer a slice hook over useViewerStore(s => s.field). Slice hooks use useShallow, so components only re-render when their slice actually changes.
17 slice hooks fan out from useViewerStore. Line numbers are from the current viewerStore.ts.
Rule of thumb for new UI
Before adding a new visibility flag or filter, grep the store for an existing slice that fits. Binding a new panel to an existing slice means two-way sync with the toolbar and ViewAllPanel eye icons is automatic — no drift risk. Adding a parallel state field usually re-discovers a bug that was already fixed once.

The canvas render gate (drift hazard)

src/components/viewer/AnnotationOverlay.tsx is the center of the drawing logic — 2,581 lines that handle every canvas mode, hit testing, bucket fill commit, split-area, vertex edit, polygon drawing, symbol search, markup, calibration, and keynote/table parse region selection. The file has one structural trap that bit the group-tool fix on 2026-04-19 and keeps coming back: adding a new mode requires touching four places.

The 4-point render gate — AnnotationOverlay.tsxA new canvas mode must be added to ALL FOUR conditions, or the canvas drifts (dead clicks, wrong cursor, stolen events).1. canvasWantsEvents (L2510–2520)Boolean. True if any mode needs pointer events.activeTakeoffItemId !== null ||bucketFillActive ||calibrationMode !== "idle" ||polygonDrawingMode === "drawing" ||mode === "markup" | "pointer" | "group" ||tableParseStep / keynoteParseStep !== "idle" ||symbolSearchActive || splitAreaActive2. canvasShouldRender (L2521–2527)If false, the whole canvas returns null.pageAnnotations.length > 0 ||polygonDrawingMode !== "idle" ||pendingMarkup !== null ||canvasWantsEventsIf false → return null before any overlay renders.Annotations on page + drawing previews + pendingMarkupall independently force the canvas to mount.3. pointerEvents (L2550)Inline style on the <canvas> element.tempPanMode? "none": canvasWantsEvents ? "auto" : "none"Hold "v" (tempPanMode) and the canvas becomes transparent to events.4. cursor (L2554)Inline style chain. One ternary arm per mode.splitAreaActive ? "crosshair": bucketFillActive ? (custom SVG cursor): calibrationMode ? "crosshair": polygonDrawingMode ? "crosshair" : ...⚠ The drift hazardAdding a new tool means edits to all fourlocations. Miss #1 and the canvas eats events it shouldn't. Miss #2 and the canvas vanishes.Miss #3 and clicks fall through to the underlying page. Miss #4 and the cursor lies about the mode. Group-tool 2026-04-19 fix: exactly this bug.
Canvas render gate — four coupled conditions in AnnotationOverlay.tsx. Missing any one produces a different silent regression.

The companion architecture doc at featureRoadMap/BPArchitecture_422.md contains the full mode table and exact line numbers if you're about to add a new tool.

Scale calibration and measurement units

Before any area or linear takeoff will produce real-world numbers, the user has to calibrate the page scale. You click Set Scale in the Area tab, click two points on a known dimension (a grid line, a labeled wall), and enter the real-world distance plus a unit. Calibration is stored per page in scaleCalibrations[pageNumber]— reusing a polygon on a new page requires recalibrating unless the pages share the same scale.

unit
= "ft"
AreaUnitChipDemo — the four base units from src/components/viewer/AreaTab.tsx.
Engines

From PDF to Structured Data

In plain English: when you upload a PDF, BP opens it, splits it into pages, reads every word with OCR, and runs a sequence of small analyses — what kind of drawing is this, what's in the title block, what CSI codes appear, are there tables — and stores the result in a compact JSON per page. A 200-page set finishes in roughly 5 to 10 minutes on the default Fargate tier.

The preprocessing pipeline is the load-bearing part of BP. Everything the viewer, the LLM, and the takeoff engine depend on — CSI codes, page classifications, text annotations, detected tables, cross-references, note blocks, the CSI spatial heatmap, and the CSI network graph — is computed during preprocessing and then read back on demand. This section walks through what actually happens between POST /api/projects and the moment the viewer loads its first page.

Entry point and orchestration

The pipeline is triggered when the projects route creates a project row. On local development, it's invoked inline via processProject(projectId) in src/lib/processing.ts. On AWS, the same function runs inside the cpu-pipeline ECS task, launched by an AWS Step Functions state machine (infrastructure/terraform/stepfunctions.tf). In both cases the public-ID lookup, the processing body, and the post- processing project analysis are identical — the state machine just gives you durable retries, CloudWatch logging, and isolation from the web task.

src/lib/processing.ts — entry signature
export async function processProject(projectId: number): Promise<{
  pagesProcessed: number;
  pageErrors: number;
  processingTime: number;
}> {
  // ... fetch project, download PDF, count pages ...
  // ... mapConcurrent(pageNums, pageConcurrency, processOnePage) ...
  // ... analyzeProject + computeProjectSummaries + warmCloudFrontCache
}
ts

The 14 per-page stages

The diagram below shows the exact order each page moves through. Every stage is individually wrapped in a try/catch: a failure in Textract doesn't prevent text annotation detection from running on the (possibly empty) output, a failure in CSI detection doesn't prevent heuristics from firing, and so on. Per-page errors are written to pages.error so you can spot partial results in the admin dashboard without the whole project being marked as failed.

processProject() — per-page stages (concurrency 8)Each stage is wrapped in try/catch — a failure does not stop subsequent stages.1Rasterize 300 DPIrasterizePage()pdf-rasterize.ts2Upload PNG + thumbuploadToS3()s3.ts3Re-raster if > 9500 pxrasterizePage(safeDpi)pdf-rasterize.ts4OCRanalyzePageImageWithFallback()textract.ts5Raw text extractextractRawText()textract.ts6Drawing numberextractDrawingNumber()title-block.ts7CSI detectiondetectCsiCodes()csi-detect.ts8Text annotationsdetectTextAnnotations()text-annotations.ts9Page intelligenceanalyzePageIntelligence()page-analysis.ts10Text regionsclassifyTextRegions()text-region-classifier.ts11Heuristic engine (text-only)runHeuristicEngine()heuristic-engine.ts12Table classifyclassifyTables()table-classifier.ts13CSI spatial mapcomputeCsiSpatialMap()csi-spatial.ts14Upsert + search vectordb.insert(pages)db/schema.tsAfter all pages: analyzeProject() → computeProjectSummaries() → buildCsiGraph() → warmCloudFrontCache()
PipelineFlowDiagram — all 14 per-page stages, in order, pulled from processing.ts:processProject().
  1. Rasterize at 300 DPI (rasterizePage()) — the full-resolution PNG for display. This is what the viewer's canvas eventually renders.
  2. Upload PNG + 72 DPI thumbnail to S3 — both get Cache-Control: public, max-age=31536000, immutable so CloudFront can cache forever. The thumbnail backs the sidebar.
  3. Re-rasterize at a safe DPI if the 300 DPI image exceeds 9500 px in either dimension— Textract rejects images above 10000 px. A 24×36" sheet at 300 DPI is 10800 px; the pipeline re-rasterizes at roughly 263 DPI in that case. The re-rasterized buffer is only used for OCR; the display image stays at 300 DPI.
  4. OCR via Textract with Tesseract fallback (analyzePageImageWithFallback()) — produces a structured TextractPageData with per-word bounding boxes and confidence scores. If Textract is unreachable or credentials are missing, it falls through to Tesseract.
  5. Flatten OCR into raw text (extractRawText()) — the concatenation used by the PostgreSQL search_vector column for /api/search.
  6. Extract the drawing number from the title block (extractDrawingNumber()). This is what becomespages.name— e.g., "A-101".
  7. Detect CSI codes (detectCsiCodes()) — the 3-tier matcher. Output is written to pages.csi_codes. Section 04 explains the algorithm.
  8. Detect text annotations (detectTextAnnotations()) — runs the 10 detector modules from src/lib/detectors/registry.ts: contact, codes, dimensions, equipment, references, trade, abbreviations, notes, rooms, csi-annotations. Produces a grouped annotation list with sub-categories.
  9. Analyze page intelligence (analyzePageIntelligence()) — discipline and drawing-type classification, cross-references to other sheets, note blocks. This is the first place the pipeline produces a structured summary of the page.
  10. Classify text regions (classifyTextRegions()) — OCR-based identification of where the tables, schedules, legends, and note blocks live on the page. Produces textRegions[] with confidence scores.
  11. Run the heuristic engine in text-only mode (runHeuristicEngine()). Rules that do not require YOLO classes fire here. Section 05 explains how YOLO-augmented heuristics re-run later, after a YOLO job completes.
  12. Classify tables (classifyTables()) — combines text regions and heuristic inferences into classified table candidates (door schedule, finish schedule, keynote table, etc.).
  13. Compute CSI spatial heatmap (computeCsiSpatialMap()) — divides the page into a 9×9 grid plus title-block and right-margin special zones and tallies CSI instances per zone. Initial pass is OCR-only; a YOLO pass can refresh later.
  14. Upsert the pages row and rebuild the search_vector via a raw SQL to_tsvector('english', rawText). This is the single write-point for the whole per-page pipeline.

Project-level analysis (after all pages complete)

Once every page has finished (or errored), the pipeline switches gears. It reads all processed pages back, passes them to analyzeProject() which computes the discipline breakdown, hub pages, cross-reference graph, and the CSI network graph via buildCsiGraph(). The result — a structured projectIntelligence blob and a short text projectSummary — is written back to the projects row. A separate computeProjectSummaries() pass then builds the per-index lookup tables (CSI → pages, trade → pages, keynote → pages, text-annotation → pages) that lookupPagesByIndex() reads at O(1) from LLM tool calls.

The final step is a best-effort CloudFront cache warm: each page PNG gets a HEAD request so CloudFront edge locations pull it ahead of the first viewer hit. Failures are logged and ignored.

Concurrency and tuning

Pages run in parallel via a small mapConcurrent() helper with a default limit of 8. The limit is per-company configurable through companies.pipelineConfig.pipeline.pageConcurrency and the Admin → Pipelinetab — raise it on a beefy Fargate task, lower it if Textract throttles you. The spatial grid size is also configurable via pipelineConfig.pipeline.csiSpatialGrid.

Idempotency
If a page already has textract_data stored, the per-page body is skipped. Re-triggering processing on an existing project (via /api/admin/reprocess) will reuse completed pages and only work on missing or errored ones. To force a full re-run, delete the project rows in the DB first or zero out the textract_data column for the pages you want redone.
Engines

CSI as a Token-Efficient Blueprint Encoding

In plain English: CSI MasterFormat is the construction industry's Dewey Decimal system. Every spec section has a 6-digit code like "08 14 00 = Wood Doors". BP detects these codes on every page, uses them to summarize what's on the sheet, and builds a project- wide graph showing how trades connect. An LLM can navigate a 200-page set through that graph without ever reading most of the OCR.

The hardest thing about putting a construction project in front of an LLM is the raw token cost. A typical 200-page drawing set runs to ~2 million characters of OCR text — roughly 500,000 tokens — and the useful content is scattered across specifications, notes, schedules, dimensions, legends, and title blocks. Dumping all of that into a context window is both expensive and counterproductive: the model gets lost in the noise.

BP's answer is the CSI engine: a three-layer encoding that turns a page into a structured, compact tag set, turns a project into a navigable graph, and lets the LLM zoom from project-level structure down to individual pages through tool calls rather than by paging through OCR. CSI codes are the primary key — they're a shared vocabulary across all construction documents and map directly to how estimators and specifiers think.

Why CSI and not raw keywords

CSI MasterFormat is an industry-standard classification system maintained by the Construction Specifications Institute. Every specification section has a code like 08 14 00 — Division 08 (Openings), section 14 (Wood Doors), subsection 00 (general). Division is the most useful unit: it maps to trade, it's stable across projects, and it's dense enough that 25 divisions can meaningfully describe any project while being small enough to fit the whole project's division breakdown into a single paragraph of LLM context.

Because CSI is a closed vocabulary, BP can turn a full page of OCR (easily 4–10k characters) into a single short tag list like [22 00 00, 23 05 00, 26 05 00] plus confidence scores and then look up detailed division data on demand through tool calls. That pattern is what lets BP scale LLM chat to 200-page projects without ever exceeding a Sonnet-sized context budget.

22 00 00Plumbing95%23 00 00HVAC88%26 00 00Electrical92%08 14 00Wood Doors78%03 30 00Cast-in-Place Concrete85%09 51 13Acoustical Panel Ceilings62%
CSI chips — how detected codes show up across the UI.

Layer 1: per-page detection (3-tier algorithm)

src/lib/csi-detect.ts implements a rule-based matcher against a MasterFormat database. The matcher runs three tiers in order of specificity; a code can be tagged by any tier it passes, and the tier with the highest confidence wins. Defaults:

TierWhat it matchesConfidenceWhy it exists
Tier 1Exact consecutive-word subphrase from the MasterFormat description (e.g. 'cast-in-place concrete' anywhere in the OCR).0.95High-signal: the literal phrase is in the text, which essentially never happens by accident.
Tier 2Bag-of-words overlap — at least tier2MinWords significant words from the description appear anywhere on the page (stop words excluded).≤ 0.75 (tier2Weight)Catches rephrased matches: 'acoustical ceiling panel' matches 'Acoustical Panel Ceilings' without insisting on word order.
Tier 3Keyword-anchor — at least tier3MinWords high-signal anchor words match a description.≤ 0.50 (tier3Weight)Fallback: rescues obvious trades (plumbing, electrical, HVAC) when neither subphrase nor bag-of-words hits.

The matcher keeps only codes whose final score beats matchingConfidenceThreshold (default 0.40). All defaults are overridable per-company through companies.pipelineConfig.csi and the Admin → CSI tab, which also lets admins upload a custom CSI database TSV (useful for trades like fire alarm that benefit from an expanded vocabulary).

src/lib/csi-detect.ts:38-52 — DEFAULT_CONFIG
const DEFAULT_CONFIG: CsiDetectConfig = {
  matchingConfidenceThreshold: 0.4,
  tier2MinWords: 3,
  tier3MinWords: 5,
  tier2Weight: 0.75,
  tier3Weight: 0.50,
};
ts
Try the live detector. The input hits /api/csi/detect with a debounce and renders the returned tier + confidence. Falls back to a static example when unauthenticated.

Layer 2: per-page spatial heatmap

After detection, computeCsiSpatialMap()bins every CSI-tagged text annotation (and, after a YOLO pass, every YOLO-inferred region) into a 9×9 grid plus two special zones: title-block (y > 0.85) and right-margin(x > 0.75, y < 0.85). The output is a list of zones with per-division counts, which is what the LLM sees when it calls getCsiSpatialMap(pageNumber).

The spatial map is how the LLM answers questions like "what's in the top-right of this sheet?" or "where are the MEP systems concentrated?" without having to scan every word box on the page. The 3×3 demo below is a simplified view of a single page; the real default grid is 9×9.

Div 0814
Div 098
Div 0911
Div 2222
Div 2316
Div 2618
Div 229
Div 2624
Div 276
Low
High
Real grid is 9×9 + title-block + right-margin zones.
CSI spatial heatmap — a toy 3×3 grid. Darker = more instances of that division.

Layer 3: the CSI network graph

At the project level, buildCsiGraph() converts the per-page CSI tags into a graph: nodes are CSI divisions, edges are co-occurrence relationships between divisions (with three types: co-occurrence, cross-reference, and containment), and clusters are pre-defined groupings: MEP (22, 23, 26, 27, 28), Architectural (08, 09, 12), Structural (03, 05), and Site (31, 32, 33). The graph carries a fingerprint that BP uses as a cache key so it can avoid re-computing the graph when nothing on the project has changed.

The graph is what makes LLM-driven navigation tractable. Tools like getCrossReferences return hub pages ranked by incoming reference count; lookupPagesByIndex({ index: "csi" }) answers "which pages have Division 22?" in O(1). When the LLM wants to find plumbing plans, it doesn't scan 200 pages — it queries the graph once and gets page numbers back.

Div 03
Concrete
Div 04
Masonry
Div 05
Metals
Div 06
Wood/Plastics/Composites
Div 08
Openings
Div 09
Finishes
Div 10
Specialties
Div 12
Furnishings
Div 21
Fire Suppression
Div 22
Plumbing
Div 23
HVAC
Div 26
Electrical
Div 27
Communications
Div 28
Electronic Safety
Div 31
Earthwork
Div 32
Exterior Improvements
Div 33
Utilities
MEP
Architectural
Structural
Site
CSI division grid — each division colored by cluster. Colors are imported from src/lib/csi-colors.ts, which is the same module the D3 graph view uses.

How the LLM uses all three layers

All three layers surface to the LLM as tool calls. Section 09 walks through the full tool set, but the CSI-specific story is:

  • getProjectOverview() returns the project-level CSI divisions and cluster membership — the coarse first look.
  • getCsiSpatialMap(pageNumber)returns the per-page heatmap — the "zoom in" query.
  • getCrossReferences(pageNumber?) returns the cross-reference edges and hub pages — navigation.
  • lookupPagesByIndex({ index: "csi", key: "22" })is the O(1) "give me every page tagged with Division 22" query.
  • detectCsiFromText(text)lets the LLM run the 3-tier matcher on arbitrary input strings (e.g. a user's question).
Why the CSI graph matters for chat
The context builder (src/lib/context-builder.ts) feeds the CSI network graph into the LLM's system context at priority 1.0— near the top, right after the project report. That means the model sees the division clusters and their edges beforeit sees raw OCR, so its first tool call is almost always a graph query rather than a full-text search. This is how a chat session starts "hot" even on a 200-page project.
Engines

YOLO Object Detection — Run, Load, Display

In plain English: YOLO is the object-detection model that finds the visual shapes on a drawing — doors, windows, tables, title blocks, tag circles. It's optional and gated behind admin controls because it runs on a GPU instance and costs real money. Once it's run, every downstream feature (Auto-QTO, Map Tags, the spatial heatmap) gets sharper.
YOLO runs only from Admin → AI Models
The viewer's YOLO toolbar button only shows and hides already-loaded detections. It does not kick off inference. To actually run YOLO, you go to Admin → AI Models, pick a model and a project, and click Run. The backend launches a SageMaker Processing job and webhook- ingests the results when the job finishes. Running YOLO costs money (GPU instance hours) and is gated behind a per-company feature toggle and an admin-only permission.

YOLO in BP is the layer that turns blueprints from textual documents into spatially-aware ones. The text pipeline (Section 03) already extracts OCR, CSI codes, classifications, and tables. What it doesn't know is where the doors, windows, grid lines, tables, and title blocks physicallyare on each page. YOLO solves that. Once YOLO has identified tables, title_block, drawings, door_single, circle, and so on, every downstream feature in BP — Auto-QTO, Map Tags, the spatial heatmap, the heuristic engine, and LLM spatial queries — becomes significantly sharper.

Where the run actually happens

The run path is: admin opens Admin → AI Models, a tab rendered by src/app/admin/tabs/AiModelsTab.tsx, picks a model from the models table and a project, confirms the cost warning, and clicks Run. That fires POST /api/yolo/run, which writes a new processingJobs row and calls startYoloJob() in src/lib/yolo.ts. That function creates an AWS SageMaker Processing job pointing at the YOLO ECR container, mounts the project's pages/ prefix in S3 as input, and sets yolo-output/ as the output destination.

While the job runs (usually a few minutes per project on an ml.g4dn.xlarge), the admin UI polls GET /api/yolo/status every ~5 seconds and shows live status, execution ID, and CloudWatch logs. When the container finishes, it writes per-page detection JSONs to S3; a webhook hits POST /api/yolo/load, which reads the JSONs, normalizes them into the annotations table (with source = "yolo"), and triggers a refresh of the CSI spatial heatmap + heuristic engine in YOLO-augmented mode.

YOLO run flow — admin click to loaded annotationsThe viewer's YOLO toggle only displays results. Inference is initiated only here.Admin → AI Modelspick model + projectclick RunPOST /api/yolo/runprocessingJobs rowstartYoloJob()SageMaker Processingml.g4dn.xlargeYOLO ECR imagein: pages/ · out: yolo-output/S3 yolo-outputper-page detection JSONpage-N.jsonPOST /api/yolo/loadwebhook on job completenormalize → annotationsannotations tablesource = "yolo"composite-classifier post-hookRefresh CSI heatmap+ heuristic engineYOLO-augmented modeUI polls/api/yolo/status~5s while job runsSafety layers (all must pass):sagemakerEnabledcompany toggle, password-gatedquota capconcurrent processingJobscanRunModelsper-user permissionmodelAccessroot admin grants per-companysrc/lib/yolo.ts:startYoloJob → infrastructure/terraform/sagemaker.tf
YoloRunFlowDiagram — admin click → POST /api/yolo/run → SageMaker Processing → S3 yolo-output → POST /api/yolo/load → annotations + CSI heatmap refresh. Four safety layers gate the run path.

Safety toggles

Because a SageMaker Processing job can cost real money if mis-triggered, BP has several layers of safety:

  • Company-level sagemakerEnabled toggle. Flipped off by default. Flipping it on requires the admin password stored in app_settings. When off, the entire YOLO run path returns an error immediately without touching AWS.
  • Quota enforcement. Per-company concurrent-job caps check against the processingJobs table before starting a new job. Toggleable in the same admin panel.
  • Per-user canRunModels flag. Regular members can view YOLO results but cannot initiate a run. Admins get the flag by default; root admins can grant it selectively.
  • Root-admin-only model sharing. A YOLO model uploaded by one company is not automatically visible to others. The root admin has to grant model access per-company via the modelAccess table.

The Detection Panel

Once a YOLO run completes and results are loaded, they show up in the viewer's Detection Panel (DetectionPanel.tsx, ~780 lines of React). The panel has three sub-tabs:

Sub-tabWhat it showsHow it's built
ModelsEvery YOLO annotation grouped by model → class → individual detection. Per-class and per-annotation visibility toggles, a global confidence slider, and a search filter.Primary view. Reads from annotations where source === "yolo".
TagsYoloTags — user-created tags that bind OCR text (like 'D-01') to specific YOLO shape instances. Created by the Map Tags step (Section 06) or by scan-ins. Each tag shows its instance count, pages, and CSI codes.Powered by the yolo_tags table. The Tags sub-tab is the main input into Auto-QTO.
ShapeDetected primitive shapes on the current page — circles, hexagons, diamonds, etc. Built for keynote tagging and tag-shape discovery. Run on-demand via /api/shape-parse.Shape-parse is OCR + OpenCV — it does not require a YOLO model run and can be triggered for free.

Confidence thresholds and filters

Each YOLO model in BP carries a confidence threshold (default 0.25). The threshold applies both to storage (low-confidence detections can be filtered at ingest by the admin config) and to display — the toolbar's per-model slider in the YOLO dropdown filters the overlay live without mutating the underlying data.

25%
ConfidenceSliderDemo — matches the per-model slider in the viewer's YOLO dropdown.

On top of confidence, the toolbar exposes a trade filter (dropdown populated from the distinct trades inferred from CSI codes) and a CSI code filter (searchable dropdown). Both apply to the canvas overlay independently of confidence; they let estimators zero in on a single scope without fighting with confidence sliders.

Sample YOLO classes

BP ships reference models trained on construction drawings. The specific classes available depend on which models are registered in the models table for your company. Some commonly-useful classes:

door_single42(yolo_medium)door_double8(yolo_medium)window28(yolo_medium)circle156(yolo_primitive)hexagon24(yolo_primitive)tables14(yolo_precise)title_block52(yolo_precise)drawings68(yolo_precise)
Sample YOLO class chips with illustrative counts.

The tables, title_block, and drawings classes are special: Auto-QTO (Section 07) strictly requires them. The drawings class marks the content region of a sheet, and tables + title_block mark regions to exclude from counts (so you don't double-count tags that appear inside a schedule).

How YOLO stacks with heuristics

The heuristic engine (src/lib/heuristic-engine.ts) runs in two modes. Text-only mode fires during the initial processing pass; YOLO-augmented mode re-runs after YOLO data loads. Each rule has optional yoloRequired and yoloBoostersfields. A rule like "if the page contains the word 'concrete' AND a tables class was detected, infer schedule_presentwith CSI division 03" will skip silently during text-only mode and fire when YOLO runs later.

Shape of a heuristic rule (src/lib/heuristic-engine.ts)
{
  id: "concrete-schedule",
  outputLabel: "schedule_present",
  outputCsiCode: "03",
  minConfidence: 0.6,
  textKeywords: ["concrete", "mix design"],
  yoloRequired: ["tables"],          // will skip until YOLO runs
  yoloBoosters: ["title_block"],     // adds confidence if present
  spatialConditions: [
    { type: "contains", region: "tables", textRegion: "header" },
  ],
}
ts

This is the stacking story the rest of the docs will refer back to: YOLO models are not a replacement for heuristics, they're an additional signal that heuristics can chain on top of. A new YOLO class becomes a new input for existing rules, a new input for Auto-QTO, and a new input for the CSI spatial heatmap — without anyone having to touch the rules. Tools stack.

Per-class CSI auto-tagging
You can assign CSI codes directly to a YOLO class in the admin model config. Once set, every annotation of that class automatically inherits the codes — so detecting a water_heaterclass immediately contributes to Division 22 on the CSI heatmap and graph. That's another place where a new model or a tagged class automatically flows into every other feature without code changes.
Engines

Parsing Schedules and Mapping Tags

In plain English: a door schedule has one row per unique door type ("D-01 = solid wood, 90-min fire rated") and the floor plans have circles with "D-01" written inside them, one for each physical door. Counting the circles = counting the doors. BP parses the schedule into a grid, finds every circle containing each tag, and gives you the count.

Schedules — door schedules, finish schedules, equipment lists, plumbing fixture schedules, electrical panel schedules — are where the ground-truth quantities live on a drawing set. Each row has a tag (D-01, W-03, P-12) that shows up as a circle, hexagon, or diamond somewhere on the floor plans, and the job of turning a schedule into a quantity is the job of (a) extracting the schedule into rows and headers, (b) identifying which column holds the tag, and (c) finding every occurrence of those tags elsewhere in the drawings. BP's table parsing and Map Tags system exists to do exactly that.

The Schedules/Tables panel

Open the Schedules/Tables panel from the toolbar (the pink-accented button). The panel renders TableParsePanel.tsx, which orchestrates five tabs: All Tables, Auto Parse, Guided, Manual, and Compare/Edit Cells. Each tab points at the same saved region data (pageIntelligence.parsedRegions[]) but exposes a different parsing strategy.

tableParseTab = "auto"
Panel tabs exactly as TableParsePanel.tsx renders them (labels verbatim from lines 306–319).

Auto Parse — try everything, pick the best

Auto Parse is the default path. You draw a bounding box around the table region on the canvas and click Process Regions. The backend runs a multi-method parse pipeline and merges the results into a single grid. Each method is a fallback for the others — a grid-line-heavy schedule is easy for Camelot; a crowded CAD-printed one is easier for TATR or img2table.

MethodStrengthsImplementation
img2tableFast, handles grid-based tables well, returns cell-level bboxes.src/lib/img2table-extract.ts
Camelot (pdfplumber-backed)PDF-native extraction via vector paths. Works when the PDF has text layers.src/lib/camelot-extract.ts
TATR (Table Transformer)Transformer-based structure inference. Robust to scanned / image-only tables.src/lib/tatr-structure.ts
OCR grid detectOpenCV line detection + OCR word clustering. Pure-image fallback.src/lib/ocr-grid-detect.ts

Results are merged via src/lib/grid-merger.ts and persisted through POST /api/table-parse. The saved parse becomes a parsedRegion on the page with { type: "schedule", data: { headers, rows, tagColumn, colBoundaries, rowBoundaries, csiTags } }.

Guided Parse — tune row/column detection

Sometimes auto parse gets the structure nearly right but misplaces a row boundary or merges two columns. Guided Parse is the answer. You still draw a region, but the panel exposes three tuning sliders whose defaults live in GuidedParseTab.tsx:44:

Row tolerance0.006
Max vertical drift between two OCR tokens in the same row.
Min column gap0.015
Minimum horizontal space to treat two columns as distinct.
Min hits ratio0.30
Fraction of rows a column must appear in to count.
Guided Parse tuning sliders. Defaults: rowTolerance=0.006, minColGap=0.015, minHitsRatio=0.3.
  • Row tolerance— maximum vertical drift between OCR tokens considered part of the same row. Too tight and valid rows split; too loose and adjacent rows merge.
  • Min column gap— smallest horizontal gap that separates two columns. If the panel is merging columns, widen this.
  • Min hits ratio— fraction of rows a column must appear in to count as a real column. Filters out one-off blobs.

As you drag a slider, the panel re-posts to POST /api/table-parse/propose with debounce and redraws the grid line overlay on the canvas. When the grid looks right, you save the parse through the same /api/table-parse endpoint as Auto Parse.

Manual Parse and Compare/Edit Cells

For hostile tables that defeat both Auto and Guided — scanned PDFs with faint grid lines, handwritten schedules, tables that mix multiple scales — Manual Parse lets an estimator define every header and every row by hand. The Compare/Edit Cells tab is an after-the-fact review surface: pick two parse attempts (e.g. Auto with default config and Auto after tweaking), diff the cells, and keep the better one. Both live in the same panel for workflow continuity.

Map Tags — the bridge to Auto-QTO

Parsing a schedule into a grid is useful, but the real leverage comes from Map Tags: binding each row's tag value to every YOLO shape instance that contains that tag text somewhere on the drawings. Once a schedule has been parsed with a tag column identified, the Map Tags section appears. The user picks which YOLO class the tag is drawn inside (usually circle or hexagon) — or selects "no shape" to let BP find free-floating text matches — and clicks Map Tags. The binding is processed by POST /api/projects/[id]/map-tags-batch, which calls into src/lib/yolo-tag-engine.ts to do bbox-intersection + OCR text matching across every page.

Map Tags to Drawings

Tag column: Tag (4 unique tags)

YOLO shapes in table region:
(This is the real MapTagsSection.tsx from src/components/viewer — the callbacks just toggle local state.)
MapTagsDemoShell — the real MapTagsSection.tsx mounted with fake props. Click a YOLO class to bind, then Map Tags.

The result is a set of YoloTags written to the yolo_tagstable and surfaced in the Detection Panel's Tags sub-tab. Each YoloTag is the anchor data structure that Auto-QTO reads from to compute final quantities — one row per tag value, one instance per YOLO match, pages tracked per instance. The connection from schedule to drawings to counts is literally this step.

Map Tags — bind schedule rows to YOLO shape instancesEach unique tag value becomes a yolo_tag with a list of matched annotation IDs across pages.Parsed door scheduleparsedRegions[i].datatagdescriptionD-013070 SCD-023070 HMD-036080 SCtagColumn highlighted in pink in the panelmap-tags-batchyolo-tag-engine.tsDrawing pagesyolo annotations · circle / hexagon classA-201 floor planD-01D-02D-01D-03A-202 floor planD-02D-02D-03D-01regions marked tables / title_block are excludedso the schedule's own tag column doesn't double-count8 instances across 2 pagesD-01: 3 · D-02: 3 · D-03: 2yolo_tags rows (one per unique tag value){ tagValue: "D-01", yoloClass: "circle", instances: [annId, annId, annId], pages: [201, 202], count: 3 }{ tagValue: "D-02", yoloClass: "circle", instances: [annId, annId, annId], pages: [201, 202], count: 3 }→ Detection Panel · Tags sub-tab
MapTagsBindingDiagram — schedule rows on the left, drawing pages on the right. Each row's tag value gets bound to every YOLO shape instance whose inner OCR text matches it. Regions tagged tables / title_block are excluded so the schedule's own tag column doesn't double-count.

The 5 tag-matching strategies

Under the hood, findOccurrences() in src/lib/tag-mapping/find-occurrences.ts dispatches on item.itemType to one of five matchers. Auto-QTO almost always picks yolo-object-with-tag-shape (type 4); the other four support composite-classifier, manual QTO, and future workflows. Each matcher produces raw hits; a shared scorer composes pattern match, region weight, scope, and fuzziness into a single confidence score and tier.

Tag-mapping — 5 matcher typesfindOccurrences() dispatches on item.itemType; scoring is shared across all 5.1yolo-onlyCount every YOLO annotation of a given class. No text, no anchor.Example: find all CIRCLE annotations on floor plansexample canvas2text-onlyCount occurrences of a literal OCR string. No YOLO anchor.Example: find every occurrence of “D-01” in word sequencesexample canvasD-01D-01D-013yolo-with-inner-textYOLO shape containing the tag text (overlap-based). Merges with free-floating text hits as fallback.Example: circles containing “T-05”example canvasT-05T-054yolo-object-with-tag-shapePrimary object (e.g. door) bound to a nearby tag shape (e.g. circle) that contains the tag text. The default for Auto-QTO.Example: door + circle “D-01” nearbyexample canvasD-01D-025yolo-object-with-nearby-textYOLO object with free-floating text adjacent (not inside the object bbox). Distance-based.Example: door with “D-01” next to it (no shape tag)example canvasD-01D-02
Five matcher types with illustrative glyphs. The dispatch is in find-occurrences.ts; the scoring is shared across all 5.
Signal-valve state (for LLM readers)
Two scoring signals — shapeContainBoost and objectAdjacencyBoost— are hardcoded to zero in the current orchestrator (L141-142). A third,windowMatch, is hardcoded true (L131). These are placeholders for the future Discrepancy Engine; BP today cannot detect "the schedule says 12 doors but only 11 are on plans" as a high-confidence signal. See Section 12 for the full story.

Output shape

The persisted data after a successful parse + map is:

  • pages.page_intelligence.parsedRegions[]— one entry per parsed table, with { type: "schedule", bbox, data: { headers, rows, tagColumn, ... } }.
  • annotations— no new rows (parse alone doesn't add annotations).
  • yolo_tags— one row per unique tag value discovered during Map Tags, with a list of matched annotation IDs and page numbers.
  • projectIntelligence.schedules[] catalog is updated by the post-processing summarizer on the next run of computeProjectSummaries(), so Auto-QTO can find the schedule without loading every page.
Schedules are what Auto-QTO reads first
Auto-QTO's first step ("select schedule") reads from theprojectIntelligence.schedules[]catalog to suggest pages that match the chosen material type. If a schedule hasn't been parsed yet, Auto-QTO will take you into this panel to parse it before continuing. See Section 07.
Engines

Auto-QTO: Schedule-Driven Takeoff

In plain English: Auto-QTO is the whole "count everything on the drawings" workflow in one wizard. Pick a material (doors, finishes, equipment, plumbing, electrical), confirm which schedule it came from, pick which shape contains the tags, click Run, review the numbers, export. It's what turns all the earlier work into a number you can paste into a bid.

Auto-QTO is the pipeline from "I parsed a schedule and mapped its tags to YOLO shapes" to "I have a line-item quantity takeoff ready to export to CSV." It is the feature that most directly translates BP's structured preprocessing into a deliverable an estimator actually sends to a bid. It lives in src/components/viewer/AutoQtoTab.tsx and is the Auto QTO sub-tab of the QTO panel.

What Auto-QTO actually does

Given a material type (doors, finishes, equipment, plumbing, or electrical), Auto-QTO:

  1. Finds or asks you to parse the relevant schedule page.
  2. Reads the parsed schedule's tag column.
  3. Asks you which YOLO tag-shape class the tags are drawn inside.
  4. Runs Map Tags (Section 06) over the entire project, binding every unique tag value to its YOLO shape instances, while excluding the schedule region itself and the title block so it doesn't double-count.
  5. Produces a line-item list with counts, pages, and an editable review surface. Estimators can hand-adjust before exporting.
  6. Exports to CSV / Excel for the bid package.

The thing to understand: Auto-QTO does not invent quantities. It simply counts tag occurrences identified by YOLO + OCR, and the fidelity of the count is a function of the fidelity of the YOLO model and the schedule parse. If the model missed a door, Auto-QTO will miss that count; that's why the review step matters and why the user always has an override.

Preflight — the strict YOLO class requirement

Auto-QTO hard-blocks the material picker unless the project's YOLO run includes three specific classes: tables, title_block, and drawings. These are exclusion / inclusion markers for the counting logic — without them, Auto-QTO can't cleanly differentiate "tags inside the schedule" from "tags out on the drawings."

src/components/viewer/AutoQtoTab.tsx:51–52
const QTO_STRICT_EXCLUSION_CLASSES = ["tables", "title_block", "drawings"] as const;
const QTO_RECOMMENDED_CLASSES = ["grid", "vertical_area", "horizontal_area"] as const;
ts

If a strict class is missing, Auto-QTO shows a blocker callout with a link to Admin → AI Models: you need to run a YOLO model that has those classes before you can proceed. The recommended classes (grid, vertical_area, horizontal_area) are soft — missing them just produces a warning banner, not a block.

Why these exact classes
The exclusion classes exist so that the tag in the door schedule's "Door Type" column (e.g. "D-01") does not itself count as a door. Auto-QTO sees that the tag text lives inside a tablesregion and skips it. The same applies to title blocks (which often have a legend that re-uses tag symbols). Without these markers, a schedule with 20 rows of "D-01 D-02 D-03" would double-count every door.

Material picker

Auto-QTO starts with the material picker. Each option binds to a schedule category that the table classifier uses when suggesting pages for the schedule step. Custom material types are supported via a free text input — BP singularizes by stripping a trailing s as a rough stem.

AutoQtoTab.tsx:14–20
const MATERIALS = [
  { type: "doors",      label: "Doors",       scheduleCategory: "door-schedule",       icon: "D" },
  { type: "finishes",   label: "Finishes",    scheduleCategory: "finish-schedule",     icon: "F" },
  { type: "equipment",  label: "Equipment",   scheduleCategory: "material-schedule",   icon: "E" },
  { type: "plumbing",   label: "Plumbing",    scheduleCategory: "plumbing-schedule",   icon: "P" },
  { type: "electrical", label: "Electrical",  scheduleCategory: "electrical-schedule", icon: "Z" },
];
ts
MaterialPickerDemo — styles match the real wizard.

The 5-step state machine

Once a material is picked, Auto-QTO drops you into a step machine whose state lives in the qto_workflows table. The canonical step IDs come from AutoQtoTab.tsx:11:

AutoQtoTab.tsx:11
const STEP_SEQUENCE = ["select-schedule", "confirm-tags", "map-tags", "review", "done"] as const;
ts
  1. Select Schedule
  2. 2
    Confirm Tags
  3. 3
    Map Tags
  4. 4
    Review
  5. 5
    Done
step = "confirm-tags"
AutoQtoProgressBar — click Next/Back to move through the real step IDs.

1. select-schedule

Auto-QTO reads from summaries.schedules (built by computeProjectSummaries()) and surfaces pages whose classified tables match the selected material. Each suggestion shows a confidence badge. If nothing is parsed yet, the wizard can launch into the Table Parse panel inline — you parse the schedule, the wizard picks up where you left off.

2. confirm-tags

Auto-QTO reads the parsed schedule's headers and rows and asks the user to confirm the tag column (pre-selected from the parse). This is also where the user picks the tag-shape class from QTO_TAG_SHAPE_CLASSES: circle, arch_sheet_circle, dot_small_circle, hexagon, hex_pill, diamond, triangle, pill, oval, rectangle, square.

3. map-tags

The user clicks Run Mapping. Auto-QTO invokes POST /api/projects/[id]/map-tags-batchwith the schedule's tag column and the selected YOLO shape class. The backend runs Map Tags across every page, excludes regions labeled tables / title_block, and writes the resulting YoloTags. Results stream back as line items with counts per tag value.

4. review

Each row of the review surface is one line item: { itemType, label, yoloClass?, text?, count, pages, annotations }. Auto-QTO flags ambiguity — e.g. if a tag appears on more pages than its schedule row implies — as a QtoFlag. The user can edit counts, add notes, and fix miscategorizations. Edits are stored in qto_workflows.userEdits so they survive a re-run.

5. done

Terminal state. The user exports via TakeoffCsvModal or ExportCsvModal (CSV / Excel). The workflow stays in the project — you can re-enter it later, advance back to review, and re-export if a schedule was updated.

Item types (SHIP 2 taxonomy)

Under the hood, the counting engine supports five item-type strategies via findItemOccurrences() in src/lib/yolo-tag-engine.ts. Auto-QTO almost always defaults to type 4 (yolo-object-with-tag-shape), but the other four are available to composite-classifier and manual QTO workflows:

  • yolo-only — count instances of a class with no text.
  • text-only — count occurrences of a literal OCR string, no YOLO.
  • yolo-with-inner-text — YOLO shape containing specific text.
  • yolo-object-with-tag-shape — primary object + tag-shape combo (the default for Auto-QTO).
  • text-pattern — detect a repeating tag series (T-01, T-02, T-03, ...).

Demo mode

In demo mode (isDemo === true), Auto-QTO workflows persist in the Zustand store only — they disappear when the tab is closed. The same is true for annotations, markups, takeoff items, and parse results. This is a deliberate design choice so that the public /demo/project/* route can let anonymous users drive a full Auto-QTO workflow without polluting the shared demo project.

Engines

Bucket Fill: Click-to-Area

In plain English: you click inside a room, BP traces the walls for you and tells you the square footage. The hard part is thin walls, open doorways, and text inside the room — solved with four tuning knobs and the option to draw virtual walls.

Bucket Fill is BP's answer to the most tedious part of a manual takeoff: tracing polygons around rooms on a 200-page floor plan set. You click once inside a room, and a browser-side Web Worker floods from that seed point, stops at walls (and any virtual barriers you've drawn across open doorways), simplifies the resulting polygon, and hands it back as normalized 0–1 vertices. If the page is scale-calibrated, BP converts those vertices to a real-world area in the unit you chose at calibration time.

Where it lives

Bucket Fill is the top strip of the Area tab inside the QTO panel (src/components/viewer/AreaTab.tsx). It appears as a tri-state button: disabled (no active area item), idle, active, or barrier mode. The state is controlled by two Zustand flags: bucketFillActive and bucketFillBarrierMode. A third store field, bucketFillResolution, drives the dominant tuning knob (see below).

state = "disabled"
No active area item — bucket fill is locked until you pick a target.
(Click the button to cycle through states.)
BucketFillButtonDemo — click through the four states to see the exact styling each uses.

The 8-stage worker pipeline

The client-side Web Worker at src/workers/bucket-fill.worker.tsdoes all the heavy lifting. It's one pass: no retry, no speculative seeding. The single-pass design is the reason the tool feels instant even on a 4096-pixel image: no round-trip to the server, no Python subprocess, just an OffscreenCanvas and a tight TypeScript loop.

bucket-fill.worker.ts — 8-stage pipelinemaxDimension is the dominant tuning knob. Text is a wall. areaFraction from the worker is decorative — real sqft flows through computeRealArea().1ImageBitmapfrom canvas2DownscalemaxDimension (1k/2k/3k/4k)3Otsu thresholdtolerance offset4morphClosedilation radius5Burn barriersuser lines + polys6Flood fillstops at dark pixels7Trace border+ holes (RETR_CCOMP)8Simplify polygonDouglas-Peucker
BucketFillStagesDiagram — verified against src/workers/bucket-fill.worker.ts:processFill (L453+). Text is a dark region like any other; the flood stops at letter boundaries.

Tuning hierarchy — maxDimension is dominant

The four knobs are not equal. If your fill leaks, or stops short, or over-bleeds through text, the order in which to reach for them is:

KnobWhat it doesDefaultSecondary effect
maxDimension (dominant)Largest dimension of the downscaled image before Otsu runs. 1000 / 2000 / 3000 / 4000 slider.1000Raise when thin wall lines get smeared away at low resolution. Doubles runtime each step.
ToleranceOffset applied to the Otsu threshold. Negative → treat more pixels as walls; positive → more pixels as floor.0Use to rescue thin walls if raising maxDimension alone isn't enough.
DilationmorphClose radius after threshold. Fills small gaps in line art (1–2 px door-frame breaks).3Dilation=0 skips morphClose entirely. Use for plans with thin mullions where closing bridges real gaps.
BarriersUser-drawn virtual walls to seal open doorways. Drawn by clicking two points in barrier mode.Tertiary. Reach for this when the underlying plan genuinely lacks a wall (e.g. an open doorway you don't want the fill to cross).
Why maxDimension is the big lever
Downscaling happens before Otsu thresholds the image. If you downscale a 3000-pixel floor plan to 1000 pixels, a 1-pixel wall becomes a sub-pixel gradient and Otsu loses it. Doubling maxDimension preserves the wall. Tolerance and dilation can sometimes rescue a low- resolution fill, but it's much cheaper (in user effort) to bump the resolution first.

Text is a wall

Post-2026-04-22, the worker does not pre-erase text blocks. Letter boundaries simply act as dark pixels and the flood stops at them like it stops at walls. The reasoning: pre-erasing OCR'd text was error-prone (it enlarged bboxes and erased parts of adjacent walls) and the user rarely wants a fill to cross text anyway — text in a room almost always labels the room or notes something inside it, which stays inside the polygon.

Area accounting, explained
The areaFraction returned from the worker is decorative. It's the pixel-count ratio of the flood, which slightly under-estimates the real room area because the text blocks inside the room are not filled. For reporting, BP uses computeRealArea(vertices, pageW, pageH, calibration) on the traced outer polygon — which correctly includes the text-punctuated interior. Section 7 and src/lib/areaCalc.ts own this math.

The workflow

  1. Open QTO → Area.
  2. Calibrate the page scale (Set Scale→ click two points → enter distance + unit). Without calibration the areas still render but the quantity column will say "page units" instead of a real measurement.
  3. Create an area item (name, color) or click an existing one to make it the active target.
  4. Click the Bucket Fill button to arm. Pick a resolution on the slider (1k / 2k / 3k / 4k).
  5. Click inside the room you want to measure. The worker runs; you see the preview overlay appear.
  6. If the fill leaked through an open doorway, toggle Barrier mode. Click two points to draw a virtual wall. Click inside the room again. Repeat until the fill is sealed.

Holes work natively (courtyards, light wells)

For U-shaped rooms and hallways enclosing a courtyard, the worker runs findHoleBorders()after the outer contour trace. Each hole is simplified separately with Douglas– Peucker. The preview overlay uses fill-rule="evenodd" so the courtyard renders as a true hole rather than a filled island, and computeRealArea() subtracts the hole areas from the outer polygon.

Server fallback

When the client worker fails (very old browser, extremely large images, corrupted ImageBitmap), the viewer falls back to the server path: POST /api/bucket-fill src/lib/bucket-fill.ts scripts/bucket_fill.py(Python OpenCV). The server path predates the Web Worker and uses an adaptive-threshold algorithm rather than Otsu, so its results can differ on low-contrast images. It's a safety net, not the preferred path.

BucketFillResult — same shape for worker and server
{
  "type": "result",
  "polygon": [{ "x": 0.142, "y": 0.388 }, ...],
  "holes": [[{ "x": 0.32, "y": 0.51 }, ...]],   // evenodd-compatible
  "vertexCount": 24,
  "areaFraction": 0.017,     // decorative — use computeRealArea()
  "retryHistory": [...]      // present only when worker retried
}
json

Scale calibration and computeRealArea

Bucket Fill returns a polygon in normalized 0–1 coordinates. To turn that into square feet, BP needs two pieces of information: the scale calibration for the current page, and the page's pixel dimensions. src/lib/areaCalc.ts runs computeRealArea(vertices, pageWidth, pageHeight, calibration)— shoelace formula in pixel space, divided by the calibration's pixels-per-unit, returning a real area in the calibrated unit. Holes are subtracted. Supported units:

unit
= "ft"
The four base units from AreaTab.tsx AREA_UNITS.
Calibration is per-page
If you calibrate a scale on one page and then reuse an area item on a different page, the new polygon inherits the area item's color and name but notits scale. You'll need to recalibrate — most sheet sets use different scales per discipline. The viewer will show a warning chip next to polygons on uncalibrated pages.

How it composes with the rest of QTO

Bucket Fill is not a feature on its own — it's an input method into the Area tab of the takeoff panel. Once the polygon is created, it behaves like any other area takeoff entry: the underlying annotations row has source = "takeoff", the group rollup appears in the QTO panel, the item can be edited, re-colored, moved between groups, or exported to CSV with the rest of the project. This is the design pattern the whole tool lives on: new capabilities stack on top of existing ones. Bucket Fill adds a fast path to create area polygons; everything downstream (aggregation, grouping, export) is unchanged.

Engines

The LLM Loop: Tool-Making, Agentic Rounds, Context Budgets

In plain English: you can chat with your project. The model can't read the whole PDF directly — it's too big — so BP gives it 20 tools that query pre-computed structured data (CSI codes, schedules, detections, text search). The model calls tools in rounds, up to ten times, until it can answer. Which tools it has, what order it prefers them in, and how much context it starts with are all tunable per company.

BP's LLM integration is the payoff for everything in sections 3–8. The preprocessing pipeline builds structured data, CSI encodes it compactly, YOLO makes it spatially aware, Auto-QTO materializes quantities. The LLM loop is what a user actually talks to, and it reaches into all of that structured data through a tool set, an agentic round loop, and a per-model context budget. This is the densest section in the docs — there's a lot happening under the hood.

The framing

A blueprint LLM has a fundamental problem: it can't read the PDF. Even if you chunk a 200-page drawing set into text, the raw OCR is too noisy (page numbers, dimensions, plot stamps, revision blocks) and too long to fit into a context window while leaving room for reasoning. BP solves this by inverting the flow:

  1. The LLM does not see the blueprint directly. What it sees is a compact structured summary built by the context builder.
  2. The LLM gets tools.Twenty of them — the full BP_TOOLS set. They query the pre-computed structured data, run BP engines on arbitrary inputs, and (for a small subset) drive the viewer. Tools are what give the model leverage.
  3. Tools compose inside an agentic loop. The model can call multiple tools in parallel per round, feed results back, and iterate up to ten rounds per turn before being forced to answer.
  4. Context budgets are per-model. A Sonnet call gets a very different slice of data than a Groq call. Admins can override priorities per-company via a preset system.

The "LLM tool making" story the user gets isn't a feature in the UI — it's the shape of the tool registry in src/lib/llm/tools.ts and the pattern you use when adding a new tool: write a tool definition with a JSON Schema input, write an executor, flip a switch in executeToolCall(), and the model can call it on the next request. The next subsection enumerates all twenty.

The 20 tools

Every tool in BP_TOOLS gets a card below, pulled live from src/lib/llm/tools.ts. Filter by group. Action tools (the ones that mutate data or drive the viewer) are marked amber so the distinction between "read" and "write" is visually obvious.

searchPages

Search blueprint pages by text content using full-text search. Returns matching pages with text snippets and relevance scores. Use when looking for specific topics, materials, equipment, or references.

  • query: string *
getProjectOverview

Get the full project map: discipline breakdown, page classifications, all trades, all CSI codes, schedule catalog, annotation summary counts, takeoff totals, and pre-computed page indexes. THIS SHOULD BE YOUR FIRST TOOL CALL — it gives you a complete overview before drilling into specifics.

(no parameters)
getPageDetails

Get comprehensive intelligence for a specific page: classification (discipline, drawing type), cross-references to other sheets, general note blocks, detected text regions, heuristic inferences with evidence, classified tables/schedules, parsed schedule data with rows, CSI spatial heatmap, CSI codes, text annotations (37 types), and keynotes.

  • pageNumber: number *
lookupPagesByIndex

Instant O(1) lookup: which pages contain a specific CSI code, trade, keynote, or text annotation. Reads from pre-computed indexes — much faster than searching. Use for questions like 'which pages have Division 08?' or 'where is the electrical trade?'

  • index: string *
  • key: string *
getAnnotations

Get YOLO object detections and user markups, optionally filtered by page, class name, source type, or minimum confidence. Returns bounding boxes, class names, confidence scores, CSI codes, and keywords.

  • pageNumber: number
  • className: string
  • source: string
  • minConfidence: number
getParsedSchedule

Get structured data from a parsed table or schedule on a page. Returns column headers, data rows as dictionaries, tag column identifier, and CSI codes. Use for door schedules, finish schedules, equipment lists, keynote tables.

  • pageNumber: number *
  • category: string
getCsiSpatialMap

Get zone-based heatmap showing where CSI construction divisions are concentrated on a page. Divides page into 9 zones (3x3 grid) plus title-block and right-margin zones. Each zone lists which divisions appear and how many instances. Use for 'what's in the top-right corner?' or 'where are the MEP systems?'

  • pageNumber: number *
getCrossReferences

Get sheet-to-sheet reference graph. Returns edges (which pages reference which), hub pages (referenced by 3+ other pages), and leaf pages. Use for 'what references A-501?' or 'what are the key hub pages?' Omit pageNumber for full project graph.

  • pageNumber: number
getSpatialContext

Get OCR text mapped into YOLO spatial regions (title_block, legend, drawing_area, grid, etc.). Shows what text is inside each detected region. Use for 'what's in the title block?' or 'read the legend.'

  • pageNumber: number *
getPageOcrText

Get the full raw OCR text for a page. This is the complete extracted text without any structuring. Use as a fallback when structured tools don't have what you need, or when you need to read the full page content.

  • pageNumber: number *
detectCsiFromText

Run CSI MasterFormat code detection on arbitrary text. Returns matching CSI codes with descriptions, trades, and divisions. Use to identify what construction category a piece of text belongs to.

  • text: string *
scanYoloClassTexts

Find all unique OCR texts inside YOLO annotations of a specific class. Use to discover what labels exist inside circles, doors, or any detected shape. Specify pageNumber for fast single-page scan, or omit for full project scan.

  • yoloClass: string *
  • yoloModel: string
  • pageNumber: number
mapTagsToPages

Given specific tag text values (like 'D-01', 'T-03'), find every instance. Optionally filter to a YOLO class or specific page. Specify pageNumber for fast single-page search, omit for project-wide.

  • tags: string *
  • yoloClass: string
  • yoloModel: string
  • pageNumber: number
detectTagPatterns

Auto-discover repeating YOLO+OCR patterns across the project. Finds groups like 'circles containing T-01, T-02, T-03...' or 'diamonds with EQ-01, EQ-02...'. Returns pattern groups with instance counts, unique values, and confidence. Requires YOLO data to be loaded.

(no parameters)
getOcrTextInRegion

Read OCR text inside a specific rectangular region on a page. Coordinates are normalized 0-1 (top-left origin). Use to read text in a specific area of the drawing.

  • pageNumber: number *
  • minX: number *
  • minY: number *
  • maxX: number *
  • maxY: number *
navigateToPageaction

Navigate the blueprint viewer to a specific page. The user will see the page change in their viewer. Use when you want to show them a specific drawing.

  • pageNumber: number *
highlightRegionaction

Highlight a rectangular region on a page with a pulsing cyan outline. Use to point the user to a specific area — a detected table, a door tag, a note block, etc. Coordinates are normalized 0-1.

  • pageNumber: number *
  • minX: number *
  • minY: number *
  • maxX: number *
  • maxY: number *
  • label: string
createMarkupaction

Create a persistent markup annotation on the blueprint with a name and optional notes. Use when the user asks you to mark, flag, or annotate something for later reference.

  • pageNumber: number *
  • minX: number *
  • minY: number *
  • maxX: number *
  • maxY: number *
  • name: string *
  • note: string
addNoteToAnnotationaction

Append a note to a specific annotation by ID. Notes are appended (never overwritten) to preserve existing user notes. Use when the user asks to annotate, comment on, or flag a specific detection.

  • annotationId: number *
  • note: string *
batchAddNotesaction

Append a note to ALL annotations matching a filter. Notes are appended to each annotation's existing notes. Use for bulk operations like 'add a note to all door detections on page 5' or 'flag all low-confidence detections'.

  • note: string *
  • pageNumber: number
  • className: string
  • source: string
  • minConfidence: number
Data is read live from BP_TOOLS in src/lib/llm/tools.ts. Total: 20 tools.
ToolCardGrid — all 20 tools from BP_TOOLS. Data is read directly from tools.ts.

Why these particular tools exist

The set is small by design. Each tool corresponds to one of the structured surfaces BP already maintains, rather than being a low-level primitive the model has to compose. An LLM given 20 purpose-built tools will pick the right one faster than one given 80 composable primitives.

  • Navigation tools getProjectOverview, getPageDetails, lookupPagesByIndex, getCrossReferences— answer "where is" questions without paging through pages.
  • Structured reads getAnnotations, getParsedSchedule, getCsiSpatialMap, getSpatialContext— pull a single structured chunk at a time, so the model can ask for exactly what it needs.
  • Text fallback searchPages and getPageOcrText let the model hit raw OCR only when structured data is insufficient. Raw OCR sits at priority 10 in the context builder for the same reason: last resort.
  • Engine invocation detectCsiFromTextlets the model run the 3-tier CSI matcher on a user's phrase. detectTagPatterns runs the tag-pattern detector. Tools can wrap BP engines so the model can do analysis on the fly.
  • YOLO tag tools scanYoloClassTexts, mapTagsToPages, getOcrTextInRegion— bridge between OCR text and YOLO regions. These are what let the model answer "how many doors have a 90-minute fire rating on the second floor" by joining schedule rows to shape detections.
  • Action tools (amber) navigateToPage, highlightRegion, createMarkup, addNoteToAnnotation, batchAddNotes. The viewer interprets these as side effects. The model can say "show me page 42" and the viewer actually scrolls there.

The agentic loop

BP's chat endpoint (POST /api/ai/chat) invokes streamChatWithTools() on the configured adapter. All three SDK adapters (anthropic.ts, openai.ts, groq.ts) implement the same interface:

src/lib/llm/anthropic.ts — streamChatWithTools (abridged)
async *streamChatWithTools(options: LLMToolUseOptions): AsyncIterable<ToolStreamEvent> {
  const maxRounds = options.maxToolRounds ?? 10;
  const tools: Tool[] = options.tools.map(toAnthropicShape);
  const msgHistory = prepareMessages(options.messages);

  for (let round = 0; round < maxRounds; round++) {
    const stream = await client.messages.stream({ model, system, messages: msgHistory, tools, ... });

    for await (const event of stream) {
      if (event.type === "content_block_delta" && event.delta.type === "text_delta")
        yield { type: "text_delta", text: event.delta.text };
      else if (event.type === "content_block_start" && event.content_block.type === "tool_use")
        yield { type: "tool_call_start", name: event.content_block.name, id: event.content_block.id };
    }

    const finalMsg = await stream.finalMessage();
    const toolUseBlocks = finalMsg.content.filter(b => b.type === "tool_use");

    if (toolUseBlocks.length === 0 || finalMsg.stop_reason !== "tool_use") {
      yield { type: "done" };
      return;
    }

    const toolResults = [];
    for (const block of toolUseBlocks) {
      const result = await options.executeToolCall(block.name, block.input);
      yield { type: "tool_call_result", name: block.name, id: block.id, result: JSON.stringify(result) };
      toolResults.push({ type: "tool_result", tool_use_id: block.id, content: JSON.stringify(result) });
    }

    msgHistory.push({ role: "assistant", content: finalMsg.content });
    msgHistory.push({ role: "user", content: toolResults });
  }

  yield { type: "text_delta", text: "\n\n(Reached maximum tool call rounds)" };
  yield { type: "done" };
}
ts
streamChatWithTools() — agentic round loopmaxToolRounds = 10 (default). Streams text deltas as they arrive; batches tool calls per round.User message+ system + historyLLM streamAnthropic / OpenAI / GroqBP_TOOLS injectedyield text_deltastop_reason=== "tool_use" ?yesExecute tool callsexecuteToolCall(name, input)read db, run lib fns, mutateappend tool_result → round++ (cap 10)yield tool_call_start / _resultno → doneyield doneexit loop, return to callersrc/lib/llm/anthropic.ts → same interface in groq.ts, openai.tsround cap → emits "(Reached maximum tool call rounds)"
AgenticLoopDiagram — the text/tool_call flow. Text deltas stream as they arrive; tool calls are batched at round boundaries.

The key behaviors to notice: text deltas stream live (the user sees the response materialize a word at a time); tool calls don't block the stream (the model's reasoning text shows up before the tools execute); each round's tool calls run in parallel before the next LLM turn starts; the loop terminates as soon as stop_reason !== "tool_use" or after 10 rounds, whichever comes first. On Opus-sized models, 3 rounds is typical; 10 is a safety cap, not an expected value.

Context budgets per model

Before the loop even starts, the server calls assembleContextWithConfig() in src/lib/context-builder.ts. The function takes a list of candidate sections (CSI codes, classification, annotations, parsed tables, etc.), sorts them by priority, and packs them into a character budget chosen for the current model. Bigger-window models get more context; smaller free-tier models stay lean to leave room for tool rounds.

ProviderModelChar budget~Tokens
anthropicclaude-opus-*200,000~50,000
anthropicclaude-sonnet-*80,000~20,000
anthropicclaude-haiku-*30,000~7,500
openaigpt-4o*60,000~15,000
openaigpt-4* (Turbo)40,000~10,000
openaio1 / o380,000~20,000
groqany24,000~6,000
customOllama / self-hosted30,000~7,500
(fallback)DEFAULT_CONTEXT_BUDGET24,000~6,000
Character budgets per model. Numbers are verbatim from getContextBudget() in src/lib/context-builder.ts.

The fallback default is DEFAULT_CONTEXT_BUDGET = 24000 characters — ~6000 tokens — which is what unknown providers and unknown models get.

Section registry and presets

SECTION_REGISTRYenumerates the 20 sections the context builder can assemble into a page- or project-scope prompt. Each has a default priority (lower = earlier, higher priority) and a description. At run time the builder sorts by priority, computes per- section budgets from the admin's preset or per-company overrides, fills each section to its budget, and truncates anything that overflows. Unused allocations flow into an overflow pool so the next section can use the slack.

Bars show default priority from SECTION_REGISTRY. Shorter bar = higher priority (packs earlier).
project-report
0.5
csi-graph
1.0
csi-codes
1.0
csi-spatial
1.0
csi-parsed
1.2
page-classification
1.5
heuristic-inferences
1.5
parsed-tables
1.8
user-annotations
2.0
spatial-context
2.0
yolo-counts
3.0
takeoff-notes
3.0
yolo-detail
3.2
cross-refs
3.5
text-annotations
5.0
note-blocks
5.5
tag-patterns
5.5
detected-regions
6.0
qto-results
9.0
raw-ocr
10.0
20 sections total. Admins can disable sections, override priority per-company, or pick a preset (balanced, structured, verbose) from Admin → LLM Context.
Default priorities from SECTION_REGISTRY. Lower = higher priority. Admins can override these per-company.

There are three presets in SECTION_PRESETS:

PresetShapeWhen to use
balancedEqual-share allocation across every enabled section. Simple and predictable.Default for general-purpose chat. Unopinionated.
structuredFront-loads parsed-tables (25%), spatial-context (12%), csi-codes (11%), yolo-counts (10%), csi-spatial (9%), detected-regions (5%), raw-ocr (1%).When the project has well-parsed schedules and you want the model to reason from structured data, not from OCR. Best for takeoff questions.
verboseFront-loads raw-ocr (40%), spatial-context (15%), parsed-tables (10%).Exploratory work on projects that aren't fully preprocessed. The model gets more text to read, at the cost of less structure.
Global vs project vs page scope
Chat scope controls which registry is used. Project and page scope use SECTION_REGISTRY with 20 sections. The global dashboard chat (the widget on /home) uses GLOBAL_SECTION_REGISTRY— 6 sections focused on cross-project discovery (project catalog, discipline breakdown, CSI summary, detection counts, search results, search OCR). Same loop, different data surface.

Provider selection

The adapter is chosen by src/lib/llm/resolve.ts based on the llm_configs table and, optionally, per-user overrides from user_api_keys. The fallback chain is:

  1. The user's API key (from user_api_keys, encrypted at rest).
  2. Company-wide config from llm_configs (set by company admin).
  3. Environment variable (ANTHROPIC_API_KEY, OPENAI_API_KEY, GROQ_API_KEY).

The adapter interface is identical across providers — LLMClient in src/lib/llm/types.ts defines streamChat() and streamChatWithTools(). Adding a new provider is a matter of writing a new file in src/lib/llm/ that implements the interface and wiring it into resolve.ts. For OpenAI-compatible endpoints (Ollama, self-hosted vLLM, llama.cpp servers), the existing openai.tsadapter works directly — you set provider = "custom" and a baseUrl.

Where to configure all of this

The user-facing surface is Admin → LLM Context (src/app/admin/tabs/LlmContextTab.tsx). It exposes:

  • Enable / disable each of the 20 sections per company.
  • Override any section's default priority.
  • Pick a preset or set custom percent allocations.
  • Inspect post-assembly section metadata (included, truncated, char count) — so admins can see exactly what made it into the prompt.
  • Edit the system prompt (overrides DEFAULT_SYSTEM_PROMPT).
  • Attach company-specific domain knowledge (free-text).

The LLM provider / model picker lives next door in Admin → AI Models → LLM Config. Both pages write to the same set of tables and both updates take effect on the next chat turn (no deployment required).

Operations

Admin Dashboard

In plain English: the admin dashboard is the tuning shop. You get 14 tabs, one for each subsystem — users, companies, AI models, CSI config, heuristics, LLM context, pipeline knobs, feature flags. Every dial that controls how the pipeline behaves for your company is in here, not behind a wizard. Built for technical users who want to see everything.

The admin dashboard at /adminis where every company-level tuning knob lives: YOLO model management, CSI detection thresholds, heuristic rules, LLM provider configuration, user and invite management, pipeline concurrency, text-annotation detector toggles, and root-admin-only settings. It is deliberately flat — 14 tabs across the top of one page — rather than hidden behind wizards, because the intended audience is technical.

The 14 tabs

TabWhat it controlsBacking route(s)
OverviewSystem health snapshot: recent parses, running jobs, disk usage, quotas./api/admin/parser-health, /api/admin/recent-parses, /api/admin/running-jobs
ProjectsEvery project in the company. Filter by status, bulk re-trigger processing, delete./api/admin/reprocess, /api/projects/[id]
AI ModelsUpload and register YOLO models. Run SageMaker Processing jobs. Configure the LLM provider + default model. House the sagemakerEnabled kill switch + quota. (See Section 05 for the run path.)/api/admin/models, /api/yolo/run, /api/admin/llm-config, /api/admin/toggles
UsersPer-company user list, invites, password resets, canRunModels grants./api/admin/invites, /api/admin/users/reset-password
CompaniesRoot admin only. Create companies, assign root admin, configure pipelineConfig per company (CSI thresholds, heuristics, pageConcurrency, csiSpatialGrid)./api/admin/companies (root only)
CSICompany CSI detection config (threshold + tier weights), custom CSI database upload, re-run CSI on all annotations after a database change./api/admin/csi/config, /api/admin/csi/upload, /api/admin/models/reprocess-csi
HeuristicsBuilt-in rules (enable/disable) + custom rules. Each rule supports text keywords, yoloRequired, yoloBoosters, spatial conditions, output labels, output CSI codes./api/admin/heuristics/config
Table ParseTuning defaults for Auto Parse, Guided Parse propose endpoints. Controls rowTolerance, minColGap, minHitsRatio defaults per company.Same pipelineConfig fields
Page IntelligenceClassifier tuning, cross-ref detector config. Test on specific pages./api/admin/pipeline
Text AnnotationsEnable/disable the 10 detector modules, view counts, configure regex patterns for custom detectors./api/admin/text-annotations/config
AI RBACPer-role tool access control. Lock individual LLM tools out of non-admin roles.llm_configs + role table
LLM ContextSection registry enable/disable, priority overrides, preset (balanced / structured / verbose), system prompt, domain knowledge, per-section telemetry./api/admin/llm-config, pipelineConfig.llm
PipelinepageConcurrency (default 8), csiSpatialGrid (default 9×9), queue visibility./api/admin/pipeline
SettingsApp settings, feature flags, non-sensitive env var reveal. Root admin only./api/admin/app-settings (root)

Root admin vs company admin

BP is multi-tenant at the row level. Every user-visible table carries a company_id, and every /api/admin/* route runs a row-scope check in src/lib/audit.tsbefore reading or writing. A company admin sees their own company's projects, users, CSI config, heuristics, and LLM configs — nothing cross-company.

A root admin (a user with isRootAdmin = true) bypasses company scoping. Root admins can create new companies, assign root admins, edit any company's pipelineConfig, reveal global app settings, and flip the sagemakerEnabled toggle on any company. There is intentionally no UI for "become root admin" — the bit is set directly in the database by a system operator.

Destructive admin toggles require a password
The sagemakerEnabled toggle and the quota kill switch both require an admin password stored in app_settings. The password check is enforced in /api/admin/toggles. This is a belt-and- suspenders precaution on top of the RBAC — destructive toggles shouldn't be one forgotten session away from flipping.
Operations

System Architecture

In plain English: BP runs as a single Next.js app on AWS ECS Fargate, with a PostgreSQL database, an S3 bucket for page images, and a SageMaker GPU job for YOLO when you kick it off. On a laptop you can run the same code with just PostgreSQL and a cheap LLM key — Textract, S3, and SageMaker are all optional. The whole AWS stack is in Terraform; redeploying is one script.

This section is a tour of where BP actually runs on AWS and how the pieces fit together. It is intentionally not a deployment tutorial — the README and the Terraform variables file are better starting points for that. The goal here is to answer questions like "what talks to what," "where does the LLM call come from," and "what part of this would I rip out if I wanted to run BP offline."

Topology at a glance

BlueprintParser — AWS runtimeUser browser/home, /project, /docsCloudFrontassets.*, CORS at edgeALBHTTPS → 3000 / 8080ECS Fargateblueprintparser-app2 vCPU / 4 GBNext.js 16Label StudioECS + EFSlabelstudio.*RDS PostgreSQL 16projects, pages, annotationsS3PDFs, page PNGs, YOLO outSecrets ManagerDB, NEXTAUTH, LLM keysStep Functionsblueprintparser-process-blueprintECS Fargatecpu-pipeline taskprocess-worker.jsTextract(Tesseract fallback)S3pages/*.pngSageMaker Processing — ml.g4dn.xlargeYOLO inference (admin-initiated)
ArchitectureSvgDiagram — the BP runtime on AWS, color-coded by service family.

A browser hits CloudFront at assets.*for page images and thumbnails, and the ALB at the primary domain for everything else. The ALB routes HTTPS to two ECS services: the main app (a Next.js container) and Label Studio (a separate labeling UI for training data work). Both run in Fargate — no EC2 to manage. Secrets come from Secrets Manager; the DB is RDS PostgreSQL; durable storage is S3 behind CloudFront.

The main app service

blueprintparser-app is the Next.js 16 container defined in infrastructure/terraform/ecs.tf. Task definition: 2 vCPU / 4 GB. It serves the entire React app, handles every API route, runs Drizzle queries against RDS, pushes processing jobs to Step Functions, and proxies LLM calls. Auto scaling is CPU-based (target 70%) with memory as a guardrail (80%). A circuit breaker is enabled on deployments so a broken image rolls back automatically.

Task definition highlights (from ecs.tf)
name            = "blueprintparser-app"
cpu             = 2048          // vCPU × 1024
memory          = 4096          // MiB
container_image = "{{ecr}}/beaver_app:latest"
container_port  = 3000
health_check    = { path = "/api/health", interval = 30, timeout = 5 }
execution_role  = "beaver_ecs_execution_role"   // ECR pull, logs, secrets read
task_role       = "beaver_ecs_task_role"         // S3, Textract, SageMaker, SFN
desired_count   = var.ecs_desired_count          // auto-scaled
deployment_controller = "ECS"
circuit_breaker = { enable = true, rollback = true }
ts

The app task needs direct access to Secrets Manager (to pull DATABASE_URL, NEXTAUTH_SECRET, LLM keys), S3 (to write uploads and read page images), Textract (OCR), SageMaker (start/stop jobs), and Step Functions (start executions). Those are all attached to the task role in iam.tf.

The cpu-pipeline task

Long-running processing is offloaded to a second ECS task named blueprintparser-cpu-pipeline. It's the same container image as the main app, just started with a different command (node scripts/process-worker.js) and a much bigger footprint (8 vCPU, 16 GB memory). The task runs the full preprocessing pipeline for a single project and then exits. This keeps the web task responsive during heavy PDF ingest.

The state machine in stepfunctions.tf(blueprintparser-process-blueprint) is what starts cpu-pipeline tasks. It's a straight line: ValidateInput → CPUProcessing → ProcessingComplete with a failure branch. Retries happen on TaskFailed with a 30-second interval and 2.0× backoff, up to 2 attempts. The state machine logs to a CloudWatch log group (/aws/states/blueprintparser-process-blueprint) for debuggability.

Local path bypasses Step Functions
On a local dev machine with no AWS, processProject() runs inline from the /api/projectshandler in a fire-and-forget promise. Same code, no state machine. This is why the local tier in Section 01 works — you don't need anything AWS just to see the pipeline run.

SageMaker Processing for YOLO

YOLO inference runs out-of-band on SageMaker Processing jobs. BP calls sagemaker:CreateProcessingJobfrom the app task with inputs pointing to the project's pages/ prefix in S3 and outputs pointing to yolo-output/. The container image comes from a second ECR repo (beaver_yolo_pipeline) built separately from the app image. The default instance type is ml.g4dn.xlarge, billed per run, which is the exact reason the sagemakerEnabled toggle exists.

Storage layout

S3 is the durability layer. The bucket is blueprintparser-data-{account_id} and the layout is stable:

S3 layout per project
{dataUrl}/                             // {company_id}/{project_public_id}
├── original.pdf                       // raw upload
├── thumbnail.png                      // 72 DPI cover image
├── pages/
│   ├── page_0001.png                  // 300 DPI display image
│   └── page_0002.png
├── thumbnails/
│   ├── page_0001.png                  // 72 DPI thumbnail
│   └── page_0002.png
├── yolo-output/                       // written by SageMaker
│   ├── page_0001_detections.json
│   └── page_0002_detections.json
└── exports/
    ├── takeoff.csv                    // user-exported CSVs
    └── labels.zip                     // Label Studio exports
text

Every file under pages/ and thumbnails/ is cached as public, max-age=31536000, immutableso CloudFront holds them forever. The cache-warming pass at the end of preprocessing primes edge locations so the first viewer open is fast. Filenames include the page number so they're effectively content-addressed.

Database storage uses PostgreSQL 16 on a db.t4g.mediumwith 50 GB gp3 that can auto-grow to 200 GB. Backups retained 7 days. Multi-AZ in production. All writes go through Drizzle; the schema lives in src/lib/db/schema.ts.

The database schema at 50,000 feet

TablePurpose
companiesMulti-tenant boundary. Holds pipelineConfig (CSI thresholds, heuristics, pageConcurrency, csiSpatialGrid).
usersAuth + RBAC. isRootAdmin, canRunModels, companyId.
sessionsNextAuth session tokens.
projectsOne row per uploaded PDF set. status, numPages, projectIntelligence JSONB, projectSummary text.
pagesOne row per page. rawText, drawingNumber, csiCodes, textAnnotations, pageIntelligence JSONB, search_vector tsvector.
annotationsYOLO + user markups + takeoff items. bbox, className, confidence, source, data JSONB.
yolo_tagsMap Tags output — tag text ↔ YOLO shape instances.
qto_workflowsAuto-QTO state machines with materialType, step, parsedSchedule, lineItems, userEdits.
takeoff_groupsGroups in the takeoff panel sidebar.
takeoff_itemsIndividual takeoff items (count/area/linear) organized into groups.
chat_messagesConversation history, keyed by project + page + scope.
llm_configsCompany- or user-scoped LLM provider + model + encrypted API key + context section overrides.
user_api_keysUser-level API keys (encrypted at rest).
modelsYOLO model registry — name, type, s3Path, config, isDefault.
model_accessPer-company access grants for models owned by another company.
processing_jobsSageMaker / Step Functions job tracking with status + CloudWatch refs.
labeling_sessionsLabel Studio integration state.
app_settingsGlobal key/value (root admin only). Includes the sagemakerEnabled toggle password.
audit_logAdmin action history.

Terraform file map

The full stack is in infrastructure/terraform/. Each file has a single responsibility:

infrastructure/terraform/13 files
  • main.tfProvider + backend + top-level module wiring.
  • variables.tfAll tunable inputs (region, sizing, domain name, ACM arn, etc.).
  • terraform.tfvars.exampleTemplate for per-environment variable values.
  • terraform.tfvarsActual (gitignored in most setups) per-env values.
  • outputs.tfExported outputs: ALB DNS, ECR repo, RDS endpoint, S3 bucket, etc.
  • vpc.tfVPC, public and private subnets, NAT, route tables, security groups.
  • ecs.tfECS cluster, task defs (app / cpu-pipeline / label-studio), services, auto-scaling.
  • ecr.tfECR repositories for the app image and the YOLO inference image.
  • rds.tfPostgreSQL 16 instance, subnet group, parameter group, backups.
  • s3.tfData bucket, CloudFront distribution with OAC, CORS, range requests.
  • iam.tfExecution + task roles (S3, Textract, SageMaker, SFN) and Step Functions role.
  • secrets.tfSecrets Manager entries for DATABASE_URL, NEXTAUTH_SECRET, LLM keys, etc.
  • stepfunctions.tfState machine definition + CloudWatch log group for the processing pipeline.
infrastructure/terraform/ — 13 files, single source of truth for AWS.

Label Studio side-car

Label Studio runs as a separate ECS task with an EFS volume mounted at /label-studio/data. The ALB routes labelstudio.* to the task; the main app integrates via /api/labeling/* routes. It reads from the same S3 bucket the main app writes to, so round-tripping a project from ingest to labeling and back works without cross-service copying.

Running without AWS

Everything in this section is the deployed tier. You can ignore most of it and still run BP: the repo ships a docker-compose.yml that brings up a local PostgreSQL on port 5433 and lets you run npm run devagainst it. Textract, S3, and SageMaker are all gated by env vars — when they're missing, BP falls back to Tesseract for OCR, the local filesystem for images (or a dev-mode S3 emulator if you prefer), and nothing for YOLO. LLM chat still works if you have a Groq free-tier key. The table parsers (img2table, Camelot, TATR) all run locally from scripts/. Bucket Fill runs locally. Auto-QTO runs locally given a parsed schedule. The only hard dependency on AWS is YOLO inference.

Meta

How BP Works — for LLMs

This section is written for a language model reading the BP codebase cold. It packs the shape of the system, the construction vocabulary, the load-bearing file paths, and the known traps into one place. If you're a human reader you can still use it — it's just unusually dense because the target reader has a large context window.

Grep anchors
Every subsection has an [LLM-NAV:slug] anchor near its heading so you can jump in from a grep. For the full, file:line-grade navigation manual, the companion doc is featureRoadMap/BPArchitecture_422.md— this section is the client-readable summary of that doc.

Canonical mental model (read this first)

[LLM-NAV:mental-model]

BlueprintParser is a graph builder for construction PDFs. It turns a multi-page drawing set into two data axes:

  • Horizontal (per page)— each page carries OCR, CSI codes, text annotations, detected tables, a classification, and (optionally) YOLO detections. Stored on the pages row with a large pageIntelligence JSONB.
  • Vertical (project-wide) annotations (YOLO + user markups + takeoff), takeoff_items (count / area / linear), yolo_tags(schedule tag ↔ shape instance), and projectIntelligence (the project-level summary: disciplines, CSI network graph, hub pages).

Everything downstream — the LLM chat, the viewer, takeoff, the CSI spatial heatmap, Auto-QTO — reads from those two shapes. The fastest way to get oriented in the code is: read src/lib/db/schema.ts, src/types/index.ts, and src/lib/processing.ts#processProject().

Construction glossary

[LLM-NAV:glossary]

BP's code uses construction-industry terms unapologetically. A model that doesn't know these will miscalibrate what the code is doing. Each card below gives the plain-English definition and the BP surface it appears in.

Keynote
A numbered callout (often in a circle or hexagon) referring to a spec section or instruction. Example: “③” on a drawing refers to note 3 in a table.
Detected as shapes with inner OCR text; stored as annotations with source="shape-parse" and in pages.keynotes JSONB.
CSI code
A 6-digit number from the Construction Specifications Institute MasterFormat. Example: 08 14 00 = Wood Doors. The primary industry-standard scheme for organizing construction scope.
Detected by the 3-tier matcher in src/lib/csi-detect.ts. Stored in pages.csi_codes JSONB. Drives CSI heatmaps, network graph, and LLM context.
Takeoff
The process of counting everything on a drawing set (doors, wall square footage, linear feet of pipe) for a bid. Historically done with a printed set + highlighter.
takeoff_items table (count / area / linear). Area tab bucket-fills rooms. Auto-QTO materializes schedule-backed takeoffs automatically.
Callout
A graphical reference on a drawing. Usually a circle with a letter/number inside, sometimes with a leader line. Two types: schedule callouts (tag → row) and detail callouts (symbol → detail sheet).
Schedule callouts become YoloTags via Map Tags. Detail callouts become cross-references in page_intelligence.crossRefs.
Schedule
A tabular block on a drawing listing every item of a type (door schedule, window schedule, finish schedule). Each row = one unique tag value.
Detected by table-classifier, parsed with img2table/Camelot/TATR/ocr-grid-detect, stored as pages.page_intelligence.parsedRegions[] with type="schedule".
Trade
A discipline: architectural, structural, MEP (mechanical/electrical/plumbing), civil, etc. Each trade has its own subset of sheets and its own CSI divisions.
Inferred from CSI codes and drawing prefixes. Surfaced in the toolbar trade filter + projectIntelligence.disciplines.
Sheet number
The identifier on the title block — e.g. A-101, E-205. Letter = trade, digit = sheet index. Estimators refer to pages by sheet number, not by 1-based index.
Extracted via extractDrawingNumber() in src/lib/title-block.ts; stored in pages.name.
Title block
The legend block (usually bottom-right corner) with the project name, sheet number, scale, revision, and stamp. Excluded from takeoff counting.
Detected as a YOLO class title_block. Auto-QTO strictly excludes hits inside this region.
Tag / tag shape
A small symbol (circle, hexagon, diamond) containing a tag string like “D-01”. The schedule has one row per tag; the drawings have many instances. Counting the instances = the takeoff.
QTO_TAG_SHAPE_CLASSES in AutoQtoTab.tsx: circle, hexagon, diamond, triangle, pill, oval, rectangle, square, + variants.
Scale calibration
Before a bucket-filled polygon is a real measurement, BP needs to know how many real-world feet per pixel. User clicks two points, types the known distance.
Stored per-page in scaleCalibrations[pageNumber]. Consumed by computeRealArea() in src/lib/areaCalc.ts.
pageIntelligence / projectIntelligence
BP's internal structured summary of a page / project. A compact JSON that replaces raw OCR for most downstream uses (LLM context, Auto-QTO discovery, CSI heatmap).
pages.page_intelligence + projects.project_intelligence JSONB columns. Built by analyzePageIntelligence() and analyzeProject().
Heuristic engine
A rules engine that fires inferences like “this page contains a door schedule” or “this page is an RCP” based on text keywords + YOLO classes + spatial conditions.
src/lib/heuristic-engine.ts. Two modes: text-only (during processing) and YOLO-augmented (after YOLO).

File:line landmarks (the 20 that matter most)

[LLM-NAV:landmarks]

SymbolFile:lineWhy it matters
processProjectsrc/lib/processing.ts:165-605Auto pipeline entry. 14 per-page stages + project rollup.
mapConcurrentsrc/lib/processing.ts:35-52Worker-pool concurrency limit. Default 8.
analyzePageImageWithFallbacksrc/lib/textract.ts:~3153-tier OCR fallback: full Textract → half-res → Tesseract.
detectCsiCodessrc/lib/csi-detect.ts3-tier matcher. Returns CsiCode[] with trade + division + confidence.
findOccurrencessrc/lib/tag-mapping/find-occurrences.ts:171Tag-mapping entry. Dispatches to 5 matcher types + composes scores.
processFillsrc/workers/bucket-fill.worker.ts:4538-stage flood-fill pipeline. Text is a wall.
computeRealAreasrc/lib/areaCalc.tsShoelace → calibrated sqft. Truth for area takeoffs.
streamChatWithToolssrc/lib/llm/anthropic.ts:85-169Agentic tool-use loop. maxRounds=10.
BP_TOOLSsrc/lib/llm/tools-defs.ts20 tool definitions. Client-safe (no db/fs).
executeToolCallsrc/lib/llm/tools.tsServer-side tool executors. Full db + fs access.
canvasWantsEventssrc/components/viewer/AnnotationOverlay.tsx:2510Render-gate condition #1. Also touch L2521, L2550, L2554.
useViewerStoresrc/stores/viewerStore.ts:609Zustand store. 17 slice hooks, L1675 onward.
resetAllToolssrc/stores/viewerStore.tsCanonical tool reset. Compose into when adding a new tool.
focusAnnotationIdsrc/stores/viewerStore.tsOne-shot signal. Read-and-clear pattern.
assembleContextWithConfigsrc/lib/context-builder.tsLLM prompt assembly. Priority-sorted section packing.
ALL_DETECTORSsrc/lib/detectors/registry.ts:2110 text-annotation detectors. Add here + wire to enable config.
QTO_STRICT_EXCLUSION_CLASSESsrc/components/viewer/AutoQtoTab.tsx:52tables, title_block, drawings. Required for Auto-QTO.
startYoloJobsrc/lib/yolo.tsSageMaker Processing job launch. Only caller: POST /api/yolo/run.
resolveConfigsrc/lib/llm/resolve.tsPer-company LLM provider + model + key lookup.
buildCsiGraphsrc/lib/csi-graph.ts (~430 LOC)Project-level CSI relationship graph. fingerprinted cache key.

The 17 Zustand slice hooks

[LLM-NAV:store-slices]

Every panel, every toolbar button, every canvas overlay reads from one of these slices. Subscribing at the slice level — rather than with a raw useViewerStore(s => s.field) — is how the 1,986-line store doesn't cause cascading re-renders. If you're adding UI that needs state from the store, check this map for the existing slice before creating a new one.

viewerStore.ts — 17 slice hooks around one Zustand storeSubscribe via slice hooks (not individual fields) to minimize re-renders. Line numbers are verbatim.useViewerStoreL609(ViewerState, ~1400 LOC body)useNavigationL1675pageNumber, numPages, mode.usePanelsL168612 showX flags + toggles.useSelectionL1726multi-select ids + helpers.useAnnotationGroupsL1737groups, memberships, upsert.useDrawingStateL1751_drawing/_drawStart/_drawEnd/_mousePos.useSymbolSearchL1764results, confidence, dismissed.useChatL1792messages, scope.useTableParseL1803step, region, grid, col/row BBs.useKeynoteParseL1832step, region, yolo-class bind.useProjectL1859projectId, publicId, dataUrl, isDemo.usePageDataL1882pageNames, pageDrawingNumbers.useDetectionL1901annotations, showDetections, filters.useYoloTagsL1921tags, activeId, visibility, picking mode.useTextAnnotationDisplayL1940shown types + colors + hidden set.useAnnotationFiltersL1956active filter, csi filter, trade filter.useQtoWorkflowL1969active wf, cell structure, toggleCellHighlight.useSummariesL1977summary arrays + chunk loader state.Rule: prefer a slice hook over useViewerStore(s => s.field). Slice hooks use useShallow, so components only re-render when their slice actually changes.
17 slice hooks fan out from useViewerStore. Line numbers are from the current viewerStore.ts.

The 20 LLM tools — when to call each

[LLM-NAV:tool-selection]

Section 9 has the full tool grid. This subsection is the selection heuristic: given a user question, which tool should an LLM call first?

User asks about…First tool to reach forReasoning
Project overview / disciplinesgetProjectOverviewOne call, returns cluster summary. Always cheaper than scanning pages.
A specific pagegetPageDetails(pageNumber)Structured summary first, then raw OCR only if needed.
Pages containing Division XlookupPagesByIndex({index:\"csi\", key:\"X\"})O(1). Don't iterate every page.
Cross-references / hub pagesgetCrossReferencesReturns edges and ranked hubs from the graph.
Text location on a pagesearchPages / getOcrTextInRegionSearch is ts_vector; in-region is bbox-scoped OCR.
Parsed schedulesgetParsedSchedule(pageNumber)Headers + rows, already structured. Don't re-parse from OCR.
Spatial layout of a pagegetCsiSpatialMap / getSpatialContext9×9 heatmap + YOLO-joined text.
YOLO detectionsgetAnnotations({source:\"yolo\"})Filter-based, returns bboxes + classes + confidence.
Tag instances across projectmapTagsToPagesBridges schedule rows to drawing shapes. Cached per tag list.
CSI code for arbitrary textdetectCsiFromText(text)Runs the 3-tier matcher on input you provide.
Jump the viewer to page XnavigateToPage({pageNumber})Side-effecting action. The user sees it happen.
Highlight a regionhighlightRegionCyan pulse on canvas. Drives attention.
Persist a new annotationcreateMarkupMutation. Writes to the annotations table.

Signal valve state — what BP does NOT do yet

[LLM-NAV:signal-valves]

Don't overpromise on tag mapping fidelity
The tag-mapping scoring system has two signals hardcoded to zero and one hardcoded to true as of 2026-04-22. These are reserved for the future Discrepancy Engine; matchers have not been wired to populate them yet.
src/lib/tag-mapping/find-occurrences.ts:131, 141-142
// :131 — windowMatch hardcoded to true (multi-word text coherence not evaluated)
const windowMatch = true;

// :141-142 — two boosts hardcoded to zero
shapeContainBoost: 0,     // not yet produced by matchers; future refinement
objectAdjacencyBoost: 0,  // not yet produced by matchers; future refinement
ts

Translation for a model reasoning about BP's capabilities: tag mapping scores are conservative. Every returned match has passed a pattern + region-weight + scope check, but the adjacency and shape-containment refinements that would let BP surface subtle discrepancies (e.g. "schedule says 12 doors of type D-01 but only 11 appear on plans") are not yet implemented. Don't claim that capability in responses. Point the user at Section 6 and Section 9 if they ask.

Post-processing flows (the stack-on story)

[LLM-NAV:flows]

Every feature in BP follows the same shape: user action → API route → DB/S3 write → Zustand store update → re-render. If you're reasoning about how a change would propagate, trace that path. The diagram below shows four of the most common flows side-by-side.

Post-processing flows — user action to rendered resultEvery feature follows the same shape: action → API → DB → store → re-render. Tools stack.Run YOLO → show detections1.Admin clicks RunAiModelsTab.tsx2.POST /api/yolo/runSageMaker job ID3.SageMaker Processingml.g4dn.xlarge4.POST /api/yolo/loadwebhook on complete5.annotations rowssource = yolo6.viewer re-renderscanvas + DetectionPanelDraw an area polygon1.Open AreaTabset activeTakeoffItemId2.Click verticesaddPolygonVertex()3.Preview rendersDrawingPreviewLayer4.Double-click finalizecomputeRealArea()5.POST /api/annotationstype=area-polygon6.AreaTab list refreshesupdateTakeoffItem()Bucket fill commit1.Arm bucket fillbucketFillActive=true2.Click seed pointclientBucketFill()3.Worker: flood + tracebucket-fill.worker.ts4.Preview overlayevenodd with holes5.User acceptsBucketFillAssignDialog6.POST /api/annotationsvertices + holes + areaCreate annotation group1.Lasso ≥ 2 itemsmode=group2.GroupActionsBar appearsfloating bar3.MarkupDialog opensname + color4.POST /api/annotation-groupsauto-CSI from notes5.hydrateGroupMembershipsZustand update6.Canvas ring outlinegroupIdToColor map
Four flows, same pattern. The symmetry is load-bearing — features compose because they share these steps.

Known hazards when editing code

[LLM-NAV:hazards]

TrapWhereSymptom
Canvas render gate driftAnnotationOverlay.tsx:2510-2527 + :2550 + :2554Adding a new canvas mode without touching all four conditions → silent event loss / wrong cursor.
csi-detect.ts is server-onlysrc/lib/csi-detect.ts uses fsImports from client components. tsc + vitest pass; Turbopack build fails. Keep it behind route files and server libs.
Native binaries on Mac → Linux containerHost npm run buildShips Darwin binaries that crash at runtime. Always build in Docker / CI.
In-memory rate limit + brute-force statesrc/middleware.ts, src/lib/auth.tsWon't scale past one ECS replica. Move to Redis when scaling.
focusAnnotationId is one-shotviewerStore.ts — read + clearSetting it twice to the same value won't fire the effect unless you clear it between.
Python scripts don't talk to S3scripts/*.py (except lambda_handler.py)TS caller handles S3 download → tempdir → subprocess → upload. Don't add boto3 to the Dockerfile.
ClientAnnotation.data is a 5-variant unionsrc/types/index.ts + AnnotationOverlayHeavy use of as any casts. If you're writing new access patterns, narrow by data.type and avoid adding more any casts.
OAuth has no domain allowlistsrc/lib/auth.tsAny email on a matching domain can join an existing company. Fine for self-hosting; dangerous on multi-tenant public deployments.

How to extend BP (recipes)

[LLM-NAV:extend]

Add a new LLM tool

  1. Add a tool definition to BP_TOOLS in src/lib/llm/tools-defs.ts (name, description, JSON Schema input).
  2. Implement execMyTool(input, ctx) in src/lib/llm/tools.ts + route it in executeToolCall().
  3. If it's a viewer action, add a handler in ChatPanel.tsx's tool-result dispatcher.
  4. Test via POST /api/ai/chat with a prompt that would trigger the tool.

Add a new canvas tool mode

  1. Add state to viewerStore.ts and compose into the right slice hook.
  2. Touch ALL FOUR conditions in AnnotationOverlay.tsx: canvasWantsEvents (L2510), canvasShouldRender (L2521), pointerEvents (L2550), cursor (L2554).
  3. Add branches in handleMouseDown/Move/Up.
  4. Compose the tool's state reset into resetAllTools().

Add a new text-annotation detector

  1. Create src/lib/detectors/my-detector.ts exporting a TextDetector.
  2. Add it to ALL_DETECTORS in src/lib/detectors/registry.ts.
  3. Add per-company enable toggle via Admin → Text Annotations.

Add a new YOLO class

  1. Register the YOLO model in the models table (Admin → AI Models).
  2. Run a SageMaker job to produce detections.
  3. (Optional) assign a CSI code to the class in the admin config so every annotation inherits it.
  4. Downstream features auto-pick up the new class; no code changes needed for Map Tags or Auto-QTO.

Instructions for a model answering a user question about BP

[LLM-NAV:model-behavior]

  1. Check the context budget. On Opus you have room to include raw OCR; on Groq or Haiku, rely on structured tools and skip raw OCR unless the question demands it.
  2. Reach for the right tool first.The selection table above is the heuristic: for "where are the plumbing fixtures," lookupPagesByIndex is always cheaper than searchPages.
  3. Ground every quantitative claim in a tool call. Do not invent counts. Auto-QTO is the source of truth for takeoffs; mapTagsToPages is the source of truth for tag instance counts.
  4. Prefer actions over prose. If the user wants to see page 42, call navigateToPage rather than describing it.
  5. Respect the signal-valve state.BP does not currently do adjacency-based cross-schedule discrepancy detection (see the warning above). If the user asks "does the door schedule match the plans," answer with what mapTagsToPages returns and note that you cannot detect subtler mismatches yet.
Operations

API Reference

In plain English: everything the viewer does — uploading a project, running YOLO, chatting with the LLM, exporting a takeoff — goes through an HTTP endpoint here. This page lists every one, grouped by domain, so you can find where in the code each button talks to the server.

BP exposes roughly 91 HTTP endpoints from Next.js API routes. This reference groups them by domain, with a one-line description for each. Click a method/path row to expand parameters and examples where provided. This is not an OpenAPI spec — for machine-readable schemas, src/lib/llm/tools.ts has JSON Schemas for the LLM tool surface, which is the most formally typed set of endpoints.

Auth model
Unless marked public, every endpoint requires a valid NextAuth session. Routes marked admin additionally check user.isAdmin || user.isRootAdmin. Routes marked root require isRootAdmin. Destructive admin toggles require an additional admin password stored in app_settings and checked at the route level in /api/admin/toggles. All authenticated routes enforce row-level multi-tenant scoping through src/lib/audit.ts.

Endpoint catalog

Listed here: 88 endpoints across 14 domains. Actual route count: 84 (some admin/util routes are omitted).
/

Notes on specific routes

A few routes need extra context beyond the short description:

  • POST /api/ai/chat is Server-Sent Events, not request/response. The response stream yields data: lines encoding a sequence of { type: "text_delta" | "tool_call_start" | "tool_call_result" | "done" } events. The client reads them as they arrive. DELETE on the same path clears the scoped conversation history.
  • POST /api/yolo/run is the only way to trigger YOLO inference. The request body is { projectId, modelId } and the response returns the SageMaker execution ID. Watch it via GET /api/yolo/status. Results ingest via the webhook.
  • POST /api/processing/webhook is the callback surface for Step Functions and SageMaker. Requests are HMAC-SHA256 signed with the PROCESSING_WEBHOOK_SECRET from Secrets Manager. Unsigned or mis-signed requests are rejected.
  • POST /api/projects/[id]/map-tags-batchis the heavy-lifter for Auto-QTO. It takes a parsed schedule's tag column, a target YOLO class (or a free-floating-text marker), and runs the mapping across the entire project at once. Expect it to take several seconds for large projects.
  • POST /api/bucket-fillreturns a polygon in normalized 0–1 coordinates, not the image space. The viewer converts to canvas coordinates at render time; areaCalc.ts converts to real-world units using the page's scale calibration.
  • POST /api/csi/detect is the public entry point to the 3-tier CSI matcher (see Section 04). Accepts a text string in the body, returns an array of matches with codes, descriptions, divisions, trades, and confidence scores.
  • /api/demo/* is a parallel, read-only mirror of the project and search routes that does not require auth. These are what power the /demoroute and the docs page's live component demos.

Where to read the source

Every endpoint in the catalog above maps to a file under src/app/api/**/route.ts. The Next.js App Router uses directory-based routing, so /api/csi/detect is src/app/api/csi/detect/route.ts and /api/projects/[id]/map-tags-batch is src/app/api/projects/[id]/map-tags-batch/route.ts. Each file exports HTTP method handlers (GET, POST, etc.) and most hand off immediately to helper functions in src/lib/. Handlers are thin by design — the real logic lives in lib/ and is unit-tested.