Topic · Architecture deep dive

The full GUS.aiarchitecture.

Three read-only data layers. 14 purpose-built tools. Evidence- mandatory reasoning. Multi-channel determinism. SHA-256 audit chain. Six panels covering what GUS.ai actually does under the hood.

~44,000 LOC · 173 source files · 14 tools · 700 RAG chunks3 years of applied research
Scroll

Chapter 16

The
Architecture.

Three data layers, fourteen tools, evidence on every answer.

Foundation

Three read-only data layers. Every answer sits on one.

GUS.ai does not reason from thin air. Every response is anchored in real plant documents, telemetry, or process knowledge. None of these layers writes; they are queried.

Document Intelligence

~700 chunks across 8 ingest pipelines

SOPs, P&IDs, owner manuals, recipe specs, sizing sheets, commissioning trial results, equipment specs, parts lists. Every chunk carries source filename, doc type, page, revision, superseded flag, and citation string.

Document Intelligence pipeline

Operational Intelligence

Plant telemetry · TimescaleDB

Hand-curated core tags plus auto-indexed from the full plant tag hierarchy. Sub-second MQTT live stream via HiveMQ. Grafana deep-links on every reading.

Operational Intelligence toolset · ADX User Guide

Process Dependencies

28-node static knowledge graph

Equipment, process variables, recipe steps, outcomes. Three reasoning modes: impact (what breaks if X fails), root cause (what causes both A and B), single-point-of-failure (where the blast radius is largest).

analyze_dependencies tool

The Agent

14 purpose-built tools. Not an all-knowing LLM.

Every tool is invoked only when evidence-grounded reasoning demands it. Every invocation is logged. The system refuses unsafe actions by design. Read-only by architecture, not by policy.

The 14-tool palette

Document Search

1 tool

  • search_documents

Operational Intelligence

10 tools

  • get_current_value
  • get_trend
  • get_alarms
  • get_batch_state
  • list_batch_steps
  • get_batch_history
  • search_tags
  • analyze_dependencies
  • correlate_tags
  • detect_anomalies

System Status

2 tools

  • get_all_vat_status
  • get_events

Live Stream

1 tool

  • get_mqtt_live

14-tool palette · per-tool descriptions

The orchestration pipeline

run_query() · every channel

  1. 01Kill-switch check
  2. 02Input scrubber (non-web channels)
  3. 03Safety guards (block control actions)
  4. 04Tool dispatch · 10-call budget · per-tool throttle
  5. 05Tool-result injection scanner
  6. 06Audit record (immutable JSONL + hash chain)

Orchestration pipeline · safety guards · audit chain

Hard limits: 10 tool calls per query, 2,048 max output tokens, per-tool throttling on top. Budget exhausted means GUS.ai synthesizes from existing evidence rather than retry. No runaway loops.

Evidence-Mandatory

The differentiator. No answer without attribution.

Industrial systems demand proof. Every document reference includes source, page, and revision. Every telemetry reading includes tag, timestamp, and quality flag. If the evidence is missing, GUS.ai says so. It does not guess.

ChatGPT

Plausible-sounding guess

When evidence is missing, the model generates fluent text that sounds confident but is not grounded in any verifiable source.

GUS.ai

Refusal with stated requirement

If a query cannot be answered from documents, telemetry, or process knowledge, GUS.ai says so explicitly: 'I do not have the information to answer that. To resolve, we would need [specific evidence].'

Citation primitives

  • Document page (source · type · revision · superseded flag · relevance score)
  • Telemetry tag (tag name · timestamp · value · unit · quality flag)
  • Alarm definition (alarm ID · trip condition · priority · source)
  • Batch record (batch ID · step · recipe · time bounds)
  • Dependency graph node (equipment / variable / step / outcome)

Evidence-mandatory design · citation primitives

Domain Depth

Not a generic process AI. Built for what cheese plants actually do.

GUS.ai knows what a set time is, why cook-time deviation matters, what dependencies exist between equipment, and what anomalies look like in a real vat. Three years of applied dairy research.

Batch detection without MES

GUS.ai discovers batches from raw tag transitions (recipe name, step number resets). It knows when a batch starts, what recipe is running, what step is active, what's next. No MES integration required.

Batch detection toolset

Spec-vs-actual deviation

Pulls design specs from recipe documents, compares to live batch history. Surfaces deviations in cook time, set time, fill volume, CIP cycle length against the recipe target. Quantified in $/year impact.

Spec-deviation analysis toolset

Causal reasoning over a 28-node graph

Equipment, process variables, recipe steps, outcomes, wired together. Three modes: impact propagation, root-cause F-score ranking, single-point-of-failure blast radius.

analyze_dependencies tool

MAD z-score anomaly detection

Median Absolute Deviation, not standard deviation. Outlier-resistant. Specifically chosen for process upsets where one extreme reading should not blind the detector to the next.

detect_anomalies tool

Pearson correlation matrix

Cross-tag correlation across 2-6 tags. Auto-flags strong / moderate / weak pairs with directionality. Surfaces couplings operators feel but cannot prove.

correlate_tags tool

PLC intelligence from L5X

UDTs, controller tags, and alarm definitions parsed from an Allen-Bradley L5X export. Alarm definitions become RAG chunks. Tag descriptions inform retrieval.

PLC Intelligence pipeline

Multi-Channel Determinism

Same reasoning engine. Four ways to ask.

Web kiosk, email, SMS, voice. Same safety guards on every channel. Same evidence-mandatory enforcement. SMS does not weaken the reasoning, it just constrains the output.

Web kiosk

All 14

Full evidence panel with citations, doc-type badges, superseded warnings

3D causal-reasoning flyout, 16 pre-built impact / RCA / SPOF questions, code graph and digital twin views.

Email

All 14

Styled HTML reply with embedded telemetry snapshots, threaded into conversation

IMAP polling. Domain + sender whitelisting. Signature stripping across Gmail, Outlook, Apple Mail. Digest mode batches multiple queries.

SMS

5 of 14

Concatenated SMS, max 4 messages

Twilio A2P 10DLC. Phone whitelist enforced. Industrial abbreviation expansion (70+ rules: CW → Cook Water, etc.).

Voice

4 of 14

Spoken TTS reply, multi-turn conversational loop

Twilio Voice with TwiML. OpenAI TTS default, ElevenLabs premium. 15s per-turn latency cap. Speech timeouts and confidence thresholds.

Per-channel tool allowlist enforced at dispatch

Safety & Audit · 1 of 2

Production-hardened. Not a prototype.

30-finding internal security audit completed. Aligned to IBM and Anthropic enterprise agent security frameworks.

25

FIXED

2

MITIGATED

2

OPEN (deferred)

1

Not applicable

Authentication

Clerk JWT on protected routes, invite-only provisioning, domain whitelist.

Read-only by architecture

The operational data adapter has zero write methods. Safety guards block control verbs (start, stop, override, bypass, write) before tool selection. Not a policy, an architectural impossibility.

Per-channel tool allowlist

SMS gets 5 of 14 tools. Voice gets 4. Untrusted channels cannot run expensive or data-heavy queries. Web (authenticated) gets the full 14.

Safety & Audit · 2 of 2

Audit, kill-switch, and the receipts.

Tamper-evident hash chain on every record, allowlists per channel, instant emergency stop.

Tool-result injection scanner

Nine classes of prompt-injection patterns stripped from tool results before the model sees them. Defense in depth on top of the system-prompt directive.

Tamper-evident audit log

Every query, every tool call, every result. SHA-256 hash chain (prev_hash + record_hash). Immutable-hash DB trigger rejects UPDATEs. Audit chain verifiable on demand.

Emergency kill-switch

Process flag flip. Query endpoints return 503 instantly across all channels, no redeploy required.

CYBERSECURITY_SUMMARY.pdf · 30-finding security audit

End of topic · Architecture

Three layers, fourteen tools, evidence on every answer. The architecture that powers the reference deployment.

Advanced Process Technologies, Inc.·Confidential · Lactalis Buffalo leadership only