The live application runs its core channel path on Cloudflare Pages, Hono, D1, KV, Workers AI, Agent Memory, and Cloudflare Email. Larger services like R2, Durable Objects, Vectorize, Queues, and AI Gateway are explicit roadmap items until they are bound and used.
Every route enters the same Worker shell.
Conversation stays separate from deterministic pricing.
Inquiries, jobs, memory, and events persist for staff review.
Every inbound request enters through Hono on Cloudflare Pages, then flows through deterministic tools and D1-backed audit records. Roadmap services are marked directly in the diagram.
Bound services that support the interview workflow today.
Kimi K2.6 agent inference and native tool-calling in the channel path.
Businesses, service windows, rate cards, inquiries, jobs, events, and Agent Memory.
Request caps, payload guardrails, idempotency, and edge-ready config.
Companion Worker for inbound email and native outbound reply attempts.
Webhook paths normalize calls and texts into the shared agent loop.
Important, but not claimed as live until bindings and code prove it.
Stateful sessions per business and conversation for richer multi-turn calls.
Contracts, call recordings, transcripts, and attachments.
Semantic retrieval upgrade for larger policy and historical corpora.
AI observability, caching, provider fallback, async sync, and retries.
The agent combines Kimi reasoning, staff-approved D1 memory, speech input, and voice output without letting retrieved text override deterministic tools.
Primary reasoning model. MoE architecture activates only relevant expert sub-networks per token, keeping inference cost low at high volume.
Live retrieval layer for business policies, procedures, warranty rules, and FAQs. Staff upload PDFs or paste text, then enabled chunks are injected into every channel.
Roadmap speech-to-text layer. The current voice path accepts Twilio Gather speech transcripts and routes them into the same Kimi K2.6 loop.
Text-to-speech for voice responses. ElevenLabs for production-quality voices; Workers AI MeloTTS as a zero-latency fallback.
Agent Memory is the live RAG path for business operations. Original files are not retained; the app stores extracted Markdown, document metadata, and retrieval chunks in D1, then injects up to four relevant enabled snippets into each Kimi K2.6 request.
Staff add PDF, TXT, Markdown, CSV, or direct policy text from /agent-memory.
Workers AI toMarkdown extracts readable text from PDFs and documents.
Markdown is normalized, capped, chunked, keyword-indexed, and scoped by business_id.
Web, email, SMS, and voice retrieve policy snippets before the agent replies.
Memory can answer warranty rules, safety procedures, escalation policy, and operational FAQs. Pricing still flows through quote_price, and service window availability still flows through check_availability.
Before Kimi K2.6 answers, the Worker retrieves relevant enabled memory chunks. The model still calls these tools via native function-calling for operational state and customer commitments.
Queries D1 for matching service windows with date-overlap exclusion and job dimension filtering.
Deterministic pricing engine: callout + service + urgency + after-hours/date + travel zone + add-ons. Never LLM-generated.
Demo work-order placeholder today. R2 PDF storage and e-signature envelopes are roadmap.
Payment-pending response today. Stripe Checkout is roadmap.
Writes confirmed job to D1 and returns a demo ServiceM8 reference. External PMS sync is roadmap.
A 6th tool the agent can call at any point to route the conversation to a human. Triggered automatically when confidence drops below threshold, dollar cap is exceeded, or max turns is reached.
9 tables, all scoped by business_id for strict multi-tenant isolation.
businesses
Tenant root table. One row per business.
availability_windows
Technician service-window inventory with category, zone, and notes.
rate_cards
Pricing configuration with deterministic JSON curves.
agent_configs
Per-business AI agent personality and guardrails.
agent_memory_documents
Uploaded or pasted staff knowledge converted to Markdown.
agent_memory_chunks
Retrieval chunks automatically injected into Kimi context.
inquiries
Every inbound interaction across all channels.
jobs
Confirmed jobs with field-service sync status.
events
Full audit trail for every action the agent takes.
Every live D1 query and job/inquiry/event path is scoped by business_id. Roadmap storage surfaces like R2, Vectorize, Queues, and Durable Objects should keep the same tenant prefix rule when added.
The LLM never generates prices. Every dollar amount comes from this formula, executed deterministically on the Worker.
LLMs are great at conversation but unreliable at arithmetic. A hallucinated price creates legal liability and erodes customer trust. By running pricing as a pure function on the Worker, the agent can confidently quote exact rates that match your published rate card.
Production AI needs more than vibes. These are hard constraints, not suggestions.
The current voice path uses Twilio webhooks and speech transcripts. Full Media Streams plus Whisper is the production evolution.
Customer calls demo number. Twilio posts Gather speech results to the Worker.
LiveHono route validates shape, applies guardrails, and preserves XML response semantics.
LiveTranscript → agent reasoning → tool calls → response text.
LiveInquiry, transcript, channel metadata, and tool events are persisted.
LiveWhen configured, short replies are rendered as audio for Twilio
Media Streams plus Workers AI speech-to-text for raw audio is roadmap.
RoadmapEverything that powers Tradie Front Office AI, in one table.
| Layer | Technology | Purpose |
|---|---|---|
| Framework | Hono 4 | Lightweight, fast web framework for Workers |
| Build | Vite + @hono/vite-build | SSR bundle for Cloudflare Pages |
| Runtime | Cloudflare Workers | V8 isolates at 300+ global PoPs |
| LLM | Kimi K2.6 (MoE) | Reasoning + native function-calling |
| STT | Twilio SpeechResult live; Whisper roadmap | Transcript input now, raw audio STT later |
| TTS | ElevenLabs / MeloTTS | Natural voice synthesis |
| Database | Cloudflare D1 (SQLite) | Relational data, multi-tenant |
| Agent Memory | D1 chunks + Workers AI Markdown Conversion | PDF/text policy retrieval |
| KV Store | Cloudflare Workers KV | Request caps, payload caps, idempotency, and edge config |
| Object Storage | Cloudflare R2 (roadmap) | Contracts, recordings, attachments |
| Vector DB | Cloudflare Vectorize (roadmap) | Semantic RAG upgrade for larger corpora |
| Sessions | Durable Objects (roadmap) | Stateful multi-turn agent sessions |
| Gateway | Cloudflare AI Gateway (roadmap) | LLM caching, rate limits, fallback |
| Queues | Cloudflare Queues (roadmap) | Async PMS sync, notifications |
| Auth | Google OAuth 2.0 + JWT | SSO for dashboard with session cookies |
| Voice | Twilio webhooks live; Media Streams roadmap | Telephony ingress/egress |
| Cloudflare Email Service + Resend fallback | Native inbound/outbound email path | |
| SMS | Twilio Messaging | Text message channel |
| Payments | Stripe Checkout (roadmap) | Customer payment collection |
| Contracts | DocuSign (roadmap) | E-signature for rental agreements |
| PMS | ServiceM8 API (roadmap) | Property management sync |
| Alerts | Slack API (roadmap) | Staff notifications & escalations |
| Frontend | Tailwind CSS + Space Grotesk | Utility-first styling, Dispatch Intelligence theme |
| TypeScript | ES2022 target | Type-safe Workers code |
Open the dashboard, run the live-channel demo, and inspect the API readiness flags.