Building an AI-Powered Business Platform in 29 Months

We don't just tell clients to adopt AI-powered business systems. We run on one.

Arche — the platform that powers HW2 Technologies — is a fully integrated business operating system we built and deployed on our own infrastructure. It handles our CRM, ERP, accounting, email, project management, and AI operations. Every client interaction, invoice, and internal task flows through it. No SaaS. No cloud lock-in. Full data sovereignty.

This is the story of how we built it, what it took, and what we learned.

The Problem We Were Solving

In 2023, we looked at the software landscape for professional services firms and saw the same pattern everywhere: dozens of disconnected SaaS tools, each with its own data model, its own pricing, its own vendor lock-in. CRM in one place, accounting in another, email scattered across inboxes, project status living in someone's head.

The cost wasn't just the subscription fees. It was the integration tax — the hours spent moving data between systems, the decisions made without complete information, the compliance risk of client data scattered across fifteen different cloud vendors.

We wanted to build something different: a single system where all business data lives in one place, where AI can act on complete context, and where you own your infrastructure outright.

The constraint we set for ourselves: it had to run entirely on-premises. No AWS. No Google Cloud. No Supabase-hosted. Every compute, every byte of storage, every AI inference — on hardware we control.

What We Built

Over 29 months, we built Arche from first principles.

The data layer is a single PostgreSQL database with a JSONB-based record store. Every business object — contact, invoice, email, task, journal entry — is a record with a typed context blob. This sounds unusual but it gives us something SaaS platforms can't: a unified query surface across every entity type, and the ability to add new entity types without schema migrations.

The entity system defines 67 distinct object types, each with field definitions, validation rules, lifecycle hooks, and AI configuration. When we add a new business concept — say, a grant tracking entity for a client — we define it once and the full CRUD, search, embedding, and classification pipeline lights up automatically.

The API layer has 73 route handlers covering every data operation. Every route enforces company isolation, runs a five-minister AI validation council before saving, and fires non-blocking embedding and classification jobs after saving.

The email system processes and threads incoming IMAP mail across multiple accounts, runs AI classification on every message, and surfaces actionable items in the CRM. We've processed 8,490+ emails through this pipeline. Thread resolution handles the full complexity of enterprise email: References headers, In-Reply-To chains, normalized subject matching, and multi-account disambiguation.

The accounting module is a full double-entry ERP: chart of accounts, journal entries, fiscal year management, period closing, payroll, and financial reporting. It enforces balance checks before posting, prevents modifications to closed periods, and generates audit trails automatically.

The AI layer runs entirely on a local Ollama instance. Vector embeddings use nomic-embed-text (768 dimensions). Classification uses llama3.1:8b. Every entity type with meaningful text content gets embedded automatically on create and update. Semantic search across the entire business corpus is available on every screen.

The Hard Parts

Self-hosted AI means you own the infrastructure problems too. Embedding pipeline failures can't be retried by a vendor's reliability team — they're your problem. We built every AI operation as fire-and-forget with graceful degradation: if embedding fails, the record saves anyway. The business never stops because an AI job stalled.

Data model discipline was harder than expected. When everything is in one table, the temptation to store anything anywhere is constant. We enforced a strict rule: entity definitions are the schema. Not the database. The database has no enforced schema. If a field isn't in the entity definition, it doesn't exist. This is what prevented our data model from degrading into chaos over 29 months of continuous development.

The migration from "working prototype" to "production system we rely on for real business operations" required building things we hadn't planned: concurrency checks, status transition guards, idempotent sync operations, and a five-minister AI validation council that catches data quality issues before they reach the database.

What We Learned

Ownership beats convenience, eventually. The first six months of running our own infrastructure were harder than using SaaS. The next 23 months have been easier, cheaper, and more capable than any SaaS alternative could have been.

AI only works on complete data. The reason our AI features are useful — semantic search, entity classification, relationship inference — is that all our data is in one place. You can't build a meaningful AI layer on top of ten disconnected SaaS tools. You need a unified corpus.

The right abstraction is the entity. Everything in a business is an entity with fields, states, relationships, and lifecycle events. When your entire system is built around that abstraction, adding new capabilities is fast. We add new entity types in hours, not weeks.

What This Means for Our Clients

When we propose a self-hosted AI business platform for a law firm or accounting practice, we're not speculating about whether it works. We're showing them the system we run on.

Every architectural pattern, every data model decision, every integration we've built for Arche is available as a reference implementation for client deployments. The 29 months of iteration are already paid for. What clients get is a proven system tailored to their practice, deployed on hardware they control, with no recurring SaaS fees and no data leaving their premises.

That's a different value proposition than anything else in the market.