Architecture

Why System Design Determines Whether Intelligence Compounds

Whether intelligence compounds or resets is determined by system design, not by model quality alone. Systems that fail to preserve context, assumptions, and decision lineage degrade over time, regardless of how capable individual components may be.



How Accordia Works — Technical Architecture (Canonical)

Architectural Premise

Accordia is built on the premise that analytical reliability is a systems problem, not a prompt or model problem.

Most AI systems fail under organizational use because they:

  • treat documents as flat text,
  • treat memory as conversational history,
  • conflate retrieval with reasoning,
  • and rely on prompt construction to compensate for missing structure.

Accordia instead decomposes intelligence into explicit, inspectable system layers that jointly determine analytical depth, recall, and explainability.


1. Ingestion Pipeline: From Raw Text to Governed Context

1.1 Quality Gating and Signal Filtering

Before any semantic processing, documents pass through a two-tier quality assessment pipeline:

Tier 1 (fast heuristics)

  • Shannon entropy scoring (information density)
  • Alphabetic and symbol ratios
  • Language detection

Tier 2 (model-based validation)

  • Perplexity scoring using lightweight or transformer language models
  • Adaptive thresholds by document type (prose vs technical)

Low-quality, corrupted, or metadata-heavy text is filtered or repaired before indexing.

Why this matters:
Retrieval precision is bounded by ingestion quality. No downstream retrieval or ranking can compensate for polluted context.


1.2 Semantic Chunking (Meaning-Preserving Segmentation)

Documents are segmented using embedding-based semantic chunking, not fixed token or character windows.

Process:

  1. Split text into sentences
  2. Generate embeddings per sentence
  3. Slide a window over sentence groups
  4. Detect semantic similarity drops (cosine distance)
  5. Insert boundaries where topic shifts occur

Enhancements:

  • Gradient-based boundary detection (not static thresholds)
  • Low-density start detection (TOC / index stripping)
  • Adaptive chunk sizing based on information density

Result: Chunks represent ideas, not storage artifacts.


1.3 Content-Rich Sentence Classification

Within each chunk, sentences are scored for content richness using multiple signals:

  • entropy (information density)
  • sentence length and structure
  • verb presence and punctuation complexity
  • stop-word ratios
  • position-based weighting (secondary)

Only content-rich sentences are emphasized during:

  • keyphrase extraction
  • contextualization
  • embedding generation

This prevents TOC, headers, and metadata from dominating semantic representations.


1.4 Ingestion-Stage RAG Optimizations (Extension)

To ensure ingestion does not silently degrade downstream reasoning:

  • Near-duplicate detection (LSH-based) prevents semantic collapse from repeated content
  • Structural preservation retains document boundaries and section order
  • Canonical representations normalize heterogeneous formats (PDF, DOC, HTML, transcripts)

These controls ensure ingestion produces a high-signal analytical substrate, not a raw text archive.


2. Contextualization: Making Chunks Self-Situating

2.1 Context Prefixing

Each semantic chunk is augmented with a context prefix generated at ingestion time.

The prefix:

  • situates the chunk within the source document
  • captures surrounding thematic scope
  • preserves section-level intent

Contextualized chunks are used consistently for:

  • embeddings
  • lexical indexing (BM25)

This allows each chunk to be retrieved independently without reconstructing document context at query time.


2.2 Contextual Retrieval Design

Contextualization ensures:

  • semantic retrieval remains stable as documents grow
  • long documents do not dominate similarity scores
  • retrieval precision does not degrade with scale

This aligns retrieval quality with document meaning rather than document length.


2.3 Long-Context Stability Improvements

Instead of expanding prompt context arbitrarily:

  • document-level meaning is compressed into contextual prefixes
  • retrieval selects meaningful units, not token windows
  • long-context inference is avoided unless explicitly required

This stabilizes reasoning and reduces hallucination pressure.


3. Embedding and Retrieval Layer

3.1 Label-Aware Embeddings

Accordia uses label-aware embeddings to align query and document representations:

  • Queries:
    search_query: {text}

  • Documents / chunks:
    search_document: {content + headings + context prefix}

This leverages training-time embedding alignment so similarity scores are semantically meaningful without altering retrieval math.


3.2 Multi-Query Expansion

Each user query is expanded into multiple semantic variants:

  • raw question
  • keyword-focused form
  • entity-anchored form
  • section-level abstraction

Each variant is embedded and retrieved independently.
Results are merged using reciprocal rank fusion (RRF).

Effect: higher recall without sacrificing precision.


3.3 Hybrid Retrieval Stack

Accordia combines retrieval signals from:

  • vector similarity search
  • BM25 lexical matching
  • pattern matching (IDs, codes, exact strings)
  • metadata and scope filters

Signals are fused and re-ranked so no single retrieval mode dominates.


3.4 REFRAG: Retrieval-Time Context Control

REFRAG operates as a retrieval-time optimization, not an ingestion shortcut.

Stages:

  1. Compress — retrieved chunks are compactly represented
  2. Sense — micro-units are scored for utility (semantic, structural, diversity signals)
  3. Expand — only high-utility fragments are expanded to full text

This increases effective context capacity while controlling token cost and latency.


4. Memory Model: Persistent, Scoped, Curated

Accordia does not rely on conversational memory.

Instead, it maintains workstream-scoped memory that persists:

  • analytical intents
  • synthesized conclusions
  • assumptions and constraints
  • source linkages
  • refinement history

Memory is:

  • explicitly written
  • selectively retained
  • structurally linked

It is not an append-only chat log.


5. Workflow-Aware Execution

All reasoning occurs inside explicit workflows that define:

  • scope boundaries
  • expected artifacts
  • persistence rules
  • reviewability requirements

Outputs are automatically:

  • versioned
  • attached to workstream memory
  • linked to upstream evidence

This converts reasoning into institutional capability, not ephemeral assistance.


How the System Preserves Context Across Decisions

Preventing intelligence from resetting requires context to persist beyond individual interactions. This depends on explicit architectural mechanisms that ingest work, situate evidence, retrieve information with traceability, carry decisions forward through memory, and execute workflows without losing lineage.