Internal Framework Processing Diagram

This diagram focuses on the logic inside the NestJS Framework API, not the FE/Admin UI.

Internal NestJS Framework API processing flow All FE/Admin requests go through this pipeline. Each node shows what data is read/written so Admin can manage, audit, and reuse it. 1. Receive request chat/action/form/admin request id + trace Write: request_logs 2. Authenticate workspace, user, role permission guard Read: users, roles 3. Load config taxonomy, page-map prompt/rules, connectors Read: client_config 4. Normalize context UTM, page, behavior session, conversation Write: sessions/messages 5. Detect intent unknown_need compare, demo, asset... Write: intent_trace 6. Slot engine known / missing merge without overwriting with empty values Write: session_slots 7. Case router decides what to call Need RAG? Need memory? Need custom tool? Are enough slots filled, or should the assistant ask next? Write: case_route, missing_fields 8A. Call Powabase RAG Query approved atoms/chunks Filter: product, industry, painPoint Return sources + citations Read: atoms/chunks/embeddings 8B. Call Memory Layer Search known user facts Industry, company size, pain point Mem0 or memory_facts Read/Write: long-term facts 8C. Call custom tool ROI, booking, product API Tool registry + timeout Validate input/output Write: tool_call_logs 8D. Store event/state chat_turn, slot_filled conversation_messages gen_ui_action, session_slots Write: messages/events/audit 9. Merge data context + slots + memory + RAG sources + tool result Create response plan before answering Write: response_plan, retrieval_trace 10A. Guardrail No strong enough source? -> low-confidence fallback Write: content_gaps 10B. Generate Gen UI Answer + citations Block spec + next actions Write: gen_ui_blocks 10C. Lead/notification If high intent: handoff Slack/CRM/email/webhook Write: leads/notifications 11. Return response FE/Admin receives stable payload message, blocks, state, audit Write: response_logs Key point: Powabase, Mem0, and tools are connectors called by NestJS when needed. Framework logic stays in the NestJS pipeline so it can be reused across clients. How is data stored? The framework does more than answer chat. Each step creates structured data that Admin can view, edit, audit, and reuse. Powabase: Knowledge sources, source_files raw_chunks, atoms atom_versions, embeddings taxonomy_terms, pages Powabase: Runtime sessions, conversations conversation_messages leads, demo_requests events, session_slots Memory Layer Mem0 or memory_facts industry / company size product interest preference / timeline Governance / Audit audit_logs content_gaps workflow_tasks notification_logs Admin can inspect source/chunk/atom retrieval score lead/event/memory audit/content gap
Framework Data ERD This layout uses dedicated rails/gutters: tables do not sit on lines; lines run only through empty space between clusters. workspaces PK id name, slug plan, locale status created_at sources PK id FK workspace_id type, title, url status owner, uploaded_at source_files PK id FK source_id storage_path mime_type, size checksum, parse_status source_pages PK id FK source_file_id page_no, ocr_text image_path quality_score raw_chunks PK id FK source_page_id chunk_text chunk_index metadata, noise_flag client_configs PK id FK workspace_id brand, theme prompt_rules connector_config atoms PK id FK workspace_id atom_type, title content, summary status, confidence atom_versions PK id FK atom_id version_no content_snapshot changed_by/reason atom_embeddings PK id FK atom_id embedding vector model metadata, indexed_at atom_chunk_links PK id FK atom_id FK raw_chunk_id evidence_score citation_label users_roles PK id FK workspace_id user_id role permissions taxonomy_terms PK id FK workspace_id type, name, slug parent_id synonyms pages PK id FK workspace_id route_path page_type, title context_hint atom_taxonomy_links FK atom_id FK taxonomy_term_id weight created_at page_taxonomy_links FK page_id FK taxonomy_term_id weight created_at visitors PK id FK workspace_id anonymous_id consent_status merged_lead_id sessions PK id FK workspace_id visitor_id / lead_id anonymous_id, utm current_page leads PK id FK workspace_id visitor_id nullable name, email, phone status, score, owner demo_requests PK id FK lead_id preferred_time solution_interest handoff_payload conversations PK id FK session_id channel: web/chat status, summary last_message_at conversation_messages PK id FK conversation_id role, content citations, tokens created_at session_slots PK id FK session_id field_key, value source, confidence expires_at events PK id FK session_id event_type payload json created_at memory_facts PK id subject_type/id fact_key/value confidence, consent Main relationship legend Lines are routed through gutters/rails: x=285, 1125, 1410 and y=270, 470, 670, 870, 1070, 1240. No table is placed on those rails. Atoms are generic knowledge; pages are route context; visitors handle anonymous users; memory_facts use subject_type/id for visitor, lead, user, or account.

Clear ERD relationship table

Group Relationship Type Join key Meaning
Tenant workspacesclient_configs, sources, users_roles, visitors, sessions, atoms, pages, leads 1:N workspace_id Separates data by client/project and prevents cross-client data mixing.
Ingestion sourcessource_filessource_pagesraw_chunks 1:N chain source_id, source_file_id, source_page_id Source documents are split into files, pages, and chunks for OCR, debugging, and indexing.
Knowledge atomsatom_versions, atom_embeddings 1:N atom_id One atom has multiple versions and embeddings/indexes for RAG.
Evidence atomsraw_chunks qua atom_chunk_links N:N atom_id, raw_chunk_id An atom can have evidence/citations from many chunks; one chunk can support many atoms.
Atom classification atomstaxonomy_terms qua atom_taxonomy_links N:N atom_id, taxonomy_term_id An atom can be linked to many labels such as topic, product, industry, persona, and pain point.
Page context pagestaxonomy_terms qua page_taxonomy_links N:N page_id, taxonomy_term_id Pages/routes are labeled so AI understands the user context.
Anonymous user visitorssessionsconversationsconversation_messages 1:N chain visitor_id, session_id, conversation_id Anonymous visitors still have sessions and transcripts before becoming leads.
Runtime state sessionssession_slots, events 1:N session_id Stores known fields, event tracking, Gen UI actions, and RAG queries per session.
Lead/demo leadsdemo_requests 1:N lead_id One lead can have many demo, quote, or handoff requests.
Memory memory_factsvisitor / lead / user / account Polymorphic subject_type, subject_id One memory table can serve visitors, leads, logged-in users, or company accounts.

Plain-English explanation for each table

Table Clear explanation When data is created Why it is required
workspaces A workspace is the working area for one client/project. If the framework serves 10 different websites, each website has its own workspace. Created when onboarding a new client. Keeps DataSys data separate from other clients; every important table is scoped by workspace_id.
client_configs The per-workspace configuration: brand, language, AI tone, answer rules, lead rules, and CRM/Slack/email connector settings. Created during client setup and updated when rules, prompts, connectors, or policies change. Keeps the core framework reusable, without hardcoding client-specific logic into code.
users_roles The list of internal Admin users and their permissions, such as document uploaders, knowledge reviewers, and sales users who view leads. Created when inviting users into Admin or changing permissions. Controls who can upload, edit, approve, publish, reindex, or view customer data.
sources The source record for original material. It represents an information source such as a PDF, old website, FAQ, slide deck, pricing sheet, or contract template. Created when Admin uploads/imports a new data source. Shows which document an AI answer came from, whether it is still valid, who uploaded it, and whether it is approved.
source_files A physical file in storage under a source. One source can contain many files, for example a document package with multiple PDFs. Created when a file is uploaded to storage. Manages file path, format, checksum, size, parse status, and retries when errors occur.
source_pages Parsed/OCR content by page or section. For example, page 5 of a PDF has its own OCR text. Created after the system parses files, OCRs PDFs/images, or crawls HTML. Lets Admin inspect OCR quality and lets AI cite the exact page/section.
raw_chunks Small text segments cut from source_pages for easier retrieval. This is technical data, not the main content Admin should edit directly. Created after document chunking. Used to debug RAG: which segment was retrieved, which segment is noisy, and which should be linked to an atom.
atoms An approved, normalized knowledge unit used by AI to answer. An atom is not only a product; it can be FAQ, policy, pricing, process, case study, company information, or technical documentation. Created from raw chunks or directly written/edited by Admin in AMS. This is the official source of truth for AI. AI should answer from approved/published atoms instead of unreviewed raw text.
atom_versions The change history of an atom. Each edit stores a version so the previous content is preserved. Created when an atom is edited, merged, split, approved, or republished. Supports rollback, audit, old/new comparison, and prevents loss of important content.
atom_embeddings The semantic vector of an atom. It helps the system find the right atom even when the user phrases the question differently. Created when an atom is indexed or reindexed. Lets RAG find relevant knowledge by meaning, not only by keyword.
atom_chunk_links A link table connecting atoms to original chunks as evidence. One atom can rely on many chunks, and one chunk can support many atoms. Created when an atom is generated from documents or when Admin manually links evidence. Allows AI answers to include citations and lets Admin verify which document the atom came from.
taxonomy_terms A shared taxonomy label set. Labels can represent industry, product, pain point, persona, topic, funnel stage, region, or language. Created when Admin configures taxonomy or imports it from documents. Used to filter RAG, understand page context, and personalize answers without hardcoding into atoms/pages.
atom_taxonomy_links A link table attaching atoms to multiple taxonomy labels. For example, one atom can belong to ERP, manufacturing, and inventory. Created when Admin tags an atom or when the system auto-tags after ingestion. Lets one atom be reused in many contexts without duplicating data.
pages A record describing a website route/page. It does not store full page HTML; it stores context so AI understands where the user is. Created when the website has a new route, landing page, or campaign page. Helps AI know the page topic, which atoms to prioritize, and which form/CTA to ask next.
page_taxonomy_links A link table attaching a page to multiple taxonomy labels. For example, an ERP manufacturing page maps to manufacturing, ERP, and inventory_accuracy. Created when Admin maps a route to industries/topics or when the system auto-maps it. Lets AI understand context as soon as a user lands on the page, without asking again for industry/product if the page already shows it.
visitors An anonymous visitor whose identity is not known yet. The system identifies them by anonymous_id in cookie/localStorage, without requiring email at first. Created when a new browser visits the website. Keeps anonymous visitor context, tracks consent, and merges into a lead when the visitor leaves contact information.
sessions A visit/interaction session for a visitor/lead/user. One visitor can have many sessions across multiple returns to the website. Created when a user starts a new visit or interaction. Groups current page, UTM, campaign, known slots, events, and conversations within one visit.
conversations A specific conversation within a session. One session can have one or more conversations depending on UI design. Created when the user opens chat or sends the first message. Manages conversation status, summary, handoff, and transcript.
conversation_messages Each line in the conversation: user question, assistant answer, tool API call, or RAG source result. Created every time there is a message or tool result. Stores the full transcript, debugs answers, extracts memory/lead insights, and audits tokens/citations.
session_slots Known fields within the session, such as industry, company size, need, timeline, and budget. Created/updated when the user speaks in chat, clicks Gen UI, or fills a form. Prevents repeated questions and lets forms automatically hide fields that are already known.
events Runtime behavior and event log. An event is not necessarily a message; it is any action/state worth recording. Created for page views, CTA clicks, chat turns, Gen UI actions, RAG queries, and CRM syncs. Used for funnel analysis, lead scoring, flow debugging, and action audit.
leads A profile for a potential customer. A visitor becomes a lead when contact information or clear buying intent exists. Created when the user leaves email/phone, books a demo, requests a quote, or reaches a high-intent score. Enables sales follow-up, CRM sync, summaries, owner assignment, and sales status tracking.
demo_requests A specific request from a lead for a demo, consultation, quote, or contact. Created when the user clicks book demo, submits a consultation form, or AI confirms a demo need. Used to send Slack/email/CRM notifications and manage schedule and sales handling status.
memory_facts Internal long-term memory fallback when Mem0 is not used. Each record is a memorable fact about a visitor, lead, user, or account. Created when the system extracts stable information such as industry, company size, pain point, preference, and has proper consent. Lets AI avoid asking again next time and personalize consultation based on long-term history.

Complete data table dictionary

Table Clear description Example data How AI/Admin uses it
workspaces Tenant/project root. Each customer or project using the framework has its own workspace. datasys, education_client, real_estate_client Separates data, config, knowledge, users, and leads by client.
client_configs Per-workspace configuration: brand, theme, prompt rules, connectors, and policies. Consulting tone, language, CRM connector, and rules for not answering without sources. NestJS loads config per client so the same core can run many projects.
users_roles Internal users and permissions in Admin/AMS. admin, editor, reviewer, sales, viewer Controls permission to upload, edit atoms, approve, reindex, view leads, or audit.
sources Original source documents imported into the system by Admin. Company profile PDF, old website, FAQ, slides, transcript, pricing sheet. Tracks which source created which knowledge; supports audit and re-parse when documents change.
source_files Physical source files in storage with technical metadata. Storage path, MIME type, checksum, and parse status. Shows which files parsed successfully, failed, or need reprocessing.
source_pages Content by page/section after OCR or parsing. Page 3 OCR text, screenshot path, quality score. Admin reviews original content, checks OCR, and traces citations back.
raw_chunks Technical chunks cut from source pages for retrieval/debugging. A 300-800 token segment about an ERP feature or policy. Not the main editable source; used to link evidence, debug RAG, and mark noisy chunks.
atoms The official normalized/approved knowledge unit. It is not only a product record. FAQ, policy, pricing, case study, implementation process, technical doc, sales script. AI uses it as the official answer source; Admin edits, merges/splits, approves, and publishes it.
atom_versions Atom change history. Version 1 old pricing, version 2 updated policy. Rollback, audit who changed what, and compare before/after content.
atom_embeddings Atom vector embedding for semantic search. Embedding model, vector, metadata filter, indexed_at. RAG finds relevant atoms based on the question and context.
atom_chunk_links Link table connecting atoms to raw chunks as evidence/citations. The “implementation process” atom links to chunks from a proposal PDF. AI answers include sources; Admin verifies which segment an atom came from.
taxonomy_terms Shared taxonomy label system for atoms/pages/queries. product=ERP, industry=manufacturing, persona=COO, topic=pricing. Filters RAG, understands page context, and classifies content without hardcoding by product.
atom_taxonomy_links Many-to-many link between atoms and taxonomy terms. One atom tagged with ERP + manufacturing + inventory_accuracy. One knowledge item can belong to many topics/industries/personas at once.
pages Route/page context on the website, not hardcoded page content. /erp-manufacturing, /pricing, /case-study AI knows where the user is, which knowledge to prioritize, and which CTA/form fits.
page_taxonomy_links Links pages to taxonomy terms. ERP Manufacturing page tagged with industry=manufacturing and product=ERP. When the user is on this page, AI automatically understands the initial context.
visitors Anonymous visitor before email, phone, or login is known. anonymous_id from cookie/localStorage, consent_status, first_seen_at. Tracks context and temporary memory; merges into a lead when the visitor provides information.
sessions One website visit or interaction session. visitor_id, lead_id, current_page, UTM, started_at. Keeps runtime context: which page the user is on, which campaign, and which slots are known.
conversations One conversation in a session. web chat, status=open, summary=customer asks about ERP for manufacturing. Groups messages, creates summaries, and analyzes lead insights.
conversation_messages Each message in the conversation: user, assistant, or tool. role=user, content="I need ERP for 200 employees". Stores transcript, citation, tokens, and tool result; used for audit and memory extraction.
session_slots Known fields in the session. industry=manufacturing, company_size=200, pain_point=inventory. Does not ask known information again; forms automatically skip fields already provided.
events Tracks behavior and system events. page_view, chat_turn, cta_click, gen_ui_action, rag_query. Analyzes funnel, debugs flow, calculates lead score, and audits actions.
leads Potential customer profile valuable enough for sales/marketing follow-up. name, email, company_size, industry, interest, score, owner. Creates sales summaries, syncs CRM, supports follow-up, and assigns owners.
demo_requests Demo/quote/consultation request attached to a lead. preferred_time, solution_interest, handoff_payload. Sales receives schedule/demo requests; CRM/Slack/email receives notifications.
memory_facts Long-term memory fallback when Mem0 is not used. subject_type=lead, fact_key=company_size, fact_value=200. Remembers long-term information by visitor/lead/user/account with consent and confidence.

Detailed storage table

Data group Where it is stored Suggested table/collection Purpose How Admin handles it
Source documents Powabase Storage + Postgres sources, source_files, source_pages Store PDFs, old web pages, OCR text, and source metadata. Upload, inspect pages, re-parse, archive.
Technical chunks Powabase Postgres/RAG index raw_chunks, atom_chunk_links Debug retrieval and inspect which text was chunked/indexed. View, filter, mark noisy, link to atoms, and reindex. Do not edit as the main source.
Knowledge atoms Powabase Postgres + vector/RAG atoms, atom_versions, atom_embeddings Official knowledge source for RAG/runtime; not just products, can be FAQ, policy, case study, pricing, or process. Edit, merge/split, attach taxonomy, approve/publish, inspect versions.
Page context Powabase Postgres pages, page_taxonomy_links Identifies which topic, industry, or funnel stage a route/page belongs to. Attach taxonomy to each route, set context hint, and default CTA.
Anonymous visitor Powabase Postgres visitors Identify anonymous visitors via anonymous_id/cookie before email/phone is known. Check consent and merge visitor into lead when contact info is provided.
Context/session Powabase Postgres sessions, session_slots Know which page/campaign the user is in and which fields they already provided. Debug sessions and check known/missing slots.
Conversation Powabase Postgres conversations, conversation_messages Store conversation history: user, assistant, tool result, citation, token, trace. View transcripts, debug answers, create summaries, extract lead insights.
User memory Mem0 or Powabase fallback Mem0 memories or memory_facts Remember long-term facts such as industry, company size, pain point, and preference. View/edit/delete when permission and consent allow it; avoid unnecessary sensitive PII.
Lead/event Powabase Postgres + CRM connector leads, events, demo_requests, crm_sync_logs Track funnel, create sales handoff, sync CRM. View lead summary, retry CRM sync, export/report.
Governance Powabase Postgres audit_logs, content_gaps, workflow_tasks Track who changed what, which questions lack data, and which tasks need review. Review, assign, resolve content gaps, inspect audit.