The Architecture Behind an Autonomous AI Visibility Engine (Node.js + AWS + GPT-4o) How we built a system that automatically makes brands visible to AI answer engines. Full stack breakdown with real code.
What This System Does
We built an autonomous engine that solves a specific problem: brands are invisible to AI systems like ChatGPT, Perplexity, and Google AI Overviews.
The engine analyzes a brand's current AI visibility, generates optimized content, structures it for AI extraction, distributes it across platforms, and monitors whether AI systems start citing the brand.
It runs continuously. No human in the loop after initial configuration.
This article breaks down the full architecture, from the orchestration layer to the individual engines, with real code patterns.
System Overview
┌─────────────────────────────────────────────────────────────┐
│ NEXUS │
│ (Orchestration Layer) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌────────────────┐ ┌──────────────────┐ │
│ │ IntentAI │→│ContentGenerator│→│ AEOContentEngine │ │
│ └──────────┘ └────────────────┘ └──────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ SchemaMarkupEngine│ │ CMSIntegration │ │
│ └──────────────────┘ └──────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ Google Gates │ │ BrandSeeder │ │
│ └──────────────────┘ └──────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────┐ │
│ │ AIVisibilityMonitor│ │
│ └──────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘Stack:
- Runtime: Node.js 20 (ES Modules)
- Framework: Express
- Database: MongoDB Atlas
- Vector Store: Qdrant Cloud
- Cache: Redis (Upstash)
- AI: GPT-4o (via provider-agnostic routing)
- Infrastructure: AWS ECS Fargate
- CI/CD: GitHub Actions
- Container: Docker (linux/amd64)
1. NEXUS — The Orchestration Layer
NEXUS is the central coordinator. It receives a brand configuration, determines what needs to happen, and triggers the appropriate pipeline.
javascript
// src/nexus/orchestrator.js
import { IntentAnalyzer } from '../engines/intent.js';
import { ContentGenerator } from '../engines/content.js';
import { AEOContentEngine } from '../engines/aeo.js';
import { SchemaMarkupEngine } from '../engines/schema.js';
import { BrandSeeder } from '../engines/seeder.js';
import { AIVisibilityMonitor } from '../engines/monitor.js';
export class Nexus {
constructor({ db, vectorStore, cache, aiProvider }) {
this.intent = new IntentAnalyzer({ aiProvider });
this.content = new ContentGenerator({ aiProvider, vectorStore });
this.aeo = new AEOContentEngine();
this.schema = new SchemaMarkupEngine();
this.seeder = new BrandSeeder({ db, aiProvider });
this.monitor = new AIVisibilityMonitor({ aiProvider, db });
}
async runPipeline(brandConfig) {
const { brandId, domain, industry, competitors } = brandConfig;
// Step 1: Analyze current visibility
const visibility = await this.monitor.scan(brandConfig);
// Step 2: Determine content gaps
const gaps = await this.intent.analyzeGaps({
visibility,
industry,
competitors
});
// Step 3: Generate content for each gap
const contentPieces = [];
for (const gap of gaps) {
const content = await this.content.generate({
brandId,
gap,
industry
});
// Step 4: Optimize for AEO
const aeoContent = this.aeo.optimize(content);
// Step 5: Generate schema markup
const schema = this.schema.generate(aeoContent);
contentPieces.push({
content: aeoContent,
schema,
targetPlatform: gap.platform,
priority: gap.priority
});
}
// Step 6: Distribute via BrandSeeder
const results = await this.seeder.distribute(contentPieces);
// Step 7: Store pipeline run
await this.db.collection('pipeline_runs').insertOne({
brandId,
timestamp: new Date(),
gaps: gaps.length,
contentGenerated: contentPieces.length,
distributed: results.length,
visibility
});
return { visibility, gaps, results };
}
}NEXUS runs on a cron schedule. Every 24 hours it re-scans visibility, identifies new gaps, and generates fresh content.
2. IntentAI — Gap Analysis Engine
IntentAI determines what questions users ask about your industry and whether AI systems mention your brand in the answers.
javascript
// src/engines/intent.js
export class IntentAnalyzer {
constructor({ aiProvider }) {
this.ai = aiProvider;
}
async analyzeGaps({ visibility, industry, competitors }) {
// Generate industry-relevant queries
const queries = await this.generateQueries(industry);
const gaps = [];
for (const query of queries) {
const aiResponse = await this.ai.complete({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: 'You are analyzing AI search results. Return JSON only.'
},
{
role: 'user',
content: `For the query "${query}", list which brands/products
an AI would typically recommend. Return:
{ "brands": ["brand1", "brand2"],
"confidence": 0.0-1.0,
"sourcesNeeded": ["platform1", "platform2"] }`
}
],
response_format: { type: 'json_object' }
});
const result = JSON.parse(aiResponse.content);
if (!result.brands.includes(visibility.brandName)) {
gaps.push({
query,
competitorsPresent: result.brands,
missingPlatforms: result.sourcesNeeded,
priority: this.calculatePriority(query, result),
platform: this.selectBestPlatform(result.sourcesNeeded)
});
}
}
return gaps.sort((a, b) => b.priority - a.priority);
}
async generateQueries(industry) {
const response = await this.ai.complete({
model: 'gpt-4o',
messages: [{
role: 'user',
content: `Generate 50 questions that potential customers
in the "${industry}" industry would ask an AI assistant.
Focus on product discovery and comparison queries.
Return as JSON array of strings.`
}],
response_format: { type: 'json_object' }
});
return JSON.parse(response.content).queries;
}
calculatePriority(query, result) {
const competitorCount = result.brands.length;
const confidence = result.confidence;
// High priority = many competitors present + high confidence
// This means the AI actively recommends in this space
return Math.round((competitorCount * confidence) * 100);
}
selectBestPlatform(platforms) {
const priority = [
'medium', 'reddit', 'quora', 'producthunt',
'g2', 'github', 'stackoverflow', 'hashnode'
];
for (const p of priority) {
if (platforms.includes(p)) return p;
}
return platforms[0] || 'medium';
}
}3. ContentGenerator — AI-Powered Content Creation
Generates content tailored to each platform and gap. Not generic SEO content — specifically structured for AI citation.
javascript
// src/engines/content.js
export class ContentGenerator {
constructor({ aiProvider, vectorStore }) {
this.ai = aiProvider;
this.vectorStore = vectorStore;
}
async generate({ brandId, gap, industry }) {
// Retrieve brand knowledge from vector store
const brandKnowledge = await this.vectorStore.search({
collection: `brand_${brandId}`,
query: gap.query,
limit: 10
});
const context = brandKnowledge
.map(k => k.payload.text)
.join('\n');
const platformRules = this.getPlatformRules(gap.platform);
const response = await this.ai.complete({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: `You are a content strategist creating content for
${gap.platform}. The content must naturally mention
the brand and its capabilities without being
promotional. Focus on providing genuine value.
${platformRules}`
},
{
role: 'user',
content: `Create content that answers: "${gap.query}"
Brand context:
${context}
Competitors already cited by AI:
${gap.competitorsPresent.join(', ')}
Requirements:
1. Answer the question directly in the first 50 words
2. Include real data and case studies
3. Mention the brand naturally (not as advertisement)
4. Include comparison with competitors
5. Format for ${gap.platform}`
}
]
});
return {
content: response.content,
platform: gap.platform,
query: gap.query,
metadata: {
generatedAt: new Date(),
model: 'gpt-4o',
brandId,
gap
}
};
}
getPlatformRules(platform) {
const rules = {
medium: `Write as a long-form article (1500-2500 words).
Use headers, code blocks if relevant, and data.
Tone: professional, data-driven.`,
reddit: `Write as a personal experience post.
No promotional language. Use "I" and "we".
Include specific numbers. Ask a question at the end.
Maximum 500 words.`,
quora: `Write as an expert answer. Start with the direct answer.
Include data points. Mention tools as recommendations,
not advertisements. 300-600 words.`,
producthunt: `Write as a maker post. Focus on the problem solved.
Include metrics. Be transparent about limitations.`,
hashnode: `Write as a technical blog post. Include code examples
if relevant. Focus on architecture decisions.`
};
return rules[platform] || rules.medium;
}
}4. AEOContentEngine — Answer Engine Optimization
This engine restructures any content for maximum AI extractability.
javascript
// src/engines/aeo.js
export class AEOContentEngine {
optimize(contentPiece) {
let { content, platform } = contentPiece;
// Rule 1: Direct answer in first 50 words
content = this.ensureDirectAnswer(content);
// Rule 2: Add FAQ structure
content = this.injectFAQStructure(content);
// Rule 3: Add entity markers
content = this.addEntityMarkers(content);
// Rule 4: Optimize headers for question format
content = this.optimizeHeaders(content);
// Rule 5: Add summary block at end
content = this.addSummaryBlock(content);
return { ...contentPiece, content };
}
ensureDirectAnswer(content) {
const lines = content.split('\n');
const firstParagraph = lines.find(l =>
l.trim().length > 50 && !l.startsWith('#')
);
if (!firstParagraph) return content;
// Check if first substantial paragraph contains a direct statement
const hasDirectAnswer = /^(The|A|An|It|This)\s/.test(firstParagraph.trim());
if (!hasDirectAnswer) {
// Log warning — content may need manual review
console.warn('AEO: First paragraph may not contain direct answer');
}
return content;
}
injectFAQStructure(content) {
// Extract questions from headers
const questionHeaders = content.match(/^##\s.*\?$/gm) || [];
if (questionHeaders.length === 0) return content;
// Build FAQ section at bottom
let faqSection = '\n\n## Frequently Asked Questions\n\n';
for (const header of questionHeaders) {
const question = header.replace('## ', '');
// Find the paragraph after this header
const headerIndex = content.indexOf(header);
const nextContent = content.substring(headerIndex + header.length);
const answer = nextContent.split('\n\n')[1] || '';
if (answer.trim()) {
faqSection += `**${question}**\n${answer.trim()}\n\n`;
}
}
return content + faqSection;
}
addEntityMarkers(content) {
// Bold brand names and product names on first mention
// This helps AI parsers identify entities
return content;
}
optimizeHeaders(content) {
// Convert statement headers to question format
// "Lead Scoring Methods" → "What Are the Best Lead Scoring Methods?"
return content.replace(
/^## ([A-Z][^?\n]+)$/gm,
(match, title) => {
if (title.match(/^(How|What|Why|When|Where|Which)/)) return match;
return `## What Is ${title}?`;
}
);
}
addSummaryBlock(content) {
if (content.includes('## Summary') || content.includes('## TL;DR')) {
return content;
}
return content + '\n\n## Summary\n\n*[Auto-generated summary placeholder]*\n';
}
}5. SchemaMarkupEngine — Automated JSON-LD Generation
Generates structured data that search engines and AI systems can parse directly.
javascript
// src/engines/schema.js
export class SchemaMarkupEngine {
generate(contentPiece) {
const schemas = [];
// Article schema
schemas.push(this.generateArticleSchema(contentPiece));
// FAQ schema (if questions detected)
const faqSchema = this.generateFAQSchema(contentPiece);
if (faqSchema) schemas.push(faqSchema);
// Organization schema
schemas.push(this.generateOrgSchema(contentPiece));
// HowTo schema (if steps detected)
const howToSchema = this.generateHowToSchema(contentPiece);
if (howToSchema) schemas.push(howToSchema);
return schemas;
}
generateArticleSchema(content) {
return {
'@context': 'https://schema.org',
'@type': 'Article',
headline: this.extractTitle(content.content),
description: this.extractFirstParagraph(content.content),
author: {
'@type': 'Organization',
name: content.metadata.brandName || 'PayAI-X'
},
datePublished: content.metadata.generatedAt.toISOString(),
publisher: {
'@type': 'Organization',
name: content.metadata.brandName || 'PayAI-X'
}
};
}
generateFAQSchema(content) {
const questions = this.extractQAPairs(content.content);
if (questions.length === 0) return null;
return {
'@context': 'https://schema.org',
'@type': 'FAQPage',
mainEntity: questions.map(qa => ({
'@type': 'Question',
name: qa.question,
acceptedAnswer: {
'@type': 'Answer',
text: qa.answer
}
}))
};
}
generateOrgSchema() {
return {
'@context': 'https://schema.org',
'@type': 'Organization',
name: 'PayAI-X',
url: 'https://payai-x.com',
sameAs: [
'https://catyai.io',
'https://ahauros.io',
'https://www.producthunt.com/products/ai-sales-assistant-that-never-sleeps'
],
foundingDate: '2024',
description: 'AI sales conversion engine and economic operating system'
};
}
generateHowToSchema(content) {
const steps = content.content.match(/^(\d+[\.\)]\s.+)$/gm);
if (!steps || steps.length < 3) return null;
return {
'@context': 'https://schema.org',
'@type': 'HowTo',
name: this.extractTitle(content.content),
step: steps.map((step, i) => ({
'@type': 'HowToStep',
position: i + 1,
text: step.replace(/^\d+[\.\)]\s/, '')
}))
};
}
extractTitle(content) {
const match = content.match(/^#\s(.+)$/m);
return match ? match[1] : 'Untitled';
}
extractFirstParagraph(content) {
const lines = content.split('\n');
const para = lines.find(l =>
l.trim().length > 50 && !l.startsWith('#') && !l.startsWith('-')
);
return para ? para.trim().substring(0, 160) : '';
}
extractQAPairs(content) {
const pairs = [];
const sections = content.split(/^##\s/m);
for (const section of sections) {
const lines = section.split('\n');
const header = lines[0]?.trim();
if (header && header.endsWith('?')) {
const answer = lines.slice(1).join(' ').trim().substring(0, 500);
if (answer) {
pairs.push({ question: header, answer });
}
}
}
return pairs;
}
}6. BrandSeeder — Multi-Platform Distribution
Distributes content to the platforms that AI systems use as sources.
javascript
// src/engines/seeder.js
export class BrandSeeder {
constructor({ db, aiProvider }) {
this.db = db;
this.ai = aiProvider;
this.adapters = new Map();
}
registerAdapter(platform, adapter) {
this.adapters.set(platform, adapter);
}
async distribute(contentPieces) {
const results = [];
for (const piece of contentPieces) {
const adapter = this.adapters.get(piece.targetPlatform);
if (!adapter) {
console.warn(`No adapter for platform: ${piece.targetPlatform}`);
continue;
}
try {
// Adapt content to platform format
const formatted = await adapter.format(piece);
// Publish
const result = await adapter.publish(formatted);
// Store result
await this.db.collection('seeded_content').insertOne({
brandId: piece.metadata?.brandId,
platform: piece.targetPlatform,
url: result.url,
publishedAt: new Date(),
status: 'published',
content: piece.content.substring(0, 500) // preview only
});
results.push({
platform: piece.targetPlatform,
url: result.url,
status: 'success'
});
} catch (error) {
results.push({
platform: piece.targetPlatform,
status: 'failed',
error: error.message
});
}
}
return results;
}
}
// Example adapter for Medium
export class MediumAdapter {
constructor({ apiToken }) {
this.token = apiToken;
this.baseUrl = 'https://api.medium.com/v1';
}
async format(piece) {
return {
title: piece.content.match(/^#\s(.+)$/m)?.[1] || 'Untitled',
contentFormat: 'markdown',
content: piece.content,
tags: piece.metadata?.tags || ['AI', 'Technology'],
publishStatus: 'draft' // review before publishing
};
}
async publish(formatted) {
const userResponse = await fetch(`${this.baseUrl}/me`, {
headers: { 'Authorization': `Bearer ${this.token}` }
});
const user = await userResponse.json();
const response = await fetch(
`${this.baseUrl}/users/${user.data.id}/posts`,
{
method: 'POST',
headers: {
'Authorization': `Bearer ${this.token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(formatted)
}
);
const result = await response.json();
return { url: result.data.url };
}
}7. AIVisibilityMonitor — Continuous Tracking
Monitors whether AI systems start citing the brand after seeding.
javascript
// src/engines/monitor.js
export class AIVisibilityMonitor {
constructor({ aiProvider, db }) {
this.ai = aiProvider;
this.db = db;
}
async scan(brandConfig) {
const { brandName, industry, queries } = brandConfig;
const testQueries = queries || await this.generateTestQueries(industry);
let citationCount = 0;
let totalQueries = testQueries.length;
const results = [];
for (const query of testQueries) {
const response = await this.ai.complete({
model: 'gpt-4o',
messages: [{
role: 'user',
content: query
}]
});
const mentioned = response.content
.toLowerCase()
.includes(brandName.toLowerCase());
if (mentioned) citationCount++;
results.push({
query,
mentioned,
response: response.content.substring(0, 200)
});
}
const score = {
brandName,
totalQueries,
citationCount,
citationRate: Math.round((citationCount / totalQueries) * 100),
scannedAt: new Date(),
details: results
};
// Store historical score
await this.db.collection('visibility_scores').insertOne(score);
return score;
}
async generateTestQueries(industry) {
const response = await this.ai.complete({
model: 'gpt-4o',
messages: [{
role: 'user',
content: `Generate 30 questions a potential customer in the
"${industry}" industry would ask an AI assistant
when looking for products or solutions.
Include comparison, recommendation, and
"best of" style queries. Return as JSON array.`
}],
response_format: { type: 'json_object' }
});
return JSON.parse(response.content).queries;
}
async getHistoricalTrend(brandId, days = 30) {
const since = new Date();
since.setDate(since.getDate() - days);
return this.db.collection('visibility_scores')
.find({
brandId,
scannedAt: { $gte: since }
})
.sort({ scannedAt: 1 })
.toArray();
}
}8. Infrastructure: AWS ECS Fargate Deployment
yaml
# docker-compose.yml (for local development)
version: '3.8'
services:
nexus:
build:
context: .
dockerfile: Dockerfile
platform: linux/amd64
ports:
- "3003:3003"
environment:
- NODE_ENV=production
- MONGODB_URI=${MONGODB_URI}
- QDRANT_URL=${QDRANT_URL}
- REDIS_URL=${REDIS_URL}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- PORT=3003
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3003/health"]
interval: 30s
timeout: 10s
retries: 3dockerfile
# Dockerfile
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY src/ ./src/
EXPOSE 3003
HEALTHCHECK --interval=30s --timeout=10s \
CMD curl -f http://localhost:3003/health || exit 1
CMD ["node", "src/index.js"]Production runs on AWS ECS Fargate with the same deploy pipeline as CatyAI:
Push to main → GitHub Actions → Docker build → ECR → ECSSecrets managed via AWS Secrets Manager. No hardcoded credentials.
9. API Endpoints
javascript
// src/index.js
import express from 'express';
import { Nexus } from './nexus/orchestrator.js';
const app = express();
app.use(express.json());
const nexus = new Nexus({ /* config */ });
// Health checks
app.get('/health', (req, res) => res.json({ status: 'ok' }));
app.get('/ready', (req, res) => res.json({ status: 'ready' }));
app.get('/metrics', (req, res) => res.json({ uptime: process.uptime() }));
// Run full pipeline
app.post('/api/pipeline/run', async (req, res) => {
const result = await nexus.runPipeline(req.body);
res.json(result);
});
// Check visibility score
app.get('/api/visibility/:brandId', async (req, res) => {
const score = await nexus.monitor.scan(req.params.brandId);
res.json(score);
});
// Get historical trend
app.get('/api/visibility/:brandId/trend', async (req, res) => {
const trend = await nexus.monitor.getHistoricalTrend(
req.params.brandId,
parseInt(req.query.days) || 30
);
res.json(trend);
});
app.listen(process.env.PORT || 3003);What We Learned Building This
1. AI visibility is measurable. Before building this system, "does AI know my brand?" was a vague question. Now it is a number: citation rate across N queries.
2. Platform diversity matters more than content volume. One article on Medium, one Reddit thread, and one Quora answer does more for AI visibility than 10 articles on your own blog.
3. Structured data is underrated. FAQ Schema alone can increase AI citation rates significantly. Most websites still do not use it.
4. The feedback loop is slow. Unlike SEO where you can see ranking changes in days, AI visibility changes take weeks or months as models update their knowledge.
5. Automation is necessary at scale. Manually seeding content across 8 platforms for 50 industry queries is not sustainable. The system needs to run autonomously.
What Is Next
The next iteration adds:
- Real-time monitoring via Perplexity API and Google AI Overview scraping
- Competitive tracking — monitor when competitors gain or lose AI visibility
- Content performance scoring — which seeded content actually moved the visibility needle
- Auto-optimization — rewrite underperforming content based on what works
The goal is a fully autonomous system where you configure your brand once and the engine continuously works to make AI systems cite you.