How I Built an AI That Actually Knows My Business
There is a lot of noise right now about AI agents, MCP servers, and connecting AI to your tools. Most of it focuses on the technical setup. Very little of it addresses the more practical question: what data should you actually put in front of AI, how should it be structured, and how do you keep it current?
This is the article I wanted to read before I built my own system. So I wrote it.
MCP, which stands for Model Context Protocol, was introduced by Anthropic in late 2024 as an open standard for connecting AI models to external data sources. By early 2025, OpenAI, Google DeepMind, and hundreds of developer tools had adopted it. The protocol addresses what Anthropic described as an “N×M” data integration problem, where every AI tool previously needed a custom connector built for every data source. Today it is the standard way to give AI real-time access to databases, files, and business systems without custom code for each connection.
The reason this matters is not technical. It is practical. Claude and ChatGPT are not blind to who you are. Both have memory features. AI remembers things across conversations. You can upload files, paste context, and write custom instructions. That works for simple, one-off questions.
It breaks the moment your questions get specific.
There is a real difference between an AI that remembers “I run a SaaS company” and an AI that can answer “which customers signed up last month, have not logged in since, and are still on a paid plan?” The first is a memory note. The second requires structured, queryable data. No amount of pasted context or memory snippets gets you there.
Now multiply that across more than one business. If you run multiple companies, each with its own customers, competitors, products, pricing, and market signals, asking AI to hold all of that coherently in a conversation becomes unreliable fast. Competitor pricing updates, new customers entering your CRM, fresh newsletter articles, recent reviews, changes in your local market: none of that exists in AI’s training data, and none of it fits cleanly in a context window alongside everything else.
The fix I built: a local SQLite database connected to Claude via an MCP server, updated daily by automated scraping workflows in Hexomatic and synced with our CRM database. When I ask Claude something about my business now, it queries real, current, structured records and returns a precise answer built from actual data.
Here is exactly how it works.
The Real Gap: Memory vs. Queryable Data
People assume the solution is better prompts or more context pasted in. It is not.
Pasted context works for one-off questions with small datasets. It breaks on anything requiring counting, filtering, comparing across time, or joining information from two sources. The moment your question sounds like a database query, you need a database.
AI is very good at reasoning over structured data once it has access to that data through a proper connection. It is not a replacement for the data itself. That distinction matters more than any other part of this setup.
The Architecture: Three Components
The system has four parts. Each one does a specific job.
1. A local SQLite database
SQLite is a file-based database. No server, no cloud, no subscription. It runs on your machine. You can open it in any database viewer, query it with standard SQL, and extend it as you need. I use it to store everything: my published articles, my customers, my contacts, my product catalogue, competitor data, and more.
The structure matters less than the habit of keeping it current.
2. An MCP server
MCP stands for Model Context Protocol. It is an open standard that lets Claude connect to external tools and data sources directly inside a conversation. Instead of copy-pasting data into a prompt, you expose your database through an MCP server and Claude queries it on its own.
Setting up an MCP server is a technical step, but it is not complex, just ask Claude how to do it.
3. Hexomatic as the daily data pipeline
This is the part that keeps the external data alive. A static database goes stale within weeks. Competitors change their pricing. New articles get published. Tenders open and close. Local market listings shift.
Hexomatic runs scheduled workflows that scrape specific sources and outputs the results as CSV files. Those files then need to make their way into your local SQLite database. I built a small local import app that watches a folder on my machine. When a new CSV lands there, it picks it up and inserts the records automatically.
The one manual step is downloading the CSV from Hexomatic once a day. That sounds like it defeats the purpose, but I made that choice deliberately. Connecting an automated pipeline directly from the web into your local machine is a security tradeoff I am not comfortable with yet. One manual click per day keeps the data flow air-gapped from any external system. The database stays local, the connection to Claude stays local, and nothing from the web writes directly to my machine without me seeing it first.
This is a part of the setup we are actively thinking about. There will likely be a cleaner solution that handles the import automatically without compromising the security model. For now, one download per day is a reasonable tradeoff given what the rest of the system gives you in return.
4. CRM sync
External data from the web is only half the picture. The other half is your own operational data: customers, signups, payments, job history, service records, support tickets. This lives in your CRM, and it needs to flow into the same SQLite database so Claude can reason across both layers at once.
This is the part most people skip. They connect AI to web data and stop there. But the most useful questions are the ones that cross both layers: which customers in a specific region have not been contacted since a competitor changed their pricing in that area, or which clients are due for a follow-up based on their last service date. Those questions only work when external signals and internal records live in the same place.
What to Scrape and Insert
The specific data depends on your business. Here is what I track, and why each one matters.
Your own published content
Every article, newsletter, or post goes into the database as soon as it is published. When I ask Claude to suggest a new article angle, it already knows everything I have written. It does not suggest topics I covered two years ago.
Competitor pages
Pricing pages, feature lists, landing pages, blog content. Scraped weekly. When a competitor adds a new plan or removes a feature, it appears in the database within days. Claude can tell me exactly what changed and what it signals.
Customer data from your own tools
That data flows into the database automatically. When I ask Claude about customer behavior or revenue trends, it reads from actual records.
Google Maps and local listings for service businesses
For my local service business, I track competitors in South Florida. New businesses entering the market, existing ones closing, review counts shifting. Hexomatic scrapes this weekly. AI can give me a competitive briefing on the local market without me doing any manual research.
Tender and procurement portals
If your business responds to bids or contracts, scraping relevant procurement portals and inserting new listings into the database turns AI into a bid discovery assistant. Ask it for qualified opportunities this week, and it pulls from real current data.
Industry news and job postings
Competitor hiring patterns reveal what they are building before they announce it. A company adding a machine learning team means a product shift is coming. Scraping job boards and inserting the results gives AI a forward-looking signal that would otherwise require hours of manual tracking.
What AI Can Do With This Data
The difference is not subtle. Here are examples of questions I ask that would be impossible to answer properly without the database.
“Which customers have not logged in for 90 days but are still on a paid plan?” Claude queries the database and returns a list I can act on immediately.
“What topics have I written about most in the last six months, and what gaps exist based on what customers are asking?” Claude cross-references my article archive against the support tickets and returns a content gap analysis.
None of this requires a prompt full of pasted data. Claude already has the context. The conversation can stay focused on the decision, not on the setup.
The Part That Most People Skip
The database alone is not enough. The workflow that keeps it updated is what makes the system worth building.
The Hexomatic workflows run on a fixed schedule. Some are daily. Competitor pricing pages, new customer signups, recent articles. Some are weekly. Google Maps competitor data, tender portals, job listings. Some are monthly. A broader competitive landscape sweep and a full content audit.
The schedule is not the point. The habit is. Once the workflows are running, the database stays current without you touching it. AI always has something real to work with.
How to Start
You do not need to build all of this at once. Start with one data source that would immediately change how you use AI.
If you are a content creator, start by inserting your published articles. AI will stop suggesting duplicates and start giving you genuinely useful feedback on gaps.
If you run a SaaS or subscription business, start with your customer data. The ability to ask natural language questions against real customer records is immediately useful.
If you compete in a local market, start with Google Maps data for your area. A weekly scrape and a simple table in SQLite gives AI enough context to brief you on competitive shifts without any manual research.
Build one workflow. Let it run for two weeks. Then extend it.
The underlying point is simple. AI is not going to learn your business on its own. But once you give it the right data, it stops guessing and starts being useful in a way that actually compounds over time. The information going in today makes the answers better next month, and better still the month after.
That is the system worth building.
I am just getting close to the point where this setup delivers the real benefit I had in mind when I started building it. More to come, including updates on how Hexomatic fits deeper into this pipeline as we develop it further. The broader topic of building a second brain, outside of Hexact products and from a more personal angle, is something I cover in my personal newsletter. If that wider context interests you, follow along at publication.aslanyan.net.
If you want to build the Hexomatic workflows that power this kind of setup, start with Hexomatic’s Google Search and Get page content automations. If you would rather have it built for you, the concierge service is the fastest path: calendly.com/hexact/concierge-service-hexact


