
Where Does Your Data Go When You Use AI?
by Deep Parmar
CTO, Sunbots & Xwits

Part 5 of the series "AI, Without the Hype". Start at Part 1.
When you type into a chatbot like ChatGPT, Claude, or Gemini, that text usually leaves your device and travels to a company's servers somewhere in the world. There, it is processed, often logged, and — depending on the tool and your settings — sometimes used to help train future models. The picture you uploaded, the contract you pasted, the email you asked it to rewrite: all of it made a round trip. The exception is on-device AI, which runs the model locally and never sends your data anywhere.
That is the whole answer. The rest of this post is just the detail behind it, because the detail is where people get caught out.
I build AI for a living, and a fair amount of what I build deliberately keeps data on the user's own machine. So this is not a scare piece. It is the honest version of where your words go, and what you can do about it.
What actually happens to your prompts
Think of a cloud AI tool like sending a postcard, not sealing a letter. You write your message, it goes through the postal system, and other parties can, in principle, handle it along the way.
Here is the typical journey:
- Your text or image is sent over the internet to the provider's servers.
- The model processes it there and sends an answer back to you.
- A copy of your input and the output is often stored for a period — for safety checks, debugging, and abuse prevention.
- On some plans, with certain settings, that stored data may be reviewed by humans or used to improve the model.
Most reputable providers now let you turn off training on your data, and business or enterprise plans usually promise not to train on your inputs at all. But two things stay true. The data still travels to a server you do not control. And "we don't train on it" is not the same as "we never store it."
What you share without realising
The risk is rarely the casual question. It is the habit. Once a tool is useful, you stop thinking before you paste.
Watch for these:
- Work documents. Strategy notes, unreleased plans, internal numbers — pasted in to "summarise this for me."
- Client and customer data. Names, phone numbers, order details, medical or financial information that belongs to someone else, not you.
- Personal information. Your Aadhaar or PAN number, bank statements, salary slips, private messages.
- Code and credentials. Source code, API keys, passwords copied in by accident inside a larger block of text.
The uncomfortable truth: the moment that data lands on an outside server, you have handed control of it to a third party. You are trusting their security, their staff, and their policies. Sometimes that trust is well placed. The point is that it should be a choice, not an accident.
Cloud vs on-device, simply
There are two fundamentally different places an AI model can run.
Cloud AI runs on the company's powerful servers. You send your data out, it comes back answered. This is most of what you use today. It is powerful and convenient. The trade-off is that your data leaves your device.
On-device AI (also called edge AI) runs the model directly on your phone, laptop, or even inside your web browser. Your data never leaves. There is no server to send it to. It works offline, and there is nothing to leak in a breach because nothing left in the first place.
A simple way to hold it: cloud AI is like consulting a brilliant specialist in another city — you have to send your file across to get an opinion. On-device AI is like having a capable doctor in your own home — smaller, but the file never leaves the house. I have written more on the privacy-first, client-side approach to AI if you want the engineering view.
Practical privacy habits anyone can adopt
You do not need to be technical. You need a few rules you actually follow.
- Assume it is not private by default. Before you paste, ask: would I be comfortable if this appeared on a public server log? If not, do not paste it.
- Redact before you send. Replace real names, numbers, and account details with placeholders like "Customer A" or "₹X". The AI rarely needs the real value to help you.
- Turn off chat history and training in the settings of any tool you use regularly. It takes two minutes.
- Use a work-approved tool for work data. Business plans usually come with a written promise not to train on your inputs. Personal free accounts often do not.
- Prefer on-device tools for sensitive tasks. For private documents, a browser-based or offline tool means the question of "where did it go" simply does not arise — I built Dhiya to do exactly this, with no server and no API key.
None of this slows you down much. It just moves you from sharing by accident to sharing by choice.
Why this is also a business and legal issue in India
This stopped being only a personal-comfort question. In India, the Digital Personal Data Protection Act (the DPDP Act) sets rules for how organisations handle people's personal data. If you paste a customer's details into a random AI tool, you may be moving that person's data to a third party — possibly across borders — without a proper basis. For a business, that is a compliance problem, not just an etiquette one.
The principle is older than the law: data you collect about other people is borrowed, not owned. Putting it into a tool you have not vetted is a decision with consequences. Increasingly, those consequences have a legal name attached.
The good news is that the safest option is also getting better. On-device AI is becoming genuinely capable — capable enough that "keep it local" is no longer the weak choice. That is exactly where this series goes next.
Next up — Part 6: AI That Runs on Your Phone, Not in Someone's Cloud. Read the whole series at deepap.dev.
Frequently Asked Questions
Quick answers about this topic — also indexed by AI search engines via FAQPage schema.
Share this article:
