What questions should I ask before starting an AI project?

Five questions decide whether an AI project ships: What decision does this AI actually change? What does success look like in measurable terms? Do we have the data to train and evaluate? What is the cost of being wrong? And who owns the system after launch? Skipping any one of these is the fastest path to a six-month prototype that never reaches users.

How do I know if my AI project will succeed?

Projects succeed when there is a clearly defined decision the AI changes, a measurable success metric tied to business value, available evaluation data, and a named owner for the production system. Projects with vague goals like 'add AI to our product' almost always stall before launch.

Why do most AI projects fail to ship?

Most AI projects fail because of unclear scope, missing evaluation data, or no operational owner — not because the model does not work. Without a concrete user decision the AI affects and a metric tied to business outcomes, prototypes drift, stakeholders disengage, and the project quietly dies.

What data do I need before starting an AI project?

You need representative examples of the inputs the system will see in production, ground-truth labels or outputs to evaluate against, and edge cases that show how the system should fail gracefully. Without an evaluation set, you cannot tell whether iterations are improving or regressing the system.

Should I use an existing AI model or build a custom one?

Start with an existing foundation model and only build custom when you have proven that off-the-shelf options cannot meet your quality, cost, latency, or privacy requirements. Custom models add training, deployment, and maintenance burden that most product teams underestimate.

5 Questions Before Starting an AI Project

The Prototype Graveyard

Most AI projects don't fail in production. They fail in planning — or more precisely, they fail because planning was skipped entirely. A client calls with an idea, the team gets excited, someone spins up a notebook, and six months later there's a demo that impresses no one and a timeline that's 200% over budget.

I've been on both sides of this. At Sunbots, we've built production AI systems for clients across healthcare, legal, retail, and education. The projects that failed all shared one characteristic: someone started building before these five questions had clear answers.

Question 1: What Decision Does This AI Need to Change?

Not "what problem does AI solve?" but specifically: what decision will a human or system make differently because this AI exists?

If you can't name the decision, you don't have a product requirement — you have an experiment. Experiments are fine, but they have a different budget, timeline, and success criteria than products.

For our retail theft detection system, the decision was: "Should a security guard investigate this specific moment in the camera feed right now?" Every design choice — latency, false positive rate, alert format — flowed from that one concrete decision.

Question 2: What Does "Wrong" Cost?

Every AI system makes mistakes. The question isn't whether it will — it's what happens when it does.

A false positive in retail theft detection means a security guard investigates a legitimate customer. Annoying, but low cost. A false negative means a theft goes undetected. Also relatively low cost in the scheme of things. This symmetry let us set reasonable accuracy thresholds and ship faster.

An AI that flags a fraudulent medical claim incorrectly is a very different situation. Get the error cost wrong and you'll either build an over-engineered system nobody needed or an under-engineered one that causes real harm.

Before starting, write down: false positive costs, false negative costs, and who is accountable when the system errs. If you can't answer any of these, you're not ready to build.

Question 3: Do You Actually Have the Data?

Not "will you have the data eventually" — do you have it now, in the format you need, with the labels you require?

I've seen projects where the data existed but was split across 12 vendor systems, required manual export, and had inconsistent timestamps. That data is theoretically available but practically unusable without 3 months of data engineering work before a single model can be trained.

For SmartON's currency detection, we needed labeled images of Indian currency notes under varied lighting conditions. We didn't have them. Before committing to a timeline, we spent 4 weeks on data collection — photographing notes in different environments, lighting conditions, and levels of wear. That 4-week investment saved us from making 3-month accuracy promises we couldn't keep.

Ask: what is our training set? What is our ground truth? Who labeled it? When was it collected? Is the distribution representative of production conditions?

Question 4: Who Maintains This After Launch?

AI systems require ongoing maintenance in a way that traditional software doesn't. Models drift as the world changes. Data pipelines need monitoring. Edge cases accumulate. A fraud detection model trained on 2024 transaction patterns may perform poorly on 2025 transaction patterns — not because the model is broken, but because the world changed around it.

Before you build, name the person who will own model monitoring, retrain the model when performance drops, and investigate anomalies in production. If that person doesn't exist in your organization, factor their cost into the project budget — or simplify the system until a current team member can absorb the maintenance load.

Question 5: Is AI Actually the Right Tool?

This is the question most AI engineers are reluctant to ask because the honest answer is sometimes "no."

If you're trying to classify documents into 5 categories and you already have 50 clear examples of each, a few well-crafted regex patterns and a decision tree will outperform a large language model in speed, cost, interpretability, and reliability. Save the LLM for the cases where you have ambiguity that rules can't capture.

I've refused AI project proposals three times in the last two years because a simpler deterministic system was the right answer. In each case, the client was initially disappointed — and later grateful when they saw how much faster and cheaper the working solution was.

Ask: is this a pattern recognition problem that genuinely requires learning from data, or is it a business logic problem that should be expressed as code?

If you've answered all five and you're ready to build, the next question is: custom AI or off-the-shelf APIs? Read the cost comparison →

5 Questions to Ask Before Starting Any AI Project

The Prototype Graveyard

Question 1: What Decision Does This AI Need to Change?

Question 2: What Does "Wrong" Cost?

Question 3: Do You Actually Have the Data?

Question 4: Who Maintains This After Launch?

Question 5: Is AI Actually the Right Tool?

Frequently Asked Questions

Related Posts

Custom AI vs. API Wrappers: The Real Cost Comparison

Why Most AI Prototypes Never Reach Production