Can you build an AI product in 90 days?

Yes — a focused AI product with tight scope and clear users can ship a working version in 60-90 days. SmartON's first usable version came together in that window. Reaching full production reliability and broad rollout typically takes another several months beyond initial launch.

What does it take to ship an AI product in 90 days?

Shipping in 90 days requires ruthless scope discipline, decisions made in days not weeks, a small co-located team, willingness to defer non-critical features, and using existing models and infrastructure instead of building from scratch. The plan that worked for SmartON kept scope narrow and iterated weekly with real users.

What are the biggest risks of building AI products quickly?

Speed risks include over-fitting to the demo case, skipping evaluation, accumulating technical debt that slows later iteration, and confusing 'works in a demo' with 'works in production'. The mitigation is to build evaluation alongside the model from day one, not after launch.

What did building SmartON in 90 days teach you?

It taught us that constraints clarify priorities, that real users break assumptions faster than internal testing, that voice and vision pipelines have far more edge cases than expected, and that the post-launch work is at least as large as the pre-launch work — even when the launch itself goes well.

What would you do differently when building an AI product fast?

Build the evaluation harness on day one, design the data and feedback loop before the first ML model is trained, write down assumptions explicitly so they can be revisited, and resist the temptation to add features before the core flow is reliable. Most regrets are about features added too early.

Building an AI Product in 90 Days

The Starting Point

In early 2023, I sat in a room with two questions: how do visually impaired users navigate daily financial transactions, and what would it take to build something that genuinely helps? The second question had a constraint attached — we had 90 days before a demo commitment I'd already made.

SmartON started as an Android app that could identify Indian currency notes by pointing a camera at them. It shipped as a multilingual, multimodal assistive AI that handles currency detection, scene understanding, OCR, and document search — all routed through a voice interface that works in Gujarati, Hindi, and English. Here's how those 90 days actually went.

Week 1–2: Define the Minimum Viable Capability

The first thing we did was resist scope. Every conversation about SmartON generated new feature ideas — text-to-speech with personality, a social layer for sharing descriptions, integration with navigation apps. All of them were interesting. None of them were the core problem.

We forced ourselves to answer one question: what is the single most important thing this product needs to do for a visually impaired user to consider it genuinely useful on day one? The answer was currency detection — the ability to identify Indian rupee denominations reliably, fast, and without an internet connection.

Everything else was Phase 2. This decision saved us from an 18-month build and let us focus engineering resources on making one thing excellent instead of five things mediocre.

Week 3–6: Data Collection and Model Training

We quickly hit the first major obstacle: there was no public dataset of Indian currency notes under realistic conditions — varied lighting, worn notes, partial views, different camera angles. We had to build our own.

We spent three weeks photographing notes in every condition we could manufacture: bright sunlight, dim indoor lighting, crumpled notes, notes partially covered by fingers. We collected ~8,000 images, labeled them carefully, and applied aggressive augmentation to expand the effective dataset.

The model training itself took two iterations. Our first YOLO-based model hit 94% accuracy in testing — which sounds good until you realize that a 6% error rate means 1 in 17 notes is misidentified. For a financial transaction tool, this is not acceptable. We went back to the data, identified the failure modes (worn serial numbers, low-light conditions), collected targeted examples, and retrained. Second iteration: 98.7% accuracy. We shipped that.

Week 7–10: Building the Android App and Voice Layer

The model was the hardest part. The Android integration was hard in a different way — predictable engineering challenges rather than research uncertainty.

We chose TensorFlow Lite for on-device inference because the alternative (sending camera frames to a server for inference) would add 300–800ms of latency — unacceptable for a real-time tool. On-device inference at 30fps took model quantization, careful memory management, and a custom camera preview pipeline.

The voice layer came next. We needed text-to-speech that felt natural in three languages and speech recognition that worked in noisy environments. We used Android's native speech APIs for recognition and a custom TTS pipeline for output, with language detection handled by a small classification model running locally.

Week 11–13: User Testing and the Pivots

We put the app in front of five visually impaired users in week 11. What we learned in two days of testing reshaped 20% of the product.

The biggest surprise: users wanted the app to tell them the note's orientation, not just its denomination. "Two hundred rupees" is useful. "Two hundred rupees, face side up, portrait orientation" is what you need when you're returning the note to your wallet correctly.

The second surprise: the voice feedback was too slow. We were generating the response after full inference. Users wanted to hear confirmation before the full inference completed — a partial result that could be corrected if wrong was better than silence followed by a correct answer 400ms later. We rebuilt the feedback loop to stream partial results.

We also killed a feature we'd spent two weeks building: a scene description mode that described everything in the camera view. Users found it overwhelming. They wanted specific, actionable information — "glass door slightly left, push bar at waist height" — not a comprehensive inventory of the scene. We scoped it down aggressively.

What I'd Do Differently

Three things I wish we'd done earlier:

User testing in week 3, not week 11. We made 20% wrong decisions that we discovered only at the end. Earlier user testing would have caught the orientation requirement and feedback latency issue before we'd built around the wrong assumptions.

Set accuracy thresholds before training, not after. We trained to "as good as we can get" rather than "98% or we iterate." Having a target upfront would have given the team clearer stopping criteria.

Simpler architecture first. We over-engineered the scene description module from the start. Starting with a simpler approach and adding complexity only when the simpler approach provably failed would have saved a week.

SmartON is now live at getsmartonai.com. If you're building an assistive AI product or want to talk through your AI build timeline, reach out.

Building an AI Product in 90 Days: Lessons from SmartON

The Starting Point

Week 1–2: Define the Minimum Viable Capability

Week 3–6: Data Collection and Model Training

Week 7–10: Building the Android App and Voice Layer

Week 11–13: User Testing and the Pivots

What I'd Do Differently

Frequently Asked Questions

Related Posts

5 Questions to Ask Before Starting Any AI Project

Building SmartON: Assistive AI for the Visually Impaired