From research to App Store in 48 hours.
What rapid validation actually looks like when building is cheap.
I sat down last week with a specific brief. Not for a client. For myself.
Find a niche app idea in a growing industry. Something that solves a real problem. Something useful without requiring an account. Build it in under 48 hours. Ship it. See if strangers care.
I opened Gemini Deep Research and started feeding it that brief. What came back surprised me. HVAC load calculators. The kind of tool contractors use to size heating and cooling systems for homes.
I have never touched an HVAC system in my life.
Why HVAC?
Florida and Texas are booming. Construction is up. Every new build and renovation needs a load calculation. The existing software options? Desktop-first tools built in the Windows 95 era, costing $500-$1,000 per year. Or clunky apps that haven’t been updated since 2019.
Meanwhile, contractors are standing in someone’s living room trying to give a quote. They need something fast, accurate, and mobile. No login screen. No onboarding flow. Just open the app and calculate.
That gap was obvious. Not because I know HVAC. Because I know how to read a market.
Building with a daisy chain of AI tools
This is where it got interesting. And honestly, harder than I expected.
HVAC load calculations aren’t simple math. Manual J (the industry standard methodology) accounts for insulation values, window orientation, climate zones, square footage, occupancy, and dozens of other variables. Get it wrong, and a contractor sizes equipment incorrectly. Real consequences.
So I built a validation loop across multiple AI models.
Gemini Deep Research handled the market analysis and regulatory landscape. It surfaced the Manual J methodology, pricing benchmarks, competitor gaps, and growth data I needed to make a go/no-go decision.
Google Stitch generated initial UI concepts that I fed into Cursor, running Claude to build the actual application. I layered in the calculation logic piece by piece, testing against real scenarios.
Then I did something that made the whole process work: I built test cases across multiple states, housing types, and climate zones. I’d run the calculations, export the results, and show them to Gemini.
Do these numbers make sense for a 2,000-square-foot home in Tampa with standard insulation?
What about a 1,400 square foot home in Dallas with upgraded windows?
Each model checked the other’s work. Claude built the logic. Gemini stress-tested the outputs. When the numbers didn’t pass the sniff test, I’d dig in, find the gap, fix it, and run it again.
No single AI tool could have done this alone. The value was in the loop.
Constraints as a feature
I gave myself two rules beyond the 48-hour window.
One: bare minimum features only. No account creation. No cloud sync. No premium tier. Just a calculator that works. If a contractor opens the app in someone’s basement with no cell signal, it should still function.
Two: be willing to walk away. If nobody downloads it, that’s an answer. A useful one. I’d rather learn that in 48 hours than in 4 months.
Compare this to Leafed, my book discovery app. I spent weeks building features before putting them in front of real users. I love that app. But the feedback loop was slow. I was polishing when I should have been validating.
With the HVAC calculator, I flipped it. Ship first. Polish only if the market says so.
150 downloads in two weeks
I released it on February 2nd. By day three, 32 downloads. By the two-week mark, 150. Completely organic. Zero marketing spend. Zero social posts. No Product Hunt launch.
150 might sound small. But these aren’t vanity downloads. These are HVAC professionals who searched the App Store for a load calculator, found mine, and chose it. People with a job to do who picked my app over tools that have been on the market for years.
My research suggests that 1,000 active users is the threshold at which monetization makes sense for a niche professional tool. I’m watching the numbers and listening to how people use it before I make any decisions about pricing.
I did test out some pricing after crossing over 150, and I’ve already had some success, so I’m in that “figure out what is the best way to handle” mode, which…I’ve never really priced apps before. So…it’s exciting.
Have some pricing tips? Leave me a message and let me know.
What this means if you’re a PM
We spend a lot of time talking about domain expertise. Years of experience in fintech. Deep knowledge of healthcare workflows. A background in whatever vertical your company operates in.
Domain expertise matters. I’m not dismissing it. But when building becomes this cheap and this fast, the bottleneck shifts.
The valuable skill isn’t knowing HVAC. It’s knowing how to find an underserved problem, set constraints that force speed, validate outputs even when you’re not the expert, and ship something real to see if the market responds.
I knew nothing about Manual J calculations two weeks ago. But I knew how to write a research brief. I knew how to evaluate market gaps. I knew how to build a multi-model workflow that caught errors I couldn’t catch on my own. And I knew when good enough was good enough to ship.
That’s product judgment. It doesn’t require domain expertise. It requires the willingness to look in markets you’ve never worked in and ask: Is there a problem here worth solving?
A shelf life on every experiment
The biggest shift in how I’m building now: every project gets a shelf life.
48 hours to build and ship. Two weeks to see if there’s traction. If the numbers say yes, invest more. If they say no, walk away clean.
No sunk cost. No “but I already built the premium features.” No weeks of polishing something nobody asked for.
This kind of rapid experimentation works best when you can actually track what you’re doing. I use Monday.com to manage my project pipeline across consulting, apps, and newsletter content. When you’re running multiple experiments at different stages, having one place to see what’s active, what’s waiting on data, and what’s ready to kill makes the “walk away clean” part a lot easier.
People don’t care if your app is sexy. They care if it does the job they need done, works consistently, and doesn’t ask for their email before they’ve gotten any value.
Sometimes the best product thinking happens in markets you’ve never set foot in. You just need the judgment to recognize a problem and the discipline to stop building before you fall in love with the solution.
Until next week,
Mike Watson @ Product Party
P.S. Want to connect? Send me a message on LinkedIn, Bluesky, Threads, or Instagram.



