How to design product tests when you can't ship them.
Notes from four indie experiments this year.
One of my vibe-coded apps, which has been on the market since February, has been slowly gaining paying customers.
Pricing has been something that is new to me, so I’ve been dabbling with different approaches - trying to find the sweet spot.
I doubled the price of my HVAC Load Calculator app a few weeks ago just to test the waters. Ten minutes in RevenueCat, a few updates in the app stores, and the changes were ready to go.
I half-expected trials to drop. But they did not - somehow they stayed consistent (#winning). The few weeks brought twice the revenue from the same paying customers.
The app now has 100 paid subscribers. And although I’m ecstatic about the wins here, what surprised me is that the whole test cost an afternoon of work.
Shipping a test that quickly isn't always possible, but doing the work to determine when a test is worth running is.
The HVAC pricing was one of three experiments this year. The other two went differently.
Another one of my vibe-coded apps (another calculator) raised a different question: was paid pricing even the right monetization shape for it?
I added a paid tier as a test. Twenty downloads a month wasn’t enough volume to read any pricing signal. So…I killed that test after a couple of weeks.
Which raised a smaller question: if pricing won’t tell me anything yet, what will? I pivoted the same app to subtle in-app advertising, using ad networks to track signals while generating a few dollars in revenue. The “Is this app worth pricing?” question can wait until I know what the Discovery Math looks like.
The work was in figuring out what to test, not in shipping it.
Inside a company, the apparatus around a test exists for real reasons. Pricing changes affect real customers. Brand promises matter. Compliance is real.
But the engineering used to be the expensive line item. Once it stopped being expensive, the apparatus didn’t shrink to match. Where corporate testing slows down now has more to do with the apparatus than with how hard the work itself is.
Your expertise is the lever indies don’t have. Years of research sessions, pricing decisions, and competitor analysis. You already know what we have to learn the slow way.
Use it to remove apparatus where you can. Skip the precision that isn’t load-bearing.
A directional answer in a week beats a perfect one in a quarter. Most of a PM's credibility is spent justifying why a test needs more rigor.
The durable version is justifying why it needs less.
Design is half the work. The other half is getting the test approved.
Two moves when you can’t ship the test yourself:
Frame the test as a probe. “We don’t know if X will work. Let’s run a two-week experiment and see what we learn.” plays better than “We’re proposing a six-month strategy around X.”
Design the test before you propose it. Hand someone a clean brief with the cohort, the success metric, and the kill criteria set, and most of the friction is gone.
100 paid subscribers, no analytics, three tests this year. One worked. One died. One pivoted into something I didn't plan. That's testing when the apparatus goes to zero. Yours probably won't.
But the gap between what your tests cost and what they should cost is wider than you think. The expertise to close it is already in your head.
Mike Watson @ Product Party
P.S. Want to connect? Send me a message on LinkedIn, Bluesky, Threads, or Instagram.


