18 — Surveys: When They Help and When They Mislead

The Most Overused Tool in Product Research

Surveys are easy to run, cheap, and produce numbers that look like data. That combination has made them the default research tool at most companies. They are also one of the easiest ways to fool yourself. A bad survey will give you confident-looking answers to questions you didn't mean to ask, and you will make real decisions based on them.

We have seen teams kill good ideas because of survey results that were misleading. We have seen teams green-light bad ideas because the survey said users would buy them. We have seen NPS scores rise while customer churn rose at the same time, which should be impossible if the score meant what teams thought it meant.

This article is about how to use surveys honestly: when they are the right tool, how to write questions that produce real answers, what statistical traps to avoid, and how to read results without overclaiming. Surveys are useful, but only if you understand what they can and cannot do.

When to Use a Survey

Surveys do some things well and other things badly. The difference is whether you already know the right question to ask. If you do, surveys can scale that question to thousands of users and give you statistical confidence. If you don't, surveys will produce numbers that don't mean what you think they mean.

Surveys Are Good For:

Measuring how widespread a known issue is. You've heard from interviews that some users struggle with X. The survey tells you whether twenty percent or seventy percent of users have the same struggle.
Tracking change over time. Once you have a baseline measurement, you can re-run the survey periodically and watch whether the metric is moving.
Comparing segments. Do enterprise users feel differently from small-business users? Does behaviour vary by region? Surveys can answer these comparative questions when sample sizes are large enough.
Validating findings from qualitative work. You interviewed ten users and found a pattern. The survey checks whether the pattern holds across a thousand.
Capturing simple, factual information. Job titles, company sizes, tools used. Things users can answer without guessing or predicting.

Surveys Are Bad For:

Discovering what users want. Users don't know what they want; they only know what they've seen. Surveys force them to choose from options you wrote, and the options reveal your assumptions, not their reality.
Predicting behaviour. "Would you use this?" produces wildly inflated yes-answers. Predictions about future use are almost worthless.
Predicting willingness to pay. "How much would you pay?" produces numbers that bear no relation to what users actually pay. People give the answer that costs them least socially, which is usually a low number.
Understanding why something happens. Surveys can tell you that thirty percent of users do X. They cannot reliably tell you why. The why requires interviews.
Detecting subtle issues. Anything that requires context to explain is hard to capture in a survey. The user will pick the closest option from a multiple choice and you will miss the actual answer.

Designing Questions That Don't Mislead

The single biggest mistake in surveys is poorly worded questions. A small change in wording can flip the result. The user is forced to interpret what you meant, and they often interpret it differently from each other. Here are the most common wording problems.

Leading Questions

How frustrated are you with the slow load times? assumes the load times are slow and that the user is frustrated. The user is now choosing how frustrated to admit being. They will often pick a middle option to be polite, which inflates the appearance of frustration.

Better: How would you rate the load times of the product? with options ranging from "too slow" to "about right" to "faster than I need." This lets the answer go in any direction.

Double-Barrelled Questions

How easy and intuitive is the onboarding? Two questions in one. A user might find it easy but not intuitive, or vice versa, and have no good way to answer. Split the question or pick one dimension.

Vague Words

How often do you use this feature? with options like occasionally, sometimes, often. What does often mean? Different users will read it differently, making the results meaningless. Use specific frequencies: every day, a few times a week, once a month, less than once a month.

Hypothetical Questions

Would you use a feature that did X? Bad. The user imagines a perfect version of X and says yes. They have no idea what X would actually look like or whether they would actually use it. Stick to questions about what they actually do or have done.

Unbalanced Scales

Excellent / Very Good / Good / Fair has three positive options and one negative. The result is biased toward positive. Balanced scales have equal positive and negative options, usually with a neutral middle: Very Good / Good / Neutral / Bad / Very Bad.

A Note on Net Promoter Score

Net Promoter Score (NPS) is the most popular metric in product and customer-success teams. It asks one question: How likely are you to recommend this product to a friend or colleague? Users answer on a zero-to-ten scale. The score is calculated by subtracting the percentage of detractors (zero to six) from the percentage of promoters (nine and ten). Passives (seven and eight) are ignored.

NPS has problems. The biggest is the calculation method. By throwing away passives, the score is unstable: small changes in answer distribution produce large changes in score. Two products with very similar customer feeling can have very different NPS numbers, and tracking NPS over time can show big movements that don't reflect real changes in customer feeling.

The second problem is cultural. NPS varies by country: users in some cultures rarely give nines or tens regardless of how they feel, while users in other cultures give them generously. Comparing NPS scores across regions can mislead.

The third problem is that likely to recommend is not the same as actually recommends . Many users say they are likely to recommend and never do. Many users who are unlikely to recommend are still happy customers; they simply don't recommend products generally.

When NPS Is Useful

Despite all this, NPS can still be useful as a rough tracker, particularly when paired with the open-ended follow-up question: What is the main reason for your score? The free-text answers are often more useful than the score itself, because they tell you what users actually feel and why. The score is the headline; the comments are the substance.

When to Be Skeptical

Be skeptical of NPS as a primary metric. Be skeptical of comparing NPS across companies. Be skeptical of small movements in NPS over time. Use it as one input among many, not as the scoreboard. Several alternatives (CSAT, CES, simple satisfaction scales) have similar weaknesses, so the broader lesson is to treat any single survey metric with appropriate humility.

Sample Size and Statistical Significance

Most PMs are not statisticians and don't need to be. But a few rules of thumb help avoid the worst mistakes.

Bigger Is Usually Better, Up to a Point

Below thirty respondents, results are noisy and shouldn't be treated as reliable. Between one hundred and four hundred is the sweet spot for most product surveys, big enough to draw conclusions and small enough to remain practical. Beyond a thousand, you are paying diminishing returns; the precision improves slowly.

Beware of Sampling Bias

If your survey is sent to all users but only certain types answer, the results reflect those types, not your full user base. Power users answer surveys more than casual users. Happy users sometimes answer more, sometimes less, depending on the incentive. The respondents are usually different in some way from the population you wanted to learn about.

The fix is to look at who actually responded and ask whether they are representative. If not, the result needs to be read with that bias in mind. Sometimes you can correct for it (by weighting answers from underrepresented segments). Sometimes you have to accept that the survey only tells you about the respondents, not all users.

A Difference Has to Be Big to Be Real

If sixty-two percent of group A says yes and fifty-eight percent of group B says yes, with one hundred respondents per group, that difference is probably noise. Differences smaller than five or ten percentage points often disappear when the survey is re-run with a different sample. Don't make decisions on tiny differences. Look for big movements or persistent patterns across multiple surveys.

Common Survey Mistakes

Mistake One: Asking Too Much

Long surveys produce poor data. Users get tired, start clicking randomly, or abandon the survey halfway through. Ten questions is usually plenty. Twenty is borderline. Thirty is too many. If you have a long list of questions you want to ask, run multiple shorter surveys instead of one long one.

Mistake Two: Open-Text Everything

Open-text questions feel rich but are exhausting to analyse and even more exhausting to fill in. One or two open-text questions per survey is right. Use multiple choice for the rest, with a free-text "other" option for users who don't fit. The mix gives you analysable data and rich detail where it matters.

Mistake Three: Forced Choice on Everything

Some users genuinely don't have an opinion on a question. Forcing them to pick a number anyway adds noise. Include a not applicable or I don't know option where appropriate. The percentage who pick it is itself useful information.

Mistake Four: Ignoring the No-Response

If you sent the survey to ten thousand users and three hundred responded, the three hundred are a self-selected group. They differ from the ninety-seven hundred who didn't respond. The non-respondents matter. Sometimes they matter more than the respondents, because they're the silent majority whose views you don't see.

Mistake Five: Reading the Results You Wanted

Confirmation bias is severe in survey analysis. People focus on the parts of the data that support their existing view and discount the parts that don't. The fix is to write down, before you see the results, what you predict the data will show and what would change your mind. Then read the data against your predictions, including the parts you didn't predict.

A Process for Good Surveys

Here is a sequence that has produced reliable results for us and is worth following until it becomes automatic.

1. Start with a clear question. What do you specifically want to learn? "We want to understand users better" is not specific. "We want to know what percentage of users have tried a competitor in the last six months" is.
2. Decide if a survey is the right tool. If the question requires open exploration or understanding why, run interviews instead. Surveys answer how-many and how-often questions, not why questions.
3. Draft the questions. Keep it under fifteen questions. Mix multiple choice with one or two open-text. Read each question aloud to check for ambiguity.
4. Test on five people first. Before sending widely, have five colleagues or friendly users complete it and tell you where they got confused or stuck. Almost every survey reveals problems at this stage.
5. Send to a representative sample. Think about who is getting the survey and whether they represent your user base. Adjust if needed.
6. Wait for enough responses. Aim for at least one hundred. Below that, the noise is too high.
7. Analyse with skepticism. Write down what you predicted before reading. Look at the data including the parts that contradict your predictions. Look for big patterns; ignore small differences.

8. Share the results, including the limitations. Communicate not just the numbers but the

caveats: who responded, what biases might be present, what the survey cannot tell you. The honest framing is what makes the results useful for decisions.

A Final Word

Surveys are a tool, not an answer. Used well, alongside interviews and behavioural data, they help you understand your users at scale. Used poorly, they produce confident numbers that mean nothing and waste real decisions on bad data. The difference is in how you write the questions, how you read the results, and how honest you are about what the numbers can and cannot tell you.

If you remember nothing else: don't use surveys for discovery, don't ask what users would do in the future, don't mistake a single survey result for truth, and treat any score-based metric (NPS included) with humility. With those rules in place, surveys become useful. Without them, they become a way to fool yourself with confidence.

Key Takeaways

Surveys measure known things at scale. They don't discover unknown things, predict behaviour, or explain why.
Run interviews first to find what to ask. Use surveys to measure how widespread the patterns are.
Bad question wording (leading, double-barrelled, vague, hypothetical, unbalanced) produces unreliable answers. Test wording before sending.
NPS and similar single-score metrics have real weaknesses. Use them as rough trackers, paired with open-ended follow-ups, not as primary scoreboards.
Sample sizes below thirty are noise. One hundred to four hundred is the sweet spot. Watch for sampling bias and don't make decisions on small differences.