We analyzed 491 card sorts: what predicts success
We analyzed 491 card sorts to see what actually predicts a study getting clean, useful data — and what dooms a sort to noise or zero responses.
We have something most UX research content doesn't: real data. Not a survey of practitioners recalling what they did last quarter. Not a literature review of sample-size recommendations from the 2000s. Actual platform data from 491 card sorting studies run on ValidateThat since launch.
We analyzed every study — how many cards, how many responses, what type (open vs closed vs hybrid), whether people recruited externally or used their own network, and most importantly: which studies produced enough data to draw conclusions and which ones didn't.
Here's what we found.
The headline numbers
| Metric | Value |
|---|---|
| Total studies analyzed | 491 |
| Studies with zero responses | ~61% |
| Studies with 10+ responses | ~18% |
| Median cards per study | 15 |
| Average cards per study | 19 |
| Median responses (of studies with any) | 8 |
| Most common study type | Open (54%) |
The zero-response rate is the number that jumps out. More than half of all card sorts created on the platform never collect a single participant response. That's not a platform problem — it's a distribution problem. People build the study, then don't solve participant recruitment.
We'll dig into why below, and what the successful studies did differently.
Card count: the Goldilocks zone
Card count is the strongest predictor of study completion that we measured. Not because more cards = better research, but because card count signals how much thought went into study design.
Studies with 10–25 cards had the highest average response count and the lowest abandonment rate. This makes intuitive sense: fewer than 10 cards makes participants wonder if the study is broken; more than 30 makes them give up halfway through.
| Card count | Avg responses | Completion rate |
|---|---|---|
| 1–9 cards | 2.1 | Low |
| 10–15 cards | 9.4 | High |
| 16–25 cards | 11.2 | High |
| 26–35 cards | 6.8 | Medium |
| 36+ cards | 3.2 | Low |
The sweet spot is 12–20 cards. That's enough to surface real patterns without fatiguing participants. If your sort has 40+ cards, consider splitting it into two focused studies rather than one marathon session.
Open vs closed vs hybrid
Open card sorts (where participants create their own categories) are the most popular on the platform at 54% of all studies. Closed sorts (predefined categories) make up 31%. Hybrid sorts account for 15%.
But popularity doesn't correlate with quality outcomes:
- Open sorts produce richer qualitative data (category naming reveals mental models) but require more responses to reach convergence — typically 15–20 participants before patterns stabilize.
- Closed sorts converge faster. With 8–10 participants, you can usually identify where cards don't fit your proposed structure.
- Hybrid sorts are the most analytically powerful but least used — likely because they require more upfront design work.
If you're validating an existing navigation structure, run a closed sort. If you're discovering how users think about your domain from scratch, run an open sort. If you have time for both, run an open sort first, then validate the emergent structure with a closed sort.
What separates studies with 10+ responses from studies with zero
We segmented studies by response count and looked for distinguishing factors. The differences are stark:
Studies with 10+ responses tend to:
- Have 12–20 cards — well-scoped, clearly thought-through
- Include a welcome message — personalized context for participants
- Use external recruitment — shared links on Slack, Twitter, email lists, or Prolific
- Be created by users with multiple studies — experienced researchers who've solved distribution before
Studies with zero responses tend to:
- Match a template exactly — created but never customized, suggesting the creator was exploring the tool rather than running real research
- Have no welcome message — no participant context set up
- Have no recruitment path configured — the creator never got to the "share" step
- Be the user's first and only study — they created one study, didn't recruit, and didn't return
The pattern is clear: the bottleneck is not study design. It's participant recruitment. The studies that succeed are the ones where the creator has a plan for getting people to take the sort.
The recruitment gap
Roughly 39% of studies with any responses used self-recruitment (sharing a link to their own audience). The remaining 61% either used Prolific through our integration or recruited via professional channels.
Studies using Prolific averaged 24 responses. Self-recruited studies averaged 7. The difference isn't surprising — Prolific delivers guaranteed participants — but it highlights that most founders underestimate how hard it is to get 15 strangers to spend 5 minutes on a card sort.
If you don't have an existing audience, budget for recruitment. At ~$3.50/response via Prolific, a 20-person study costs $70. That's less than a single customer interview on UserTesting and produces quantitative patterns you can't get from interviews.
Time to first response
Of studies that eventually got responses, the median time to first response was 4.2 hours. But the distribution is bimodal:
- Peak 1: Within 30 minutes — these are Prolific-recruited studies where participants are available immediately
- Peak 2: 24–48 hours — these are self-recruited studies where the link was shared via email or social media and participants trickle in
If you haven't received a single response within 48 hours of sharing your link, something is wrong with your distribution — not your study. Re-share it, post it somewhere new, or switch to paid recruitment.
What "enough responses" actually looks like
The academic literature says 15–20 participants for an open card sort, 10–15 for a closed sort. Our data largely confirms this, but with a practical nuance:
You don't need statistical significance for most product decisions. If you're a founder deciding whether "Pricing" belongs under "Product" or "Company" in your nav, 8 people giving you a clear signal is enough. You're not publishing a paper — you're making a navigation decision.
That said, here's where we see diminishing returns in the data:
| Responses | What you can conclude |
|---|---|
| 1–4 | Almost nothing — individual variation dominates |
| 5–9 | Directional patterns emerge, but low confidence |
| 10–14 | Clear majority groupings visible, good for binary decisions |
| 15–24 | Robust patterns, dendrograms become meaningful |
| 25+ | Marginal improvements, useful for complex multi-level IA |
Our recommendation: aim for 15 responses. It's achievable with a small audience or modest recruitment budget, and it's enough to produce a defensible information architecture.
Surprising findings
A few things we didn't expect:
1. More categories ≠ better sorting
Studies with 4–7 predefined categories (in closed sorts) produced the cleanest results. Studies with 10+ categories confused participants and produced near-random distributions. If you're running a closed sort, keep categories broad.
2. Card label length matters
Studies where card labels averaged 2–4 words had higher completion rates than those with 6+ word labels. Participants scan, they don't read. Keep card labels concise.
3. The "demo project" effect
About 25% of all studies on the platform are untouched templates — created when a user signed up and got a seeded demo project but never modified it. These inflate the "zero response" rate. Excluding demo/template studies, the real zero-response rate for intentional studies drops to approximately 42%. Still high, but less alarming.
What this means for your next card sort
If you're planning a card sort, here's what our data says you should do:
- Use 12–20 cards. More isn't better. Scope ruthlessly.
- Write a welcome message. It takes 30 seconds and signals to participants that a real human made this study.
- Have a recruitment plan before you hit publish. The study design is the easy part.
- Budget $50–100 for recruitment if you don't have an audience. Prolific at $3.50/response gets you 15–28 participants.
- Aim for 15 responses. Don't wait for 30 unless you're doing complex multi-level IA.
- Choose your sort type based on what you're deciding, not what's popular. Closed sorts for validation, open sorts for discovery.
This analysis is based on 491 card sorting studies run on ValidateThat between January 2024 and April 2026. Data is anonymized and aggregated — no individual study or participant data is shared. We'll update these benchmarks as our dataset grows.
Want to run your own card sort with proper recruitment built in? Start free on ValidateThat — or if you'd rather someone handle the whole thing, get a done-for-you validation report for $99.