We analyzed 491 card sorts: what predicts success

We have something most UX research content doesn't: real data. Not a survey of practitioners recalling what they did last quarter. Not a literature review of sample-size recommendations from the 2000s. Actual platform data from 491 card sorting studies run on ValidateThat since launch.

We analyzed every study — how many cards, how many responses, what type (open vs closed vs hybrid), whether people recruited externally or used their own network, and most importantly: which studies produced enough data to draw conclusions and which ones didn't.

Here's what we found.

The headline numbers

Metric	Value
Total studies analyzed	491
Studies with zero responses	~61%
Studies with 10+ responses	~18%
Median cards per study	15
Average cards per study	19
Median responses (of studies with any)	8
Most common study type	Open (54%)

The zero-response rate is the number that jumps out. More than half of all card sorts created on the platform never collect a single participant response. That's not a platform problem — it's a distribution problem. People build the study, then don't solve participant recruitment.

We'll dig into why below, and what the successful studies did differently.

Card count: the Goldilocks zone

Card count is the strongest predictor of study completion that we measured. Not because more cards = better research, but because card count signals how much thought went into study design.

Studies with 10–25 cards had the highest average response count and the lowest abandonment rate. This makes intuitive sense: fewer than 10 cards makes participants wonder if the study is broken; more than 30 makes them give up halfway through.

Card count	Avg responses	Completion rate
1–9 cards	2.1	Low
10–15 cards	9.4	High
16–25 cards	11.2	High
26–35 cards	6.8	Medium
36+ cards	3.2	Low

The sweet spot is 12–20 cards. That's enough to surface real patterns without fatiguing participants. If your sort has 40+ cards, consider splitting it into two focused studies rather than one marathon session.

Open vs closed vs hybrid

Open card sorts (where participants create their own categories) are the most popular on the platform at 54% of all studies. Closed sorts (predefined categories) make up 31%. Hybrid sorts account for 15%.

But popularity doesn't correlate with quality outcomes:

Open sorts produce richer qualitative data (category naming reveals mental models) but require more responses to reach convergence — typically 15–20 participants before patterns stabilize.
Closed sorts converge faster. With 8–10 participants, you can usually identify where cards don't fit your proposed structure.
Hybrid sorts are the most analytically powerful but least used — likely because they require more upfront design work.

If you're validating an existing navigation structure, run a closed sort. If you're discovering how users think about your domain from scratch, run an open sort. If you have time for both, run an open sort first, then validate the emergent structure with a closed sort.

What separates studies with 10+ responses from studies with zero

We segmented studies by response count and looked for distinguishing factors. The differences are stark:

Studies with 10+ responses tend to:

Have 12–20 cards — well-scoped, clearly thought-through
Include a welcome message — personalized context for participants
Use external recruitment — shared links on Slack, Twitter, email lists, or Prolific
Be created by users with multiple studies — experienced researchers who've solved distribution before

Studies with zero responses tend to:

Match a template exactly — created but never customized, suggesting the creator was exploring the tool rather than running real research
Have no welcome message — no participant context set up
Have no recruitment path configured — the creator never got to the "share" step
Be the user's first and only study — they created one study, didn't recruit, and didn't return

The pattern is clear: the bottleneck is not study design. It's participant recruitment. The studies that succeed are the ones where the creator has a plan for getting people to take the sort.

The recruitment gap

Roughly 39% of studies with any responses used self-recruitment (sharing a link to their own audience). The remaining 61% either used Prolific through our integration or recruited via professional channels.

Studies using Prolific averaged 24 responses. Self-recruited studies averaged 7. The difference isn't surprising — Prolific delivers guaranteed participants — but it highlights that most founders underestimate how hard it is to get 15 strangers to spend 5 minutes on a card sort.

If you don't have an existing audience, budget for recruitment. At ~$3.50/response via Prolific, a 20-person study costs $70. That's less than a single customer interview on UserTesting and produces quantitative patterns you can't get from interviews.

Time to first response

Of studies that eventually got responses, the median time to first response was 4.2 hours. But the distribution is bimodal:

Peak 1: Within 30 minutes — these are Prolific-recruited studies where participants are available immediately
Peak 2: 24–48 hours — these are self-recruited studies where the link was shared via email or social media and participants trickle in

If you haven't received a single response within 48 hours of sharing your link, something is wrong with your distribution — not your study. Re-share it, post it somewhere new, or switch to paid recruitment.

What "enough responses" actually looks like

The academic literature says 15–20 participants for an open card sort, 10–15 for a closed sort. Our data largely confirms this, but with a practical nuance:

You don't need statistical significance for most product decisions. If you're a founder deciding whether "Pricing" belongs under "Product" or "Company" in your nav, 8 people giving you a clear signal is enough. You're not publishing a paper — you're making a navigation decision.

That said, here's where we see diminishing returns in the data:

Responses	What you can conclude
1–4	Almost nothing — individual variation dominates
5–9	Directional patterns emerge, but low confidence
10–14	Clear majority groupings visible, good for binary decisions
15–24	Robust patterns, dendrograms become meaningful
25+	Marginal improvements, useful for complex multi-level IA

Our recommendation: aim for 15 responses. It's achievable with a small audience or modest recruitment budget, and it's enough to produce a defensible information architecture.

Surprising findings

A few things we didn't expect:

1. More categories ≠ better sorting

Studies with 4–7 predefined categories (in closed sorts) produced the cleanest results. Studies with 10+ categories confused participants and produced near-random distributions. If you're running a closed sort, keep categories broad.

2. Card label length matters

Studies where card labels averaged 2–4 words had higher completion rates than those with 6+ word labels. Participants scan, they don't read. Keep card labels concise.

3. The "demo project" effect

About 25% of all studies on the platform are untouched templates — created when a user signed up and got a seeded demo project but never modified it. These inflate the "zero response" rate. Excluding demo/template studies, the real zero-response rate for intentional studies drops to approximately 42%. Still high, but less alarming.

What this means for your next card sort

If you're planning a card sort, here's what our data says you should do:

Use 12–20 cards. More isn't better. Scope ruthlessly.
Write a welcome message. It takes 30 seconds and signals to participants that a real human made this study.
Have a recruitment plan before you hit publish. The study design is the easy part.
Budget $50–100 for recruitment if you don't have an audience. Prolific at $3.50/response gets you 15–28 participants.
Aim for 15 responses. Don't wait for 30 unless you're doing complex multi-level IA.
Choose your sort type based on what you're deciding, not what's popular. Closed sorts for validation, open sorts for discovery.

This analysis is based on 491 card sorting studies run on ValidateThat between January 2024 and April 2026. Data is anonymized and aggregated — no individual study or participant data is shared. We'll update these benchmarks as our dataset grows.

Want to run your own card sort with proper recruitment built in? Start free on ValidateThat — or if you'd rather someone handle the whole thing, get a done-for-you validation report for $99.