How to Plan and Run a Tree Test Study

Tree testing strips away everything visual about your site and asks one simple question: can people find what they're looking for using only your navigation labels and hierarchy? You give participants a text-only version of your site structure, hand them a task, and watch where they go.

It's one of the best ways to pressure-test your information architecture before you invest time in visual design. No colors, no layout, no imagery to guide people — just your labels and your structure doing the work.

Difficulty: Intermediate Time Required: 2-3 hours for planning, 3-7 days for running

What You'll Need

Before you start, gather three things:

A defined site structure. Pull this from your wireframes, an existing site, or card sort results. Aim for 20-40 items across 2-4 levels of hierarchy — enough to feel real without overwhelming people.
Realistic task scenarios. You'll want 5-8 of these, based on actual user goals (more on writing good ones below).
The right participants. Plan for 15-30 people who match your target audience.

You'll also need a tree testing tool (TreeJack and Optimal Workshop are popular options), about 30-45 minutes of each participant's time, and a spreadsheet to track your results.

Step 1: Define Your Research Questions

Start with what you actually want to learn. Vague goals lead to vague results.

Write down 3-5 specific questions tied to real IA decisions you're making or problems you've already spotted in analytics or user feedback. These questions will shape every task you write, so spend time getting them right.

Good research questions look like this:

Where do users expect to find account settings in a SaaS dashboard?
Can users locate technical documentation without relying on search?
Do users understand the difference between "Solutions" and "Services"?

Notice that each one points to a specific, testable thing. You'll know whether the answer is yes or no. That's what you're after.

Step 2: Prepare Your Tree Structure

Your tree is a text-only version of your proposed navigation. No icons, no descriptions, no visual hints — just labels organized in a hierarchy.

Keep it between 2-4 levels deep with 20-40 total items. That's enough complexity to reflect real navigation without making people scroll forever. Use clear, descriptive labels and stay consistent with your terminology across sections.

Here's a quick example:

Products
├── Software Solutions
│   ├── Project Management
│   └── Team Collaboration
├── Hardware
└── Pricing
Support
├── Help Documentation
├── Contact Support
└── Community Forums

Make sure your tree reflects the most current version of your proposed IA. If you've already run a card sort, incorporate those findings. Don't leave in placeholder labels or half-baked sections — participants will navigate them, and you'll get misleading data.

Step 3: Create Realistic Task Scenarios

This is where a lot of tree tests go sideways. If your tasks read like instructions ("Find the contact page"), you'll learn almost nothing useful. People will just scan for matching words.

Instead, write scenarios that put participants in a realistic situation. Give them a reason to look for something. Here are some examples:

"You're comparing software options and need to check pricing before a budget meeting tomorrow. Where would you look?"
"You've been using the product for about a month and want to connect with other users for troubleshooting tips. Where would you go?"
"A colleague mentioned a specific automation feature, and you want to read the technical details before you commit to using it. Where would you look?"

See the difference? These feel like real moments. They give people enough context to think the way they would on your actual site.

Write 6-8 of these, making sure each one connects back to one of your research questions. Mix in a few easy tasks alongside harder ones to keep participants engaged.

Step 4: Recruit the Right Participants

Getting the wrong people to test your tree is worse than not testing at all. You need participants who would actually use your site or product in real life.

For B2B products, that means recruiting professionals in relevant roles with the right industry background and comfort level with similar tools. For consumer products, match the demographics and behaviors of your actual audience. Either way, screen with 2-3 qualifying questions that check for fit without tipping people off to what you're testing.

Things to screen for:

Job role or industry (for B2B)
Experience with similar products
Geographic location (if it matters for your business)
Whether they primarily use mobile or desktop

You want 15-30 participants total. Tree testing doesn't need the large sample sizes that some other research methods require, because you're measuring clear success-or-failure outcomes rather than nuanced behavioral patterns.

Step 5: Set Up and Launch Your Tree Test

Load your tree and tasks into your testing platform. Turn on task randomization — this prevents order effects from skewing your data. (If everyone gets the same task first, their performance on it won't be comparable to later tasks.)

A few setup details that matter:

Set a reasonable time limit per task (2-3 minutes is usually plenty)
Add a confidence rating after each task so you can spot cases where people "succeeded" but felt unsure
Include an optional comment field — the qualitative insights can be surprisingly useful
Run through the whole thing yourself, then have a colleague do it too, before you send it to real participants

When you distribute the link, tell participants roughly how long it'll take (usually 15-20 minutes) and what device to use. Keep the instructions short. People don't need a paragraph explaining what tree testing is — just tell them to navigate the structure as if it were a real website.

Step 6: Monitor and Analyze Results

Once responses start coming in, keep an eye on completion rates but resist the urge to change anything mid-study. Even if you spot an obvious problem, modifying tasks or structure while data is being collected will make your results unreliable.

When the data is in, focus on three things:

Task success rates. How often did people end up in the right place? Tasks where fewer than 70% of participants succeed usually point to a real structural problem.
Directness. Did people go straight to the answer, or did they wander? A high success rate with lots of backtracking means people eventually figured it out, but your labels aren't doing their job.
Time to complete. Tasks that take noticeably longer than others often signal confusing category names or unclear groupings.

Look especially hard at:

Tasks where many people failed
Tasks where successful participants took wildly different paths
Spots in your tree where people consistently got stuck or turned around
Qualitative comments that reveal how people were thinking

Pull all of this into a spreadsheet so you can prioritize what to fix.

Step 7: Implement Changes and Validate

Now put your findings to work. Start with the biggest problems — the tasks where people clearly couldn't find what they needed, especially if those tasks map to important user goals.

Sometimes the fix is obvious. If users keep looking for "Pricing" under "Products" but you've buried it under "Plans," move it. Go with what your participants showed you, not what makes sense internally.

For bigger structural changes — reworking entire sections, relabeling multiple categories — it's worth running a quick follow-up tree test with 8-10 participants to make sure you actually improved things. This doesn't need to be a full study; just test the tasks that failed the first time around.

Document what you changed and why. This paper trail will save you when a stakeholder asks "why did we reorganize the support section?" six months from now.

Tips and Best Practices

Ground every task in real user behavior. Pull from support tickets, sales calls, search logs — anything that tells you what people are actually trying to do on your site. Your assumptions about user goals are often wrong in small but important ways.

If a significant chunk of your traffic comes from mobile, include some mobile-specific scenarios. Mobile users often think about navigation differently, and a structure that works well on desktop can fall apart on a small screen.

Mix up your task difficulty. Include some broad, exploratory tasks ("find pricing information") alongside narrow, specific ones ("find the API rate limits documentation"). The broad tasks test your top-level categories; the specific ones stress-test your deeper structure.

Finally, don't skip the qualitative feedback. A comment like "I wasn't sure if this would be under Support or Resources" tells you more about the problem than a success rate ever could.

Common Mistakes to Avoid

Testing a rough draft. Your tree should use realistic labels and structure. If you test something half-baked, you'll get feedback on problems that don't exist in your real IA and miss the ones that do.

Ignoring mobile users. Tree testing works on mobile devices. If your audience skews mobile, test with mobile participants.

Overloading on edge cases. It's tempting to test obscure scenarios, but most of your tasks should cover the things most users do most often. Save the edge cases for a follow-up study.

Changing things mid-study. You'll see problems during data collection. Write them down and fix them later. Modifying the study while it's running ruins your data.

Only looking at failures. Successful navigation paths matter too. They show you what's working well in your IA — and those are patterns you'll want to preserve when you make changes elsewhere.

Frequently Asked Questions

How many participants do I need for reliable tree test results? Most teams get solid results with 15-30 people. Tree testing has clear pass/fail outcomes for each task, so you don't need the large sample sizes that more open-ended research methods call for. Even at the lower end of that range, patterns in your data tend to be pretty clear.

What's the difference between tree testing and card sorting? They're complementary. Card sorting helps you build your IA — you learn how users naturally group and label things. Tree testing checks whether the IA you built actually works. Most teams run a card sort first to inform the structure, then tree test to validate it.

When should I run a tree test instead of traditional usability testing? Tree testing is ideal early in the design process, before you've committed to any visual direction. It isolates navigation and findability from everything else — layout, visual hierarchy, imagery. Once you have a working prototype, switch to usability testing so you can evaluate the full experience.

How do I know if my tree test results are good enough to move forward? There's no magic number, but 70% success is a reasonable benchmark for most tasks. Below that, you probably have a structural problem worth fixing before you move into visual design. Also pay attention to how people got to the right answer. If everyone succeeded but took completely different paths, your labels might still need work.

Can tree testing work for mobile-first or app-based information architectures? Absolutely. Just make sure your tasks reflect how mobile users actually behave. Someone on their phone might have different priorities or expectations than a desktop user — shorter sessions, more task-focused, less likely to browse. You might find that what works for desktop navigation needs adjustments for mobile, which is exactly the kind of thing tree testing can surface.

How to Plan and Run a Tree Test Study

How to Plan and Run a Tree Test Study

What You'll Need

Step 1: Define Your Research Questions

Step 2: Prepare Your Tree Structure

Step 3: Create Realistic Task Scenarios

Step 4: Recruit the Right Participants

Step 5: Set Up and Launch Your Tree Test

Step 6: Monitor and Analyze Results

Step 7: Implement Changes and Validate

Tips and Best Practices

Common Mistakes to Avoid

Further Reading

Frequently Asked Questions

Ready to Try It Yourself?

More How-To Guides

Related Guides & Resources