Was this helpful? Yes or No

Design

Early in the new millennium, I started a career as a Microsoft Program Manager. I'd been managing a team of designers, developers, and writers working on Office docs but moved to PM because I wanted to make the software that I'd been documenting.

As is so often the case when changing careers, my previous experience carried into my new responsibilities. My first job was to make Office docs available via the internet. Up to this point, documentation was installed on your computer. Before that, it was shipped on paper.

There's Always Room for Improvement

There are plenty of reasons why content on the internet is better than content on a user’s hard drive. But they all boil down to this: you can keep improving the content after you ship the product. Today this same principle applies to the entire product, but in 2002 we still shipped on DVDs every 3 years. So, this was pushing the envelope.

Being able to improve content after shipping allowed writers to focus on the product in market rather than the product in development. But to make this meaningful, writers needed to know where to focus. The signals at our disposal were limited. We could see what customers searched for and what they clicked. We could see what got the most views. But we couldn’t see whether content was effective. To address this, we needed direct customer feedback.

Setting a New Standard for Listening

The industry standard for gathering feedback at the time was to ask customers to rate content. Typically, it was a star rating from 1 to 5 or, in the case of MSDN (the Microsoft Developer docs site) 1 to 9. As I set about designing Office’s version of this experience, I spoke to the folks at MSDN. I learned that only about 1% of page views resulted in any rating at all. Also, in looking at sample data, I saw that the data was U-shaped. Responses tended to cluster at the bottom and the top of the scale.

I believed we could do better than 1% and formed a hypothesis. First, engage users by asking them a direct question about the content rather than asking for a rating. Second, make sure that the question required very little thought. The choice of answer needed to be polarizing. What I came up with was this:

Was this helpful?

Yes | No

If the user selected No, we'd ask for more details. That was it.

You might think this is obvious. This sort of question is ubiquitous now. But at the time, I hadn't seen anything like it.

Success?!?

When I was working on this, we didn’t have a framework for testing the hypothesis. The measure of a good idea at the time was whether it could survive in a room of engineers and other PMs. But this idea survived and we built the design.

When we shipped, literally years after I first formed my hypothesis, I got to see if it was right. Sort of. We saw a 10% response rate. But we couldn’t say for sure whether that was because the design was better than a rating scale or whether Office users are different from MSDN users.

Regardless, I moved on to the next release of Office.

A Well-Intentioned Failure

However, we had data now and with data come data scientists. The data scientists complained that the binary choice made analysis difficult. So, a different PM took that feedback and determined that a star rating from 1 to 5 would solve this problem. I suggested this would significantly reduce the volume of feedback, but I had no proof and the star rating went into development.

But when it shipped something dramatic happened: feedback virtually dropped to zero. It was so stark that the engineers believed the feedback wasn't being recorded. But once that was ruled out, it became clear that the star rating didn't work.

Finally, a Proper Experiment

This led us to develop our first experiment. We figured out a way to ship 3 alternate designs: my original design, the star rating, and a compromise that used a simple question but changed the choices to Yes, No, and Somewhat. In that experiment, the original design got the highest percentage of responses, the star rating continued to flop, and the compromise got fewer responses but still a meaningful number. To address the concerns of the data scientists, the compromise design shipped and remained the standard for many years.

Epilogue

Years later, the pattern of asking a simple yes or no question is everywhere. Which led me to wonder if our design had changed at all. So, I checked and found this:

Is this page helpful?

Yes | No

◄ Back to homepage