The Fearless Critic

The Daily Sip|Dec. 27, 2010 8:26 pm EST

Until August of 2008, no one in the wine world had heard the name Robin Goldstein. Now, no one can forget him.

That summer, Goldstein announced that he'd created a fictitious restaurant in Italy with a reserve wine list comprised solely of wines that had terrible scores from Wine Spectator. And yet his fictitious restaurant was bestowed with an Award of Excellence by the magazine for its outstanding wine list.

While Goldstein clearly had his fun at the expense of Wine Spectator (never mind loads of coverage, such as this recent Freakanomics podcast), he insists that his goal was to point out that such a program — and the field of wine criticism in general — would be more useful to consumers if it removed all potential for bias. Furthermore, the scientific method works better than a 100-point system.

Since 2008, Goldstein has helped recast the way that much of the wine industry and consumers look at critics. He's done this mostly with his annually updated book The Wine Trials, in which readers are shown what wines performed best in several blind tastings among general sippers and experts alike. Interestingly, the most expensive, high-profile wines don't always perform as well as one would expect.

Read on for our full interview with Goldstein, who insists he's not out to undermine every wine critic in the world, or to prove that all inexpensive wines taste better than pricey ones. Rather, he wants to reform the way critics evaluate wine and consumers perceive it. Only then will everyone truly be drinking what tastes best.

You're now on your third edition of The Wine Trials. What's been your goal with the project: Is it to devalue critics, to bring transparency to their processes, bring fairness to the marketplace? All of the above?

Goldstein: I think there are two main goals. The first is to question the reliance that we place on wine critics' ratings — not all critics, but the critics whose ratings correlate strongly with prices, and who reserve their highest ratings for very expensive wines. The experiments that I've conducted for The Wine Trials, along with a lot of the other research that I cite in the book, point toward a very different conclusion: That for most people, more expensive wines don't taste better than cheaper wines. So if you can't trust most critics' ratings, then how should you choose wine? The Wine Trials argues that blind wine tasting can help you understand your palate, learn your true wine preferences, and become a more skeptical — and better — consumer.

It's important to note that I'm not encouraging you to taste blind over dinner, nor am I suggesting that there's no value in tasting non-blind when your only goal is to savor and enjoy a bottle of wine. On the contrary, research on the placebo effect shows that you'll often enjoy wine more — sometimes a lot more — when you know more about it. But blind tasting is important because it allows you to discover what you like and don't like — and to learn to identify what's what — in a rigorous way, and to get to the bottom of which wines are really worth the money to you, above and beyond your natural price bias, and which ones you're just enjoying because you know they're expensive.

The second goal, which corresponds to the second half of the book, is to let consumers know about the inexpensive wines that did best in our own blind tastings. The Wine Trials needs a new vintage every year because wine does, too — there are totally new wines on the shelf every year, and our blind-tasting panel starts fresh with every vintage. The 2011 edition covers the wines under $15 that you'll be able to find at stores and restaurants in the coming year.

Although we believe that our blind-tasting methodology is more scientifically grounded than others (especially the ones that come from non-blind or semi-blind tasting settings), I want to reiterate that every wine drinker and every reader has a unique palate, there's no substitute for blind tasting yourself so you can benchmark your own preferences against our tasting panel's.

Have you ever tried to get a well-known wine critic to participate in your tastings?
No, I haven't reached out to any critic who's known for publishing ratings. Given that The Wine Trials is largely a response to that phenomenon, you can probably understand why. We do draw from many different parts of the wine world in assembling the panel for our New York tastings; we've had importers, distributors, winemakers, sommeliers, wine bloggers, people who study wine academically, chefs.

Here's an interesting tidbit: In our 2008 blind-tasting experiment that forms the basis for the first half of The Wine Trials, we conducted something that I call the "twin wine test" in the book. Basically, every taster got one flight of six wines that contained two identical wines. In people's rank order of which wines were better and worse, we measured the distance between the two identical wines as a measure of accuracy and consistency. The wine experts (about 70 of our 500 tasters) did slightly better on this test than the non-experts, but not a whole lot better. Generally speaking–and this is only a correlation, not an absolute rule–people who were involved in the production of wine did better than professional critics. I wish the world of critics included more people who were more involved in making wine–but on the other hand, winemakers often object to critics' work, so it's understandable why they might not want to join their ranks.

Have you ever tasted an expensive wine that's truly blown you away with how good it tasted, or an inexpensive wine that really tasted gross, confirming what the critics would say? A situation where, honestly, price makes no difference–the cheap one is truly terrible, the expensive one is truly good.
Of course. The polemic sections of The Wine Trials are often misinterpreted as implying that all expensive wines are bad, and all cheap wines are good. That's completely untrue. There are many expensive wines that I think are amazing, and totally worth the money. For me, it's often the older wine that's worth spending more for. I have a serious weakness for old Burgundy; it's hard to find at restaurants and everyday wine shops, and so when I want it, I'm prepared to pay more for it. It costs time and money to cellar wine for decades, and I think it's fair that you should pay extra for that service. I wish older vintages were more readily available, especially on big, long, expensive wine lists. If I'm going to pay $100 for a wine, it should be ready to drink, but so often, it's not–you're spending that much on a brand-new release of Barolo that's way too harsh and tannic in its first few years. There are also a few wines that cost so much to make that they justify the much higher prices from a cost-of-production standpoint. German TBA/icewine comes to mind–those grapes are so expensive to farm and harvest that it's simply impossible for those wines to be sold at entry-level prices–but still, from a consumer standpoint, you shouldn't spend a lot for it if you don't like it in a blind tasting.

There's also tons of shitty cheap wine. I'm not making the argument that cheap wine, generally speaking, is better than expensive wine. Even if consumers, on average, prefer the cheap wine, that doesn't mean it's better in some platonic sense. It just means that a lot of people are overpaying for expensive wine. Either way, by no means is picking a four- or five-dollar bottle at random a good idea. You need to understand what you're buying, and The Wine Trials aims to be a good starting point for that. What's really a big mistake, for anyone, is picking out a hundred-dollar bottle at random and assuming you're going to love the liquid inside more (beyond the placebo effect). That's even worse than picking a cheap bottle at random, because you're taking a much bigger financial risk.

But the ultimate goal of our experiments, and of The Wine Trials, isn't to find out which wines suck. The goal is to help you learn better what you do like through the process of blind tasting, and to point you toward the widely available wines under $15 that are great values across every different major wine style. We exclude from the book wines that aren't good. We had more than 600 inexpensive wines nominated by wine industry professionals for the 2011 edition, and out of those, only 175 made it into the book. Then we narrow it down further by picking the category winners–our top picks. Even for those, most reviews talk about both pros and cons of the wines. It's important not to over-generalize that cheap wine is better than expensive wine.

What was your impression of Robert Parker's failure to identify any of the 2005 Bordeaux wines at that tasting last year? Proof that critics are very fallible, or hey, he could nail them all tomorrow?
It didn't surprise me. It's very difficult to taste a wine blind and identify who the producer is, or what exact subregion it's from. I guess it might have surprised some people who'd bought into the myth that Parker's palate was superhuman. It's not. In my travels, I often have people come up to me and say, "I have a friend who has an amazing palate–she can taste any wine and tell me its exact grape variety or varieties, region, appellation, producer and vintage." That's hogwash. There's nobody in the world who can do that every time, or even most of the time–not amateurs, not famous critics, not even producers. They can't do it because, first of all, those signals are merely correlative, not causal–there isn't a single distinct "taste" to the 2010 Napa vintage any more than there is a distinct "style" of all 2010 American cars produced in Detroit. Each is correlated with a norm (one climatic, the other cultural) but deviates significantly from that norm.

But just as importantly, your palate just isn't designed to measure exact data like these. The idea that someone is able to precisely identify a wine every time is based on a misunderstanding of how the human brain works. Neuroscience teaches us that the brain is actually designed to interpret the world's sensory inputs through an individualized, constantly changing perceptual lens that helps you make conscious sense of the world around you and make quick judgments based on imperfect information. Everything we perceive involves a lot of leaps of faith, guesswork and interpretation. You don't notice how many assumptions you were making about a staircase until you stumble on a stair that was an inch taller than the previous one. Every day, when you wake up, your body and your palate are working from a different starting point–a different set of assumptions, a different perceptual lens–than they were the day before. Your mood changes. What wine you've drunk in your life changes. What you've had for breakfast changes. All of these things affect your impressions of the wine. So it follows that Parker isn't going to identify a particular Bordeaux producer in a blind tasting, and it also follows that he isn't going to be all that consistent with his scores, either. To Parker's credit, he does agree to participate in this sort of blind tasting from time to time, and each time it seems that these inconsistencies are revealed. So clearly he understands these constant fluctuations, and to some degree, he's comfortable with them.

What's problematic, though, is the degree of accuracy that's implied by his 100-point rating scale. It's baffling that a critic like Parker thinks it's appropriate to convey information about wine to consumers in such a specific way, with such misleadingly high resolution. He defends himself by saying that when he tastes the same wine twice, he's always within a few points of his original rating. Even if that were true, it would be troubling. If a score might vary by, say, three points in either direction, that's a six-point span–which makes up a considerable percentage of his overall scale. Parker's really using more like a 20-point scale than a 100-point scale, because he rarely scores wine below 80, which means that a six-point span might cover almost a third of his percentile range. If that's true, then the Parker points are not representing a whole lot–especially when you can predict so much of the Parker score with the price of the wine.

Why do you think people are so protective of their reputations in the wine world? They make excuses: I had a cold when I tasted those. It was the shape of the glass. It was a different vintage. It was an off bottle. You never see this with other types of critics? A restaurant critic who gives a great review to a bad restaurant won't say, "Oh, I wolfed the food because I was high that night." Why is that wine aficionados don't appreciate being stumped?
That's a great question, and it's a complex psychological phenomenon. My first take on it is that there's more self-confidence and less insecurity when we're talking about food rather than wine. Food is something we grow up eating and thinking about from an age at which we're younger than even able to talk. Whereas with wine, especially in the U.S., people don't start drinking it until they're almost adults. Parker, for instance, according to his biographer, didn't taste his first dry wine until he was an adult.

In defense of wine critics, though, the tastes and textures of food vary much more than they do with wine. The differences between Merlot and Cabernet are subtle compared to the differences between a banana and a Twix bar (yet those are still two foods in the "sweet" category). A food critic would never mix up the latter two–not even a first-grader would ever mix them up. A professional wine critic, on the other hand, might sometimes mistake a Merlot for a Cabernet. But the way the mainstream wine media correlates its ratings so strongly with price, correlates its adjectives so strongly with grape varieties, it sends us the message that this mistake could never happen to a real wine critic–it's only the lowly non-expert consumer that would ever be so ignorant, which is why he needs such guidance from ratings.

In other words, the way the wine industry has developed over the past couple decades, it's taken advantage of this simple issue of subtle differences by exploiting people's insecurities about their own inability to detect those differences without wine training. This makes a lot of consumers feel intimidated when they shop for wine or read a wine list, in a way that they never would be shopping for groceries or reading a restaurant menu (unless it's the one at, say, WD-50). Unfortunately, the way that many people respond to this kind of condescension is by assuming that the more expensive, higher-rated wine is better, and that they're just unqualified to judge. And yet our experiments show that they'd usually be better off buying cheaper, lower-rated wine.

After your famous stunt with the Wine Spectator Restaurant Awards, the magazine said that you "scammed" them. First, were you expecting a backlash or a plea for help in reforming the system? And have you had any interaction with the magazine since, or any indication that their evaluation processes have changed for the better?
I didn't expect the magazine to be happy about my exposé of their restaurant awards program, but I was surprised by their response. I would have thought that what was called for, when I drew attention to the clear problems with the Awards of Excellence–that not only weren't they going to the restaurant or discussing the wine program with the staff, they didn't even seem to be comparing the reserve wine list to their own declared standards of wine quality, by which almost every wine on that list was in the bottom 1% of Italian wines rated by the magazine–was a revamping of an obviously flawed process. Clearly, these restaurants were not being evaluated in any meaningful way.

I haven't heard from anyone at the magazine since the exposexposé, and more importantly, they didn't ever apologize to readers for the lack of standards or make their process more transparent. So I don't have any reason to believe that the program is now any different from how it was in 2008, when I ran the exposexposé. If I were a reader, the best way for Wine Spectator to win back my trust would be to explain to me how the process had been reformed since then.

I thought that it was ironic that the folks at Wine Spectator spent so much time calling me a fraud for submitting an imaginary restaurant to their awards program in order to test their standards, when their program was perpetuating an actual fraud against readers by awarding restaurants with a designation of "excellence" year after year for what amounted to the purchase of a $250 advertisement in the magazine–which, multiplied by the 4,000 restaurants that apply, nets Wine Spectator more than $1 million annually.

Editor: I would argue that it the program really grew beyond its means. I knew a lot of people there who worked very hard on the program, and I know for a fact that they (a) didn't rubber-stamp reward every wine list, and (b) never expected something like this to happen. As the program grew and became more profitable, it seemed to be doing fine as it was–not something that had potential to blow up in their faces.

In this sense, some of the biggest victims of Wine Spectator's fraudulent program were the restaurants that worked hard to get their highest level, the Grand Award–and more generally, the restaurants that really did have excellent wine lists. They should be the most incensed about Wine Spectator's sloppiness, because they were under the impression that their $250 was actually paying for the exercise of professional judgment of expertise, that the award meant something, that when they won it they could display it proudly. Instead, it's now apparent that if Wine Spectator were honest about what the basic Award of Excellence meant, it would be something along the lines of "we've received in the mail from this restaurant a copy of a wine list, organized in a coherent fashion, that contains at least 150 wines, but we are not certifying that the wines are good or that they're actually in the restaurant's cellar, or that there is a cellar, or that there is even a restaurant." I don't imagine that the Grand Award winners were very happy with what this would imply about Wine Spectator's definition of "excellence."

In your mind, what's the problem with typical blind-tasting procedures employed by critics? That they know too much about what they're tasting, even when the bags are on the bottles? That context, while important, rigs the game?
I think there are many problems. One is that not enough of them use the scientific method, and tests of statistical significance that are concerned with how likely a result would be to occur by chance–like the probability of a wine being rated high or low by mere statistical chance, given how inconsistent tasters can be. That's a problem that is insufficiently dealt with by critics, wine fairs, and competitions, and it can be mitigated (partly, anyway) by having a larger panel of tasters. In The Wine Trials, we employ a larger panel than most wine fairs, magazines, and competitions use, in the interest of getting more statistical accuracy. There are many exceptions to this rule–there are plenty of writers and critics out there whose work I really enjoy. But too many of them don't work with panels, don't taste blind, or if they do taste blind, it's sort of a fake blind, in the following sense: You taste a flight of red wines; you know they're all 2005 Hermitage, but you don't know the producers or exact prices. You rate all the wines, and while the fluctuations between producers are unpredictable, they almost all get scores somewhere in the 90s. You can call that blind tasting, but anyone who's an experienced wine writer or critic will know that all these wines are generally pretty expensive, so they have a strong positive ratings bias for the whole flight. The entire range is biased by their knowledge. While there may be certain unpredictable ratings of individual producers, in general, that's why you see such a high correlation between price and rating. For both Wine Spectator and Robert Parker's Wine Advocate, the correlation in a simple one-variable model-predicting rating based on price-has been somewhere around 0.5. For something as difficult to consistently identify as wine styles, that's implausibly huge.

Who are some of those critics you enjoy?
I like Eric Asimov's wine writing in The New York Times. He and his colleagues do panel blind tastings, often bringing in other industry people from around the city, like we do. He stays away from the overambitious use of the 100-point scale, and instead uses a simpler star scale. He's modest and honest in his writing–nicely up front about the shortcomings in the process, and apologetic about the arbitrariness necessary in the selection process of the range of wines to be tasted. This is true even though we have our differences on the purpose of blind tasting, which I talk about in The Wine Trials.

I think there are a lot of great blogs out there, too, from the unbridled infectious enthusiasts to the much-needed watchdogs. The explosion of blogs has been a really positive phenomenon for the wine world, and a foil for an overreliance on established or famous critics and ratings. There's been a lot more talk about blind tasting because of wine bloggers. There's also been a lot more talk about how cheap wine can taste great, how price isn't an indicator of quality, and how the question of "objective quality" is almost a misleading question. It may not correspond to how much pleasure you, as an individual, derive from the wine. I think the heterogeneity of wine bloggers' opinions out there is helping people intuit the heterogeneity of wine preferences, too.

OK, so, for argument's sake, what's the best way to taste and evaluate the wines of Bordeaux: All the Pauillac wines together, all the Margaux wines together, with no knowledge of classification? Or is it all Cabernet/Merlot wines from all over the world together?
Generally speaking, our approach is to mix together different regions and styles in a single flight. There's a limit to how much you can do that, though, because you have to keep in mind what wines will exhaust your palate and inhibit our ability to judge competently. For The Wine Trials, we classify wines as Old and New World, and heavy and light, and then we randomize the flights. That's my preferred approach, because I think even knowing that all wines in a flight are Cabernet Sauvignon biases you.

There's such sloppiness about fruit adjectives that are used to review wines in the mainstream wine media. If you know you're tasting five Cabernets, you'll be predisposed to those flavors. You'll be looking for cassis and green bell pepper, and–lo and behold!–there they are. And if anything, wine experts are even more susceptible to this effect than novices, because they're familiar with all of those expected adjectives. The more you know going in, the more what you're describing is about your own personal experience drinking the wine, and the less it's about the wine itself. The point that so many wine writers are missing is that, in spite of the purported usefulness of tasting competing brands in the same appellation against each other, when that's the only way you taste, the mere knowledge of the grape variety completely obscures your ability to describe the wines accurately and evaluate them fairly.

In your mind, what's the perfect scenario: An overhaul of the 100-point system, or no system at all? Or a different system that works better?
I'd like to see critics getting together in larger panels to get more scientifically valid results. If critics merely got together in bigger groups, you'd have massive improvement on that axis.

I'd also like to see critics acknowledge that people have different palate types, and to consider more what types of wines are better and worse for different kinds of consumers. As the technology of understanding the palate progresses, I think it's possible that we'll be able to think more about recommending different wines, or perhaps even rating or ranking wines differently, for different palate types.

In the meantime, in addition to more of a focus on blind tasting-based results, I'd like to see coarser rating scales–way less than 100 points–and a move toward more reliance on rankings than on ratings. Comparisons are a necessary aspect of evaluating value in the marketplace; point scores out of 100 are not.

Give me an example of a blind tasting people can do at home, one that's in the latest edition of Wine Trials, that will really blow their minds?
I think a great one is Dom Pérignon against Domaine St. Michelle, or any match of two or three Champagnes in the $40-plus range and two or three Champagne-style sparkling wines, like Spanish Cava and a California or Northwest sparkling wine that goes through the Champagne process. Put them into a blind flight of four or six, and have everyone rank them. That almost always yields surprising results.