It's bad enough men never have to wait in line at the restrooms

By adamg on Mon, 11/12/2007 - 9:21am

Now comes a study of Boston-area coffee shops (via MetaBoston) that concludes:

... [F]emale customers wait an average of 20 seconds longer for their orders than do male customers even when controlling for gender differences in orders.

Author Caitlin Knowles Myers, an assistant professor of economics at Middlebury College, says the wait was even more pronounced in shops with male employees - women workers were less likely to try to hold women customers up. Not only that, but the baristas seem particularly disgusted by ugly customers - they had to wait longer for their orders than the beautiful people.

Myers acknowledged in the study that women customers seemed to be more likely to order "fancy" drinks that would take more time to prepare (75% of women vs. just 55% of men; no discussion of the complexity of orders by ugly people), but said that even when this factor is taken into account, women and ugly people still had to wait longer than pretty boys (although you'd think the reverse would hold: Workers would rush their orders to get them out of their face).

She says possible reasons for the discrimination include male workers trying to get more money out of women customers (the report does not say how); male workers hating women customers (in part because of a conception that they're lousy tippers); or male workers "garnering utility" from female customers (i.e., they want to get to know them better). However, in all cases, it's discrimination against women, she writes.

The report is based on 295 observed customer interactions in visits to eight unnamed coffee shops in "the central Boston area" by a professor and five students this past January; they were selected based on whether they had seating arrangements that let the "enumerators" ~~spy on~~ observe workers without letting them know somebody had an eye on them. So I'm thinking Dunkin' Donuts was not included, because it'd be hard to not look out of place staring at counter workers at a Dunk's.

Should you wish to verify the results, Myers provides detailed instructions on how to properly conduct such a study:

In order to control for possible effects of appearance (and possible correlations between appearance and other demographic characteristics of interest), enumerators ranked each customer's appearance on a scale of 1 to 10. This ranking was based less on physical beauty in the sense of Hamermesh and Bid dle (1994) and more on the quality and style of clothing and hair.

Also, you'll need to come up with a generic term for the "fancy orders," i.e., all the myriad ways in which you can get your coffee and to account for the fact that being discreet means you might not be able to hear every last bit of dialog between server and customer:

In the cases of orders that were recordable, there was variability in how many details of the order could be overhead. Some orders were recorded exactly as issued, but others could only be classified as, for instance, a cappuccino, and the enumerator ould not otherwise discern the size of the cappuccino or other special instructions.

Also, if you don't have a laptop with a stopwatch program, a cell phone with a clock will do in a pinch.

Topics:

Dining

Free tagging:

Starbucks

Ad:

Like the job UHub is doing? Consider a contribution. Thanks!

Comments

This was written up on

By Anonymous on Mon, 11/12/2007 - 11:57am.

This was written up on Slate.com. The study is a typical undergraduate-level work. Garbage In - Garbage Out. The statistical analysis actually says that the was NO significant difference between wait times. And the suggestion that college age male servers are discriminating against college age customers because of some "gender animus" is something that could only come out of the nutty land of the Gender Studies ghetto that is the American college campus.

Next time you come upon one of these studies in the media, ask yourself "Does this make any sense?" If it doesn't pass the sniff test, let it go.

Oh dear.

By Spatch on Mon, 11/12/2007 - 12:23pm.

This sounds like something that'd have been written up in the Annals of Improbable Research.

Not statistically significant

By Anonymous on Mon, 11/12/2007 - 12:49pm.

Thanks for linking to the actual paper. Unfortunately for the author, they basically found that they wasted their time...but wanted to justify that waste of time by trying to get their study published. I'm in research...I know how that goes.

One of the partially discussed caveats in this paper is the potential bias for drink order "fanciness". In fact, once included as a control in their model, they nearly lose their significance and when controlling for the line length, their significance is totally lost (models 5 and 6). A Pearson's Chi-square test on their data shows a significant difference in how fancy an order was in relation to the gender too (p < 0.001)...but they chose not to explore that, of course. I also didn't see any acknowledgment of whether a barista or a clerk was capable of filling the order (goes to fanciness of drink...a coffee can be poured by anyone...a latte/cappuccino requires a barista (usually only 1 per store)). This would significantly increase wait time on average...especially given the differences in gender-fanciness ordering patterns crudely recorded here.

Without having the actual data that generated their statistics, I can't be certain, but I'd bet a good regression analysis would show little to no difference in their wait curves (Fig 1).

Finally, especially given that their mean/SD for wait times overall was like 99s per order, give or take 59s. That gives 68% of the orders being between 40 and 158 seconds...and then you're going to quibble over 20 seconds when the average order for anyone could vary by nearly 2 minutes? It's not like their data was fully bimodal (with all male orders equating for the 40s data and all female orders equating for the 160s data to give the mean/SD of 100 +/- 60).

wow, So you're saying this

By Anonymous on Mon, 11/12/2007 - 2:02pm.

wow,

So you're saying this was a test tailored to fit a outcome predetermined?

Who woulda thunk!

Most political polling does the same thing, which is why they're pretty much irrelevant unless you look at the methodology. Unfortunately, their ability to influence people after being released is the real reason they're done.

No, I said

By Anonymous on Mon, 11/12/2007 - 2:37pm.

No, what I said was that they did not take into account things that could easily influence such a highly variable activity. Their data is not coerced, it's just attempting to answer a question that it's not reasonable to answer based on the method. They used valid tests, but they did not treat the data properly to meet the assumptions of their tests nor did they attempt to use more stringent tests within the publication.

The bigger problem is a media willing to jump on a study of this quality as if it were scientific truth (thus diluting what *is* scientific truth to most laypeople AND giving others false justification in extolling this result as reason for change/complaint/etc). Looking at the Slate article as well, it appears that even an "interesting economist" would give it far more credence than it's worth. Sadly, once our culture seems to jump on these types of banner-waving studies, it's difficult to get them to let go even in the face of tremendous evidence to the contrary.

It can be even uglier when the data is just flat out faked in order to expose the flaws by the media who would use a single unvalidated study as hard evidence to wave up the flagpole.

Get your statistics straight, please!

By SwirlyGrrl on Tue, 11/13/2007 - 9:52pm.

Tests of statistical significance don't account for bias and confounding and misclassification issues. They merely reflect the probability that the findings are due to pure statistical chance. In other words, if you did 1,000 studies, what is the chance you would get the given finding? That has nothing to do with study design issues that you mention here (in my dissertation, I determined confidence intervals empirically using repeat resampling to define the distribution of possible values).

Any monkey can type commands into Stata and get out numbers and call them macaroni. Crunching numbers isn't science. True inference requires that one first understand the data, use appropriate statistical procedures given the form of the data and assumptions of the analyses (do you even know what those are for a Pearson Chi Square????) and understand the meaning of what the analyses do to the data at each step in the process.

Why are you replying to that

By Anonymous on Sun, 02/24/2008 - 2:59pm.

Why are you replying to that comment? The parent poster is not the one who ran the study. The data is not available.

"True inference requires that one first understand the data, use appropriate statistical procedures given the form of the data and assumptions of the analyses (do you even know what those are for a Pearson Chi Square????)"

Actually when it boils down to it this isn't even a relevant question given that the data is not available. It's simply an ad-hoc attack on the original poster designed to discredit. You might as well have taken your post from a list of formula responses.

My guess

By Ron Newman on Mon, 11/12/2007 - 2:20pm.

If this study is valid at all, I'm guessing that the difference is explained by "men like to look at women, and therefore serve them slightly more slowly."

And what about ...

By adamg on Mon, 11/12/2007 - 2:39pm.

Gay baristas serving well dressed men?

Gay baristas

By Dave on Mon, 11/12/2007 - 8:24pm.

Gay baristas

That might be redundant.

The final retort to this report

By adamg on Mon, 11/12/2007 - 8:42pm.

Provided, natch, by Spatch, who imagines the discussion between one of the study enumerators and a barista:

... PROF: And this other couple: He got his in a minute and a half, she got hers in a minute and fifty-five seconds. What have you to say to that?

BORED BARISTA: The second cup spilled a little, so I went and topped the coffee back off. Gee. I sure hope she didn't, like, die of thirst waiting that long or something.

PROF: Was that sarcasm? It was. Your nineteenth exhibition today.

BORED BARISTA: You're counting my sarcastic remarks too? ...

The final retort to this

By Dave on Mon, 11/12/2007 - 9:15pm.

The final retort
to this report

Provided, natch,
by Spatch,

who imagines the discussion
between one of the study enumerators and a barista

Damn, you were doing so well there.
I just don't get this post-modern poetry stuff.

First, anyone who would like

By Caitlin Myers on Tue, 11/13/2007 - 3:58pm.

First, anyone who would like the data is welcome to email me for it. (It's in Stata format.) Several of the statistical critiques don't make sense in this context. I've posted a detailed response in the Slate comments that you can check if you're interested.

Second, we don't say that this can only result from male workers hating female customers. We say that there are several explanations for the differential: unobserved drink complexity (which we address in several ways and conclude is unlikely driving everything), discrimination based on disklike of women, a desire to chat up women, expectations about women being less demanding customers, or differences in tips. We're upfront about the fact that there is limited data based on a small survey and we can't differentiate between several of the explanations.

Third, that result about "ugly" people is not significant, not discussed as such in the paper, and, anyway, not about being "ugly" at all. It was an extra control for how people were dressed and was standardized by enumerator to account for differences in their ratings. (We also include enumerator and shop fixed effects.)

Too Many Numbers, Not Enough Meaning

By SwirlyGrrl not logged In on Tue, 11/13/2007 - 4:23pm.

As a PhD epidemiologist, I find these studies amusing. I also find the general tendency of Stata-wielding sociologists to think they can enrich tepid data by attaching numbers to it and running multiple fancy statistical procedures. Lipstick on a pig!

This data is suggestive, but far from conclusive nonetheless. I would suggest any new study be interventional, rather than observational. In other words, send the same "pretty" male, female and "ugly" people into multiple establishments at a fixed time each day, enumerate staff characteristics, and observe waiting times and interactions. This would result in a gain in statistical power from a crossover design and control for a number of misclassification issues.

Yep, being a social

By Caitlin Myers on Tue, 11/13/2007 - 4:35pm.

Yep, being a social scientist is hard. It's hard to measure "soft" preferences with data. That's why trying to get at revealed preferences by things like timing a wait is helpful. Other than the appearance ranking (which, again, is an additional control and not a result of interest) I don't see what we attach a number to that isn't measurable.

And, yes, this is suggestive, not conclusive. That's what we say repeatedly in the paper (but that's not nearly as sexy as trumpeting "proof" of discrimination in the press.) And what you describe is an "audit" study of discrimination and would certainly be better and would be a nice follow up. (We didn't have the resources for one in the initial study, which was just a class project.) In an ideal world, one would design the perfect experiment and have no need for regression at all. But the world isn't ideal, so we try to control for confounding factors in the regression.

PS-- I'm an economist, by the way.

Cults of Quantification

By SwirlyGrrl not logged in on Tue, 11/13/2007 - 4:57pm.

Don't get me wrong - many economists and public health folk alike also tend to think numbers = science = meaning. There is a high correlation with this attitude and the ferverent belief that statistical significance actually means something really important - as though it was a universal constant of truth like e=mc2 (it is an arbitrary threshhold of statistical noise - that has nothing to do with biases, confounding, etc.). Maybe I've just had particularly interesting interactions with sociologists - perhaps it is their defensive reaction to being picked on by self-identified "hard scientists" and the like, I dunno.

It all gets down to data quality - you can use clever analysis to get around sparse data and small data sets, but data quality doesn't improve with transformation.

Get into policy work and message and meaning in research get really fun really quick! Best to remember the immortal and oft-warped/misattributed words of Andrew Lang:

"He uses statistics as a drunken man uses lamp-posts: for support rather than illumination."
Andrew Lang (1844-1912), Scottish author

It's bad enough men never have to wait in line at the restrooms

Comments

Support Universal Hub