Chocolate linked to depression (or vice versa)

| 17 Comments

The Aztecs consumed chocolate as a beverage and called it bitter water. Linnaeus, who had more sense, named it Theobroma, the food of the gods. Evidently, the gods (if not the Aztecs) were a bit depressed – a recent study by Beatrice Golomb and her colleagues concluded that depressed people consume more chocolate. I am not surprised.

As reported in the LA Times, Science Daily, and elsewhere, Dr. Golomb and her colleagues examined the chocolate consumption of 931 participants who were involved in an epidemiological study. None of the participants had diabetes or cardiovascular disease, and none was on antidepressants.

The participants self-reported their chocolate consumption using what I take to be a well known and apparently well documented metric, the Fred Hutchinson Food Frequency Questionnaire. The study measured depression using another metric, Center for Epidemiological Studies–Depression Scale. People who scored higher on the depression scale also consumed more chocolate per week – the equivalent of about 8, 1-ounce servings per week (that is, approximately 8, 28-gram servings, in God’s units).

The study postulates no causative effect, either way, between chocolate consumption and depression. The authors note that there are comparatively little caffeine and theobromine in chocolate, and they did not find a correlation between caffeine and depression. Indeed, the LA Times quotes a food psychologist, Marcia Levin Pelchat, who argues that the effect is learned, not biochemical. That is, according to Dr. Pelchat, we train ourselves to eat chocolate when we are depressed. If that is so, then the correlation is effectively cultural.

There are a number of problems with this study, many of which are mentioned forthrightly in the article. The authors were not, however, data dredging: They stated a hypothesis, that consumption of chocolate would correlate with depression, and examined the data to confirm or disconfirm their hypothesis. They did not search high and low for apparent correlations with high statistical significance (low p-value), and they demonstrated a dose-response relationship (I thought, though, that a graph of chocolate consumption versus depression score would have been more useful than merely dividing the depression scores into 2 categories). The most troublesome aspect of this study to me (as a sometime metrologist), however, is that they report the standard deviation of their data sets but make no effort to include the standard uncertainty of either metric. My guess is that if they had calculated the combined standard uncertainty, their 95-% confidence intervals would have overlapped considerably.

Well, no, I take back some of what I just said. The most troublesome aspect is that they did not distinguish between proper chocolate and milk chocolate. Whereas I will admit that milk chocolate is better than no chocolate whatsoever, I want to know whether halfway-decent chocolate is better at curing my depression than insipid milk chocolate. I spend a lot of money every year on dark chocolate bars, and (my dental health notwithstanding), I want to know if it’s doing me any good.

Reference. Mood Food: Natalie Rose, Sabrina Koperski, and Beatrice A. Golomb, “Chocolate and Depressive Symptoms in a Cross-sectional Analysis,” Arch. Intern. Med. 170(8):699-703 (2010); http://archinte.ama-assn.org/cgi/co[…]rt/170/8/699. Subscription required.

Acknowledgment. Thanks to Beatrice Golomb for providing a reprint of her article almost instantaneously.

17 Comments

…they report the standard deviation of their data sets but make no effort to include the standard uncertainty of either metric. My guess is that if they had calculated the combined standard uncertainty, their 95-% confidence intervals would have overlapped considerably.

This doesn’t make much sense.

This doesn’t make much sense.

I am afraid it does; what about it do you not understand?

I won’t get into the milk vs. dark chocolate battle again :D

I’m wondering “why just chocolate”? Why not Doritos or Twizzlers? Might depressed people also eat more Cheez-Its and Cherry Garcia? They say “These findings did not appear to be explained by a general increase in fat, carbohydrate, or energy intake”. But that’s not quite the same thing. If you’re constantly popping Cheese Doodles, as we always tell our kids, “you’re gonna spoil your appetite for dinner”. They said “Much lore but few studies describe a relation of chocolate to mood”, but the same anecdotal relationship has been suggested between eating in general and mood.

It sounds like it’s time for a follow-up study: “Chex Mix and Depressive Symptoms”!

Chocolate makes me happy…PERIOD! I’ve fortunately never really battled with depression but there are days that something decadent makes everything better.

Thanks for the article…interesting reading.

I admit I’ve taken to melting dark chocolate in a little hot water, adding some spice, beating, adding milk, heating, and drinking (if feeling decadent, add whipped cream).

I thought chocolate was to help deal with depression?! I heard it was good for it. Obviously if chocolate is a pleasing good thing, its popularity bearing witness, then someone depressed with use it more then usual. Obviously. At least evolution didn’t make a scene here in explaining it.

I am afraid it does; what about it do you not understand?

So, they just reported sample standard deviations, and not standard errors or CIs? That’s fairly standard practice. It’s trivially easy to compute the standard error/CI from the standard deviation, if the right information is given.

Why aren’t CIs typically reported? Because it is well known that you can’t make inferences from the fact that 95% CIs overlap or don’t overlap. You need a proper test, because CIs can overlap and still yield a significant result when the test is done. Now, if they didn’t offer a hypothesis test, that’s a problem; but not reporting standard errors/CIs, and possibly overlapping CIs, are not a problem.

So, they just reported sample standard deviations, and not standard errors or CIs? That’s fairly standard practice. It’s trivially easy to compute the standard error/CI from the standard deviation, if the right information is given.

I don’t have the paper in front of me, and I don’t recall what test they used, but they reported p-values and also showed error bars on bar graphs. I was too imprecise when I wrote about overlapping error bars – it is correct that error bars may overlap and the results may nevertheless have statistical significance.

What concerned me about this paper and probably others is that they do not consider the uncertainty of the measuring tool, which I will call Fred. Suppose that those who eat little chocolate under-report their consumption, whereas those who eat a lot of chocolate over-report theirs, but by unkown amounts. Then the dose-response curve will be flattened, and the statistical significance will be less. In other words, Fred may well have a systematic error. If we do not know the sign or the value of the systematic error, then we must compute an uncertainty. That uncertainty must then be included in any calculation of the combined standard uncertainty of the result. The key word is “combined.” The standard deviation of the mean of the raw data does not necessarily represent the entire uncertainty of any experiment.

Why aren’t CIs typically reported? Because it is well known that you can’t make inferences from the fact that 95% CIs overlap or don’t overlap. You need a proper test, because CIs can overlap and still yield a significant result when the test is done. Now, if they didn’t offer a hypothesis test, that’s a problem; but not reporting standard errors/CIs, and possibly overlapping CIs, are not a problem.

Yes. That was not my concern. My concern was that they did not include Fred’s uncertainties in their calculation.

My position on overlapping error bars, said tongue-in-cheek but not entirely frivolously: If the error bars obviously do not overlap, the means are different; if the error bars overlap greatly, the means are the same; if the error bars barely or nearly overlap, call a statistician. Here “error bar” means 95-% confidence interval.

Another problem, I think, is that they binned the data into depressed and not depressed, and averaged the data in the 2 bins. But the data in fact fall on a curve, so it is not necessarily meaningful to average the data in the bins. Is that right?

I wonder whether they weighted the data to adjust for the great disparity in sampling with respect to gender and to make any adjustments in their analysis which would minimize such bias:

Matt Young said:

So, they just reported sample standard deviations, and not standard errors or CIs? That’s fairly standard practice. It’s trivially easy to compute the standard error/CI from the standard deviation, if the right information is given.

I don’t have the paper in front of me, and I don’t recall what test they used, but they reported p-values and also showed error bars on bar graphs. I was too imprecise when I wrote about overlapping error bars – it is correct that error bars may overlap and the results may nevertheless have statistical significance.

What concerned me about this paper and probably others is that they do not consider the uncertainty of the measuring tool, which I will call Fred. Suppose that those who eat little chocolate under-report their consumption, whereas those who eat a lot of chocolate over-report theirs, but by unkown amounts. Then the dose-response curve will be flattened, and the statistical significance will be less. In other words, Fred may well have a systematic error. If we do not know the sign or the value of the systematic error, then we must compute an uncertainty. That uncertainty must then be included in any calculation of the combined standard uncertainty of the result. The key word is “combined.” The standard deviation of the mean of the raw data does not necessarily represent the entire uncertainty of any experiment.

Why aren’t CIs typically reported? Because it is well known that you can’t make inferences from the fact that 95% CIs overlap or don’t overlap. You need a proper test, because CIs can overlap and still yield a significant result when the test is done. Now, if they didn’t offer a hypothesis test, that’s a problem; but not reporting standard errors/CIs, and possibly overlapping CIs, are not a problem.

Yes. That was not my concern. My concern was that they did not include Fred’s uncertainties in their calculation.

My position on overlapping error bars, said tongue-in-cheek but not entirely frivolously: If the error bars obviously do not overlap, the means are different; if the error bars overlap greatly, the means are the same; if the error bars barely or nearly overlap, call a statistician. Here “error bar” means 95-% confidence interval.

Another problem, I think, is that they binned the data into depressed and not depressed, and averaged the data in the 2 bins. But the data in fact fall on a curve, so it is not necessarily meaningful to average the data in the bins. Is that right?

Suppose that those who eat little chocolate under-report their consumption, whereas those who eat a lot of chocolate over-report theirs, but by unkown amounts. Then the dose-response curve will be flattened, and the statistical significance will be less.

If you don’t know how much they consumed, there’s no way to correct for this systematic error (since you have no idea what that systematic error might be). This is less about statistics and more about being humble about what your data tell you. To give an extreme example, let’s say that there’s a stereotype in our culture that people eat chocolate when the’re depressed. Suppose this stereotype makes eating chocolate more salient to depressed people: “Wow, I’m a walking stereotype!” So they are more likely to remember eating chocolate when they fill out their inventory.

Depressed people will report eating more chocolate than non-depressed people even if consumption is the same across the populations, and there is no amount of statistics that will enable you to figure this out, unless you have data on actual consumption - so, you have to understand that you’ve really found a relationship between reported consumption and depression, not consumption and depression.

The extent to which it is reasonable to infer consumption from reported consumption is a matter that can’t be divined from data with no consumption included. Of course, if there is data elsewhere on Fred, maybe you could do it, but that’s not my literature.

Both men and women showed the positive correlation.

If you don’t know the accuracy of Fred, you have to estimate his uncertainty and add it in quadrature with the statistical uncertainty of the study. That is not exactly statistics, but it is error (or uncertainty) analysis. I do not know anything about Fred either, but his uncertainty cannot be ignored.

If you don’t know the accuracy of Fred, you have to estimate his uncertainty and add it in quadrature with the statistical uncertainty of the study.

How would you do this without knowing anything about actual consumption of chocolate?

I do not know anything about Fred either, but his uncertainty cannot be ignored.

Fred is an errorless measure of reported consumption. If you interpret the data properly, there is no uncertainty in Fred.

How would you do this without knowing anything about actual consumption of chocolate?

I haven’t the foggiest – not my field. But presumably you would have to find a study that compared reported consumption (of whatever) with actual consumption. It would be very surprising if the noise in such samples were 0 or even close to 0 – not to mention systematic error.

Fred is an errorless measure of reported consumption. If you interpret the data properly, there is no uncertainty in Fred.

Absolutely. But what we want to know is how actual chocolate consumption correlates with depression.

Not having read the paper, I am wondering whether “FRED” might have been a latent variable derived from some multivariate statistical analysis (some kind of Factor Analysis), arising from the need to try to show some positive correlation between actual chocolate consumption and the relative severity of depression:

Matt Young said:

How would you do this without knowing anything about actual consumption of chocolate?

I haven’t the foggiest – not my field. But presumably you would have to find a study that compared reported consumption (of whatever) with actual consumption. It would be very surprising if the noise in such samples were 0 or even close to 0 – not to mention systematic error.

Fred is an errorless measure of reported consumption. If you interpret the data properly, there is no uncertainty in Fred.

Absolutely. But what we want to know is how actual chocolate consumption correlates with depression.

Absolutely. But what we want to know is how actual chocolate consumption correlates with depression.

Sometimes in science you can’t always interpret the data the way you like…but, at any rate, since they found a correlation between depression and reported chocolate consumption, all you have to do is posit that there’s a monotonic relationship between reported consumption and actual consumption. That’s a fairly weak claim (and presumably Fred would be worthless without it), but it allows you to infer a relationship between depression and actual consumption by transitivity with no extra statistics.

…it allows you to infer a relationship between depression and actual consumption by transitivity with no extra statistics.

Yes, but how strong a relationship? How interesting is it if (to exaggerate for effect) depressives consumed 1 % more chocolate than nondepressives? It is really important to do the full analysis.

I bet if they checked only the females versus date, they’ll find a periodic effect.

Chocolate helps.

About this Entry

This page contains a single entry by Matt Young published on May 5, 2010 12:48 PM.

Mt. Vernon School Levy Passes was the previous entry in this blog.

Creationist vs. creationist on Homo habilis is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories

Archives

Author Archives

Powered by Movable Type 4.361

Site Meter