I wanted to write this as a companion to something I wrote on my professional blog, Will Big Data Give Us a Whole Bunch of Questionable Correlations?
That post, and this one, is also based on a recent episode of Seth Godin’s podcast. – Sample Size.
The reason I think both of those things are applicable here, and why I wanted to point my readers here to both of those links, is because I think in the mental health care field, we’re already there. The simple fact of the matter is, that anyone can grab hundreds, thousands, even millions of data points and throw them into a artificial intelligence, or analytical study.
And when we do, we begin to see correlations. We see the kinds of things that we can’t explain, like why sales of margarine go up when the divorce rate in Maine goes up if I remember Seth’s example correctly. It does. We can’t explain why it does, and our natural tendency is to start looking for reasons why those two things seem to correlate so closely. We don’t really stop to consider that it might just be random. It’s a coincidence, and may at some point cease to keep moving in the same direction one day for the same reason they move in the same direction now, which is no reason.
I feel like when it comes to mental health, physical health, childhood development, etc. we already do this every single day. We see that people who do “X” seem to be less depressed on average than people who do “Y”, and we immediately see headlines touting “X” as a cure for depression, and “Y” as the explanation for depression, when in fact, there’s no prove of either of those things. The correlation may just be random and coincidental, or it might be true, but only in .05% of cases.
Studying all of this data can tell us quite a lot that we don’t know, and I welcome anything that can help us understand mental health, on the cause and treatment side of the equation, but we need to be careful about some of what we see in these study results. No one study, or one seemingly correlated bit of data is going to explain all of mental health. If it were that simple, we would not be where we are.
As I mention in my other post, the key is not finding various behaviors that correlate to poor mental health outcomes, because the machines will continue to make that easy. The key is going to be understanding which ones really matter, and which ones are just random noise. That’s not so easy. When we see something, for example, that indicates that rates of mental health issues are lower in North Dakota than they are in Louisiana, do we just recommend people move to North Dakota, or is that just something random? (Or worse, is it because so few people in ND get diagnosed in the first place?)
It takes more than that one data point to answer those questions. We should avoid jumping to conclusions, or recommendations based on what seems like a correlation when correlation may be questionable.
We also need to be aware of how these correlations may be leading us into mistakes, and leading us down paths to recommendations that are actually false.
Take, for example, the long-believed correlation between suicide rates and holidays. We “know” that people struggling at the holidays can feel a strong level of sadness, missing lost loved ones, seeing the holiday specials with the perfect families, etc. So we end up seeing recommendation after recommendation about checking in with the people you care about at the holidays, a time when they might be more likely to die by suicide. But, actually suicide rates are very low during that time of year, so checking in with folks during that time is great, but it absolutely shouldn’t stop once the holidays season is over.
Similarly, we shouldn’t look at correlations between economic status and child abuse, or ethnicity and mental health issues without also looking at how the underlying data is created. Yes, there are more reports of child abuse in poor communities, and maybe fewer reports of mental health issues in Asian communities, but those correlations may be based on bias and cultural stigma more than the reality of care. We shouldn’t be making decisions based on data that is biased. Let’s face it, there is, very likely, a strong stigma against mental health issues in the Asian communities, and a very much larger likelihood that a call to Children’s Services is going to made over a poor family when compared to a wealthy one.
But algorithms are not going to tell us that. It will require understanding, and not jumping to conclusions based on single headlines. I’m afraid there’s already too much of that though.