Data Is Overrated
As a grizzled "data scientist" who has been in the biz long enough to remember when "data science" became a buzzword, I'm here to tell you that data is, by and large, crap. Far from being the digital gold of this generation, data is more like the fabled tulips of 16th century Holland. Sure there are omnipresent tech giants siphoning the telemetry off your smart phone along with the cabal of shadowy data brokers who deal in poorly secured spreadsheets filled with your sensitive personal information. All of this demand might have you believe your data is actually worth something beyond what someone will pay for it.
This is a myth. It is a fallacy perpetrated by the same people who have convinced themselves that you want to see relevant advertising, and therefore you consent by default to ubiquitous surveillance. Just to be clear, even if Amazon could predict exactly what I want the most at any given moment, I wouldn't consent to this exchange. But they can't even do that right. Their laser targeted mind control ray divined that because I once purchased a math textbook, obviously I want to buy every textbook ever written on the subject. Truly, they have a window into my soul.
Back in the hallowed ages of antiquity, I used to teach statistics. In those days, whenever people would ask me what I did for a living, their immediate response would invariably be to tell me that they hated statistics. God, it's just so boring why would anyone want to do that?. This phenomenon was by no means contained to me, and the broader statistical community was getting sick of hearing this tripe. Then, in 2015 the statistician's union undertook a rebranding campaign, where statistics became "data science" and predictive modeling became "artificial intelligence."
And boy did it work. Data science is more popular than ever, energized by trendy buzzwords that belie the inherent uncertainty in all statistical methods. A tool that is easy to misinterpret, raises more questions than it answers, and often yields no results whatsoever is a tough sell. Sounds like those boring old statistics. AI sounds like it'll do the thinking for me, even though it has all of those same problems. Machine learning sounds like my computer will learn the context of and be able to make judgements about my work. Wrong. The only kind of judgements a computer can reliably make are those with no uncertainty. The timer hits zero and the traffic light changes. Period. For any problem with uncertainty, all a computer can do is give you the probabilities to interpret for yourself. For those who think that's too much work, there's no shortage of "algorithms" promising to remove the painful burden of thinking, the results of which are predictably bad.
Fun fact: AI does not exist. No matter how much processing power you throw at a regression model, it will not gain sentience.