大數(shù)據(jù)的局限性
????阿貝斯曼的分析單位是事實,希爾則聚焦于“預測的有效性”。希爾擁有良好的風度和自我認知,他承認人性的弱點是一種設計約束。“但我認為,我們的信念永遠不能達到完美的客觀性,合理性和準確性,”希爾寫道?!跋喾矗覀兛梢粤幧僖稽c主觀性、少一點不合理性、少犯一點錯誤。根據(jù)我們的信念作出預測,是進行自我測試的最佳(或許也是唯一的)方式。如果客觀性關(guān)系到一個更大的超越我們自身條件的真理,那么預測就是審視我們個人看法與那個更大真理之間的聯(lián)系究竟有多么密切的最佳方式,最客觀的往往是那些做出最準確預測的人。” ????然而,我想知道的是,希爾是否充分意識到,他將警示故事與令人震驚的失敗混合在一起的做法,可能會對將其報道銘記于心的讀者產(chǎn)生累積效應。他提供了一個又一個例子來說明,帶有缺陷和偏見的人,使用帶有缺陷和偏見的方式,構(gòu)建出帶有缺陷和偏見的模型。他非常出色地反復闡述了“過度擬合的”統(tǒng)計模型。希爾解釋稱,為了適應數(shù)據(jù),統(tǒng)計學家們竭力調(diào)試自己的模型,最終往往大大降低了這些模型的準確性,進而無法用其進行可靠的預測。 ????希爾的故事為現(xiàn)在的預測模型構(gòu)建者提供了一個公平的樣本。就這一點而言,這本書預測稱,未來的新世界將充斥著許多由統(tǒng)計數(shù)據(jù)驅(qū)動的成功案例,既不快樂,也不勇敢。在這個世界中,平均表現(xiàn)距離世界級水準或許相差好幾個標準差。 ????希爾引用了菲利浦?泰洛克對專家意見所進行的經(jīng)典研究。這項研究顯示,數(shù)量多得令人不安的專業(yè)領域的“專家”在預測可能結(jié)果方面的表現(xiàn)往往差得離譜。此外,專家們往往對其預測的質(zhì)量過度自信,簡言之,專家意見時常獲得兩個世界的最差結(jié)果:以妄自尊大的態(tài)度給出了錯誤答案。這不是成功的秘訣。 ????從IBM的超級電腦Watson,谷歌(Google)的搜索算法,到亞馬遜網(wǎng)站(Amazon)的推薦引擎,數(shù)據(jù)驅(qū)動的計算系統(tǒng)無疑能夠獲得非凡的成功,特別是當它們專注于現(xiàn)實生活測試,而不是抽象理論的時候?!罢嬲谩髷?shù)據(jù)的公司,比如谷歌,并沒有將大量時間花在構(gòu)建模型上,”希爾寫道?!斑@些公司每年從事數(shù)十萬次實驗,在真實的顧客身上測試自己的想法。” ????然而,讀完這兩部著作,我們可以得出一個頗具諷刺意味的結(jié)論:一個人獲得的數(shù)據(jù)和事實越多,預測就越有意義,人的判斷也就顯得愈發(fā)重要。人類、數(shù)據(jù)集和算法的協(xié)同進化將最終決定“大數(shù)據(jù)”究竟是會創(chuàng)造新財富,還是會摧毀舊價值。 ????本文作者邁克爾?施拉格是麻省理工學院斯隆管理學院數(shù)字商務研究中心(MIT Sloan School's Center for Digital Business)研究員,曾經(jīng)擔任《財富》雜志(Fortune )專欄作家,著有《你想讓你的客戶變成什么樣的人?》(Who Do You Want Your Customers To Become?)一書。 ????譯者:任文科 |
????Where Arbesman's unit of analysis is the fact, Silver focuses on "predictive validity." He has the good grace and self-awareness to accept human frailty as a design constraint. "But I'm of the view we can never achieve perfect objectivity, rationality or accuracy in our beliefs," Silver writes. ????"Instead we can strive to be less subjective, less irrational and less wrong. [emphasis in original] Making predictions based on our beliefs is the best (and perhaps only) way to test ourselves. If objectivity is the concern for a greater truth beyond our personal circumstances, and prediction is the best way to examine how closely aligned our personal perceptions are with that greater truth, the most objective among us are those who make the most accurate predictions." ????I wonder, however, if Silver is fully aware of the cumulative effect his mix of cautionary tales and shocking failures might have on readers who take his reporting to heart and mind. He provides example after example of flawed and biased human beings building flawed and biased models and using them flawed and biased ways. He provides a superb riff on "overfitting" in statistical models. By trying to get their models to fit the data a little too well, Silver explains, statisticians all too frequently end up making them far less accurate and reliable for prediction. ????To the extent that Silver's stories present a fair sampling of today's predictive modelers, this book predicts neither a happy nor brave new world of statistics-driven success. In this world, average performance may prove to be quite a few standard deviations away from world class. ????Silver cites Philip Tetlock's classic study of expertise, which shows that "experts" in a disconcerting number of disciplines are disproportionately worse than chance at predicting likely outcomes. Experts also tend to be disproportionately overconfident about the quality of their predictions. In short, expertise frequently yields the worst of both worlds: wrong answers stewed in arrogance. This is not a recipe for success. ????Between IBM's Jeopardy-winning Watson, Google's search algorithms and Amazon's recommendation engines, there's no doubt that data-driven computational systems can enjoy remarkable success, particularly when they focus on real-life testing rather than abstract theory. "Companies that really 'get' Big Data, like Google, aren't spending a lot of time in model land," Silver writes. "They're running hundreds of thousands of experiments every year and testing their ideas on real customers." ????The ironic takeaway from both these fine books, however, is that the more data and facts one has, and the more predictions matter, the more important human judgment becomes. The co-evolution of human beings, datasets, and algorithms will ultimately determine whether Big Data creates new wealth or destroys old value. ????A research fellow at MIT Sloan School's Center for Digital Business, Michael Schrage is a former Fortune columnist and the author of Who Do You Want Your Customers To Become? |
最新文章