大數(shù)據(jù):預(yù)知未來的高科技“水晶球”
????你可能還記得,塔吉特百貨公司(Target)在去年初曾深陷憤怒的輿論漩渦中心。那是因?yàn)檫@家零售商的數(shù)據(jù)專家們開發(fā)出了一種統(tǒng)計(jì)方法,可以預(yù)測哪些客戶有可能在近期懷孕,營銷人員向她們推銷嬰幼兒產(chǎn)品時(shí),就擁有了先人一步的優(yōu)勢。 ????這個模型很管用:在塔吉特購買孕期及嬰幼兒產(chǎn)品的客戶增長了30%。但這卻引來輿論一片嘩然,從《紐約時(shí)報(bào)》(The New York Times)到??怂剐侣劊‵ox News),幾乎所有人都指責(zé)該公司是在“暗中監(jiān)測”購物者。這場風(fēng)波好幾周后才平息下去。 ????如果塔吉特成功監(jiān)測準(zhǔn)媽媽這件事已經(jīng)讓你覺得毛骨悚然了,那埃里克?西格爾的新書恐怕會讓你惶惶不可終日的。西格爾曾是哥倫比亞大學(xué)(Columbia University)的教授,他的公司叫“預(yù)測影響”(Predictive Impact),專門開發(fā)各類數(shù)學(xué)模型,這些模型能從海量原始數(shù)據(jù)中提取出極具價(jià)值的信息。各類公司都在使用這些工具進(jìn)行預(yù)測,不管是我們想購買什么東西,還是我們想看什么電影,不管是我們碰上車禍的可能性有多高,還是我們有多大可能會信用卡欠款,都能預(yù)測出來。 ????在《預(yù)測分析:預(yù)測誰將點(diǎn)擊、購買、撒謊或死亡的力量》(Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die)一書中,西格爾用清晰生動的語言解釋了這些模型運(yùn)作的機(jī)制及各類誤區(qū)。簡而言之,預(yù)測分析,或簡稱PA,就是一種從經(jīng)驗(yàn)中學(xué)習(xí)的科學(xué)。從既定人群——客戶、病人、即將假釋的囚犯、選民或員工——過去和當(dāng)前的行為數(shù)據(jù)入手,分析師就能預(yù)知他們下一步可能的行為。 ????這是一種可以預(yù)知未來的高科技“水晶球”。西格爾寫道,它位居“日益盛行的、越來越依靠數(shù)據(jù)做決策的趨勢”幕后,“實(shí)際上,如果一個機(jī)構(gòu)從來不用這種方式充分利用自己的數(shù)據(jù),那就和一個人有過目不忘的本事卻從來不動腦筋無異”。 ????這本書列舉了豐富的案例,有關(guān)花旗集團(tuán)(Citi)、Facebook、IBM、谷歌公司(Google)、網(wǎng)飛公司(Netflix)、貝寶(PayPal)和其他企業(yè)及政府機(jī)構(gòu)利用預(yù)測分析的例子比比皆是。比如,輝瑞制藥(Pfizer)就有一個預(yù)測模型,它能預(yù)告病人在三周內(nèi)對一種給定新藥產(chǎn)生藥效反應(yīng)的幾率。LinkedIn會用PA來準(zhǔn)確找到你希望聯(lián)系的用戶。而在美國國稅局(IRS),一套用于過去納稅申報(bào)單的數(shù)學(xué)排序系統(tǒng)“讓IRS的分析師在不增加調(diào)查的前提下,能發(fā)現(xiàn)比以前多25倍的逃稅情況?!?/p> ????還有一個惠普公司(Hewlett-Packard)的案例。幾年前,惠普的一些部門每年離職率高達(dá)20%,受此觸動,惠普決定預(yù)測其全球33萬名員工中誰最有可能辭職。分析師團(tuán)隊(duì)從海量數(shù)據(jù)入手,如薪酬水平、加薪情況、升遷情況及輪崗情況等,將它們和已離職員工的詳細(xì)工作經(jīng)歷聯(lián)系起來開展分析。在他們所發(fā)現(xiàn)的數(shù)據(jù)相似性基礎(chǔ)上,研究者們?yōu)槟壳懊课粏T工都打了一個離職風(fēng)險(xiǎn)(Flight Risk)評分。 |
????Early last year, you might recall, Target found itself at the center of a storm of outrage. The retailer's number crunchers had come up with a statistical method for predicting which of its customers were most likely to become pregnant in the near future, giving Target's marketers a head start on pitching them baby products. ????The model worked: Target expanded its customer base for pregnancy and infant-care products by about 30%. But the media brouhaha, with everyone from The New York Times to Fox News accusing the company of "spying" on shoppers, took weeks to die down. ????If Target's success at setting its sights on potential moms-to-be gives you the creeps, Eric Siegel's new book could ruin your whole day. Siegel is a former Columbia professor whose company, Predictive Impact, builds mathematical models that cull valuable nuggets of data from floods of raw information. Companies use the tools to forecast everything from what we'll shop for, to which movies we'll watch, to how likely we are to be in a car accident or default on our credit cards. ????In Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, Siegel explains how these models work and where the pitfalls are, in clear, colorful terms. Simply put, predictive analytics, or PA, is the science of learning from experience. Starting with data about the past and current behavior of a given group of people -- whether customers, patients, prison inmates up for parole, voters, or employees -- analysts can predict what they'll probably do next. ????This kind of high-tech crystal ball is behind "the growing trend to make decisions more 'data driven,'" Siegel writes. "In fact, an organization that doesn't leverage its data in this way is like a person with a photographic memory who never bothers to think." ????Predictive Analytics is packed with examples of how Citi, Facebook, Ford, IBM, Google, Netflix, PayPal and many other businesses and government agencies have put PA to work. Pfizer, for instance, has a predictive model to foretell the likelihood that a patient will respond to a given new drug within three weeks. LinkedIn uses PA to pinpoint the fellow members you might want as connections. At the IRS, a mathematical ranking system applied to past tax returns "empowered IRS analysts to find 25 times more tax evasion, without increasing the number of investigations." ????And then there's Hewlett-Packard. A couple of years ago, alarmed by annual turnover rates in some divisions as high as 20%, HP decided to try anticipating which of its 330,000 employees worldwide were most likely to quit. Beginning with reams of data on things like salaries, raises, promotions, and job rotations, a team of analysts correlated that information with detailed employment records of people who had already left. Based on the similarities they found, the researchers assigned each current employee a Flight Risk score. |