警惕大數(shù)據(jù)的“啞鈴”現(xiàn)象
????如果真像知名風(fēng)投家馬克?安德里森所說的那樣,軟件正在吞噬世界,那么大數(shù)據(jù)就應(yīng)該是在拯救世界,對吧? ????近兩年來,“大數(shù)據(jù)”這個詞已然濫殤于報端。它一般代指用來分析令常規(guī)工具望洋興嘆的海量數(shù)據(jù)的一系列技術(shù)?!按髷?shù)據(jù)”的火爆令許多高管不禁躊躇自己的公司是否也要來上這么一套東西。這種現(xiàn)象從很多方面看很像上個世紀(jì)60年代——當(dāng)年仍處于襁褓階段的計(jì)算機(jī)雖然異常昂貴,但它所具有的未來主義色彩仍令眾多大企業(yè)心折不已,遂紛紛把它看成一種有利于競爭的工具。那么現(xiàn)在的企業(yè)面對大數(shù)據(jù)浪潮該何去何從?是該害怕它,還是勇敢地?fù)肀??另外,究竟誰才真正需要這個東西? ????為了透過熱鬧看門道,《財富》雜志將電話打到了高拉夫?迪隆在加州圣馬特奧市的辦公室。如果你覺得迪隆這個名字很耳熟,那是因?yàn)榈下≡鴵?dān)任過Informatica公司的創(chuàng)始人兼首席執(zhí)行官。Informatica公司的總部位于加州的紅杉市,市值將近40億美金,主要業(yè)務(wù)是替大企業(yè)管理數(shù)據(jù)庫。 ????迪隆于2009年就任數(shù)據(jù)集成公司SnapLogic的首席執(zhí)行官。他認(rèn)為大數(shù)據(jù)對于大企業(yè)來說蘊(yùn)含著豐富的商機(jī)——但僅限于某些行業(yè)。他把這種情形稱為大數(shù)據(jù)應(yīng)用的“啞鈴”現(xiàn)象。以下是這次電話專訪的文字記錄,為清晰起見進(jìn)行了部分編輯和精簡。 ????《財富》:去年可能再沒有比“大數(shù)據(jù)”更火的詞了,幾乎到處都能看到這個詞——比如在科技峰會的主題演講里,在各種簡介材料和展板里,在關(guān)于各種行業(yè)的新聞文章里……大家都覺得自己需要搞大數(shù)據(jù)。不過,大數(shù)據(jù)是個非常專門的計(jì)算技術(shù)的類型,是吧?還是說,它只是個噱頭? ????迪?。?/strong>我在信息技術(shù)行業(yè)從業(yè)22年,也有一些自己的觀點(diǎn)。2002年的時候,我用“信息海嘯”一詞來描述它。現(xiàn)在我們又有了一個新名詞。 ????我認(rèn)為現(xiàn)在需要管理的數(shù)據(jù)量的確越來越大了。這個行業(yè)最初發(fā)端于上個世紀(jì),而且是在互聯(lián)網(wǎng)發(fā)明以前,起初是要處理零售業(yè)的條形碼和UPC代碼數(shù)據(jù)。對這些數(shù)據(jù)的早期分析孕育了后來的數(shù)據(jù)存儲行業(yè)。后來這個行業(yè)帶動了市場決策、定價決策、零售預(yù)測等等方面。 ????大數(shù)據(jù)的火爆趨勢還會繼續(xù)下去,不會突然發(fā)生轉(zhuǎn)變。一位科學(xué)家曾說過:“科學(xué)每一次都提前埋葬了一點(diǎn)過去。”所以我認(rèn)為我們還能夠繼續(xù)享受利用數(shù)據(jù)進(jìn)行決策,以及利用大數(shù)據(jù)進(jìn)行更合理的決策所帶來的效益。 ????我們需要處理的數(shù)據(jù)的確“變大了”。當(dāng)然,我家車庫里也比十年前裝了更多的東西,隨著時間的推移,大家的東西都會越來越多。 ????但是有意思的是,大數(shù)據(jù)具有數(shù)據(jù)科學(xué)的元素,我認(rèn)為這是比較重要的一點(diǎn)。首先它從大數(shù)據(jù)中擷取出小數(shù)據(jù),然后在小數(shù)據(jù)中尋找信號,來理解我們下一步該做什么——比如誰將贏得大選?氣候和語言之間有什么相關(guān)性?也就是我們現(xiàn)在能做一些靠上個世紀(jì)的運(yùn)算能力沒法處理的事。而且現(xiàn)在Hadoop和其它一些工具已經(jīng)讓大數(shù)據(jù)走向大眾化。所以,現(xiàn)在大數(shù)據(jù)計(jì)算的價格和性能都發(fā)生了根本的變化。 ????在有些案例中,大數(shù)據(jù)的效益很明顯;在其他一些案例中,大數(shù)據(jù)的作用被夸大了,它的效益可能不會那么明顯。隨著許多東西的電子化程度越來越高——比如超市、橋梁、汽車、公路等,大家有了它們的傳感器數(shù)據(jù),就會獲得大量的信息。但更多的數(shù)據(jù)并不會讓人變得更聰明,它只是意味著大家要花更多錢用來儲存這些數(shù)據(jù)。正是這個方面會讓有些公司被甩出這個市場——也就是大數(shù)據(jù)的效益方面。 ????在有些領(lǐng)域,比如零售、定價、金融方面,大數(shù)據(jù)的效益很明顯。但在有些行業(yè)里,把錢投在大數(shù)據(jù)或是投在研發(fā)和市場上,哪個帶來的效益更多,答案并不明顯。我不是要告訴你大數(shù)據(jù)是個萬靈丹,而是要告訴你管理這些數(shù)據(jù)……不同的人獲得的效益是不一樣的。 ????上周新更新的一集美劇《廣告狂人》(Mad Men)里,那家名叫Sterling Cooper & Partners的廣告公司購買了一臺新的IBM 360大型主機(jī)放在原來的一間會議室里。劇中的有些角色為了讓公司獲得競爭優(yōu)勢而想買這臺電腦;還有些人支持買這臺電腦是因?yàn)樗麄儼阉闯晌磥淼囊环N趨勢。另外還有一些人擔(dān)心這臺電腦會取代他們的工作。這就是人們看待大數(shù)據(jù)的一般看法嗎? ????對計(jì)算機(jī)的恐懼不僅僅是他們有。剛畢業(yè)的大學(xué)生、2000年后畢業(yè)的人以及我的孩子(一個13歲、一個6歲)這一代人,他們并不害怕計(jì)算機(jī)——他們雖然可能不是搞編程的,但他們對科技上手很快,個個都是民間高手。而《美國隊(duì)長2》(Captain America: The Winter Soldier)里九頭蛇密謀顛覆世界的“洞察計(jì)劃”渲染的全是大數(shù)據(jù)的陰暗面。實(shí)際上如今各大企業(yè)想的都是“我們不能落在后面”,所以紛紛在這個領(lǐng)域開展軍備競賽。雖然社會上有人擔(dān)心大數(shù)據(jù)會導(dǎo)致“洞察計(jì)劃”這樣的陰謀成為現(xiàn)實(shí),但企業(yè)界沒有這種擔(dān)憂。不過在企業(yè)界里也存在獲取了錯誤的數(shù)據(jù)或是沒能真正理解數(shù)據(jù)含義的問題——這和五六十年前的情況如出一轍。在SnapLogic公司,我們現(xiàn)在就正在嘗試完成一下一些未完成的業(yè)務(wù)。為什么到了2014年它還是這么難? |
????f software is eating the world, as described by the prominent venture capitalist Marc Andreessen in 2011, then big data is supposed to be saving it. Right? ????Popular use of the term "big data," which is used to describe technologies that help parse datasets too large for conventional tools to handle, has exploded in the last two years -- leaving many business executives wondering if they need it. It is in many ways an echo of the 1960s, when large corporations saw early computers as (expensive, rudimentary, futuristic) competitive tools. To fear, or to embrace? And who, exactly, should need such a thing? ????In an attempt to slash through the hype, Fortune rung up Gaurav Dhillon at his office in San Mateo, Calif. If his name sounds familiar, that's because Dhillon is the founder and former chief executive of Informatica (INFA), the nearly $4 billion Redwood City-based software company known for managing the data warehouses of large companies. ????Dhillon, who became the chief executive of the data integration company SnapLogic in 2009, believes that big data holds big promise for big businesses -- but only in certain industries. He calls it the "big data barbell." Below are his words, edited and condensed for clarity. ????Fortune: Perhaps no term has been more popular in the last year or so than "big data." It's everywhere: in keynotes at technology conferences, in briefing materials and presentation decks, in news articles about various industries. Everybody seems to think they need it -- but big data is a rather specialized type of computing, no? Is big data kind of B.S.? ????Dhillon: Coming up on 22 years in the technology industry, I should have some kind of perspective. Back in 2002, I used the term "the information tsunami." And here we are today. ????I think what is true is that data under management has gotten bigger. Initially, the roots of this industry in the last century, before the web, were in retail and bar code scans and UPC codes, as you call them, to stock shelves. That was the birth of the data warehousing industry: early analytics. That industry drove marketing decisions, pricing decisions, retail forecasting, and so on. ????The trend will continue; it's not suddenly going to change. A scientist said, "Science advances one funeral at a time." So I think the benefit of being able to use data to make decisions, and make bigger data to make more possible decisions, will continue. ????The fact that data is "bigger" -- well yes, my garage has more stuff in it than it did 10 years ago! Everybody has more stuff [over time]. ????But the interesting twist is that big data has an element of data science, which I think is more important. It first makes small data out of big data and then it looks for signals in that small data to understand what to do: Who's going to win the election? What are the correlations between weather and language? Things that we simply didn't have enough processing power in the last century. And now you've got a democratizing aspect with Hadoop and other things. So you had a fundamental shift around price and performance around compute. ????The benefits of that are in some cases pretty clear, and in some cases there is gee-whiz science for which the benefits are not. So I think this aspect of being able to get a lot of information by increasingly electronic things -- the supermarket, bridges, cars, roads -- so you have sensor data. More data doesn't make you any smarter; it just means you spent a lot of money to store it. This is where the market will shake out -- the benefits. ????In retail it's clear. Pricing, etc. The financial industry -- that's clear. But in certain industries, it's not clear, putting all this effort in rather than looking at the R&D budget or spending on marketing. I'm not here to tell you it's a panacea; I'm here to tell you that managing that data ... people are going to get varying mileage from it. ????On this week's episode of Mad Men, the ad agency Sterling Cooper & Partners replaces a meeting room with a new tool: an IBM System/360 mainframe computer. Some characters want the computer for competitive reasons; some want it because they see it as the future. Others are terrified that it will replace them. Is that how people look at big data? ????The fear of computers has, in fact, left the building. New generations of employees, people who graduated this millennium, my kids -- 13 and 6. The Millennials are not afraid of computers -- they make not be programmers, but they're tech-savvy. We think of them as citizen integrators. Captain America: The Winter Soldier was all about the dark side of big data. Today, there's more of an arms race of, "We don't want to be left behind." There are Orwellian concerns around big data in society, but not in business. But in business, there are issues around having the wrong data or not being able to get at information -- that's the same as it was 50, 60 years ago. At SnapLogic, we're trying to finish some unfinished business. Why is this so hard in 2014? |
-
熱讀文章
-
熱門視頻