挖掘Twitter數(shù)據(jù):打破沙鍋問(wèn)到底
????2006年13歲的那些人如今已20歲了。他們中不少人恐怕都想刪除一些網(wǎng)絡(luò)歷史記錄,特別是那些自己十幾歲時(shí)在Twitter上發(fā)布的信息:比方說(shuō),現(xiàn)在著迷于2 Chainz的人,七年前可能還在一遍遍地發(fā)Twitter說(shuō),多么喜歡Fall Out Boy樂(lè)隊(duì)的熱門(mén)單曲《Dance Dance》。
????如今他們過(guò)去發(fā)布的這些Twitter消息都能搜索到,自從Twitter成立以來(lái)每條微博都能被搜索到,這太糟了。社交媒體索引和分析公司Topsy周三宣布,已經(jīng)對(duì)Twitter成立以來(lái)的每條微博建立了索引——總共有4,250億條內(nèi)容,包括鏈接至微博的圖片和頁(yè)面。此前,索引只能追溯到2010年。 ????這項(xiàng)服務(wù)是免費(fèi)的。用戶可以對(duì)這些內(nèi)容進(jìn)行各種各樣的操作,包括縮短時(shí)間參數(shù)以及將結(jié)果圖表化。根據(jù)Topsy的算法,用戶也可以按微博轉(zhuǎn)發(fā)量和某一Twitter用戶的受關(guān)注程度進(jìn)行排名。 ????營(yíng)銷和社交專家無(wú)疑會(huì)尋找方式利用所有這些可能已被遺忘的數(shù)據(jù)——甚至是那些非常久遠(yuǎn)的信息,那些初看起來(lái)可能不是特別有價(jià)值的信息。但數(shù)據(jù)研究人員為了研究購(gòu)車(chē)習(xí)慣這樣的行為需要追溯的時(shí)間遠(yuǎn)遠(yuǎn)超過(guò)7年。Twitter發(fā)布的時(shí)間越久,這些數(shù)據(jù)對(duì)于研究消費(fèi)者偏好或政客人氣等內(nèi)容就越有價(jià)值。 ????Topsy大部分收入來(lái)自出售高端和細(xì)化數(shù)據(jù)分析工具,包括將微博按發(fā)布地區(qū)或?qū)Ρ葪l件細(xì)分的數(shù)據(jù)。注冊(cè)用戶每月最低支付1,000美元的費(fèi)用就可獲得這些服務(wù)。 ????順便說(shuō)一下,對(duì)于Fall Out Boy有個(gè)好消息:過(guò)去30天有273,000條微博到了這支樂(lè)隊(duì)的名字。(財(cái)富中文網(wǎng))??? |
????People who were 13 years old in 2006 are 20 now. Many of them no doubt would like to erase much of their online histories, especially the stuff they wrote on Twitter in their early-teen years: say, somebody who's now a fan of 2 Chainz, but who seven years ago tweeted over and over again about how much they loved Fall Out Boy's "Dance Dance." ????Too bad, because those tweets from the past are all accessible, along with every other tweet ever written since Twitter launched in 2006. The social-media indexing and analysis firm Topsy announced on Wednesday that it had indexed every tweet from the beginning -- in all, about 425 billion pieces of content including pictures and pages linked from tweets. Before now, the index had reached back only to 2010. ????The service is free. Users can do all kinds of things with it, such as narrow time parameters and graph results. Topsy's algorithms allow ranking in terms of the number of retweets and the popularity of a particular Twitter user. ????Marketers and social scientists will no doubt find ways to make use of all that data that haven't even been thought of yet -- even the really old stuff, which at first blush might not appear to be particularly valuable. But data researchers go back a lot further than seven years all the time to study things like car-buying habits. The older Twitter gets, the more valuable the data will be for studying things like consumer sentiment or reactions to politicians. ????Topsy makes most of its revenue from selling access to high-end, more granular data-analysis tools which include things like breaking down tweets by geographic origin or comparing terms. Subscribers pay a minimum of $1,000 a month for those services. ????Good news, by the way, for Fall Out Boy: The band's name has been tweeted 273,000 times in the past 30 days. |