成人小说亚洲一区二区三区,亚洲国产精品一区二区三区,国产精品成人精品久久久,久久综合一区二区三区,精品无码av一区二区,国产一级a毛一级a看免费视频,欧洲uv免费在线区一二区,亚洲国产欧美中日韩成人综合视频,国产熟女一区二区三区五月婷小说,亚洲一区波多野结衣在线

立即打開(kāi)
大數(shù)據(jù)的局限性

大數(shù)據(jù)的局限性

Clifton Leaf 2017-08-08
在那些可能有海量有用數(shù)據(jù)可供發(fā)掘的地方,我們沒(méi)有為那些真正希望使用這些數(shù)據(jù)的人提供方便之門。

“每一場(chǎng)科學(xué)革命——從哥白尼的日心說(shuō)模型到統(tǒng)計(jì)學(xué)和量子力學(xué)的興起,從達(dá)爾文的進(jìn)化和自然選擇學(xué)說(shuō)到基因理論——都是由于一件事,也只是由于一件事導(dǎo)致的,那就是數(shù)據(jù)的獲取?!?

這是達(dá)納法伯癌癥研究所生物統(tǒng)計(jì)學(xué)和計(jì)算生物學(xué)教授約翰·夸肯布什昨天主題演講中令人大開(kāi)眼界的開(kāi)頭。他也是哈佛大學(xué)陳曾熙公共衛(wèi)生學(xué)院的教授,擁有諸多學(xué)術(shù)成果。

毫無(wú)疑問(wèn),這一數(shù)據(jù)概念如今正推動(dòng)著醫(yī)療衛(wèi)生行業(yè)幾乎各個(gè)方面的轉(zhuǎn)型??淇喜际苍谫M(fèi)城的MedCity Converge大會(huì)上指出,每家醫(yī)院平均每年會(huì)產(chǎn)生大約665TB的數(shù)據(jù),其中五分之四都是以圖片、視頻或醫(yī)囑的零散形式存在的。

不過(guò)嚴(yán)重限制人們利用這些信息的因素,不是“大數(shù)據(jù)”,而是“混亂數(shù)據(jù)”。

總體來(lái)看,在那些可能有海量有用數(shù)據(jù)可供發(fā)掘的地方,我們沒(méi)有為那些真正希望使用這些數(shù)據(jù)的人提供方便之門。那些數(shù)據(jù)可能很難或很直接地獲取,或是信息量不足,或是格式不對(duì)。還有可能數(shù)據(jù)不完整,或沒(méi)有使用兼容的儲(chǔ)存“標(biāo)準(zhǔn)”(我們似乎有數(shù)不清的互相不能兼容的標(biāo)準(zhǔn))?;蛘咴诙嗑S度的領(lǐng)域里,數(shù)據(jù)只記錄了一個(gè)維度的信息。(他說(shuō):“生物系統(tǒng)是個(gè)復(fù)雜的自適應(yīng)系統(tǒng),擁有許多活動(dòng)的部件,我們只是剛剛了解了一些皮毛?!保?

另外,這些數(shù)據(jù)并不能真正給出終端用戶想要尋求的答案,這一點(diǎn)似乎是出人意料的普遍誤解。換句話說(shuō),現(xiàn)有的數(shù)據(jù)沒(méi)有目的性。

以人口統(tǒng)計(jì)數(shù)據(jù)為例,這是政府和學(xué)術(shù)機(jī)構(gòu)常規(guī)收集的數(shù)據(jù)??淇喜际脖硎荆骸敖y(tǒng)計(jì)學(xué)會(huì)使用人口數(shù)據(jù),而醫(yī)學(xué)研究也會(huì)依賴人口數(shù)據(jù)。但醫(yī)療護(hù)理卻是通過(guò)個(gè)體數(shù)據(jù)推動(dòng)的。所以當(dāng)我們把(我們的數(shù)據(jù)研究)用于臨床時(shí),必須考慮如何讓個(gè)體數(shù)據(jù)以有意義的格式儲(chǔ)存而為人所用。”

他說(shuō),最終的目標(biāo)應(yīng)該是“利用不直觀的數(shù)據(jù),建立直觀的圖形化呈現(xiàn)”,從而讓非數(shù)據(jù)科學(xué)家“不必坐在終端機(jī)前輸入一系列晦澀的指令,就能對(duì)其展開(kāi)研究”。

夸肯布什表示:“在你考慮讓數(shù)據(jù)為人所用時(shí),要做的就是建立接口,讓人們能夠接觸并理解數(shù)據(jù),用他們自己的想法使用數(shù)據(jù)?!?

如果不這么做,我們所有的大數(shù)據(jù)就只是大型的二進(jìn)制數(shù)據(jù)塊和越來(lái)越大的數(shù)據(jù)服務(wù)器。

怎么阻止這種情況發(fā)生?夸肯布什坦率地說(shuō),將這些未經(jīng)處理的數(shù)據(jù)變成可用數(shù)據(jù)的動(dòng)機(jī),“不是提高醫(yī)療水平或讓人們過(guò)得更好。驅(qū)動(dòng)力將是所有科學(xué)中最重要的一種:經(jīng)濟(jì)學(xué)。如果我們真的打算有所進(jìn)展,就必須證明,將這種數(shù)據(jù)和信息整合起來(lái)會(huì)有利可圖。”(財(cái)富中文網(wǎng))

譯者:嚴(yán)匡正

“Every revolution in science—from the Copernican heliocentric model to the rise of statistical and quantum mechanics, from Darwin’s theory of evolution and natural selection to the theory of the gene—has been driven by one and only one thing: access to data.”

That was the eye-opening opening of a keynote address given yesterday by the brilliant John Quackenbush, a professor of biostatistics and computational biology at Dana-Farber Cancer Institute who has a dual professorship at the Harvard T.H. Chan School of Public Health and ample other academic credits after his name.

There is also no question that this digital fuel is driving virtually every transformation in healthcare happening today. Speaking at the MedCity Converge conference in Philadelphia, Quackenbush noted that the average hospital is generating roughly 665 terabytes of data annually, with some four-fifths of it in the unstructured forms of images, video, and doctor’s notes.

But the great limiting factor in harnessing all of this information-feedstock is not a “big data problem,” but rather a “messy data problem.”

In sum, in places where there is tons of potentially useful data to examine, we don’t make it accessible in ways that people actually want to use it. Either the data isn’t easy or intuitive to access or it simply isn’t informative. Or it’s in the wrong format. Or it’s incomplete—or created with incompatible “standards” (of which we seem to have an unlimited, irreconcilable supply). Or it captures just one dimension of a multidimensional realm. (“Biological systems are really complex, adaptive systems with many moving parts, that we’ve only begun to scratch the surface of understanding,” he says.)

Or—and this one seems to be a surprisingly common misstep—the data doesn’t really address the question the end user wants to answer. It’s off-purpose, in other words.

Take the case of population-level data, which government and academic institutions routinely collect: “Statistics operate on population data and medical research is driven by population data,” says Quackenbush, “but medical care is driven by individual-level data. So when we’re driving [our data research] to the clinic, we have to think about how we’re going to make that individual-level available in a meaningful format.”

Ultimately, the goal, he says, should be to “create intuitive graphical representations of the underlying data” in ways that allow non-data scientists “to explore it without having to sit at a terminal and type in a bunch of obscure commands.”

“What you want to think about doing when you make data available to people is to create interfaces that allow them to dive in and make sense of that data, using their own intuition,” Quackenbush says.

Without doing that, all of our growing mounds of big data will simply be big blobs on ever-bigger data servers.

What’s to stop that from happening? The incentive for turning all this raw feedstock into a usable fuel “is not going to be enhancing healthcare or making people better,” Quackenbush says flatly. “The driver is really going to be the most important ‘–omics’ science of all: which is economics. We have to show that there’s an advantage to bringing this kind of data and information together if we’re really going to make advances.”

掃描二維碼下載財(cái)富APP
十国产十欧美十岛国在线观看| 亚洲精品无码αv中文字幕| 久久精品极品盛宴观看| 国产一区二区三区AV在线无码观看| 亚洲中文字幕AⅤ无码性色| 国产美女精品人人做人人爽| 国产精品亚洲产品一区二区三区| 国产午夜精品一区二区在线观看| 激情综合亚洲色婷婷五月app| 青青青在线香蕉国产精品| 久爱无码精品免费视频| 2024最新福利天堂视频| 日韩成高清无码视频| 色婷婷色综合激情国产日韩| 亚洲AV狠狠爱一区二区三区| 国产精品991TV制片厂在线观看| 午夜激情影院国产| 中文资源在线天堂库8| 嫩草嫩草嫩草久久水拉丝了| 最近中文字幕国语免费高清6| 青青青久热国产精品视频| 久久久久夜夜夜精品国产| 中文字幕日韩精品中文区| 9277在线视频免费观看| 国产精品毛片VA一区二区三区| 久久久久无码精品国产AV蜜桃| 亚洲天堂在线观看视频网站| 狠狠色婷婷久久综合频道日韩| 亚洲熟妇AV一区二区三区浪潮| 乱码卡一卡二新区网站| 日韩人妻一区二区三区久久性色| 777久久精品一区二区三区无码| 国产综合久久久久久鬼色| 夜夜骚AV-D,亚洲VA中文字幕无码毛片| 女儿的朋友7中汉字晋通话| 久久福利一区二区三区| 青青青久热国产精品视频| 欧美疯狂做受XXXX高潮| 77777亚洲午夜久久多喷 | 国产精品亚洲综合一区| 欧美性猛交ⅩXXX乱大交|