人工智能通常被視為現(xiàn)代社會(huì)的電力,每天為無數(shù)人類互動(dòng)提供動(dòng)力。然而,隨著大型科技公司和富裕國(guó)家在該領(lǐng)域占據(jù)主導(dǎo)地位,初創(chuàng)公司和發(fā)展中國(guó)家面臨著明顯的劣勢(shì),尤其是在兩個(gè)關(guān)鍵領(lǐng)域:訓(xùn)練數(shù)據(jù)集和算力。
人工智能的全球監(jiān)管環(huán)境極為復(fù)雜,并因私營(yíng)和公共部門利益相關(guān)者之間不同的法規(guī)和合作模式而呈分散態(tài)勢(shì)。由于需要協(xié)調(diào)國(guó)際間的監(jiān)管框架和標(biāo)準(zhǔn),這種復(fù)雜性進(jìn)一步加劇。
人工智能訓(xùn)練數(shù)據(jù)集的合理使用規(guī)定因地區(qū)而異。例如,歐盟的《人工智能法》禁止在未經(jīng)版權(quán)所有者明確授權(quán)的情況下使用受版權(quán)保護(hù)的材料來訓(xùn)練人工智能模型。相反,日本的《文本和數(shù)據(jù)挖掘法》允許在不區(qū)分合法和非法獲取的材料的情況下使用受版權(quán)保護(hù)的數(shù)據(jù)來訓(xùn)練人工智能模型。相比之下,中國(guó)出臺(tái)了若干原則和法規(guī)來規(guī)范人工智能訓(xùn)練數(shù)據(jù)集的使用,這些原則和法規(guī)要求合法獲取訓(xùn)練數(shù)據(jù),這一點(diǎn)與歐盟更為一致。然而,這些法規(guī)僅針對(duì)面向公眾的人工智能服務(wù),而不包括企業(yè)和研究機(jī)構(gòu)開發(fā)和使用的人工智能服務(wù)。
監(jiān)管環(huán)境往往會(huì)塑造初創(chuàng)公司的發(fā)展軌跡,對(duì)其創(chuàng)新和規(guī)模擴(kuò)張能力產(chǎn)生重大影響。一家專注于訓(xùn)練模型的人工智能初創(chuàng)公司——無論是預(yù)訓(xùn)練還是后訓(xùn)練階段——都將面臨各種各樣的監(jiān)管挑戰(zhàn),這些挑戰(zhàn)可能會(huì)影響其長(zhǎng)期成功,具體取決于其運(yùn)營(yíng)的地區(qū)。例如,在爬取受版權(quán)保護(hù)的互聯(lián)網(wǎng)數(shù)據(jù)并將其用于訓(xùn)練強(qiáng)大的人工智能模型方面,日本的初創(chuàng)企業(yè)將受到日本《文本和數(shù)據(jù)挖掘法》的保護(hù),這使其比歐盟的初創(chuàng)企業(yè)更具優(yōu)勢(shì)。鑒于人工智能技術(shù)超越國(guó)界,這就需要關(guān)鍵利益攸關(guān)方開展協(xié)作,制定跨境解決方案,并進(jìn)行全球合作。
就算力而言,大型企業(yè)(無論是國(guó)有企業(yè)還是私營(yíng)企業(yè))與初創(chuàng)企業(yè)之間存在巨大差距。大型科技公司和國(guó)有實(shí)體擁有購(gòu)買和囤積算力的資源,以支持其未來的人工智能發(fā)展目標(biāo),而不具備這些資源的小型企業(yè)則依賴大型企業(yè)提供人工智能訓(xùn)練和推理基礎(chǔ)設(shè)施。圍繞計(jì)算資源的供應(yīng)鏈問題加劇了這一差距,這在全球南方更為明顯。例如,在全球前100個(gè)能夠訓(xùn)練大型人工智能模型的高性能計(jì)算(HPC)集群中,沒有一個(gè)托管在發(fā)展中國(guó)家。
2023年10月,作為聯(lián)合國(guó)秘書長(zhǎng)數(shù)字合作路線圖的一部分,聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)(HLAB)成立,旨在為聯(lián)合國(guó)成員國(guó)提供人工智能國(guó)際治理的分析和建議。該小組由39名具有不同背景的人士組成(按地域、性別、年齡和學(xué)科劃分),涵蓋政府、公民社會(huì)、私營(yíng)部門和學(xué)術(shù)界,以確保人工智能治理的建議既公平又具有包容性。
作為這個(gè)過程的一部分,我們對(duì)初創(chuàng)企業(yè)和中小企業(yè)的專家進(jìn)行了采訪,以探討他們?cè)谌斯ぶ悄苡?xùn)練數(shù)據(jù)集方面面臨的挑戰(zhàn)。他們的反饋意見強(qiáng)調(diào)了聯(lián)合國(guó)等中立國(guó)際機(jī)構(gòu)在監(jiān)督人工智能國(guó)際治理方面的重要性。
聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)關(guān)于人工智能訓(xùn)練數(shù)據(jù)集標(biāo)準(zhǔn)的建議,涵蓋了預(yù)訓(xùn)練還是后訓(xùn)練的標(biāo)準(zhǔn),詳見新報(bào)告《以人為本的人工智能治理》,包括以下內(nèi)容:
1.建立全球匿名數(shù)據(jù)交換市場(chǎng),規(guī)范數(shù)據(jù)相關(guān)定義、人工智能訓(xùn)練數(shù)據(jù)全球治理原則和人工智能訓(xùn)練數(shù)據(jù)來源,以及透明、基于權(quán)利的問責(zé)制。這包括引入數(shù)據(jù)管理和交換流程及標(biāo)準(zhǔn)。
2.促進(jìn)數(shù)據(jù)共享,鼓勵(lì)對(duì)代表性不足或缺失的數(shù)據(jù)進(jìn)行整理。
3.確保國(guó)際數(shù)據(jù)訪問的互操作性。
4.創(chuàng)建以尊重權(quán)利的方式補(bǔ)償數(shù)據(jù)創(chuàng)建者的機(jī)制。
為了解決算力的差距,聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)提出以下建議:
1.在共同利益框架下建立能力建設(shè)網(wǎng)絡(luò),確保人工智能利益的公平分配。
2.建立全球基金,支持旨在將人工智能應(yīng)用于當(dāng)?shù)毓怖嬗美难芯咳藛T和開發(fā)人員獲取計(jì)算資源。
人工智能的國(guó)際治理,尤其是訓(xùn)練數(shù)據(jù)集和算力的治理,對(duì)初創(chuàng)企業(yè)和發(fā)展中國(guó)家至關(guān)重要。它為獲取基本資源和促進(jìn)國(guó)際合作提供了強(qiáng)有力的框架,使初創(chuàng)企業(yè)能夠在全球人工智能領(lǐng)域負(fù)責(zé)任地進(jìn)行創(chuàng)新和擴(kuò)展。(財(cái)富中文網(wǎng))
納茲尼恩·拉賈尼(Nazneen Rajani)博士擔(dān)任Collinear AI公司的首席執(zhí)行官,也是聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)的成員。
Fortune.com上發(fā)表的評(píng)論文章中表達(dá)的觀點(diǎn),僅代表作者本人的觀點(diǎn),不代表《財(cái)富》雜志的觀點(diǎn)和立場(chǎng)。
譯者:中慧言-王芳
人工智能通常被視為現(xiàn)代社會(huì)的電力,每天為無數(shù)人類互動(dòng)提供動(dòng)力。然而,隨著大型科技公司和富裕國(guó)家在該領(lǐng)域占據(jù)主導(dǎo)地位,初創(chuàng)公司和發(fā)展中國(guó)家面臨著明顯的劣勢(shì),尤其是在兩個(gè)關(guān)鍵領(lǐng)域:訓(xùn)練數(shù)據(jù)集和算力。
人工智能的全球監(jiān)管環(huán)境極為復(fù)雜,并因私營(yíng)和公共部門利益相關(guān)者之間不同的法規(guī)和合作模式而呈分散態(tài)勢(shì)。由于需要協(xié)調(diào)國(guó)際間的監(jiān)管框架和標(biāo)準(zhǔn),這種復(fù)雜性進(jìn)一步加劇。
人工智能訓(xùn)練數(shù)據(jù)集的合理使用規(guī)定因地區(qū)而異。例如,歐盟的《人工智能法》禁止在未經(jīng)版權(quán)所有者明確授權(quán)的情況下使用受版權(quán)保護(hù)的材料來訓(xùn)練人工智能模型。相反,日本的《文本和數(shù)據(jù)挖掘法》允許在不區(qū)分合法和非法獲取的材料的情況下使用受版權(quán)保護(hù)的數(shù)據(jù)來訓(xùn)練人工智能模型。相比之下,中國(guó)出臺(tái)了若干原則和法規(guī)來規(guī)范人工智能訓(xùn)練數(shù)據(jù)集的使用,這些原則和法規(guī)要求合法獲取訓(xùn)練數(shù)據(jù),這一點(diǎn)與歐盟更為一致。然而,這些法規(guī)僅針對(duì)面向公眾的人工智能服務(wù),而不包括企業(yè)和研究機(jī)構(gòu)開發(fā)和使用的人工智能服務(wù)。
監(jiān)管環(huán)境往往會(huì)塑造初創(chuàng)公司的發(fā)展軌跡,對(duì)其創(chuàng)新和規(guī)模擴(kuò)張能力產(chǎn)生重大影響。一家專注于訓(xùn)練模型的人工智能初創(chuàng)公司——無論是預(yù)訓(xùn)練還是后訓(xùn)練階段——都將面臨各種各樣的監(jiān)管挑戰(zhàn),這些挑戰(zhàn)可能會(huì)影響其長(zhǎng)期成功,具體取決于其運(yùn)營(yíng)的地區(qū)。例如,在爬取受版權(quán)保護(hù)的互聯(lián)網(wǎng)數(shù)據(jù)并將其用于訓(xùn)練強(qiáng)大的人工智能模型方面,日本的初創(chuàng)企業(yè)將受到日本《文本和數(shù)據(jù)挖掘法》的保護(hù),這使其比歐盟的初創(chuàng)企業(yè)更具優(yōu)勢(shì)。鑒于人工智能技術(shù)超越國(guó)界,這就需要關(guān)鍵利益攸關(guān)方開展協(xié)作,制定跨境解決方案,并進(jìn)行全球合作。
就算力而言,大型企業(yè)(無論是國(guó)有企業(yè)還是私營(yíng)企業(yè))與初創(chuàng)企業(yè)之間存在巨大差距。大型科技公司和國(guó)有實(shí)體擁有購(gòu)買和囤積算力的資源,以支持其未來的人工智能發(fā)展目標(biāo),而不具備這些資源的小型企業(yè)則依賴大型企業(yè)提供人工智能訓(xùn)練和推理基礎(chǔ)設(shè)施。圍繞計(jì)算資源的供應(yīng)鏈問題加劇了這一差距,這在全球南方更為明顯。例如,在全球前100個(gè)能夠訓(xùn)練大型人工智能模型的高性能計(jì)算(HPC)集群中,沒有一個(gè)托管在發(fā)展中國(guó)家。
2023年10月,作為聯(lián)合國(guó)秘書長(zhǎng)數(shù)字合作路線圖的一部分,聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)(HLAB)成立,旨在為聯(lián)合國(guó)成員國(guó)提供人工智能國(guó)際治理的分析和建議。該小組由39名具有不同背景的人士組成(按地域、性別、年齡和學(xué)科劃分),涵蓋政府、公民社會(huì)、私營(yíng)部門和學(xué)術(shù)界,以確保人工智能治理的建議既公平又具有包容性。
作為這個(gè)過程的一部分,我們對(duì)初創(chuàng)企業(yè)和中小企業(yè)的專家進(jìn)行了采訪,以探討他們?cè)谌斯ぶ悄苡?xùn)練數(shù)據(jù)集方面面臨的挑戰(zhàn)。他們的反饋意見強(qiáng)調(diào)了聯(lián)合國(guó)等中立國(guó)際機(jī)構(gòu)在監(jiān)督人工智能國(guó)際治理方面的重要性。
聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)關(guān)于人工智能訓(xùn)練數(shù)據(jù)集標(biāo)準(zhǔn)的建議,涵蓋了預(yù)訓(xùn)練還是后訓(xùn)練的標(biāo)準(zhǔn),詳見新報(bào)告《以人為本的人工智能治理》,包括以下內(nèi)容:
1.建立全球匿名數(shù)據(jù)交換市場(chǎng),規(guī)范數(shù)據(jù)相關(guān)定義、人工智能訓(xùn)練數(shù)據(jù)全球治理原則和人工智能訓(xùn)練數(shù)據(jù)來源,以及透明、基于權(quán)利的問責(zé)制。這包括引入數(shù)據(jù)管理和交換流程及標(biāo)準(zhǔn)。
2.促進(jìn)數(shù)據(jù)共享,鼓勵(lì)對(duì)代表性不足或缺失的數(shù)據(jù)進(jìn)行整理。
3.確保國(guó)際數(shù)據(jù)訪問的互操作性。
4.創(chuàng)建以尊重權(quán)利的方式補(bǔ)償數(shù)據(jù)創(chuàng)建者的機(jī)制。
為了解決算力的差距,聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)提出以下建議:
1.在共同利益框架下建立能力建設(shè)網(wǎng)絡(luò),確保人工智能利益的公平分配。
2.建立全球基金,支持旨在將人工智能應(yīng)用于當(dāng)?shù)毓怖嬗美难芯咳藛T和開發(fā)人員獲取計(jì)算資源。
人工智能的國(guó)際治理,尤其是訓(xùn)練數(shù)據(jù)集和算力的治理,對(duì)初創(chuàng)企業(yè)和發(fā)展中國(guó)家至關(guān)重要。它為獲取基本資源和促進(jìn)國(guó)際合作提供了強(qiáng)有力的框架,使初創(chuàng)企業(yè)能夠在全球人工智能領(lǐng)域負(fù)責(zé)任地進(jìn)行創(chuàng)新和擴(kuò)展。(財(cái)富中文網(wǎng))
納茲尼恩·拉賈尼(Nazneen Rajani)博士擔(dān)任Collinear AI公司的首席執(zhí)行官,也是聯(lián)合國(guó)高級(jí)別人工智能咨詢機(jī)構(gòu)的成員。
Fortune.com上發(fā)表的評(píng)論文章中表達(dá)的觀點(diǎn),僅代表作者本人的觀點(diǎn),不代表《財(cái)富》雜志的觀點(diǎn)和立場(chǎng)。
譯者:中慧言-王芳
Artificial Intelligence (AI) is often regarded as the modern-day equivalent of electricity, powering countless human interactions daily. However, startups and developing nations face a clear disadvantage as Big Tech companies and richer nations dominate the field, especially when it comes to two critical areas: training datasets and computational power.
The global regulatory landscape for AI is highly complex and fragmented along the lines of varied regulations and collaborations between stakeholders in both the private and public sectors. This complexity is further exacerbated by the need to harmonize regulatory frameworks and standards across international borders.
The regulations governing fair use of AI training datasets differ across regions. For instance, the European Union’s AI Act prohibits the use of copyrighted materials for training AI models without explicit authorization from rights holders. Conversely, Japan’s Text and Data Mining (TDM) law permits the use of copyrighted data for AI model training, without distinguishing between legally and illegally accessed materials. In contrast, China has introduced several principles and regulations to govern the use of AI training datasets that are more in line with the EU in that they require the training data to be lawfully obtained. However, those regulations only target AI services accessible to the general public and exclude those developed and used by enterprises and research institutions.
The regulatory environment often shapes a startup’s trajectory, significantly influencing its ability to innovate and scale. An AI startup focused on training models—whether in the pre-training or post-training phase—will encounter varying regulatory challenges that could affect its long-term success, depending on the region in which it operates. For example, a startup in Japan would have an advantage over one in the EU when it comes to crawling internet data that is copyrighted and using it for training powerful AI models because it would be protected by Japan’s TDM law. Given that AI technologies transcend national borders, this necessitates collaborative, cross-border solutions, and global cooperation among key stakeholders.
In terms of computational power, a significant disparity exists between large players—whether state-owned or private entities—and startups. Bigger tech companies and state entities have the resources to buy and hoard computational power that would support their future AI development goals, whereas smaller players that do not have those resources depend on the bigger players for AI training and inference infrastructure. The supply chain issues surrounding compute resources have intensified this gap, which is even more pronounced in the global South. For example, out of the top 100 high-performance computing (HPC) clusters in the world capable of training large AI models, not one is hosted in a developing country.
In October 2023, the UN’s High-Level Advisory Body (HLAB) on AI was formed as part of the UN Secretary-General’s Roadmap for Digital Cooperation, and designed to offer UN member states analysis and recommendations for the international governance of AI. The group is made up of 39 people with diverse backgrounds (by geography, gender, age, and discipline), spanning government, civil society, the private sector, and academia to ensure recommendations for AI governance are both fair and inclusive.
As part of this process, we conducted interviews with experts from startups and small-to-medium enterprises (SMEs) to explore the challenges they face in relation to AI training datasets. Their feedback underscored the importance of a neutral, international body, such as the United Nations, in overseeing the international governance of AI.
The HLAB’s recommendations on AI training dataset standards, covering both pre-training and post-training, are detailed in the new report Governing AI for Humanity and include the following:
1.Establishing a global marketplace for the exchange of anonymized data that standardizes data-related definitions, principles for global governance of AI training data and AI training data provenance, and transparent, rights-based accountability. This includes introducing data stewardship and exchange processes and standards.
2.Promoting data commons that incentivize the curation of underrepresented or missing data.
3.Ensuring interoperability for international data access.
4.Creating mechanisms to compensate data creators in a rights-respecting manner.
To address the compute gap, the HLAB proposes the following recommendations:
1.Developing a network for capacity building under common-benefit frameworks to ensure equitable distribution of AI’s benefits.
2.Establishing a global fund to support access to computational resources for researchers and developers aiming to apply AI to local public interest use cases.
International governance of AI, particularly of training datasets and computational power, is crucial for startups and developing nations. It provides a robust framework for accessing essential resources and fosters international cooperation, positioning startups to innovate and scale responsibly in the global AI landscape.
Nazneen Rajani, PhD, is the CEO of Collinear AI and a member of the UN’s High-Level Advisory Body on AI.
The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.