{"id":580,"date":"2020-08-27T20:39:41","date_gmt":"2020-08-27T20:39:41","guid":{"rendered":"http:\/\/sigai.acm.org\/aimatters\/blog\/?p=580"},"modified":"2020-08-27T20:40:54","modified_gmt":"2020-08-27T20:40:54","slug":"ai-data","status":"publish","type":"post","link":"https:\/\/sigai.acm.org\/aimatters\/blog\/2020\/08\/27\/ai-data\/","title":{"rendered":"AI Data"},"content":{"rendered":"\n<p>Confusion in the popular media about terms such as algorithm and what constitutes AI technology cause critical misunderstandings among the public and policymakers. More importantly, the role of data is often ignored in ethical and operational considerations. Even if AI systems are perfectly built, low quality and biased data cause unintentional and even intentional hazards.<\/p>\n\n\n\n<p><strong>Language Models and Data<\/strong><\/p>\n\n\n\n<p>A generative pre-trained transformer GPT-3 is currently in the news. For example, James Vincent in the&nbsp;July 30, 2020, <a href=\"https:\/\/www.theverge.com\/21346343\/gpt-3-explainer-openai-examples-errors-agi-potential\">article<\/a> in <em>The Verge<\/em> writes about GPT-3, which was created by <a href=\"https:\/\/www.wired.com\/story\/compete-google-openai-seeks-investorsand-profits\/\">OpenAI<\/a>. Language models,&nbsp;GPT-3 the current ultimate product, have ethics issues on steroids for products being made. Inputs to the system have all the liabilities discussed about Machine Learning and Artificial Neural Network products. The dangers of bias and mistakes are raised in some writings but are likely not a focus among the wide range of enthusiastic product developers using the&nbsp;open-source GPT-3.&nbsp;Language models suggest output sequences of words given an input sequence. Thus, samples of text from social media can be used to produce new text in the same style as the author and potentially can be used to influence public opinion. Cases have been found of promulgating incorrect grammar and misuse of terms based on poor quality inputs to language models.&nbsp;An <a href=\"https:\/\/towardsdatascience.com\/gpt-3-101-a-brief-introduction-5c9d773a2354\">article<\/a> by David Pereira includes examples and comments on the use of GPT-3. The <a href=\"https:\/\/www.theguardian.com\/commentisfree\/2020\/aug\/01\/gpt-3-an-ai-game-changer-or-an-environmental-disaster\">article<\/a> \u201cGPT-3: an AI Game-Changer or an Environmental Disaster?\u201d by John Naughton gives examples of and commentary on results from GPT-3.<\/p>\n\n\n\n<p><strong>Data Governance<\/strong><\/p>\n\n\n\n<p>A possible\nmeta solution for policymakers to keep up with technological advances is <a href=\"https:\/\/www.datanami.com\/2019\/05\/17\/ai-ethics-and-data-governance-a-virtuous-cycle\/\">discussed<\/a>\nby Alex Woodie in \u201cAI Ethics and Data\nGovernance: A Virtuous Cycle.\u201d<\/p>\n\n\n\n<p>He quotes James Cotton, who is the international director of the Data Management Centre of Excellence at&nbsp;<a href=\"http:\/\/www.informationbuilders.com\/\">Information Builders&#8217;<\/a> Amsterdam office: \u201cas powerful as the AI technology is, it can\u2019t be implemented in an ethical manner if the underlying data is poorly managed and badly governed. It\u2019s critical to understand the relationship between data governance and AI ethics. One is foundational for the other. You can\u2019t preach being ethical or using data in an ethical way if you don\u2019t know what you have, where it came from, how it\u2019s being used, or what it\u2019s being used for.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Confusion in the popular media about terms such as algorithm and what constitutes AI technology cause critical misunderstandings among the public and policymakers. More importantly, the role of data is often ignored in ethical and operational considerations. Even if AI systems are perfectly built, low quality and biased data cause unintentional and even intentional hazards. &hellip; <a href=\"https:\/\/sigai.acm.org\/aimatters\/blog\/2020\/08\/27\/ai-data\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;AI Data&#8221;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"_links":{"self":[{"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/posts\/580"}],"collection":[{"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/comments?post=580"}],"version-history":[{"count":3,"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/posts\/580\/revisions"}],"predecessor-version":[{"id":584,"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/posts\/580\/revisions\/584"}],"wp:attachment":[{"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/media?parent=580"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/categories?post=580"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sigai.acm.org\/aimatters\/blog\/wp-json\/wp\/v2\/tags?post=580"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}