{"id":221257,"date":"2024-04-06T17:17:40","date_gmt":"2024-04-06T17:17:40","guid":{"rendered":"https:\/\/michigandigitalnews.com\/index.php\/2024\/04\/06\/openai-and-google-reportedly-used-transcriptions-of-youtube-videos-to-train-their-ai-models\/"},"modified":"2025-06-25T17:19:10","modified_gmt":"2025-06-25T17:19:10","slug":"openai-and-google-reportedly-used-transcriptions-of-youtube-videos-to-train-their-ai-models","status":"publish","type":"post","link":"https:\/\/michigandigitalnews.com\/index.php\/2024\/04\/06\/openai-and-google-reportedly-used-transcriptions-of-youtube-videos-to-train-their-ai-models\/","title":{"rendered":"OpenAI and Google reportedly used transcriptions of YouTube videos to train their AI models"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<p>OpenAI and Google trained their AI models on text transcribed from YouTube videos, potentially violating creators\u2019 copyrights, according to <a data-i13n=\"elm:context_link;elmt:doNotAffiliate;cpos:1;pos:1\" class=\"link \" href=\"https:\/\/www.nytimes.com\/2024\/04\/06\/technology\/tech-giants-harvest-data-artificial-intelligence.html\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:The New York Times;elm:context_link;elmt:doNotAffiliate;cpos:1;pos:1;itc:0;sec:content-canvas\"><em><\/em><\/a>. The report, which describes the lengths OpenAI, Google and Meta have gone to in order to maximize the amount of data they can feed to their AIs, cites numerous people with knowledge of the companies\u2019 practices. It comes just days after YouTube CEO Neal Mohan said in an interview with <a data-i13n=\"elm:context_link;elmt:doNotAffiliate;cpos:2;pos:1\" class=\"link \" href=\"https:\/\/www.bloomberg.com\/news\/articles\/2024-04-04\/youtube-says-openai-training-sora-with-its-videos-would-break-the-rules?sref=10lNAhZ9&amp;embedded-checkout=true\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:Bloomberg Originals;elm:context_link;elmt:doNotAffiliate;cpos:2;pos:1;itc:0;sec:content-canvas\"><em><\/em><\/a> that OpenAI\u2019s alleged use of YouTube videos to train its new text-to-video generator, Sora, <a data-i13n=\"elm:context_link;elmt:doNotAffiliate;cpos:3;pos:1\" class=\"link \" href=\"https:\/\/www.engadget.com\/youtube-ceo-warns-openai-that-training-models-on-its-videos-is-against-the-rules-121547513.html\" data-ylk=\"slk:would go against the platform\u2019s policies;elm:context_link;elmt:doNotAffiliate;cpos:3;pos:1;itc:0;sec:content-canvas\"><\/a>.<\/p>\n<p>According to the <em>NYT<\/em>, OpenAI used its Whisper speech recognition tool to transcribe more than one million hours of YouTube videos, which were then used to train GPT-4. <a data-i13n=\"elm:context_link;elmt:doNotAffiliate;cpos:4;pos:1\" class=\"link \" href=\"https:\/\/www.theinformation.com\/articles\/why-youtube-could-give-google-an-edge-in-ai?rc=whf0fd\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:The Information;elm:context_link;elmt:doNotAffiliate;cpos:4;pos:1;itc:0;sec:content-canvas\"><em><\/em><\/a> previously reported that OpenAI had used YouTube videos and podcasts to train the two AI systems. OpenAI president Greg Brockman was reportedly among the people on this team. Per Google\u2019s rules, \u201cunauthorized scraping or downloading of YouTube content\u201d is not allowed, Matt Bryant, a spokesperson for Google, told <em>NYT<\/em>, also saying that the company was unaware of any such use by OpenAI.<\/p>\n<p>The report, however, claims there were people at Google who knew but did not take action against OpenAI because Google was using YouTube videos to train its own AI models. Google told <em>NYT <\/em>it only does so with videos from creators who have agreed to take part in an experimental program. Engadget has reached out to Google and OpenAI for comment.<\/p>\n<p>The <em>NYT<\/em> report also claims Google tweaked its privacy policy in June 2022 to more broadly cover its use of publicly available content, including Google Docs and Google Sheets, to train its AI models and products. Bryant told <em>NYT<\/em> that this is only done with the permission of users who opt into Google\u2019s experimental features, and that the company \u201cdid not start training on additional types of data based on this language change.\u201d<\/p>\n<\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"https:\/\/www.engadget.com\/openai-and-google-reportedly-used-transcriptions-of-youtube-videos-to-train-their-ai-models-163531073.html?src=rss\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] OpenAI and Google trained their AI models on text transcribed from YouTube videos, potentially violating creators\u2019 copyrights, according to . The report, which describes<\/p>\n","protected":false},"author":1,"featured_media":221258,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[159],"tags":[],"_links":{"self":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/221257"}],"collection":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/comments?post=221257"}],"version-history":[{"count":1,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/221257\/revisions"}],"predecessor-version":[{"id":330135,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/221257\/revisions\/330135"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media\/221258"}],"wp:attachment":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media?parent=221257"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/categories?post=221257"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/tags?post=221257"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}