{"id":1246714,"date":"2023-06-20T06:23:51","date_gmt":"2023-06-20T00:53:51","guid":{"rendered":"https:\/\/trak.in\/stories\/?p=1246714"},"modified":"2023-06-20T06:24:27","modified_gmt":"2023-06-20T00:54:27","slug":"youtube-videos-were-secretly-used-by-openai-to-train-chatgpt","status":"publish","type":"post","link":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/","title":{"rendered":"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT?"},"content":{"rendered":"\n<p>According to reports OpenAI made use of YouTube to train its speech-to-text AI language model Whisperby scraping its data.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1000\" height=\"465\" src=\"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png\" alt=\"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT? \" class=\"wp-image-1246745\" srcset=\"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png 1000w, https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1-300x140.png 300w, https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1-768x357.png 768w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/figure>\n\n\n\n<p><strong>Using YouTube&nbsp;<\/strong><\/p>\n\n\n\n<p>Some of the training data derived from Whisper ultimately contributed to the development of GPT-4, which is the language model behind <a href=\"https:\/\/www.businesstoday.in\/technology\/news\/story\/ai-training-goes-visual-openai-and-google-reportedly-trained-their-ai-models-on-youtube-385657-2023-06-15\">ChatGPT<\/a>.<\/p>\n\n\n\n<p>According to a report in The Information, OpenAI &#8220;has secretly used data from the site (YouTube) to train some of its artificial intelligence models&#8221;.<\/p>\n\n\n\n<p>AI models need tons of data for training and YouTube is the single biggest and richest source of imagery, audio and text transcripts on the web.&nbsp;<\/p>\n\n\n\n<p><strong>Google\u2019s Gemini<\/strong><\/p>\n\n\n\n<p>Google researchers have also been using YouTube data to train and refine its own large-language model called Gemini.<\/p>\n\n\n\n<p>Sundar Pichai, the CEO of Google noted, \u201cGemini was created from the ground up to be multimodal, highly efficient at tool and API integrations, and built to enable future innovations, like memory and planning.\u201d&nbsp;<\/p>\n\n\n\n<p>He further added that it offers \u201c<a href=\"https:\/\/www.financialexpress.com\/life\/technology-did-chatgpt-train-on-data-from-youtube-3129069\/\">impressive<\/a> capabilities not seen in prior models.\u201d<\/p>\n\n\n\n<p>The value of video content for AI training purposes has also been acknowledged by <a href=\"https:\/\/www.businesstoday.in\/technology\/news\/story\/ai-training-goes-visual-openai-and-google-reportedly-trained-their-ai-models-on-youtube-385657-2023-06-15\">Meta<\/a>.<\/p>\n\n\n\n<p><strong>Using video data<\/strong><\/p>\n\n\n\n<p>Yann LeCun, the AI chief at Meta Platforms, has emphasized the significance of video training data in his work.<\/p>\n\n\n\n<p>LeCun stated that a hierarchical Joint Embedding Predictive Architecture could potentially learn about the world by watching videos and interacting with its environment.<\/p>\n\n\n\n<p>His point highlights the importance of video in enabling AI models to &#8220;think&#8221; more like humans, as opposed to relying solely on text data for training.<\/p>\n\n\n\n<p><strong>Violates rules<\/strong><\/p>\n\n\n\n<p>YouTube does not permit use of its data for such purposes.<\/p>\n\n\n\n<p>Its terms of service ban using content for anything other than &#8220;personal, non-commercial use.&#8221;<\/p>\n\n\n\n<p>Hence, training a commercially oriented AI model using such content could <a href=\"https:\/\/www.businesstoday.in\/technology\/news\/story\/ai-training-goes-visual-openai-and-google-reportedly-trained-their-ai-models-on-youtube-385657-2023-06-15\">potentially <\/a>violate the site&#8217;s rules.<\/p>\n\n\n\n<p><strong>Controversy<\/strong><\/p>\n\n\n\n<p>It&#8217;s an <a href=\"https:\/\/www.daijiworld.com\/news\/newsDisplay?newsID=1090297\">open <\/a>secret in the AI industry that all are scraping the web and OpenAI reportedly &#8220;scraped&#8221; YouTube data to train its AI models which are now a rage in the world.<\/p>\n\n\n\n<p>This has <a href=\"https:\/\/swarajyamag.com\/insta\/chatgpt-maker-used-youtube-data-to-train-ai-models-report\">provoked <\/a>debates and disputes as major technology companies increasingly move to improve their AI capabilities or offer AI-powered services.<\/p>\n\n\n\n<p>Despite the lawsuits filed against text-to-image generator firms for violating artists&#8217; copyright, large language models continue to be developed in secrecy with no information or transparency about their training data content.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>According to reports OpenAI made use of YouTube to train its speech-to-text AI language model Whisperby scraping its data. Using YouTube&nbsp; Some of the training data derived from Whisper ultimately contributed to the development of GPT-4, which is the language model behind ChatGPT. According to a report in The Information, OpenAI &#8220;has secretly used data [&hellip;]<\/p>\n","protected":false},"author":24,"featured_media":1246745,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[22],"tags":[584,1541],"class_list":["post-1246714","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","tag-chatgpt","tag-openai"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.9 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT? - Trak.in - Indian Business of Tech, Mobile &amp; Startups<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT? - Trak.in - Indian Business of Tech, Mobile &amp; Startups\" \/>\n<meta property=\"og:description\" content=\"According to reports OpenAI made use of YouTube to train its speech-to-text AI language model Whisperby scraping its data. Using YouTube&nbsp; Some of the training data derived from Whisper ultimately contributed to the development of GPT-4, which is the language model behind ChatGPT. According to a report in The Information, OpenAI &#8220;has secretly used data [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/\" \/>\n<meta property=\"og:site_name\" content=\"Trak.in - Indian Business of Tech, Mobile &amp; Startups\" \/>\n<meta property=\"article:published_time\" content=\"2023-06-20T00:53:51+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-06-20T00:54:27+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"465\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Shreya Bose\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Shreya Bose\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/\",\"url\":\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/\",\"name\":\"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT? - Trak.in - Indian Business of Tech, Mobile &amp; Startups\",\"isPartOf\":{\"@id\":\"https:\/\/trak.in\/stories\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png\",\"datePublished\":\"2023-06-20T00:53:51+00:00\",\"dateModified\":\"2023-06-20T00:54:27+00:00\",\"author\":{\"@id\":\"https:\/\/trak.in\/stories\/#\/schema\/person\/9817221c96a9aadc34f34cc7b8767dc6\"},\"breadcrumb\":{\"@id\":\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#primaryimage\",\"url\":\"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png\",\"contentUrl\":\"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png\",\"width\":1000,\"height\":465},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/trak.in\/stories\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/trak.in\/stories\/#website\",\"url\":\"https:\/\/trak.in\/stories\/\",\"name\":\"Trak.in - Indian Business of Tech, Mobile &amp; Startups\",\"description\":\"Trak.in is a popular Indian Business, Technology, Mobile &amp; Startup blog featuring trending News, views and analytical take on Technology, Business, Finance, Telecom, Mobile, startups &amp; Social Media Space\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/trak.in\/stories\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/trak.in\/stories\/#\/schema\/person\/9817221c96a9aadc34f34cc7b8767dc6\",\"name\":\"Shreya Bose\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/trak.in\/stories\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/1404d637610526a87c014a8eac9883a6c61491f1e6553d1946b0dd0b135713be?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/1404d637610526a87c014a8eac9883a6c61491f1e6553d1946b0dd0b135713be?s=96&d=mm&r=g\",\"caption\":\"Shreya Bose\"},\"url\":\"https:\/\/trak.in\/stories\/author\/shreya-bose\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT? - Trak.in - Indian Business of Tech, Mobile &amp; Startups","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/","og_locale":"en_US","og_type":"article","og_title":"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT? - Trak.in - Indian Business of Tech, Mobile &amp; Startups","og_description":"According to reports OpenAI made use of YouTube to train its speech-to-text AI language model Whisperby scraping its data. Using YouTube&nbsp; Some of the training data derived from Whisper ultimately contributed to the development of GPT-4, which is the language model behind ChatGPT. According to a report in The Information, OpenAI &#8220;has secretly used data [&hellip;]","og_url":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/","og_site_name":"Trak.in - Indian Business of Tech, Mobile &amp; Startups","article_published_time":"2023-06-20T00:53:51+00:00","article_modified_time":"2023-06-20T00:54:27+00:00","og_image":[{"width":1000,"height":465,"url":"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png","type":"image\/png"}],"author":"Shreya Bose","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Shreya Bose","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/","url":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/","name":"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT? - Trak.in - Indian Business of Tech, Mobile &amp; Startups","isPartOf":{"@id":"https:\/\/trak.in\/stories\/#website"},"primaryImageOfPage":{"@id":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#primaryimage"},"image":{"@id":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#primaryimage"},"thumbnailUrl":"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png","datePublished":"2023-06-20T00:53:51+00:00","dateModified":"2023-06-20T00:54:27+00:00","author":{"@id":"https:\/\/trak.in\/stories\/#\/schema\/person\/9817221c96a9aadc34f34cc7b8767dc6"},"breadcrumb":{"@id":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#primaryimage","url":"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png","contentUrl":"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png","width":1000,"height":465},{"@type":"BreadcrumbList","@id":"https:\/\/trak.in\/stories\/youtube-videos-were-secretly-used-by-openai-to-train-chatgpt\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/trak.in\/stories\/"},{"@type":"ListItem","position":2,"name":"Youtube Videos Were Secretly Used By OpenAI To Train ChatGPT?"}]},{"@type":"WebSite","@id":"https:\/\/trak.in\/stories\/#website","url":"https:\/\/trak.in\/stories\/","name":"Trak.in - Indian Business of Tech, Mobile &amp; Startups","description":"Trak.in is a popular Indian Business, Technology, Mobile &amp; Startup blog featuring trending News, views and analytical take on Technology, Business, Finance, Telecom, Mobile, startups &amp; Social Media Space","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/trak.in\/stories\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/trak.in\/stories\/#\/schema\/person\/9817221c96a9aadc34f34cc7b8767dc6","name":"Shreya Bose","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/trak.in\/stories\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/1404d637610526a87c014a8eac9883a6c61491f1e6553d1946b0dd0b135713be?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1404d637610526a87c014a8eac9883a6c61491f1e6553d1946b0dd0b135713be?s=96&d=mm&r=g","caption":"Shreya Bose"},"url":"https:\/\/trak.in\/stories\/author\/shreya-bose\/"}]}},"jetpack_featured_media_url":"https:\/\/trak.in\/stories\/wp-content\/uploads\/2023\/06\/Untitled-design-4-2-1.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/posts\/1246714","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/users\/24"}],"replies":[{"embeddable":true,"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/comments?post=1246714"}],"version-history":[{"count":3,"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/posts\/1246714\/revisions"}],"predecessor-version":[{"id":1246746,"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/posts\/1246714\/revisions\/1246746"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/media\/1246745"}],"wp:attachment":[{"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/media?parent=1246714"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/categories?post=1246714"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/trak.in\/stories\/wp-json\/wp\/v2\/tags?post=1246714"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}