OpenAI trained ChatGPT on millions of hours of YouTube videos without permission

AI companies encountered obstacles in collecting high-quality training data.

Now the New York Times has detailed how companies have addressed this problem. To no one’s surprise, This is about doing things that are in the gray area of copyright in the area of AI.

OpenAI, desperate for training data, developed an audio transcription model to transcribe over a million hours of YouTube videos, which he used to train GPT-4. According to the New York Times, the company knew this was legally questionable but considered it fair use.

OpenAI President Greg Brockman was personally involved in collecting the videos used.

OpenAI spokeswoman Lindsay Held said the company uses “numerous sources, including public domain data and attributions for non-public data”and investigates the generation of our own synthetic data.

The company appeared to have exhausted supplies of useful data in 2021 and was discussing transcribing YouTube videos, podcasts and audiobooks after other resources were exhausted. Until then, it had trained its models on data including computer code from Github, databases of chess moves, and schoolwork content from Quizlet.

Google He also collected YouTube transcripts, according to Times sources. Bryant said the company trained its models “in some YouTube content pursuant to our agreements with YouTube creators.”

Goal It also hit the limits of the availability of good training data, and in recordings the Times heard, its AI team discussed unauthorized use of copyrighted works as it tried to catch up with OpenAI.

OpenAI trained ChatGPT on millions of hours of YouTube videos without permission

Recent Articles

How much can Aayush Sharma’s film Ruslaan earn on the first day?

Google is Again Postponing the End of Third-Party Cookies to 2025

The Impact of Noise Pollution on Bird Development

Revolutionary Bitcoin mining plans revealed by Twitter founder

The Carnation Revolution: Celebrating Democracy and Reflecting on the Past

Related News

Leave A Reply Cancel reply