Dhruv Rathee, Marques Brownlee, PewDiePie YouTube video subtitles used to train AI models

The Hindu Bureau The Hindu Bureau | 07-18 00:20

Dhruv Rathee, Marques Brownlee, and PewDiePie YouTube video subtitles were used to train AI models, according to a tool shared by the Proof News outlet.

Anthropic, Nvidia, Apple, and Salesforce were among the leading tech firms that used a YouTube video subtitle dataset to train their AI models, according to the outlet

The outlet said it found subtitles from 173,536 YouTube videos that were pulled from over 48,000 channels, but warned that the tool could result in false negatives.

Some of the videos that were used to train AI included uploads by tech reviewer Marques Brownlee, apart from content creators such as PewDiePie and Dhruv Rathee, as well as news publications and talk shows worldwide.

Based on a search using the tool, a 2020 video by The Hindu was also seen in the results.

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

Most of the videos were from 2020 or earlier, suggesting a cut-off of sorts.

Brownlee criticised companies that scraped video transcripts for AI training content.

“Fun fact, I pay a service (by the minute) for more accurate transcriptions of my own videos, which I then upload to YouTube’s back-end. So companies that scrape transcripts are stealing *paid* work in more than one way. Not great.,” posted Brownlee on X on Tuesday.

Anthropic and Salesforce confirmed using training datasets that included the scraped video subtitles, but did not accept any wrongdoing, per the outlet. Nvidia, Apple, Databricks, and Bloomberg did not confirm or deny the allegations.

The question of scraping YouTube videos—or their transcripts—to train AI models is a contentious one.

Earlier in the year, when OpenAI ​official Mira Murati was asked about whether the ChatGPT-maker used YouTube videos for AI training, she struggled with the question and could not answer clearly.

Disclaimer: The copyright of this article belongs to the original author. Reposting this article is solely for the purpose of information dissemination and does not constitute any investment advice. If there is any infringement, please contact us immediately. We will make corrections or deletions as necessary. Thank you.


ALSO READ

China's Zeekr launches EV in Australia, eyes New Zealand next

Chinese EV maker Zeekr's has begun sales of its first model for Australia. Chinese EV maker Zeekr's ...

Hyundai is for the long haul and do not expect to make quick buck on listing: Dipan Mehta

Dipan Mehta, Director, Elixir Equities.Dipan Mehta, Director, Elixir Equities, says Hyundai compares...

EV chipmaker Wolfspeed set to receive USD 750 million US chips grant

Wolfspeed's devices are used for renewable energy systems, industrial uses and artificial intelligen...

Rio Tinto Q3 iron ore shipments rise, Simandou on track for 2025

Rio said iron ore production from its Iron Ore Company of Canada (IOC) operations fell 11% following...

Hyundai issue is for long-term investors; expect 16-18% growth in next 2-3 yrs: Narendra Solanki

Narendra Solanki, Head Fundamental Research-Investment Services, Anand Rathi Shares & Stock Brok...

Electric car sales have slumped, misinformation is one of the reasons

The politicisation of green initiatives adds to the challenge. When electric vehicles become associa...