Half of Top News Sites Blocked OpenAI’s Crawlers in 2023

South Carolina Digital News February 26, 2024

24 2 minutes read

[ad_1]

At the end of 2023, nearly one-half (48%) of the top news websites, based on reach, across 10 countries blocked OpenAI‘s crawlers, while nearly one-quarter (24%) blocked Google’s AI crawler, according to a study by Reuters Institute.

Reuters Institute analyzed the robots.txt of the 15 online news sources with the widest reach, including titles like The New York Times, BuzzFeed News, The Wall Street Journal, The Washington Post, CNN and NPR, across countries including Germany, India, Spain, the U.K. and the U.S.

In the absence of clear regulatory frameworks governing generative artificial intelligence‘s use of copyrighted material, many large publishers have taken matters into their own hands, taking AI firms to court, updating terms of service, blocking crawlers or making deals to protect premium content, data and revenues.

The study grouped outlets into three categories: legacy print publications, television and radio broadcasters and digital-born outlets.

Over one-half (57%) of the websites of legacy print publications, such as The New York Times, blocked OpenAI’s crawlers by the end of 2023, compared with 48% of television and radio broadcasters and 31% of digital-born outlets.

Similarly, 32% of print outlets blocked Google’s crawlers, while 19% of broadcasters and 17% of digital-born outlets did the same.

“The Reuters study highlights a fundamental challenge for generative AI: its dependence on authentic content generated by real people who see it as a threat to their livelihoods,” said Gartner VP distinguished analyst Andrew Frank.

Meanwhile, a recent study by Cornell University found that when new AI models are trained on data derived from prior models rather than human input, they tend to ‘model collapse’ or degenerate, leading to increased errors and misinformation in the generated output.

“This suggests that large language model developers need to find ways to compensate people who create or report true content, not just for the sake of society, but also for their own commercial interests,” said Frank.

Website crawlers are deployed for many reasons. Crawlers like Google’s Googlebot index publisher websites in the tech giant’s search results. Meanwhile, OpenAI’s crawler, GPTBot, collects data across the internet to train its large language models such as ChatGPT. This lets AI tools generate accurate, contemporaneous data—a capability that news publishers especially are uniquely positioned to provide: LLMs overweigh premium publishers’ content by a factor of between 5 and 100. AI-powered solutions are emerging as alternatives to traditional search engines.

1 2 Next page

South Carolina Digital News February 26, 2024

24 2 minutes read

Клининговая компания Челябинск
24.Клининг Челябинск специализируется на профессиональной уб...
buy ig followers
Really nice experience I got my followers really fast plus s...
aviator kqEl
1. The Ultimate Aviator Games Guide juego del aviator aviato...
オナホラブドール
STPE provides this expertise nearer to fact than ever just b...
オナニーグッズ男
He now knows what I look like when I fall asleep with a shee...

Half of Top News Sites Blocked OpenAI’s Crawlers in 2023

South Carolina Digital News

Experience the Art of Sushi at Noble Nori in Monticello

Unbeatable Bulk Sale on High-Quality Musical Instruments and Stage Equipment in South Fallsburg, NY

Bulk Sale of Musical Instruments and Stage Equipment in South Fallsburg, NY

Coca-Cola Mini Fridge ONLY $19.98 on Walmart.com (Regularly $50) | Holds 6 Cans!

TikTok in $1.5bn investment with Indonesia tech giant GoTo

How to assign roles and jobs to your Pals in Palworld

Unleash the Power of Adventure with the Beats 180XL Monster Golf Cart UTV 170cc Utility Vehicle

Explore the Power and Versatility of the 500cc Ranch Pony UTV Utility Vehicle

Discover the Versatility and Convenience of the Electric Termite Golf Cart Mini Four-Seater

Onboard Comfort and Convenience: Why Bus Charter Services Are Ideal for Groups

How Bus Charter Services Enhance Group Adventures

Unleashing the Potential of Vacation Properties » RenovateRx

5 Strategies for Overcoming Gender Bias in Entrepreneurship

From 16-Year-Old Skater to Investing in “Cash Machine”

50 Jobs That AI Will Replace In The Next 5 Years

Caesars Entertainment Paid Millions to Hackers, Now Look Like Geniuses

Convenient and Comfortable Bus Charter Service for Your Group Travel Needs

The Power of Digital Marketing for Vacation Property Success » RenovateRx