241 Websites, Including 23 News Outlets, Deny Access to Internet Archive's Web Crawler
Twenty-three news organizations, including USA Today and the New York Times, are among 241 sites denying the Internet Archive’s web crawler access to their articles. Major media outlets are blocking the Wayback Machine from saving web pages. Fortune reported these developments amid discussions on AI data usage.
Fortune# 23 News Organizations Block Internet Archive's Web Crawler Access to Articles Twenty-three news organizations, including USA Today and the New York Times, are among 241 sites denying the Internet Archive’s web crawler access to their articles. This action limits the archiving of content from these outlets. Fortune reported the details via Wired.
Major media outlets are blocking the Internet Archive’s Wayback Machine from saving web pages. The blocks prevent the digital archive from capturing and preserving online content from these sources. These measures affect the availability of historical web material.
Tech Companies Use Wayback Machine for AI Training Tech companies can use the Wayback Machine as a workaround for training language models on content.
This approach allows access to archived web data for AI development purposes. The practice has prompted responses from content providers. The digital archive has controls to limit abuse of AI automation and prevent large-scale data extraction.
Mark Graham stated these measures are in place to manage usage. Fortune reported the statement.
Prior Actions Against Data Scraping Last year, Reddit barred the Wayback Machine from data scraping.
This decision restricted the archive's ability to collect content from the platform. The move followed concerns over data usage. Mark Graham is in talks to regain access to the material. These negotiations aim to restore archiving capabilities for affected sites.
Fortune noted the ongoing discussions.
Support from Media Workers More than 100 media workers signed a letter supporting Wayback.
The letter endorses the Internet Archive's role in preserving digital content. This backing highlights divided views within the media sector. The blocks by 241 sites, including major outlets, underscore tensions over web archiving and AI data access.
Fortune reported the full scope of denials.
Story Timeline
4 events- 2026 (current)
23 news organizations, including USA Today and the New York Times, among 241 sites deny Internet Archive’s web crawler access
1 sourceFortune via Wired - 2026 (current)
Mark Graham in talks to regain access to the material
1 sourceFortune unattributed - 2026 (current)
More than 100 media workers sign letter supporting Wayback
1 sourceFortune unattributed - 2025
Reddit bars the Wayback Machine from data scraping
1 sourceFortune unattributed
Potential Impact
- 01
Reduced availability of archived news articles for public access
- 02
Limited workaround options for tech companies training AI models
- 03
Precedent from Reddit's bar may encourage more platforms to restrict scraping
- 04
Ongoing negotiations may restore some access for the archive
- 05
Increased support from media workers could influence publisher decisions
Transparency Panel
Related Stories
GB NewsTrump Announces Plans to Replace Fed Chair Powell and Review UK Trade Deal
President Donald Trump threatened to fire Federal Reserve Chair Jerome H. Powell if he does not resign soon. Trump also warned of changes to the UK-US Economic Prosperity Deal due to Britain's stance on the Iran war. The Strait of Hormuz remains contested with a U.S. naval blocka…
cnbc.comTrump Says He Will Fire Fed Chair Powell If Term Extended
President Donald Trump said he would fire Federal Reserve Chair Jerome Powell if Powell does not leave the central bank after his successor is confirmed. Trump also stated that an investigation into the renovation of the Fed's headquarters must continue. Powell's term as chair ex…
US Treasury Announces Continued Efforts Against Iran's Illicit Networks and Record Tax Refunds
The US Treasury stated it will continue actions to dismantle Iran's illicit smuggling and terror proxy networks. Treasury officials also confirmed a record tax refund season this year, attributing it to tax cuts for working families. The announcements come amid mentions of high g…