CUPERTINO (dpa-AFX) - A recent report from Wired highlights that numerous major news organizations and social media platforms have chosen not to participate in Apple's AI training data collection through website scraping.
For several years, Apple has utilized Applebot to enhance Siri and provide Spotlight suggestions, and it has recently expanded this use to train Apple Intelligence.
The company has also launched a second web crawler, Applebot-Extended, which was introduced less than three months ago. This new tool allows web publishers to opt out of having their site content used for training Apple's generative AI models that power various products, including Apple Intelligence, Services, and Developer Tools.
Opting out is straightforward, as it is managed using a publicly available robots.txt file, making it easy to identify sites that have done so. Wired examined several major news and social media platforms and reported that notable companies, such as The New York Times, Facebook, Instagram, Craigslist, Tumblr, Financial Times, The Atlantic, USA Today, and Conde Nast, have opted out of Apple's training program.
Wired's findings indicate that approximately 6% to 7% of high-traffic websites are currently blocking Applebot, suggesting that some companies either accept Apple's training methods or are unaware of the option to decline.
Additionally, a recent analysis by data journalist Ben Welsh revealed that over a quarter of the news sites he surveyed-294 out of 1,167 primarily English-language, U.S.-based publications-are blocking Applebot-Extended.
Copyright(c) 2024 RTTNews.com. All Rights Reserved
Copyright RTT News/dpa-AFX
© 2024 AFX News