Google Says It’ll Scrape Everything You Post Online for AI

Google updated its privacy policy over the weekend, specifically stating that the company reserves the right to extract almost anything you post online to develop its AI tools. If Google can read your words, assume they’re owned by the company now and assume they’re in the bowels of a chatbot somewhere.

“Google uses information to improve our services and develop new products, features and technologies that benefit our users and the public.” says the new Google policy. “For example, we use publicly available information to train Google’s AI models and develop products and features such as Google Translate, Bard, and Cloud AI capabilities.”

Luckily for history buffs, Google maintains one history of changes to its terms of use. The new wording changes an existing policy and spells out new ways your online thoughts could be leveraged to make the tech giant’s AI tools work.

Previously, Google had stated that the data would be used “for language models” and not for “AI models”, and where the older policy just mentioned Google Translate, now Bard and Cloud AI pop up.

This is an unusual clause for a privacy policy. Typically, these policies describe how a company uses the information you post on its own services. Here, Google seems to reserve the right to collect and use data published on the public web, as if the entire internet were the company’s own AI playground. Google did not immediately respond to a request for comment.

The practice raises new and interesting questions about privacy. People generally understand that public posts are public. But today you need a new mental model of what it means to write something online. It’s no longer about who can see the information, but how it might be used. There’s a good chance Bard and ChatGPT picked up your long-forgotten blog posts or 15-year-old restaurant reviews. As you read this, the chatbots could be echoing a humonculoid version of your words in a way that is impossible to predict and difficult to understand.

One of the less obvious complications of the post-chatGPT world is where data-hungry chatbots get their information from. Companies like Google and OpenAI has destroyed large parts of the internet to encourage their robotic habits. It’s not at all clear if this is legal, and for years to come the courts will be grappling with copyright issues that would have seemed like science fiction just a few years ago. The phenomenon is already affecting consumers in unexpected ways.

The overlords of Twitter and Reddit are particularly saddened by the AI ​​issue and have made controversial changes to ban their platforms. Both companies disabled free access to their APIs, allowing anyone to download large amounts of posts. Supposedly, this is to protect social media sites from other companies stealing their intellectual property, but it has other ramifications as well.

Twitter and Reddit’s API changes broke third-party tools that many people used to access these sites. For a moment, it even seemed like Twitter would compel public institutions B. weather, public transport and emergency services pay if they wanted to tweet, a step the company withdrew after heavy criticism.

Recently, web scraping has been Elon Musk’s favorite boogieman. Musk blamed a number of recent Twitter disasters on the company’s need to prevent others from retrieving data from its site, even when the problems appear unrelated. Over the weekend, Twitter limited the number of tweets Users were allowed one look per day, making the service nearly unusable. Musk said it was a necessary response to “data scraping” and “system manipulation.” However, most IT pros agreed that rate limiting was more of a crisis response to technical problems stemming from mismanagement, incompetence, or both. Twitter didn’t answer Gizmodo’s questions on the matter.

On Reddit, the impact of the API changes has been particularly loud. Reddit is essentially run by unpaid moderators who keep the forums healthy. Mods on large subreddits typically rely on third-party tools, tools based on now-defunct APIs, to do their work. The sparked a mass protest, where moderators have essentially shut down Reddit. Although the controversy is still ongoing, it is likely that this is the case lasting consequences while scorned moderators hang up their hats.

Zack Zwiezen

Zack Zwiezen is a USTimesPost U.S. News Reporter based in London. His focus is on U.S. politics and the environment. He has covered climate change extensively, as well as healthcare and crime. Zack Zwiezen joined USTimesPost in 2023 from the Daily Express and previously worked for Chemist and Druggist and the Jewish Chronicle. He is a graduate of Cambridge University. Languages: English. You can get in touch with me by emailing

Related Articles

Back to top button