“Google uses information to improve our services and develop new products, features and technologies that benefit our users and the public.” says the new Google policy. “For example, we use publicly available information to train Google’s AI models and develop products and features such as Google Translate, Bard, and Cloud AI capabilities.”
Previously, Google had stated that the data would be used “for language models” and not for “AI models”, and where the older policy just mentioned Google Translate, now Bard and Cloud AI pop up.
The practice raises new and interesting questions about privacy. People generally understand that public posts are public. But today you need a new mental model of what it means to write something online. It’s no longer about who can see the information, but how it might be used. There’s a good chance Bard and ChatGPT picked up your long-forgotten blog posts or 15-year-old restaurant reviews. As you read this, the chatbots could be echoing a humonculoid version of your words in a way that is impossible to predict and difficult to understand.
One of the less obvious complications of the post-chatGPT world is where data-hungry chatbots get their information from. Companies like Google and OpenAI has destroyed large parts of the internet to encourage their robotic habits. It’s not at all clear if this is legal, and for years to come the courts will be grappling with copyright issues that would have seemed like science fiction just a few years ago. The phenomenon is already affecting consumers in unexpected ways.
The overlords of Twitter and Reddit are particularly saddened by the AI issue and have made controversial changes to ban their platforms. Both companies disabled free access to their APIs, allowing anyone to download large amounts of posts. Supposedly, this is to protect social media sites from other companies stealing their intellectual property, but it has other ramifications as well.
Twitter and Reddit’s API changes broke third-party tools that many people used to access these sites. For a moment, it even seemed like Twitter would compel public institutions B. weather, public transport and emergency services pay if they wanted to tweet, a step the company withdrew after heavy criticism.
Recently, web scraping has been Elon Musk’s favorite boogieman. Musk blamed a number of recent Twitter disasters on the company’s need to prevent others from retrieving data from its site, even when the problems appear unrelated. Over the weekend, Twitter limited the number of tweets Users were allowed one look per day, making the service nearly unusable. Musk said it was a necessary response to “data scraping” and “system manipulation.” However, most IT pros agreed that rate limiting was more of a crisis response to technical problems stemming from mismanagement, incompetence, or both. Twitter didn’t answer Gizmodo’s questions on the matter.
On Reddit, the impact of the API changes has been particularly loud. Reddit is essentially run by unpaid moderators who keep the forums healthy. Mods on large subreddits typically rely on third-party tools, tools based on now-defunct APIs, to do their work. The sparked a mass protest, where moderators have essentially shut down Reddit. Although the controversy is still ongoing, it is likely that this is the case lasting consequences while scorned moderators hang up their hats.