Why regulators in Canada and Italy are digging into ChatGPT’s use of personal information

[ad_1]

As governments rush to address concerns about the rapidly advancing generative artificial intelligence industry, experts in the field say more oversight is needed over what data is used to train the system.

Earlier this month, the Italian data protection agency launched an investigation into OpenAI and banned ChatGPT, an AI-powered chatbot. On Tuesday, Canada’s privacy commissioner also announced an OpenAI investigation. Both agencies expressed concerns about data privacy.

“You might say, ‘Oh, maybe it feels a little heavy,'” says Katrina Ingram, founder of Edmonton-based consulting firm Ethically Aligned AI.

“On the other hand, the company decided that it was going to release this technology to the world and let everyone deal with the consequences. So it’s irresponsible as well.”

Concerns about ChatGPT, transparency

Since its release late last year, ChatGPT’s ability to write everything from tweets to computer code has raised questions about its potential use in education and business. Similar AI products have been launched by Microsoft and Google in recent weeks.

These generative systems are trained to provide responses or produce outputs using data available on the internet – and it’s not always clear what kind of information that includes, experts say.

A smiling woman in a black cardigan and green shirt stands in front of a red brick wall. — Katrina Ingram is the founder of AI ethics consulting firm Ethically Aligned AI. He believes that greater oversight is needed because AI products are rapidly advancing. (Jani Auto)

“One of the challenges now is that I think we don’t know enough about what’s going on below. The investigation can help clarify,” said Teresa Scassa, Canadian Research Chair in Law and Information Policy and professor of law at the University of Ottawa.

Lack of transparency has prompted organizations and governments to slow down – and even pause – when launching new generative AI projects.

OpenAI complied with Italy’s request, and CEO Sam Altman tweeted, “we think we follow all privacy laws.” European Union countries including France and Ireland said they would review Italy’s findings on the issue, while Germany said it could block the service. Sweden has lifted the ban on ChatGPT.

OpenAI published a blog post on Wednesday outlined his approach to safety and accuracy. The post also states that “some” training data includes personal information. The data is not used to track users or advertise, but to make the product more “helpful,” according to the post.

The company said in its post that the measures it has taken “minimize the possibility that our models may generate responses that include private individuals’ personal information.”

Late last month, OpenAI said it solved a “significant issue” that shared the chat history of some users to a small subset of other users.

WATCH | Experts discuss how AI could disrupt the workforce:

Is ChatGPT coming for your project?

With AI becoming more powerful, disruptive tech expert Joel Blit and PR executive Dara Kaplan break down how programs like ChatGPT could be impacting white-collar jobs and disrupting the workforce as we know it.

What data is captured?

Experts say there is a lack of transparency about what data companies use to train the large language models that underpin systems like OpenAI’s ChatGPT.

According to Ingram, the system is trained on data that users have not provided to the company. OpenAI says it uses a “broad corpus” of data, including licensed content, “content created by human creators” and content that is publicly available on the internet.

“We disagree on that,” Ingram said. “But as a byproduct of living in the digital age, we’re caught up in it.”

Information provided directly to OpenAI through ChatGPT can also be used to train the AI, but that is disclosed in the product’s terms of service, he said.

CBC News asked OpenAI what it includes in the data it uses to train its products. In response, he provided a link to a blog post published Wednesday.

‘A new version of an old controversy’

Black and white photo of a man wearing a gray sweater. — Philip Dawson is head of policy for Armilla AI and a consultant on AI governance. (Philip Dawson/LinkedIn)

Philip Dawson, head of policy for Armilla AI – a technology company that provides risk mitigation products for companies that use AI – said that emerging concerns about data privacy in AI are a continuation of long-standing concerns about online tracking by social networks and web companies.

“It’s a new version of an old controversy. And it really calls into question some of the building blocks of a large language model, which is really all about the vast amount of data that this model is trained on and the computing power that enables that training. , “he said.

Dawson noted that companies are beginning to provide more information about the data sets they use to train their AI systems — especially as companies using AI seek to avoid potential risks — but there is no requirement to do so.

Chatbots may provide inaccurate information

Whether sensitive personal data can appear in the output of generative AI systems is unclear. However, there are concerns about ChatGPT providing inaccurate information in response to inquiries.

In one example, an Australian mayor said on Wednesday that he may sue OpenAI if it does not correct false information shared about him by ChatGPT.

Brian Hood, the mayor of Hepburn Shire, became concerned about his reputation after members of the public told him the chatbot was named as the wrong party in a foreign bribery scandal involving the Reserve Bank of Australia.

Lawyers representing Hood said that while working for the subsidiary, he was the person who tipped off authorities about bribe payments to foreign officials to win currency printing contracts.

OpenAI warns that ChatGPT “may generate inaccurate information about people.”

The masked face appears on the computer screen displayed on the screen message. — ChatGPT is a chatbot that can answer written prompts. The artificial intelligence that supports the product is trained on publicly available data plucked from the internet. (Nicolas Maeterlinck/Getty Images)

Are AI bans needed?

There is already precedent for internet data harvesting cases that violate privacy laws, Scassa said. In 2021, American technology companies Clearview AI violates Canadian privacy laws by collecting photos of Canada without their knowledge or consent.

Part of the challenge for tech companies, regulators and consumers is that laws vary from one jurisdiction to the next. While American companies scraping online data to train large language models may be legal in the US, the same rules may not apply in Europe.

“We can have whatever laws we want in Canada, but ultimately we’re dealing with technology that’s coming from other countries and may be operating under different norms,” Scassa said.

A smiling woman with red dangling earrings and a black and white blouse. — Teresa Scassa is Canada Research Chair in Information Law and Policy and professor of law at the University of Ottawa. (Posted by Teresa Scassa)

Canada is considering stronger rules on the use of personal data

The proposed Canadian legislation, Bill C-27, which is now having its second reading in the House of Commons, it aims to strengthen the rules on how personal data is used by technology companies. The Artificial Intelligence and Data Act, introduced alongside C-27, also requires technology companies to provide documentation of how AI systems are developed and report compliance with established safeguards.

The EU is also developing a regulatory framework for artificial intelligence that outlines high-risk and unacceptable use scenarios with the aim of protecting users.

But many experts say a ban on generative AI — or a moratorium, as suggested last week in an open letter signed by a group of artificial intelligence experts, industry executives and Tesla CEO Elon Musk — is not always the solution.

“I think a ban is the best short-term solution,” Ingram said, noting that a slowdown in new product releases may be warranted.

“We need to speed up the regulatory process and move faster on that front. And we need to talk more with stakeholders, including just ordinary people who encounter AI in various ways in their daily lives.”

Dealing with AI threats is a challenge

On Thursday, in response to the ban from the Italian privacy regulator, OpenAI said that it does not intend to brake the development of AI, but expressed the importance of respecting the rules that aim to protect the personal data of citizens in the country and the EU. .

Until stronger regulation, Scassa, the law professor, worries that dealing with the potential threat of AI will be a challenge.

“There is a government to create what will create a response so that we set legal parameters that will help us manage AI,” he said.

“I always think that this is a very important problem until we have that framework, it will be very difficult to respond and shape AI.”

[ad_2]

Source link