Japan Declares AI Training Data Fair Game and ‘Will Not Enforce Copyright’

Mount Fuji looming over Tokyo
Mount Fuji in Japan.

In the first such declaration of its kind, Japan has seemingly asserted that it will not enforce copyrights when it comes to training generative artificial intelligence (AI) programs.

Japan’s minister of education, culture, sports, science, and technology recently said that it is possible to take content from any source and use it for “information analysis.”

According to a Japanese political website, Liberal Democrat minister Keiko Nagoaka clearly stated at a committee meeting that AI companies can use whatever data they want to train generative AI programs.

“First of all, when I checked the legal system (copyright law) in Japan regarding information analysis by AI, I found that in Japan, whether it is for non-commercial purposes, commercial purposes, or acts other than duplication, it is obtained from illegal sites, etc. Minister Nagaoka clearly stated that it is possible to use the work for information analysis regardless of the method, regardless of the content.”

Keiko Nagoaka
Japan’s minister of education, culture, sports, science, and technology Keiko Nagoaka

The above quote is a translation and the comments were picked up by Technomancer.AI who declares that Japan is going “all in” on AI.

Rights Holders Versus AI

All generative AI programs, whether Large Language Models (LLMs) like ChatGPT or AI image generators such as Midjourney, have done an almighty scrape of the internet to build the algorithms.

For photographers, especially those with plenty of pictures online, it means their photos have been used to train products belonging to multi-million dollar companies without being asked for consent.

It has raised copyright issues that have never existed before and governments, photographers, and companies are unsure of what to do about it.

If Japan means to press ahead with its no-enforcement policy it could signal a major blow to copyright holders who have had their work used with zero compensation. Conversely, it represents a major win for the likes of OpenAI.

Getty Images is currently suing Stability AI, which makes the AI image generator Stability Diffusion, for using 12 million of the photo agency’s images in its training data.

Midjourney founder David Holz admitted that his company did not receive consent for the hundreds of millions of images used to train its AI image generator, outraging photographers and artists.

This is a developing story, PetaPixel will update it when more information becomes available.

Image credits: Header photo licensed via Depositphotos.