news

The original paper should not become a free tool for training artificial intelligence

2024-08-18

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Recently, CNKI warned AI search startup Mita Technology that it presented academic literature titles, catalogs and abstracts in AI search results without permission, which constituted serious infringement. Mita Technology expressed doubts and incomprehension. In addition, Elizabeth Gipney, editor of the internationally renowned journal Nature, recently pointed out in an article that more and more academic publishers are licensing research papers to technology companies for training artificial intelligence models. One academic publisher earned $23 million from this, while the author earned nothing.
Artificial intelligence is becoming a new thing and advanced technology that everyone knows. Large language models (LLMs) usually rely on a large amount of data crawled from the Internet for training. Academic papers are more valuable than a large amount of ordinary data because of their rich content and high information density. They are an important source of data in AI training.
This seems to be quite normal. After all, citing the research results of predecessors can enhance the persuasiveness and credibility of the paper, and references are also an indispensable part of a qualified academic paper. However, this phenomenon actually involves serious intellectual property issues. According to the Copyright Law, the copyright owner has the right to publish, the right to sign, the right to distribute, the right to protect the integrity of the work, the right to disseminate on the information network, the right to adapt, the right to compile, etc. The publisher authorized by the author also enjoys corresponding rights.
Of course, copyright is also subject to corresponding restrictions. Under corresponding circumstances, the use of works does not need to pay remuneration to the author, but the author's name or title and the title of the work should be indicated. For example, for personal study, research or appreciation, use other people's published works; to introduce, comment on a certain work or explain a certain issue, appropriately quote other people's published works in the work; to report news, it is inevitable to reproduce or quote published works in newspapers, periodicals, radio stations, television stations and other media; for school classroom teaching or scientific research, translate, adapt, compile, broadcast or copy a small amount of published works for teaching or scientific research personnel; libraries, archives, etc. copy the works in their collections for the purpose of display or preservation of versions; free performance of published works, the performance does not charge fees to the public, does not pay remuneration to the performers, and is not for profit.
From the above, we can see that the fair use of other people's works should be public welfare and non-profit. If other people's works are used "free" for profit purposes, it obviously constitutes infringement. Take the training of artificial intelligence as an example. The purpose of enterprises training artificial intelligence is to improve commercial value so as to gain an advantage in the fierce market competition. Of course, if the author is indicated and the original text is linked when providing search services for network users, which increases the popularity, download volume and citation volume of the original text, it is fair use and does not constitute infringement.
It should be noted that although some journals and publishers have "bought out" the copyright and indicated that the publisher has the right to publish and disseminate, it does not mean that the publisher has the right to completely replace the author. Overall, training artificial intelligence is conducive to scientific and technological progress, but related companies cannot use works for free and without restrictions. They should still operate within the framework of the Copyright Law and cannot infringe on copyright under the banner of scientific and technological innovation.
Text | Shi Hongju
Report/Feedback