Abstract:
The emergence of ChatGPT marked the beginning of the era of artificial general intelligence. The implementation of artificial general intelligence's iterative generalization ability relies on massive data, which is deeply coupled with the data throughout its operation, accompanied by new data risks, and may evolve into real threats and serious consequences with the widespread application of the model. Using the concept of "full cycle management" to analyze the data risks in the input, processing, and output stages of artificial general intelligence, including the risk of illegal acquisition of multi-source data, security risks in data utilization and storage, and quality risks in data false generation. On this basis, a data risk governance framework is constructed around the triple path of legal regulation, ethical guidance, and administrative supervision. It is proposed that China should improve the legal system for data protection and utilization, establish technical ethical norms guided by science and technology, and design administrative regulatory rules that focus on risk prevention and control to resolve the data risks generated in the process of model "input—processing—output", thereby promoting the healthy development of artificial general intelligence.