The research paper introduces a new learning paradigm called non-fine-tunable learning, which aims to prevent pre-trained models from being fine-tuned for unethical or illegal tasks while maintaining their original performance. The proposed SOPHON protection framework reinforces pre-trained models to resist fine-tuning in restricted domains, addressing the challenge of misuse of pre-trained models for inappropriate tasks. The paper outlines the objectives of intactness and non-fine-tunability, which aim to preserve the model’s original performance and incur a comparable or greater overhead than training from scratch when fine-tuned in restricted domains. The authors address the challenges of designing the optimization framework, ensuring robustness under unpredictable fine-tuning strategies, and boosting the convergence of fine-tuning suppression in restricted domains. Extensive experiments on deep learning models and restricted tasks confirm the effectiveness and robustness of SOPHON, with fine-tuning in restricted domains incurring overhead comparable to or greater than training from scratch. The paper’s contributions include the proposal of non-fine-tunable learning, the development of a non-fine-tunable learning framework, and extensive experiments to verify its effectiveness and robustness. The document also provides background information on deep learning tasks, transfer learning, and the key components of fine-tuning.
