Awesome-LLM4Patents

📢 News

[2024-09-01] Project Beginning.

Papers
Repositories

Papers

🔥 for papers with >50 citations or repositories with >200 stars. 📖 for papers accepted by reputed conferences/journals. ❌ for papers that I haven't read. 🙋 for our team's paper.

Survey

❌ [March 2024] Natural Language Processing in Patents: A Survey. Jiang, Lekang, and Stephan Goetz. Arxiv 2024. [paper]
[April 2024] A Comprehensive Survey on AI-based Methods for Patents. Homaira Huda Shomee, Zhu Wang, Sathya N. Ravi, Sourav Medya. Arxiv 2024. [paper]

Patent Model Optimizing

❌ [Sep. 2024] PatentGPT: A Large Language Model for Patent Drafting Using Knowledge-based Fine-tuning Method. Runtao Ren, Jian Ma. Arxiv 2024. [paper]
[April 2024] PatentGPT: A Large Language Model for Intellectual Property. Zilong Bai, Ruiji Zhang, Linqing Chen, Qijun Cai et.al.. Arxiv 2024. [paper]
❌ 📖 [May 2024] InstructPatentGPT: Training patent language models to follow instructions with human feedback. Jieh-Sheng Lee. Artif Intell Law 2024. [paper]

Patent Retrieval

❌ [March 2024] A comparative analysis of embedding models for patent similarity. Grazia Sveva Ascione, Valerio Sterzi. Arxiv 2024. [paper]

Patent Information Extraction and KG

❌ [March 2024] LLM-based Extraction of Contradictions from Patents. Stefan Trapp, Joachim Warschat. Arxiv 2024. [paper]

Patent Generation

Patent Writing

[June 2024] Can Large Language Models Generate High-quality Patent Claims?. Lekang Jiang, Caiqi Zhang, Pascal A Scherz, Stephan Goetz. Arixv 2024. [paper][github]
❌ 📖 [Oct. 2023] Creating a Silver Standard for Patent Simplification. Silvia Casola, Alberto Lavelli, Horacio Saggion. SIGIR 2023. [paper][github]
❌ 📖 [July 2022] PGT: a prompt based generative transformer for the patent domain. Dimitrios Christofidellis, Antonio Berrios Torres, Ashish Dave et.al.. IBM 2022. [paper]

Long-Context Generation for Patent

[Aug. 2024] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs. Yushi Bai, Jiajie Zhang, Xin Lv et.al.. Arxiv 2024. [paper][github][huggingface]
❌ 📖 [April. 2024] Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models. Yijia Shao, Yucheng Jiang, Theodore A. Kanell. NAACL 2024. [paper]
❌ [Feb. 2024] LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration. Jun Zhao, Can Zu, Hao Xu. Arxiv 2024. [paper]

Patent Agents

❌ ❌ [Sep. 2024] Towards Automated Patent Workflows: AI-Orchestrated Multi-Agent Framework for Intellectual Property Management and Analysis. Sakhinana Sagar Srinivas, Vijay Sri Vaikunth, Venkataramana Runkana. Arxiv 2024. [paper]
❌ [Feb. 2024] From PARIS to LE-PARIS: Toward Patent Response Automation with Recommender Systems and Collaborative Large Language Models. Jung-Mei Chu, Hao-Cheng Lo, Jieh Hsiang, Chun-Chieh Cho. Arxiv 2024. [paper]

Patents with Multimodal

❌ [April 2024] Large Language Model Informed Patent Image Retrieval. Hao-Cheng Lo, Jung-Mei Chu, Jieh Hsiang, Chun-Chieh Cho. Arxiv 2024. [paper]
❌ 📖 [Sep. 2023] PatFig: Generating Short and Long Captions for Patent Figures. Dana Aubakirova, Kim Gerdes, Lufei Liu. ICCV 2023. [paper][huggingface]

Evaluation and Dataset

❌ [Sep. 2024] Intelligent Innovation Dataset on Scientific Research Outcomes and Patents. Xinran Wu, Hui Zou, Yidan Xing, Jingjing Qu, Qiongxiu Li, Renxia Xue, Xiaoming Fu. Arxiv 2024. [paper][dataset]
❌ [July 2024] A Comparative Study of Quality Evaluation Methods for Text Summarization. Huyen Nguyen, Haihua Chen, Lavanya Pobbathi, Junhua Ding. Arxiv 2024. [paper]
❌ 📖 [June 2024] PatentEval: Understanding Errors in Patent Generation. You Zuo (ALMAnaCH), Kim Gerdes (LISN), Eric Villemonte de La Clergerie (ALMAnaCH), Benoît Sagot (ALMAnaCH). NAACL 2024.[paper]
🙋 [June 2024] IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models. Qiyao Wang, Jianguo Huang, Shule Lu et.al. [paper][website][github][huggingface]
📖 [Feb. 2024] MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property. Shiwen Ni, Minghuan Tan, Yuelin Bai, ..., Min Yang et.al.. LREC-COLING 2024. [paper][github]
📖 [July 2022] The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications. Mirac Suzgun, Luke Melas-Kyriazi, Suproteem K. Sarkar et.al.. Neurips 2022. [paper][website][github][huggingface]