The LLM Architecture Patent Race: From Transformer to RLHF and Inference Optimization

Large language model (LLM) core technologies—Transformer architectures, reinforcement learning from human feedback (RLHF), and inference optimization—have become one of the most intensely contested patent battlegrounds in the technology sector. Companies are attempting to fence off specific technical implementations through patent claims, while simultaneously confronting the institutional and ethical limits of patenting foundational infrastructure. This article analyzes the principal technical domains in LLM architecture patents and each company’s rights-acquisition strategy.

The Transformer Patent Story: Origins and Aftermath
Mapping the Technical Domains: Four Categories
RLHF Patents: OpenAI vs. Anthropic
1. OpenAI’s RLHF Patent Strategy
2. Anthropic’s Constitutional AI
The Limits of Patenting Foundational Technology
1. The Disclosure Dilemma
2. The Industry’s Implicit Non-Aggression Understanding
The Section 101 Problem
Institutional Differences: Europe and Japan
Next Frontiers: Agent Patents and Reasoning Model Patents

The Transformer Patent Story: Origins and Aftermath

Virtually every modern LLM is built on the Transformer architecture introduced in the 2017 paper “Attention Is All You Need” (Vaswani et al., arXiv:1706.03762), published on arXiv in June 2017 and presented at NeurIPS the same year.

Several patents directly related to the Transformer have been filed in Google’s (now Alphabet’s) name. Among the most relevant is U.S. Patent Application No. 16/166,620 (filing date: October 2018), filed after the 2017 paper was published and claiming priority through its disclosure. Multiple continuation applications were filed through the early 2020s.

Despite holding these patents, Google was cautious—at least initially—about pursuing licensing revenues. One reason was the reputational risk: asserting Transformer patents would expose the company to criticism for trying to monopolize foundational AI infrastructure. Maintaining productive relationships with the academic research community was also a core part of Google Brain’s institutional culture.

The Transformer architecture itself has undergone numerous modifications and improvements (GPT-family, BERT, T5, and many others), making it unclear whether patents on specific implementations can cover a broad technical domain. The premise that “owning Transformer patents means controlling the industry” does not withstand scrutiny.

Mapping the Technical Domains: Four Categories

LLM-related patent filings fall broadly into four technical categories.

(1) Architecture Design

Architecture design patents target the structure of the model itself. Major filing subjects include variations on attention mechanisms (multi-head attention, Flash Attention, sliding window attention), model parallelization methods, and Mixture of Experts (MoE) architectures.

Flash Attention, introduced by researchers Dao et al. at Stanford in 2022 (arXiv:2205.14135), dramatically improves attention computation efficiency and has been adopted in many LLM implementations. Related patents have been filed under Stanford University’s name and through the researchers’ subsequent startup, Together AI. Major companies have also filed patents on their own variants of efficient attention mechanisms.

MoE architectures are an area of particular focus for Google, with multiple patent filings related to the Switch Transformer (Fedus et al., 2022). Mistral AI’s release of a MoE model (Mixtral, December 2023) further confirmed the strategic importance of this technical space.

(2) Training Methods

Training method patents cover how models are built. Key filing areas include pre-training efficiency improvements, fine-tuning methods (LoRA, Adapter Layers), knowledge distillation, and—discussed separately below—RLHF-related techniques.

LoRA (Low-Rank Adaptation), introduced by Microsoft Research (Hu et al., 2021), enables high-performance fine-tuning with dramatically reduced parameter counts. Related patents are held in Microsoft’s name and form part of the legal foundation for commercial use through Azure AI Studio.

(3) Inference Optimization

Inference optimization—reducing the cost of running trained models in production—is a commercially critical domain. Key targets include quantization, pruning, speculative decoding, and KV cache optimization.

Speculative decoding, introduced by DeepMind and Google researchers in 2023 (Leviathan et al., arXiv:2211.17192), achieves major speed improvements by having a small model generate candidate tokens that a large model verifies in parallel. The technique has been adopted in Apple’s on-device AI processing, and multiple parties have filed related patents.

(4) Agent Technologies and Tool Use

Patents related to AI agents—autonomous systems that use tools to complete tasks—have grown sharply since 2024. Tool calling (function calling), multi-agent coordination, long-term memory management, and context window extension are among the primary filing subjects. OpenAI’s Function Calling (introduced June 2023) and Anthropic’s Tool Use capability (2024) have generated related patent filings as both companies seek first-mover rights in the agent technology space.

RLHF Patents: OpenAI vs. Anthropic

Reinforcement learning from human feedback (RLHF), the fine-tuning methodology that was central to ChatGPT’s success, is at the core of a significant technical and IP divergence between OpenAI and Anthropic.

OpenAI’s RLHF Patent Strategy

The academic origins of RLHF trace to 2017 (Christiano et al., arXiv:1706.03741), but it was OpenAI that scaled the technique to LLM dimensions. OpenAI has filed patents covering the concrete implementation of RLHF as applied in InstructGPT (2022)—specifically reward model architecture, modifications to PPO (Proximal Policy Optimization), and human feedback collection methodologies. These patents target specific architectural implementations of RLHF rather than the general theoretical concept.

Anthropic’s Constitutional AI

Anthropic developed Constitutional AI (CAI) as an alternative and complement to RLHF (Bai et al., 2022, arXiv:2212.08073). CAI enables an AI system to self-evaluate and self-revise based on a defined set of “constitutional” principles, reducing reliance on direct human feedback. By taking a different technical path from RLHF, Anthropic achieves differentiation while avoiding direct IP conflict with OpenAI’s RLHF patents.

Anthropic has filed patents on CAI implementation techniques, with filings centered on the process design for AI self-evaluation, automation of safety evaluation, and mechanisms for calibrating the tradeoff between harmlessness and helpfulness.

A notable feature of Anthropic’s approach is its two-layer strategy: publishing the CAI paper to establish academic priority while using patents to protect implementation details. Publishing the paper could create a novelty problem for patent filings, but there is typically a meaningful gap between the conceptual disclosure in a paper and the specific claims in a patent application, leaving room for patents covering particular implementation details.

The Limits of Patenting Foundational Technology

As the LLM architecture patent race intensifies, the fundamental limits of “foundational technology patenting” are coming into focus.

The Disclosure Dilemma

The foundational principle of the patent system is that inventors disclose technology to society in exchange for a time-limited exclusive right (see 35 U.S.C. §112: enablement and written description requirements). In filing LLM-related patents, companies necessarily disclose technical information to competitors. For frontier models like GPT-4 or Claude 3, including architectural details in patent specifications risks publicizing core competitive information. This dilemma pushes companies toward drafting claims that are “as broad as possible while obscuring the technical core”—a pattern that intersects with the Section 101 issues discussed below.

The Industry’s Implicit Non-Aggression Understanding

As of 2026, major AI companies appear to maintain an implicit understanding that foundational LLM patents will not be aggressively litigated against one another. At least two reasons explain this. First, strictly enforcing Google’s Transformer-related patents would expose most industry players to infringement risk, likely triggering a cycle of countersuits and cross-licensing negotiations. Second, the lesson from the smartphone patent wars (Apple v. Samsung and others) has been broadly absorbed: patent attrition wars consume enormous resources and time without necessarily translating into durable competitive advantage.

This implicit understanding is not permanent, however. Changes in market conditions—financial stress, M&A activity—or patent acquisition by NPEs (non-practicing entities) could destabilize the equilibrium.

The Section 101 Problem

U.S. patent law (35 U.S.C. §101) defines patent-eligible subject matter as “any new and useful process, machine, manufacture, or composition of matter,” explicitly excluding “abstract ideas.” In Alice Corp. v. CLS Bank International (2014), the Supreme Court established the Alice test: implementing an abstract idea on a computer does not confer patent eligibility.

Many LLM architectures are fundamentally algorithmic—applications of mathematical principles. A significant portion of LLM-related patent applications are therefore candidates for Section 101 rejection. USPTO examiners have issued Section 101 rejections for AI-related applications with increasing frequency since 2024, raising the bar for patent acquisition.

Patent practitioners respond to this problem by drafting claims that tie the invention to “specific hardware” (particular processor configurations, specific memory management approaches). A claim for “a specific tensor-partitioning method for model parallelization across multiple GPUs” carries enough technical concreteness to clear the Section 101 bar. The tradeoff is a narrower claim scope.

Between 2024 and 2025, the USPTO revised its examination guidance for AI-related patents. The updated guidance identifies presenting a “specific technical solution to an actual technical problem” as a key element of patent eligibility, making it harder to obtain patents based on abstract algorithmic descriptions alone.

Institutional Differences: Europe and Japan

Section 101 is a U.S.-specific provision, but European Patent Convention (EPC) Article 52 excludes “mental acts, mathematical methods, and computer programs as such” from patentability, raising substantively similar issues. The EPO uses “technical character” as the patentability criterion, requiring AI patent applicants to demonstrate a “technical solution to a specific technical problem.”

Under Japanese patent law, an “invention” is defined as “the highly advanced creation of technical ideas utilizing natural laws” (Article 2, Paragraph 1). Software qualifies as patentable subject matter when it is shown to “operate in conjunction with hardware.” Japan’s Patent Office revised its AI-related examination guidelines in 2019, explicitly clarifying that a “trained model” can itself be protected as a product invention. This makes Japan’s environment for AI patent acquisition comparatively permissive relative to the U.S. and Europe.

Next Frontiers: Agent Patents and Reasoning Model Patents

Looking toward 2026 and beyond, the two most closely watched technical domains in LLM patents are (1) AI agents and autonomous execution systems, and (2) reasoning models.

OpenAI’s o1 series (released September 2024), Anthropic’s extended thinking capability, and Google’s Gemini Thinking represent a new category of “reasoning models” with enhanced complex logical inference capabilities. The core technologies of these models—Chain-of-Thought internalization, search tree construction, and process reward models (PRMs)—have generated rapidly growing patent filings since 2024. Reasoning models are currently one of the most competitive technical frontiers in AI, and the ability to establish priority in this domain may prove decisive in future competitive positioning.

LLM architecture patent competition continues in a context where regulatory institutions are struggling to keep pace with technological change. The risk of technology commoditization before patents are granted, the difficulty of clearing Section 101, and the fragility of the implicit non-aggression understanding—these factors interact in complex ways as the second act of the AI IP war unfolds.

This is Part 2 of the “AI Patent War of 2026” series. Part 3 analyzes the current status of copyright lawsuits over training data, including the New York Times v. OpenAI case.