CVE-2026-44223

Published: May 12, 2026

Modified: May 15, 2026

PUBLISHED

CVSS v3.1

6.5

MEDIUM

Description

vLLM is an inference and serving engine for large language models (LLMs). From to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition_penalty": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0.

Vendor	Product	Versions
vllm-project	vllm	affected `>= 0.18.0, < 0.20.0`

Weaknesses (CWE)

CWE-131

CWE-704

CVSS v3.1 Details

CVSS v3.1 Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Attack Vector

Network

Attack Complexity

Low

Privileges Required

Low

User Interaction

None

Scope

Unchanged

Confidentiality

None

Integrity

None

Availability

High

References

https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw

x_refsource_CONFIRM

https://github.com/vllm-project/vllm/pull/38610

x_refsource_MISC

Security Training

Train your team to recognize and prevent security threats with our comprehensive security awareness program.

Start Training

Vulnerability Scanning

Discover vulnerabilities in your applications and infrastructure before attackers do.

Scan Now

CVE-2026-44223

Description

Affected Products

Weaknesses (CWE)

CVSS v3.1 Details

References