Proposed New Category: Postural Manipulation — Semantically Benign Behavioral Orientation Attack

**Proposed New Category: Postural Manipulation**

AG Davidson · Independent AI Security Researcher
ORCID: 0009-0004-2758-9035 · shapingrooms.com

---

**Summary**

Postural manipulation is an attack class in which semantically benign inputs -- content indistinguishable from ordinary human expression -- alter a large language model's behavioral orientation before any instruction is issued. The model that acts may not be the model that was deployed.

No adversarial signature is present. No override is issued. The input passes every current filter. The vulnerability is not in the content. It is in the architecture.

---

**Distinction from existing categories**

| Attack Class | Target | Adversarial Signature |
|---|---|---|
| Prompt Injection (LLM01:2025) | Model behavior | Yes |
| Jailbreaking | Safety constraints | Yes |
| Context-Switch Attack | Dialogue direction | Yes (eventual) |
| Postural Manipulation (proposed) | Model orientation | No |

---

**Why current defenses miss it**

Defensive frameworks scan for adversarial intent -- keyword patterns, semantic drift thresholds, instruction override signatures. Postural manipulation carries none of these signals. The content is genuinely benign by every conventional measure.

This is not a gap in any individual defense. It is a structural blind spot: defenses are built to catch things that look dangerous. Postural manipulation doesn't look like anything except ordinary human expression.

---

**Implications for agentic systems**

In agentic pipelines that read emails, documents, webpages, and retrieved content, each input represents a potential postural vector. The model that acts may not be the model that was configured and tested. There is no log entry for this change. There is no detection signal. There is no mitigation in current practice.

Postural manipulation is not a vector sitting alongside the existing taxonomy. It operates underneath it -- changing the conditions under which every other attack lands.

---

**Empirical basis**

Consistent postural effects documented across four frontier LLM architectures (ChatGPT, Claude, Gemini, Grok) using semantically benign pre-task inputs. Bidirectional effects confirmed using ambient literary content with zero operational vocabulary. Architecture-specific expression observed; directional shift consistent across all four systems.

Three-agent propagation confirmed: posture installs in agent one, survives handoff, arrives as confident policy in agent three -- with the original primer absent and no narration of its influence.

---

**Published research**

The Atmosphere Attack v1.0 (full empirical paper, March 30, 2026)
https://doi.org/10.5281/zenodo.19485192 · [shapingrooms.com/research](https://shapingrooms.com/research)

Postural Manipulation v1.1 (original disclosure, March 19, 2026)
https://doi.org/10.5281/zenodo.19484481  · [shapingrooms.com/posture](https://shapingrooms.com/posture)

Shaping the Room v1.0 (constructive methodology, March 30, 2026)
https://doi.org/10.5281/zenodo.19485354 · [shapingrooms.com/shaping-the-room.pdf](https://shapingrooms.com/shaping-the-room.pdf)


**Interactive demos:**  
[shapingrooms.com/demos](https://shapingrooms.com/demos)

---

**Coordinated disclosure**

Responsible disclosure completed March 23, 2026 to Anthropic, OpenAI, Google, xAI, and CERT/CC. Full coordinated public disclosure March 30, 2026.

---

**About the researcher**

32 years in internet infrastructure, identity and access management, privileged identity management, vulnerability management, intrusion detection, network detection and response, AI security.

---

**How I want to contribute**

Available to:

- Write the full entry in standard OWASP format (description, common examples of risk, prevention and mitigation strategies, references)
- Participate in working group discussions on the #project-top10-for-llm Slack channel
- Present findings to the working group
- Provide the demos, probe sets, and empirical captures for community validation -- all reproducible without specialized tools or internal access

---

*The thread below documents ongoing community engagement, independent convergence from adjacent research, and collaboration on production system prompt data. Those exchanges are part of the record.*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposed New Category: Postural Manipulation — Semantically Benign Behavioral Orientation Attack #807

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Attack Class	Target	Adversarial Signature
Prompt Injection (LLM01:2025)	Model behavior	Yes
Jailbreaking	Safety constraints	Yes
Context-Switch Attack	Dialogue direction	Yes (eventual)
Postural Manipulation (proposed)	Model orientation	No

Uh oh!

Proposed New Category: Postural Manipulation — Semantically Benign Behavioral Orientation Attack #807

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions