Proposed New Category: Postural Manipulation
AG Davidson · Independent AI Security Researcher
ORCID: 0009-0004-2758-9035 · shapingrooms.com
Summary
Postural manipulation is an attack class in which semantically benign inputs -- content indistinguishable from ordinary human expression -- alter a large language model's behavioral orientation before any instruction is issued. The model that acts may not be the model that was deployed.
No adversarial signature is present. No override is issued. The input passes every current filter. The vulnerability is not in the content. It is in the architecture.
Distinction from existing categories
| Attack Class |
Target |
Adversarial Signature |
| Prompt Injection (LLM01:2025) |
Model behavior |
Yes |
| Jailbreaking |
Safety constraints |
Yes |
| Context-Switch Attack |
Dialogue direction |
Yes (eventual) |
| Postural Manipulation (proposed) |
Model orientation |
No |
Why current defenses miss it
Defensive frameworks scan for adversarial intent -- keyword patterns, semantic drift thresholds, instruction override signatures. Postural manipulation carries none of these signals. The content is genuinely benign by every conventional measure.
This is not a gap in any individual defense. It is a structural blind spot: defenses are built to catch things that look dangerous. Postural manipulation doesn't look like anything except ordinary human expression.
Implications for agentic systems
In agentic pipelines that read emails, documents, webpages, and retrieved content, each input represents a potential postural vector. The model that acts may not be the model that was configured and tested. There is no log entry for this change. There is no detection signal. There is no mitigation in current practice.
Postural manipulation is not a vector sitting alongside the existing taxonomy. It operates underneath it -- changing the conditions under which every other attack lands.
Empirical basis
Consistent postural effects documented across four frontier LLM architectures (ChatGPT, Claude, Gemini, Grok) using semantically benign pre-task inputs. Bidirectional effects confirmed using ambient literary content with zero operational vocabulary. Architecture-specific expression observed; directional shift consistent across all four systems.
Three-agent propagation confirmed: posture installs in agent one, survives handoff, arrives as confident policy in agent three -- with the original primer absent and no narration of its influence.
Published research
The Atmosphere Attack v1.0 (full empirical paper, March 30, 2026)
https://doi.org/10.5281/zenodo.19485192 · shapingrooms.com/research
Postural Manipulation v1.1 (original disclosure, March 19, 2026)
https://doi.org/10.5281/zenodo.19484481 · shapingrooms.com/posture
Shaping the Room v1.0 (constructive methodology, March 30, 2026)
https://doi.org/10.5281/zenodo.19485354 · shapingrooms.com/shaping-the-room.pdf
Interactive demos:
shapingrooms.com/demos
Coordinated disclosure
Responsible disclosure completed March 23, 2026 to Anthropic, OpenAI, Google, xAI, and CERT/CC. Full coordinated public disclosure March 30, 2026.
About the researcher
32 years in internet infrastructure, identity and access management, privileged identity management, vulnerability management, intrusion detection, network detection and response, AI security.
How I want to contribute
Available to:
- Write the full entry in standard OWASP format (description, common examples of risk, prevention and mitigation strategies, references)
- Participate in working group discussions on the #project-top10-for-llm Slack channel
- Present findings to the working group
- Provide the demos, probe sets, and empirical captures for community validation -- all reproducible without specialized tools or internal access
The thread below documents ongoing community engagement, independent convergence from adjacent research, and collaboration on production system prompt data. Those exchanges are part of the record.
Proposed New Category: Postural Manipulation
AG Davidson · Independent AI Security Researcher
ORCID: 0009-0004-2758-9035 · shapingrooms.com
Summary
Postural manipulation is an attack class in which semantically benign inputs -- content indistinguishable from ordinary human expression -- alter a large language model's behavioral orientation before any instruction is issued. The model that acts may not be the model that was deployed.
No adversarial signature is present. No override is issued. The input passes every current filter. The vulnerability is not in the content. It is in the architecture.
Distinction from existing categories
Why current defenses miss it
Defensive frameworks scan for adversarial intent -- keyword patterns, semantic drift thresholds, instruction override signatures. Postural manipulation carries none of these signals. The content is genuinely benign by every conventional measure.
This is not a gap in any individual defense. It is a structural blind spot: defenses are built to catch things that look dangerous. Postural manipulation doesn't look like anything except ordinary human expression.
Implications for agentic systems
In agentic pipelines that read emails, documents, webpages, and retrieved content, each input represents a potential postural vector. The model that acts may not be the model that was configured and tested. There is no log entry for this change. There is no detection signal. There is no mitigation in current practice.
Postural manipulation is not a vector sitting alongside the existing taxonomy. It operates underneath it -- changing the conditions under which every other attack lands.
Empirical basis
Consistent postural effects documented across four frontier LLM architectures (ChatGPT, Claude, Gemini, Grok) using semantically benign pre-task inputs. Bidirectional effects confirmed using ambient literary content with zero operational vocabulary. Architecture-specific expression observed; directional shift consistent across all four systems.
Three-agent propagation confirmed: posture installs in agent one, survives handoff, arrives as confident policy in agent three -- with the original primer absent and no narration of its influence.
Published research
The Atmosphere Attack v1.0 (full empirical paper, March 30, 2026)
https://doi.org/10.5281/zenodo.19485192 · shapingrooms.com/research
Postural Manipulation v1.1 (original disclosure, March 19, 2026)
https://doi.org/10.5281/zenodo.19484481 · shapingrooms.com/posture
Shaping the Room v1.0 (constructive methodology, March 30, 2026)
https://doi.org/10.5281/zenodo.19485354 · shapingrooms.com/shaping-the-room.pdf
Interactive demos:
shapingrooms.com/demos
Coordinated disclosure
Responsible disclosure completed March 23, 2026 to Anthropic, OpenAI, Google, xAI, and CERT/CC. Full coordinated public disclosure March 30, 2026.
About the researcher
32 years in internet infrastructure, identity and access management, privileged identity management, vulnerability management, intrusion detection, network detection and response, AI security.
How I want to contribute
Available to:
The thread below documents ongoing community engagement, independent convergence from adjacent research, and collaboration on production system prompt data. Those exchanges are part of the record.