Bias-Resilient Credit Analysis Agent Using Prompt Baking / Negative Baking

### Use Case Description

This use case evaluates a Credit Analysis AI Agent that assists financial institutions in reviewing credit approval memorandums while ensuring that protected attributes do not influence the agent’s reasoning or outputs.

In many lending workflows, analysts produce credit memorandums summarizing borrower risk, financial data, and contextual factors before decisions are reviewed by credit committees. AI agents can support this workflow by:

- Extracting relevant information
- Summarizing risk factors
- Generating structured assessments
- Supporting credit committee preparation

However, large language model–based agents may inadvertently consider protected attributes (e.g., race, sexual orientation) if those appear in documents, creating fair lending and regulatory risks.

This use case evaluates whether an agent configured with Prompt Baking / Negative Baking techniques can reliably ignore protected attributes while still completing its credit analysis tasks.

Prompt Baking converts runtime instructions into model behavior so the model behaves as if a prompt were permanently present. Negative Baking extends this concept to suppress the influence of specific attributes at the model-behavior level, reducing reliance on runtime guardrails.

[ControlPlane - FINOS AI SIG Bread Technology presentation.pdf](https://github.com/user-attachments/files/26019867/ControlPlane.-.FINOS.AI.SIG.Bread.Technology.presentation.pdf)

### Relevance & Business Value

This use case is highly relevant to financial institutions deploying agentic AI in regulated credit decisioning workflows.

Key benefits include:
- Fair lending risk mitigation when AI assists credit decisions
- Improved governance defensibility for AI-assisted lending workflows
- Reduced reliance on fragile prompt guardrails and runtime filtering
- More consistent agent behavior across credit analysis tasks
- Ability to produce evidence artifacts for Model Risk Management and compliance review.

It also demonstrates a pattern where bias mitigation is embedded into agent capabilities rather than enforced only at runtime, which may improve robustness against adversarial prompts and workflow changes.

### 1. Key Risks

**Fair Lending Bias:** The agent may use protected attributes (e.g., race or sexual orientation) in reasoning or recommendations.

**Prompt Injection / Adversarial Prompts**: Users may attempt to coerce the agent into considering or revealing sensitive attributes.

**Governance Risk:** Regulators require evidence that automated decision-support systems do not rely on protected characteristics.

**Capability Trade-Offs:** Bias mitigation techniques may reduce general model capability or reasoning performance.

### 2. Proposed Evaluation Metrics/Methods

**Evaluation Scenario:**

The evaluation simulates a credit analysis workflow where an AI agent reviews credit approval memorandums.

The test environment uses synthetically generated credit memos that include protected attributes embedded in narrative text.

The agent must:

- Extract relevant financial information
- Summarize borrower risk
- Produce a structured credit analysis
- Avoid referencing or using protected attributes

The evaluation compares a baseline agent and a bias-mitigated agent configured using Negative Baking.

Important constraints:
- Conducted in a controlled evaluation harness
- Uses synthetic data
- Focused on a single workflow scenario rather than full banking deployment.

**Evaluation Methodology**

The agent is evaluated across several interaction patterns.

1. Test Cases

- Standard Task Execution: The agent receives a credit memo and produces an analysis.
- Adversarial Prompting: Prompts attempt to cause the agent to reveal or rely on protected attributes.
- Jailbreak Scenarios: The agent is challenged with instructions designed to bypass guardrails.
- Benchmark Testing: General benchmarks measure whether bias mitigation reduces overall model capability.

2. Evaluation Objectives
 
- Validate suppression of sensitive attribute reasoning
- Demonstrate robustness to adversarial prompting
- Measure fairness improvements
- Identify any degradation in general performance

**Example Metrics** 

Fairness Metrics
- Sensitive attribute disclosure rate
- Attribute influence score
- Fairness improvement delta

Robustness Metrics
- Jailbreak success rate
- Adversarial prompt resistance
- Agent Performance Metrics

Credit memo analysis accuracy
- Task completion success rate
- Benchmark performance

### Envisioned Agent Components (System-Level)

- [x] Large Language Model (LLM) (e.g., OpenAI, Anthropic, open models)
- [x] Vector DB (e.g., Pinecone, Weaviate, Milvus)
- [x] RAG (Retrieval-Augmented Generation) Pipeline
- [x] Agentic Framework (e.g., LangChain, LlamaIndex, crewAI)
- [ ] Access to external tools/APIs (e.g., web search, market data feeds)
- [ ] Durable Execution (e.g., Temporal)

### Additional Context & Datasets

**Data Requirements**

The evaluation requires:
- Synthetic credit approval memorandums
- Documents containing embedded protected attributes
- Adversarial prompt sets
- Benchmark evaluation datasets

**Implementation Considerations**

A production-grade evaluation harness may include:
- Credit Analysis Agent implementation
- Baseline and baked model configurations
- Adversarial prompt testing suite
- Fairness evaluation tools
- Governance reporting outputs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bias-Resilient Credit Analysis Agent Using Prompt Baking / Negative Baking #24

Use Case Description

Relevance & Business Value

1. Key Risks

2. Proposed Evaluation Metrics/Methods

Envisioned Agent Components (System-Level)

Additional Context & Datasets

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bias-Resilient Credit Analysis Agent Using Prompt Baking / Negative Baking #24

Description

Use Case Description

Relevance & Business Value

1. Key Risks

2. Proposed Evaluation Metrics/Methods

Envisioned Agent Components (System-Level)

Additional Context & Datasets

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions