You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This use case evaluates a Credit Analysis AI Agent that assists financial institutions in reviewing credit approval memorandums while ensuring that protected attributes do not influence the agent’s reasoning or outputs.
In many lending workflows, analysts produce credit memorandums summarizing borrower risk, financial data, and contextual factors before decisions are reviewed by credit committees. AI agents can support this workflow by:
Extracting relevant information
Summarizing risk factors
Generating structured assessments
Supporting credit committee preparation
However, large language model–based agents may inadvertently consider protected attributes (e.g., race, sexual orientation) if those appear in documents, creating fair lending and regulatory risks.
This use case evaluates whether an agent configured with Prompt Baking / Negative Baking techniques can reliably ignore protected attributes while still completing its credit analysis tasks.
Prompt Baking converts runtime instructions into model behavior so the model behaves as if a prompt were permanently present. Negative Baking extends this concept to suppress the influence of specific attributes at the model-behavior level, reducing reliance on runtime guardrails.
This use case is highly relevant to financial institutions deploying agentic AI in regulated credit decisioning workflows.
Key benefits include:
Fair lending risk mitigation when AI assists credit decisions
Improved governance defensibility for AI-assisted lending workflows
Reduced reliance on fragile prompt guardrails and runtime filtering
More consistent agent behavior across credit analysis tasks
Ability to produce evidence artifacts for Model Risk Management and compliance review.
It also demonstrates a pattern where bias mitigation is embedded into agent capabilities rather than enforced only at runtime, which may improve robustness against adversarial prompts and workflow changes.
1. Key Risks
Fair Lending Bias: The agent may use protected attributes (e.g., race or sexual orientation) in reasoning or recommendations.
Prompt Injection / Adversarial Prompts: Users may attempt to coerce the agent into considering or revealing sensitive attributes.
Governance Risk: Regulators require evidence that automated decision-support systems do not rely on protected characteristics.
Capability Trade-Offs: Bias mitigation techniques may reduce general model capability or reasoning performance.
2. Proposed Evaluation Metrics/Methods
Evaluation Scenario:
The evaluation simulates a credit analysis workflow where an AI agent reviews credit approval memorandums.
The test environment uses synthetically generated credit memos that include protected attributes embedded in narrative text.
The agent must:
Extract relevant financial information
Summarize borrower risk
Produce a structured credit analysis
Avoid referencing or using protected attributes
The evaluation compares a baseline agent and a bias-mitigated agent configured using Negative Baking.
Important constraints:
Conducted in a controlled evaluation harness
Uses synthetic data
Focused on a single workflow scenario rather than full banking deployment.
Evaluation Methodology
The agent is evaluated across several interaction patterns.
Test Cases
Standard Task Execution: The agent receives a credit memo and produces an analysis.
Adversarial Prompting: Prompts attempt to cause the agent to reveal or rely on protected attributes.
Jailbreak Scenarios: The agent is challenged with instructions designed to bypass guardrails.
Benchmark Testing: General benchmarks measure whether bias mitigation reduces overall model capability.
Evaluation Objectives
Validate suppression of sensitive attribute reasoning
Demonstrate robustness to adversarial prompting
Measure fairness improvements
Identify any degradation in general performance
Example Metrics
Fairness Metrics
Sensitive attribute disclosure rate
Attribute influence score
Fairness improvement delta
Robustness Metrics
Jailbreak success rate
Adversarial prompt resistance
Agent Performance Metrics
Credit memo analysis accuracy
Task completion success rate
Benchmark performance
Envisioned Agent Components (System-Level)
Large Language Model (LLM) (e.g., OpenAI, Anthropic, open models)
Use Case Description
This use case evaluates a Credit Analysis AI Agent that assists financial institutions in reviewing credit approval memorandums while ensuring that protected attributes do not influence the agent’s reasoning or outputs.
In many lending workflows, analysts produce credit memorandums summarizing borrower risk, financial data, and contextual factors before decisions are reviewed by credit committees. AI agents can support this workflow by:
However, large language model–based agents may inadvertently consider protected attributes (e.g., race, sexual orientation) if those appear in documents, creating fair lending and regulatory risks.
This use case evaluates whether an agent configured with Prompt Baking / Negative Baking techniques can reliably ignore protected attributes while still completing its credit analysis tasks.
Prompt Baking converts runtime instructions into model behavior so the model behaves as if a prompt were permanently present. Negative Baking extends this concept to suppress the influence of specific attributes at the model-behavior level, reducing reliance on runtime guardrails.
ControlPlane - FINOS AI SIG Bread Technology presentation.pdf
Relevance & Business Value
This use case is highly relevant to financial institutions deploying agentic AI in regulated credit decisioning workflows.
Key benefits include:
It also demonstrates a pattern where bias mitigation is embedded into agent capabilities rather than enforced only at runtime, which may improve robustness against adversarial prompts and workflow changes.
1. Key Risks
Fair Lending Bias: The agent may use protected attributes (e.g., race or sexual orientation) in reasoning or recommendations.
Prompt Injection / Adversarial Prompts: Users may attempt to coerce the agent into considering or revealing sensitive attributes.
Governance Risk: Regulators require evidence that automated decision-support systems do not rely on protected characteristics.
Capability Trade-Offs: Bias mitigation techniques may reduce general model capability or reasoning performance.
2. Proposed Evaluation Metrics/Methods
Evaluation Scenario:
The evaluation simulates a credit analysis workflow where an AI agent reviews credit approval memorandums.
The test environment uses synthetically generated credit memos that include protected attributes embedded in narrative text.
The agent must:
The evaluation compares a baseline agent and a bias-mitigated agent configured using Negative Baking.
Important constraints:
Evaluation Methodology
The agent is evaluated across several interaction patterns.
Example Metrics
Fairness Metrics
Robustness Metrics
Credit memo analysis accuracy
Envisioned Agent Components (System-Level)
Additional Context & Datasets
Data Requirements
The evaluation requires:
Implementation Considerations
A production-grade evaluation harness may include: