A periodic review must occur at least once a year. This review determines if the model is still working as intended and if the current validation activities are sufficient. This is especially important in stable economic times, when risk estimates can become too optimistic and the available data may not show how the model would perform under stress.

Human-in-the-Loop Validation for Banking Data
Human-in-the-Loop Validation for Banking Data


Powering the Future with AI
Key Takeaways

91% of risk and compliance professionals are aware of AI's role in banking, with 53% actively using or trialing AI systems, up from 30% in 2023.

The Federal Reserve's SR 11-7 guidance mandates three core validation elements: conceptual soundness evaluation, ongoing monitoring, and outcomes analysis.

Human-in-the-loop (HITL) validation addresses the tension between automation efficiency and regulatory accountability, with 42% of professionals believing human oversight is mandatory.

Financial institutions use HITL to annotate transaction patterns, merchant data, and account behaviors for fraud detection while maintaining compliance with privacy regulations and risk management frameworks.
Who is accountable when an AI model flags a transaction as fraudulent, approves a loan, or identifies a compliance violation?
The answer lies in human-in-the-loop (HITL) validation, a framework that integrates human judgment into AI-driven decision-making processes. This approach reflects the regulatory reality that accountability cannot be delegated to algorithms.
Banking institutions must navigate a difficult position. They need AI to manage immense volumes of data, but they must also satisfy strict validation standards from regulators. HITL validation connects these two needs. It places human expertise at specific points in the AI lifecycle, from the initial labeling of data to the final review of a model’s decisions.
The Regulatory Foundation: SR 11-7 and Model Risk Management
The Federal Reserve's SR 11-7 guidance from 2011 is the foundational document for model risk management in American banking.The guidance defines model validation as "the set of processes and activities intended to verify that models are performing as expected, in line with their design objectives and business uses." The guidance covers all models a bank might use, whether built internally or acquired from a third party.
SR 11-7 outlines 3 main parts of a complete validation process.
- An evaluation of the model’s conceptual soundness checks the quality of its design. This includes reviewing documentation and the evidence supporting the chosen methods. This step confirms that the judgments made during model design are sound and aligned with established industry practices.
- Ongoing monitoring confirms the model is implemented correctly and continues to perform well. This part of the process checks if changes in the market, customer behavior, or other conditions require the model to be adjusted or replaced. It often involves benchmarking, which compares the model’s results to those from different analytical methods.
- Outcomes analysis compares the model’s predictions to what happened. A key part of this is back-testing, where model forecasts are compared to actual results from a time period that was not used to build the model. This happens at a frequency that matches the model’s prediction timeline.
The guidance requires that validation be independent. The staff who validate a model should not be the same people who developed or use it. While developers and users can contribute to validation, their work must be reviewed by an objective party. The quality of this review process, and the actions taken to fix any identified issues, are indicators of a strong validation program.
The Current State of AI Adoption in Banking
A 2025 global study by Moody's of 600 risk and compliance professionals shows how quickly AI adoption is growing in finance. About 91% of those surveyed were aware of AI’s use in their field, with 53% actively using or testing it. This is a notable increase from 30% in 2023. Adoption rates are highest in fintech and asset management.
Larger companies are leading this trend, especially in North America, Europe, and the Asia-Pacific region. Yet, challenges with regulatory uncertainty and system integration continue.
The dominant view, held by 42% of respondents, is that human oversight must be mandatory.
A compliance head from a professional services firm stated, "Ultimately, it is the human beings who must be accountable. You can’t outsource accountability. That’s a principle in regulation that will always stay, so I think human involvement has to be mandatory." A chief financial officer at a North American corporation agreed, adding, "There needs to be a human component because while AI is great, sometimes nothing can beat good old common sense and intuition."
About 5% of respondents said they were comfortable with fully autonomous AI systems. These individuals were found across several sectors, with banking and professional services having the highest representation.
Three Models of Human Involvement
3 distinct models for integrating human judgment into AI-driven compliance and risk management.
- Human in the loop is the traditional model. An AI system gathers data and flags potential issues, but a compliance professional makes the final decision. This approach reduces risk and meets regulatory expectations for accountability. It aligns with the SR 11-7 requirement for independent validation and human accountability.
- Human out of the loop describes fully autonomous AI. This model is efficient but could create regulatory and operational risks if not managed with care. The small minority (5%) comfortable with this approach suggests that while some are pushing boundaries, the regulatory climate does not yet support it for high-stakes decisions.
- Human on the loop is a middle ground. The AI makes decisions, but professionals monitor the results to confirm they align with the organization’s risk tolerance. This model is developing as institutions and regulators gain more confidence in AI systems.
The division of labor is becoming clear. Humans manage high-risk, complex tasks, while AI systems handle low-risk, repetitive work. Oversight is moving from operational decisions to quality control. This is similar to how banks train new analysts, who start with close supervision and gain independence as they prove their competence. AI models may follow a similar path.
HITL in Banking Data Annotation
Data annotation is the starting point for HITL validation. Financial institutions process millions of data points daily, from transaction records and market feeds to loan applications and compliance documents. This raw data needs to be structured and labeled before it can be used to train AI models. The quality of this annotation directly affects how well AI systems can perform tasks like fraud detection and risk assessment.
Transaction labeling is a primary annotation task. It involves tagging transactions with attributes like merchant category, transaction type, and risk indicators for fraud detection and compliance monitoring. Annotators label historical fraud cases with specific details, such as unusual spending amounts or locations. These labeled examples help AI systems identify potentially fraudulent activity in real time.
The HITL framework is critical for this work because annotation requires domain expertise.
An algorithm might flag a transaction that is normal for a particular customer. A compliance professional with knowledge of anti-money laundering (AML) and know-your-customer (KYC) rules provides context that an automated system lacks. They can distinguish between a real warning sign and a statistical anomaly.
Financial data is also protected by strict privacy regulations. Institutions must anonymize sensitive information before using it for model training. HITL validation confirms that this anonymization is done correctly and that the datasets comply with rules like the Gramm-Leach-Bliley Act (GLBA) in the U.S. or the General Data Protection Regulation (GDPR) in Europe. Human reviewers check that personally identifiable information (PII) has been removed.
Validation Challenges and Mitigating Strategies
The Moody’s study noted that safeguards like training programs and governance frameworks are widely used to manage AI risks. These are required under SR 11-7.
- One challenge is the limited ability to use certain validation tools when there is a lack of data. This can constrain back-testing and outcomes analysis. In these situations, more attention must be given to the model’s limitations. Senior management needs to be aware of these limitations when using the model for decisions. Possible steps include adjusting model outputs or placing restrictions on its use.
- Another challenge is maintaining validation independence, especially in smaller institutions where the same team might handle development and validation. SR 11-7 permits this, but the work must be reviewed by an independent party. This can be done by hiring a third-party validation service or creating a separate internal function.
The move to more automation also raises staffing questions. If AI handles more low-risk work, the compliance professionals who did those tasks can be redeployed. Their expertise can shift to quality assurance, model validation, and managing complex cases that require human judgment.
Building better AI systems takes the right approach
Regulatory Perspectives and Future Directions
Regulators are observing AI adoption with interest. The Federal Reserve and other bodies have encouraged new methods while stressing the need for strong governance. Large institutions are experimenting with AI-driven compliance, but they do so with the knowledge that examiners will expect proof of effective validation.
In the future, AI agents might be treated like new employees, requiring review before they are given autonomy. Banks should balance new technology with compliance duties. They can learn from both the majority who require human oversight and the minority who are testing autonomous systems.
Regulatory frameworks are also changing. The European Union’s AI Act, for instance, classifies AI systems used for credit scoring as high-risk, which means they require conformity assessments and continuous monitoring. In the U.S., the Federal Reserve has indicated it will update SR 11-7 to address AI-specific risks like bias and data drift. These updates are expected to reinforce the importance of HITL validation.
Conclusion
Human-in-the-loop validation is not a temporary measure until AI becomes "good enough" to operate autonomously. It is a strategic advantage that balances the efficiency of automation with the accountability required by regulators and the trust demanded by customers. The SR 11-7 framework offers a clear guide for integrating human judgment into model validation.
As AI adoption accelerates in banking, institutions must resist the temptation to view HITL as a delay but as a safeguard. It confirms that models work as intended, that risks are managed, and that accountability stays with people. The industry is finding a balance between new technology and responsibility. This balance is the foundation of trust in a financial system that increasingly relies on AI.
FAQ
No, the principles of HITL apply to any financial institution using AI models for risk-related decisions. Smaller institutions can implement it by using third-party validation services or by establishing clear, independent review processes within their existing teams.
HITL is designed to be applied at critical points, not to every single decision. For high-volume, low-risk decisions, AI can operate with a high degree of autonomy. Human review is focused on model validation, handling exceptions, and assessing high-risk cases, which preserves efficiency while managing risk.
“Human in the loop” means a human must approve a decision before it is finalized. “Human on the loop” means the AI makes the decision, but a human monitors the outcomes and can intervene if necessary. The choice between them depends on the risk level of the task.
It is unlikely to become obsolete in high-stakes financial decisions. Regulatory requirements for accountability mean that a human must remain responsible. As AI advances, the nature of human oversight will evolve from direct decision approval to a focus on model governance, performance monitoring, and ethical considerations.
















