The AI Assurance Sandbox is a global initiative by IMDA and AI Verify Foundation to create a testing ground for builders or deployers of GenAI applications (not the underlying foundation models) to get them tested by specialist technical testers.
Objectives of Sandbox
Reduce testing-related barriers to GenAI adoption – through • Practical guidance • Access to specialist testing partners
Provide inputs into (eventual) technical testing standards for GenAI applications
Support the growth of a viable AI assurance market
Who should join the sandbox?
Builders or deployers of GenAI applications
You are launching or scaling up a GenAI application, and are looking for:
Guidance on the appropriate testing to build trust in the application (what to test)
Guidance on how to conduct those tests
Introduction to potential partners that have relevant experience in such testing
Opportunity to showcase your effort
(Limited funding to access specialist expertise)
Specialist Technical Testing Vendors
You are building a business (software/ service) around technical testing of GenAI applications, and are seeking an opportunity to:
Validate your testing product/methodology with a real-life use case
Get introduced to potential customers
Contribute to emerging standards in GenAI technical testing
Showcase your capabilities
Scope
For Application to be tested
Involves use of a large language or multi-modal model*
Live in production or intended to be live (not purely experimental)
Focus on technical testing (not process governance) of the application (not the underlying foundation model)
Makes a net new contribution to the AIVF/IMDA “body of knowledge”
* Exception: video/ image/ voice applications using pre-LLM/LMM technology
Specialist Technical Testing Vendors
Offers AI testing as part of product and/or service
Demonstrates technical expertise in designing and scaling AI testing (e.g., benchmarking, red-teaming, automated evaluators, automated test data generation, human calibration)
Able to distinguish between testing of underlying foundation model and the GenAI application
Risk dimensions to consider during testing (not exhaustive)
Impact on Safety & Health/Financial Concerns/ Trust/Reputation Concerns/ Unfair treatment of employees/ customers/ users
Lack of Appropriate level of human oversight / Recourse
(Non-AI) Breach of industry-specific regulatory requirements
(Non-AI) Breach of internal compliance requirements
Timeline
Apply to participate: Ongoing
Technical testing of the Gen AI use case: Up to 3 months
Output
Case Study / Report
What the Sandbox will NOT cover:
A software environment to conduct such testing (can use our open source libraries if relevant)
Regulatory approval for the application, from either IMDA/ AIVF or any other sector regulator
Background
In February 2025, AIVF/IMDA launched the Global AI Assurance Pilot for testing of GenAI applications. The goal was to provide the industry an opportunity to shape good practices in this rapidly evolving space. The Pilot was successful in its mission of bringing together two groups – those who are deploying GenAI in their day to day operations, and those who specialise in testing.
By May 2025, the AI Verify Foundation and IMDA had paired 17 AI deployers with 16 specialist technical testers from around the world.
Together, we learnt a lot from the pilot: not just about how to test, but also what to test (or not to test). Participating companies were able to get their GenAI applications evaluated by specialist testers. The testers in turn got exposure to real-life applications, on which they could refine their testing approaches.
Following the successful pilot, we have turned it into an ongoing Sandbox and launched it on 7 July 2025.
Thank you for completing the form. Your submission was successful.
Preview all the questions
1
Your organisation’s background – Could you briefly share your organisation’s background (e.g. sector, goods/services offered, customers), AI solution(s) that has/have been developed/used/deployed in your organisation, and what it is used for (e.g. product recommendation, improving operation efficiency)?
2
Your AI Verify use case – Could you share the AI model and use case that was tested with AI Verify? Which version of AI Verify did you use?
3
Your reasons for using AI Verify – Why did your organisation decide to use AI Verify?
4
Your experience with AI Verify – Could you share your journey in using AI Verify? For example, preparation work for the testing, any challenges faced, and how were they overcome? How did you find the testing process? Did it take long to complete the testing?
5
Your key learnings and insights – Could you share key learnings and insights from the testing process? For example, 2 to 3 key learnings from the testing process? Any actions you have taken after using AI Verify?
6
Your thoughts on trustworthy AI – Why is demonstrating trustworthy AI important to your organisation and to any other organisations using AI systems? Would you recommend AI Verify? How does AI Verify help you demonstrate trustworthy AI?
Enter your name and email address below to download the Discussion Paper by Aicadium and IMDA.
Disclaimer: By proceeding, you agree that your information will be shared with the authors of the Discussion Paper.