Developing Functionally Equivalent Proxy Systems for AI: A Framework for Code Similarity Analysis, Asynchronous Digital Twin Proxies, and Proxy Repository Implementation
PI: Joanna F. DeFranco (Engineering)
Additional mentors:
Mark Kennedy D.Eng. Candidate. Mark is a fully funded D.Eng. student working on objectives #1 and #2 and would assist in mentoring an RA with objectives #3 and #4.
Sven Bilen, Ph.D. Dr. Bilen is an expert is systems engineering would contribute valuable use cases for the proxy system.
Safety and trust are essential attributes of any critical system—systems that must not only perform their intended function with high reliability but also ensure no harm to the public. Testing these systems becomes especially complex when they incorporate artificial intelligence (AI), as Critical AI Systems (CAIS) can exhibit unpredictable behaviors that are difficult to replicate or evaluate through traditional testing methods. To address this, non-critical proxy systems are needed to safely test and evaluate CAIS performance under extreme or failure conditions.
With a few colleagues at NIST and Penn State, a five-dimensional CAIS framework and an associated weighting system to map key system characteristics to potential proxies was created:
https://nvlpubs.nist.gov/nistpubs/CSWP/NIST.CSWP.31.pdf
This approach provides a structured pathway for identifying and developing equivalent proxy systems suitable for rigorous error testing, ultimately enhancing safety and trust in critical AI systems.
The process relies on identifying proxies with high similarity to the CAIS and validating them through use and misuse case testing. The concept parallels transfer learning, with a key distinction that both the model and environment may differ in proxy V&V. To ensure meaningful results, the framework emphasizes quantifying similarity between the CAIS and its proxy using statistical and attribute-based measures adapted from transfer learning research.
This taxonomy enables the development of the non-critical proxy (i.e., stand in) equivalent by matching physical operational environments, application purpose, operational characteristics, development algorithms, and development techniques. The weighting system provides additional analysis to determine the most critical dimensions of the proxy. Further analysis may determine any mappings between the taxonomy and dimension weighting. Ultimately the CAIS taxonomy and analysis technique will lead to a set of CAIS proxies that can provide the vital information to select the appropriate proxy where the proxy can be used for critical error testing.
The research team will investigate targeted techniques for analyzing AI systems to develop functionally equivalent proxy systems. There are 4 objectives:
1. Create a technique to analyze code similarity. Given the challenge in getting code from a critical system, we are looking for ways to determine code similarity using source code metrics.
2. Validate the techniques using Git Hub repositories (i.e. build and test a proxy system similar to an asynchronous digital twin for a non-critical AI systems.)
3. Implementation of a system to categorize potential proxies utilizing the validated technique.
4. Build a database of proxies.
The D.Eng. candidate is focused on the first two objectives. The RA would focus on objectives #3 and #4. This would provide additional validation by matching noncritical AI proxy system for a non-critical AI system. •
A list of specific areas of computational and/or data science expertise or skills that the current team is particularly interested in recruiting to support the project.
o Verification and validation planning
o Developing modular, testable, and scalable codebases
o Implementing transfer learning and domain adaptation techniques
o Implementing software testing and version control o Familiarity with AI/ML Frameworks
o Familiarity with python, C/C++, Java
A list of specific objectives for work supported by this call
The Junior researcher would further validate and scale the techniques for code analysis developed by the D.Eng. student. Git Hub repositories can be used to build/test/categorize potential proxy code using the “technique” which would then would be stored in a searchable database of proxies created as part of this research. •
At least one medium-to-long-term goal
Medium goal to determine a matching process. The long-term goals would be to validate the proxy and develop a searchable database of proxy systems. •
A short statement (1 sentence to 1 paragraph) explaining the connection of the project to ICDS’s mission.
This research aligns with Penn State’s commitment to advancing trustworthy AI, supported by the Institute for Computational and Data Sciences (ICDS). ICDS fosters interdisciplinary research that integrates domain-specific knowledge with advanced computational methods, offering high-performance computing infrastructure and expert consulting – this aligns with the proxy-based validation framework to support safe, scalable evaluation of critical AI systems. Leveraging this partnership, the proposed research aims to expand proxy validation methods and apply them to emerging AI technologies across critical infrastructure domains. •
A paragraph summarizing team member’s recent and/or planned engagement with ICDS.
As part of this research effort, we plan to engage with ICDS to support key aspects of proxy system development and validation. This work could expand into a collaboration with ICDS-affiliated faculty in artificial intelligence, systems engineering, and cybersecurity will be essential, along with the involvement of domain experts, co-hired faculty, and graduate students. To strengthen interdisciplinary impact, we will explore partnerships with researchers across AI and reliability engineering.