AI researchers and engineers are confronted with different levels of safety and security, different horizontal and vertical regulations, different (ethical) standards (including fairness, privacy), different certification processes, and different degrees of liability, that force them to examine a multitude of trade-offs and alternative solutions to address specific requirements, such as fairness, explainability, transparency, accountability, reproducibility, reliability, and acceptance. It is critical to establish objective attributes such as accountability, accuracy, controllability, correctness, data quality, reliability, resilience, robustness, safety, security, transparency, explainability, fairness, privacy etc, map them onto the AI processes and its lifecycle and provide metrics, measurement, methods and tools to assess them. This emphasis needs also to be considered at the theoretical level, so that AI process and lifecycle considerations, which are often only addressed after the research methods are developed and incorporated early, bringing full cycle concerns closer to AI basic research approaches.
The focus of this symposium track is on AI trustworthiness broadly and methods that help provide bounds for fairness, reproducibility, reliability and accountability in the context of quantifying AI-system risk, spanning the entire AI lifecycle from theoretical research formulations all the way to system implementation, deployment and operation. This symposium brings together industry, academia, and government researchers and practitioners who are vested stakeholders in addressing these AI-specific and intelligent system challenges in applications where a priori understanding of risk is critical.
The symposium track aims to create a platform for discussions and explorations that are expected to ultimately contribute to the development of innovative solutions for quantitatively trustworthy AI. Potential topics of interest include, but are not limited to:
-
Assessment of non-functional requirements such as explainability, including transparency, accountability, privacy or non-intended discrimination, legal compliance (e.g. copyright violation in the context of LLMs), as well as assessment from pilot assessments to systematic evaluation and monitoring.
-
NeuroSymbolic methods that use data and knowledge to support system reliability requirements, to quantify uncertainty, balance over-generalizability, and to improve trustworthiness of LLFM-enabled critical applications.
-
Approaches for verification and validation (V&V) of AI systems; quantitative AI and system performance indicators; links between performance, trustworthiness and trust.
-
Methods and approaches for enhancing reasoning in LLFMs, e.g. causal reasoning techniques and outcome verification approaches. Research focused on assessing and evaluating the reasoning behavior of LLFMs, beyond LLFM benchmark metrics is also of relevance.
-
Links between performance, and trustworthiness and trust leveraged by AI sciences, system and software engineering, metrology, and Social Sciences and Humanities (SSH) methods.
-
Research on and architectures/frameworks for Mixture-Of-Experts (MoE) and multi-agent systems with an emphasis on robustness, reliability, accountability, and emergent behaviors in risk-averse contexts.
-
Evaluation of AI systems vulnerabilities, risks and impact; including adversarial (prompt injection, data poisoning, etc.) and red-teaming approaches (assessing model risk and liabilities; assessing degradation objectives; and automating "attacks" and assessments) targeting LLFMs or multi-agent behaviors.
Best,