PhD position: Interactive Explainability of Machine learning applied to language tasks
Job offer posted on 6 July 2022
DesCartes Program (Work Package 5) is looking for a PhD in Interactive Explainability of Machine learning applied to language tasks, as part of the DesCartes program, which aims to develop disruptive hybrid AI to serve the smart city and to enable optimized decision-making in complex situations, encountered for critical urban systems.
The DesCartes programme is developing a hybrid AI, combining Learning, Knowledge and Reasoning, which has good properties (need for less resources and data, security, robustness, fairness, respect for privacy, ethics), and demonstrated on industrial applications of the smart city (digital energy, monitoring of structures, air traffic control).
The program brings together 80 permanent researchers (half from France, half from Singapore), with the support of large industrial groups (Thales SG, EDF SG, ESI group, CETIM Matcor, ARIA etc.).
The research will take place mainly in Singapore, at the premises of CNRS@CREATE, with a competitive salary and generous funding for missions.
Read more about the DesCartes program here.
The thesis takes place within the Descartes project, which will generate a lot of data about artifical systems deployed in the wild, part of
which will be expressed as textual data (expert reports, user reactions, news coverage, social media conversations). Natural language processing (NLP) models can help access that voluminous information, but there is an important need from operators, policy makers and public institutions to understand the reasons behind models’ behaviours and the information they extract, to be able to evaluate their potential issues (accuracy, fairness, biases). This thesis will investigate methods design to explain machine learning systems typically used in NLP while integrating an interactive process with the system users.
Modern machine-learning based AI systems, while achieving good results on a lot of tasks, still appear as “black-box” models, where it is difficult to trace the path from the input (a text, an image, a set of sensor measures) to the decision (classification of a document, an image, a situation).
The issue of explainability poses two different problems: (1) what is a good explanation, and specifically what is a good explanation in the context of textual models? and (2) how to scale existing explanation methods to the kind of models used in NLP tasks?
About (1), existing methods for image classification or tabular data tend to rely on the extraction of a set of pixels or features that are sufficient for generating predictions, or increase the probabilities of the prediction. It is less straightforward for textual input, which consists of words, but whose meanings are inter-related in a given context (for instance “good” in a review could be an indication that the review is positive … unless it is preceded by “not”). So the first problem of this thesis will be to provide humanly acceptable explanations of simple text classifiers such as those foreseen for the detection tasks in the dedicated sub-project of Descartes.
About (2), modern NLP models are based on very large and complex architectures, such as the transformer family. Logically sufficient or causally satisfying explanations are difficult to get for such cases, as both such methods suffer from scalability problems. So we will explore heuristics based on our solution to the first problem guiding an interactive procedure between explainee (the person requesting the explanation) and the ML system whose predictions should be explained. We will evaluate the procedure on those users targeted for the use cases of the project. Brian Lim from NUS Singapore will help design the validating experiments.
EXPERIENCE & QUALIFICATIONS
– A background in Computer Science and/or Machine learning.
– Familiarity or a willingness to acquire a familiarity with both model based and model agnostic explanation paradigms that use either logical or statistical methods.
– A familiarity with NLP / dialogue would be a plus.
– Given the nature of the project, the student should be open to work in a cross-disciplinary environment, and have good English communication skills
– A Survey of the State of Explainable AI for Natural Language Processing Marina Danilevsky, Kun Qian, Ranit Aharonov, Yannis Katsis, Ban Kawas, Prithviraj Sen, ACL 2020. https://aclanthology.org/2020.aacl-main.46/
– Explanation in artificial intelligence: Insights from the social sciences, Tim Miller, Artificial Intelligence 267:1-38 (2019)
– Interpretable Machine Learning, Christoph Molnar. https://christophm.github.io/interpretable-ml-book/
– Alexey Ignatiev, Nina Narodytska, and Joao Marques-Silva. 2019. On Relating Explanations and Adversarial Examples. In NeurIPS. 15857–15867.
– Shrikumar, A.; Greenside, P.; and Kundaje, A. 2017. Learning important features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, 3145–3153. JMLR. org.
-Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2016. Why should I trust you?: Explaining the predictions of any classifier. In ACM SIGKDD.
The thesis will happen within the France-Singapore collaboration, with advisors from both sides. The student will be registered at the University of Toulouse, and part of the IRIT lab, but is expected to spend a good part of the thesis in Singapore at the partner lab, with funding provided by the Descartes program.
The thesis will be supervised on the French side by Nicholas Asher and Philippe Muller, both NLP experts on text and conversation analysis, and co-advised by Nancy Chen from the A* lab, expert in NLP and dialogue, and Brian Lim at the National University of Singapore, an expert on Human-Computer interaction. The French advisors will also spend time at NUS during the thesis.
FURTHER INFORMATION & CONTACT
Workplace Address: CREATE Tower (NUS Campus), 1 Create Way #08-01 Singapore 138602
Please send a short cover letter describing your suitability for the position, detailed CV with academic ranking (if any) and publication list, a concise description of research interests and future plans, and academic transcripts to:
– Nicolas Asher
– Philippe Muller
– Nancy Chen
We will begin reviewing applications for the positions immediately.