The MLIA team at ISIR conducts research focused on statistical learning (Machine Learning) with an emphasis on algorithmic aspects and applications. It studies deep learning in different domains, with a particular focus on computer vision, natural language processing and physics-based deep learning. Applications in language processing include: (i) interactive IR, conversational or task-oriented search systems, text generation, abstract summarisation, information extraction, named entity recognition and machine translation.
François YvonSenior researcher
Ziqian PengPhD student
Paul LernerPostdoctoral student
The ALMAnaCH team (Automatic Language Modelling and Analysis & Computational Humanities) focuses on Natural Language Processing (NLP) and Digital Humanities, at the crossroads between theoretical computer science, machine learning, and linguistics. The team’s work covers a wide variety of topics related to language variation, both in a historical sense and within contemporary language states (developing robust NLP systems for noisy web content and dialectal varieties of language). Our interests also span to the pre-training of neural networks (e.g. the CamemBERT model), interpretability of neural approaches, language resource development (e.g. OSCAR corpus, treebanks, parallel datasets, lexicons, but also historical corpora built using OCR and HTR applied to archives and other historical documents), evaluation and information extraction and retrieval (especially from specialised corpora and historical documents).
Éric de la ClergerieResearcher
Laurent RomarySenior researcher
Research Unit 3967 CLILLAC-ARP, (Centre for Inter-language Linguistics, Lexicology, English Linguistics and Corpus-Workshop for Speech Research) is a unit of Université Paris-Cité, attached to the Language Sciences Doctoral School and supported by the linguistics, applied language and English departments. As part of MaTOS, CLILLAC-ARP provides expertise in specialized discourse, terminology and variation, neology, corpus linguistics, contrastive analysis of bilingual scientific discourse, specialised translation and post-editing, as well as human evaluation of machine translation and post-editing.
Alexandra MestivierAssociate professor
Lichao ZhuAssociate professor
Maud BénardPhD student
José Cornejo CárcamoPhD student
L’Institut de l’Information Scientifique et Technique (Inist) est une unité d’appui et de recherche (UAR) du CNRS spécialisée en information scientifique et technique (IST). The Institut de l'Information Scientifique et Technique (Institute for scientific and technical information - Inist) is a CNRS support and research unit (UAR) specializing in scientific and technical information ("STI"). Inist's mission is to provide research units and research support services with tools and services for accessing, disseminating, exploiting, analyzing, mining, and enriching scientific data in the broadest sense (all information produced by research, including texts, documents, software, and publications). Inist's activities focus on 3 areas: "access to scientific information", " exploitation of research data", and "information analysis and mining". The unit's project is in line with the institutional and national policy of open science.