đť•„aTOS

đť•„aTOS

MAchine Translation for Open Science


The MLIA team at ISIR conducts research focused on statistical learning (Machine Learning) with an emphasis on algorithmic aspects and applications. It studies deep learning in different domains, with a particular focus on computer vision, natural language processing and physics-based deep learning. Applications in language processing include: (i) interactive IR, conversational or task-oriented search systems, text generation, abstract summarisation, information extraction, named entity recognition and machine translation.

François Yvon
Senior researcher
Ziqian Peng
PhD student
Paul Lerner
Postdoctoral student

The ALMAnaCH team (Automatic Language Modelling and Analysis & Computational Humanities) focuses on Natural Language Processing (NLP) and Digital Humanities, at the crossroads between theoretical computer science, machine learning, and linguistics. The team’s work covers a wide variety of topics related to language variation, both in a historical sense and within contemporary language states (developing robust NLP systems for noisy web content and dialectal varieties of language). Our interests also span to the pre-training of neural networks (e.g. the CamemBERT model), interpretability of neural approaches, language resource development (e.g. OSCAR corpus, treebanks, parallel datasets, lexicons, but also historical corpora built using OCR and HTR applied to archives and other historical documents), evaluation and information extraction and retrieval (especially from specialised corpora and historical documents).

Rachel Bawden
Researcher
Éric de la Clergerie
Researcher
Laurent Romary
Senior researcher
Nicolas Dahan
Doctorant

Research Unit 3967 CLILLAC-ARP, (Centre for Inter-language Linguistics, Lexicology, English Linguistics and Corpus-Workshop for Speech Research) is a unit of Université Paris-Cité, attached to the Language Sciences Doctoral School and supported by the linguistics, applied language and English departments. As part of MaTOS, CLILLAC-ARP provides expertise in specialized discourse, terminology and variation, neology, corpus linguistics, contrastive analysis of bilingual scientific discourse, specialised translation and post-editing, as well as human evaluation of machine translation and post-editing.

Nathalie KĂĽbler
Professeur
Alexandra Mestivier
Associate professor
Lichao Zhu
Associate professor
Maud BĂ©nard
PhD student
José Cornejo Cárcamo
PhD student

L’Institut de l’Information Scientifique et Technique (Inist) est une unité d’appui et de recherche (UAR) du CNRS spécialisée en information scientifique et technique (IST). The Institut de l'Information Scientifique et Technique (Institute for scientific and technical information - Inist) is a CNRS support and research unit (UAR) specializing in scientific and technical information ("STI"). Inist's mission is to provide research units and research support services with tools and services for accessing, disseminating, exploiting, analyzing, mining, and enriching scientific data in the broadest sense (all information produced by research, including texts, documents, software, and publications). Inist's activities focus on 3 areas: "access to scientific information", " exploitation of research data", and "information analysis and mining". The unit's project is in line with the institutional and national policy of open science.

Jean-François Nominé
Research engineer
Mathilde Huguin
Research engineer
Manon Delorme
Project manager