Joseph Mariani

Joseph Mariani
Joseph Mariani
Born	1 February 1950 (age 74)
Nationality	French
Occupation	Researcher in Computer Science

Joseph Mariani (born Joseph-Jean Mariani; 1 February 1950) is a French computer science researcher and pioneer in the field of speech processing.

Education and career

After obtaining a Doctor of Engineering degree in 1977 from the Pierre and Marie Curie University, Joseph Mariani joined the National Center for Scientific Research (CNRS) in the Computer Science Laboratory for Mechanics and Engineering Sciences (LIMSI) as a researcher. He then was the head of the Speech Communication group from 1982 to 1985. He left for the United States (1985–1986) where he worked as invited researcher at IBM T.J. Watson Research Center (Yorktown Heights, NY, USA). Back in France, from 1987 to 2001 he was in charge of the Human-Machine Communication Department and was Director of LIMSI from 1989 to 2000. Later, he was named Director of the Department of Information and Communication Technologies at the Ministry of Research. Within the Ministry, he created the Techno-Langue and Techno-Vision Programs on the development and evaluation of technologies in these two domains.

During this time, he was named President of the European Language Resources Association (ELRA) and was on the boards of several organizations including the ANFr, the IGN, the OST and INRIA. He participated in the creation of many associations and international conferences such as ELSNET, COCOSDA, ESCA/ISCA, ELRA and LREC.

From 2006 through December 2013, he was director of the Institute for Multilingual and Multimedia Information (IMMI), a CNRS Mixed International Unit, part of the Quaero Program, a collaboration between LIMSI, the Karlsruhe Institute of Technology (KIT) and the University of Aix-la-Chapelle (RWTH). In February 2016, he was named Emeritus Senior Researcher by the CNRS.

Research areas

Joseph's research activities mainly concern Human-Machine Communication, both spoken and written, within the domain of Natural Language Processing.

Early in his career, he concentrated on automatic speech recognition and signal processing.

In the early 1980s, Joseph Mariani was already, within the NATO RSG-10 working group's evaluation activities, using the name “evaluation paradigm” to denote an open evaluation effort seen as a quantitative black-box with performance metrics on shared data, and then combined and compared, a task now referred to as a “shared task”. This evaluation paradigm allowed for the continuous improvement of speech processing and the eventual appearance of vocal assistants such as SIRI, Cortan, ECHO and Google Voice.

He was involved in NIST² becoming the center of automatic speech and text processing evaluation activities in the US in 1987. In 1994, with Robert Martin, then Director of the Institut National de la Langue Française (INaLF), he organized the first francophone open text evaluation for morphosyntactic analyzers of French text thanks to the support of two CNRS departments, the Humanities and Social Sciences and the Engineering Sciences. The same year, he helped start a program in the field of linguistic engineering by Aupelf-Uref (now AUF, the Francophone University Association) and coordinated by the Francophone Network on Language Engineering (FRANCIL) to strengthen francophone activities in this area. This encompasses Concerted Research Actions (CRAs), a major action concerning the text and speech⁴evaluation paradigm. In the early 2000s, he contributed to a major publication on automatic speech processing: Spoken Language Processing⁵.

Between 2000 and 2010, his activities focused on multilingualism with the development of language matrices for the 24 languages of the European Union⁶. Later he worked on the publication of the META-NET White Paper Series⁷ in order to establish an inventory of the resources available for French (dictionaries, grammars and programs).

Since 2010, he has worked on the automatic processing of regional languages⁸ and is interested in ethical problems related to the use of computers in daily life.

Since 2013⁹, he has collected and studies articles in the whole field of natural language processing, including speech processing and information retrieval. This work has been carried out within the framework of the NLP4NLP project¹⁰ that began by using the ISCA archives, and later those of LREC¹¹, TALN and IEEE and following that, other conferences and revues such as TREC. After this collection phase, which for the first time gathered a major part of the publications in the field, the publications were automatically analyzed from several points of view. First, all of the technical terms were extracted and compiled in a lexicon. Second, each lexical entry was attributed to the author who first used it. This is an innovation¹² in scientific publication. The goal was to understand the mechanisms that influence the domain and thus to identify current and future trends. This work included the creation of technical terms, their evolution (appearance and eventual decay and resurgence), such as the term “neural networks”. Another strategy was to create a predictive analysis, which consists of creating a statistical representation of the use of technical terms in order to predict their use over the following four years. The study also examined the impact of one conference on another, on plagiarism and on re-use in scientific publications¹³. A full synthesis of the NLP4NLP has been published in 2019 under the form of a double publication in Frontiers in Research Metrics and Analytics.^[1]^[2] Then, starting from this first 50 years analysis (1965-2015), a follow-up study has been conducted to consider five more years (2016-2020).^[3] It identified profound changes in research topics as well as in the emergence of a new generation of authors and the appearance of new publications around artificial intelligence, neural networks, machine learning, and word embedding.

Distinctions

Joseph Mariani was nominated knight in the French National Order of Merit (1985) and Officer in the Ordre des Arts et des Lettres (2016). He is an honorary member of the Francophone Association for Speech Communication (AFCP), a fellow and life member of ISCA, where he received the Special Service Medal in 1999, and honorary president of ELRA since 2010.

Bibliography

Joseph Mariani is an author, coauthor or editor of over 500 publications.

References

^ Mariani, Joseph; Francopoulo, Gil; Paroubek, Patrick (2019), "The NLP4NLP Corpus (I): 50 Years of Publication Collaboration and Citation in Speech and Language Processing", Frontiers in Research Metrics and Analytics, 3, doi:10.3389/frma.2018.00036
^ Mariani, Joseph; Francopoulo, Gil; Paroubek, Patrick; Vernier, Frédéric (2019), "The NLP4NLP Corpus (I): 50 Years of Research in Speech and Language Processing", Frontiers in Research Metrics and Analytics, 3, doi:10.3389/frma.2018.00037
^ Mariani, Joseph; Francopoulo, Gil; Paroubek, Patrick; Vernier, Frédéric (2022), "NLP4NLP+5: The Deep (R)evolution in Speech and Language Processing", Frontiers in Research Metrics and Analytics, 7, doi:10.3389/frma.2022.863126, PMC 9363593

↑ Jean-Sylvain Liénard, Joseph Mariani, 1980, Système de reconnaissance de mots isolés: MOISE - Registered Technical Report ANVAR No 50312, juin 1980
↑ David Pallet, 1998 The NIST Role in Automatic Speech Recognition Benchmark Tests, LREC 1998
Ralph Grishman, Beth Sundheim, 1996 Message Understanding Conference-6: A Brief History [archive], COLING 1996
Survey of the State of the Art in Human Language Technology [1] [archive]
↑ Spoken Language Processing [2] [archive]
Language Matrices and the Language Resource Impact, Joseph Mariani, Gil Francopoulo, dans Language Production, Cognition and the lexicon, edited by Gala, Rapp, Bel-Enguix, Springer
↑ META-NET White Paper Series: French, Joseph Mariani, Patrick Paroubek, Gil Francopoulo, Aurélien Max, François Yvon, Pierre Zweigenbaum. Springer [3] [archive]
↑ Technologies de la langue: état des lieux, Joseph Mariani, dans Les Technologies pour les langues régionales de France, Colloque du 19 et 20 février 2015 organisé par la DGLFLF
↑ Rediscovering 25 Years of Discoveries in Spoken Language Processing: A Preliminary ISCA Archive Analysis, Joseph Mariani, Patrick Paroubek, Gil Francopoulo, Marine Delaborde, [4] [archive]
↑ NLP4NLP: The Cobbler's Children Won't Go Unshod, Gil Francopoulo, Joseph Mariani, Patrick Paroubek, D-Lib Magazine: The Magazine of Digital Library Research, November 2015 [5] [archive]
↑ Rediscovering 15 Years of Discoveries in Language Resources and Evaluation: The LREC Anthology Analysis, Joseph Mariani, Patrick Paroubek, Gil Francopoulo, Olivier Hamon, LREC 2014, [6] [archive]
↑ Text Mining for Notabilility Computation, Gil Francopoulo, Joseph Mariani, Patrick Paroubek, LREC 2016, Workshop on Cross-Platform Text-Mining and Natural Language Processing Interoperability [7] [archive]
A Study of Reuse and Plagiarism in LREC papers, Gil Francopoulo, Joseph Mariani, Patrick Paroubek, LREC 2016, http://www.lrec-conf.org/proceedings/lrec2016/index.html [archive]

External links

Joseph Mariani on the LIMSI website https://perso.limsi.fr/mariani/

[1] Mariani, Joseph; Francopoulo, Gil; Paroubek, Patrick (2019), "The NLP4NLP Corpus (I): 50 Years of Publication Collaboration and Citation in Speech and Language Processing", Frontiers in Research Metrics and Analytics, 3, doi:10.3389/frma.2018.00036

[2] Mariani, Joseph; Francopoulo, Gil; Paroubek, Patrick; Vernier, Frédéric (2019), "The NLP4NLP Corpus (I): 50 Years of Research in Speech and Language Processing", Frontiers in Research Metrics and Analytics, 3, doi:10.3389/frma.2018.00037

[3] Mariani, Joseph; Francopoulo, Gil; Paroubek, Patrick; Vernier, Frédéric (2022), "NLP4NLP+5: The Deep (R)evolution in Speech and Language Processing", Frontiers in Research Metrics and Analytics, 7, doi:10.3389/frma.2022.863126, PMC 9363593

[1]

[2]

[3]

Authority control databases
International	ISNI VIAF WorldCat
National	Germany United States France BnF data Netherlands Norway Israel
Academics	CiNii ORCID DBLP
People	DDB
Other	IdRef