Research interests and current projects

Computing science teaching

In France, computer science has long be seen as a purely technical domain, which did not deserve to be part of fundamental education. From early nineties until recently (2011), there were no computer science classes neither in primary nor in secondary education. In 2011, CS was re-introduced in high-school curricula as an optional topic. I then started to train high-school teachers to this topic. In 2014, I joined the Maison pour la science in Orléans to co-conduct workshops on unplugged computer science. I was puzzled by the prejudices colleagues may have about computer science, which largely impedes their learning.

When I moved back to Nancy in 2017, I got a position in the School of Education, where I got a chance to further work on this matter. Since 2018, first within the PIAF EU Erasmus+ Project (2018-2021), I have been working with colleagues in Nancy (Marie Duflot-Kremer), Liège (Brigitte Denis), Luxembourg (Robert Reuter) and Saarbrücken (Armin Weinberger) on the definition and implementation of a new referential of competencies related to computational thinking, which would help to break the above-mentioned prejudices and bridge the gap between using computers and understanding them (Parmentier et al., 2020).

Generation

Natural Language Generation (NLG) can be described as the task of automatically building natural language texts which verbalise some given input meaning. Nowadays, NLG systems build on machine learning techniques for extracting meaningful statistical representations of language from large datasets. The efficiency of these systems heavily rely (among others) on the input data. While several shared tasks exist in this domain, it is hard to compare between these systems, as the main metrics used in the community (e.g. BLEU) are limited (e.g. processing of paraphrases, precision, etc.). I am working, together with colleagues (Claire Gardent and Anastasia Shimorina) on the study of biases in metrics used in NLG (Shimorina et al., 2021).

Another topic of interest related to text generation, is computer-aided language learning (CALL). Together with my colleague Claire Gardent, we are working on the design and implementation of a learning environment, which would benefit from text generation algorithms to automatically provide teachers and learners with adequate grammar exercises.

Grammar engineering

Grammar engineering is the task of designing and implementing linguistically motivated electronic descriptions of natural language (so-called grammars). These grammars are expressed within well-defined theoretical frameworks, and offer a fine-grained description of natural language. While grammars were first used to describe syntax, that is to say, the relations between constituents in a sentence, they often go beyond syntax and include e.g. semantic information.

I have been working with colleagues (see e.g. (Crabbé et al. 2013), (Petitjean, Duchier, and Parmentier 2016)) on the definition and implementation of description languages for grammar engineering, that is formal languages which help linguists to describe various dimensions of language (syntax, semantics, morphology). We are also interested in the application of these description languages to the actual description of natural languages (such as Ikota (Duchier et al. 2012) or Arabic (Ben Khelil et al. 2016)).

Related projects include:

  • eXtensible Meta-Grammar 2 (XMG2)

  • Formal grammars working group led by Dr Nicola Lampitelli at the Laboratoire Ligérien de Linguistique

Parsing

Parsing (aka syntactic analysis) is the task of computing a representation of the relations between words in a string. Parsing usually relies on a (implicit or explicit) formal description of language (grammar) and produces a tree structure (constituency tree or dependency structure, depending on the framework one is working with).

I have been working in the context of parsing natural language with mildly context-sensitive grammars (namely Tree-Adjoining Grammars TAG). The objectives are manyfold, and include the following:

  • enhance practical TAG parsing, see e.g. (Gardent et al. 2014)

  • syntax-based semantic construction, following Montague’s legacy, see e.g. (Gardent and Parmentier 2005)

I also developed, together with colleagues (Duchier, Dao, and Parmentier 2014), a parsing prototype for Property Grammars, a constraint-based grammar formalism. Property Grammars differ from generative formalisms insofar as they can describe the syntax of ungrammatical or partially grammatical utterances, thus providing a formal framework for grammaticality judgement.

Past projects

Multiword Expressions

Multi-word expressions (MWEs) are sequences of words with some unpredictable properties, such as to count somebody in (to rely on somebody) or to take a haircut (to suffer from some loss). Processing such expressions is particularly difficult because of their highly heterogeneous behaviour at the lexical, syntactic and semantic level.

During my participation to the PARSEME COST Action (Savary et al. 2015) led by Agata Savary, I have worked on the representation of these expressions in linguistic resources and their impact on parsing (see e.g. (Waszczuk, Savary, and Parmentier 2016)). Related projects include:

Syntactic parsing with Range Concatenation Grammar

During a post-doctoral visit to Laura Kallmeyer‘s group at the University of Tübingen in 2007-2008, I worked on a parsing architecture for mildly context-sensitive formalisms and which is based on Range Concatenation Grammar as a pivot formalism (Parmentier et al. 2008) (a formalism which can represent all languages whose parsing complexity is polynomial):

  • Tuebingen Linguistic Parsing Architecture (TuLiPA)

Semantic construction for Tree-Adjoining Grammar

During my PhD (2003-2007) under the supervision of Claire Gardent, I worked on the semi-automatic generation of real-size Tree-Adjoining Grammars, which led to the development of the XMG and SemConst softwares.

The former is a compiler for the XMG description language (language for describing syntactic trees via reusable tree fragments and flat semantic representations), and the latter a semantic wrapper for the DyALog system:

  • eXtensible Meta-Grammar (XMG)

  • Semantic Constructor (SemConst)

Supervised students

PhD candidates

Bachelor / Master students

  • Valadis Mastoras - MSc in Natural Language Processing at Université de Lorraine, Nancy, 2021. Topic: semi-automatic generation of grammar exercises. Now NLP engineer at Centre for Research and Technology Hellas (CERTH), Greece.

  • Mathilde Aguiar - 2nd year student at Polytech Grenoble Engineering School (~ BSc), Nancy, 2021. Topic: Python-flask based development of a User Interface for a language learning environment.

  • William Soto - BSc in Natural Language Processing at Université de Lorraine, Nancy, 2020. Topic: language detection and topic modelling from tweets. Co-supervision with Emmanuel Schang and Claire Gardent. Now PhD candidate at Université de Lorraine.

  • Paul Claude - Undergraduate studies in Computer Science (DUT Informatique) at Université de Lorraine, Nancy, 2020. Topic: design and implementation of an on-line multilingual document edition environment with Flask. Now Master Student at Université de Lorraine (training program for future school teachers).

  • Laurine Jeannot - BSc in Natural Language Processing at Université de Lorraine, Nancy, 2018. Topic: Formal representation of multiword-expressions. Co-supervision with Claire Gardent. Now PhD candidate at Université de Lorraine.

  • Guilherme Razet - Undergraduate studies in Computer Science for Humanities (Licence MIASHS) at Université de Lorraine, Nancy, 2018. Topic: Creation of a test-suite for semantic parsing of arabic.

  • Simon Petitjean - MSc in Computer Science at Université d’Orléans, 2010. Topic: modular development of formal grammars. Co-supervision with Denys Duchier. Now research fellow at University of Düsseldorf, Germany.

  • Kilian Evang - BA in Computational Linguistics at Universität Tübingen, 2008. Topic: development of a plugin for the eclipse IDE for metagrammar design. Co-supervision with Timm Lichte and Laura Kallmeyer. Now research fellow at University of Düsseldorf, Germany.

  • Johannes Dellert - BA in Computational Linguistics at Universität Tübingen, 2008. Topic: implementation of automata-based lexical selection within the TuLiPA parser. Co-supervision with Wolfgang Maier and Laura Kallmeyer. Now lecturer at University of Tübingen, Germany.

  • Brice Ambrosiak - Undergraduate Studies in Computer Science (Licence Mathématiques-Informatique) at Université Henri Poincaré, Nancy, 2007. Topic: development of a metagrammar explorer using Graphviz. Now senior software developer at Pictet Technologies, Luxembourg.

References

Ben Khelil, Chérifa, Denys Duchier, Yannick Parmentier, Chiraz Zribi, and Fériel Ben Fraj. 2016. “ArabTAG: From a Handcrafted to a Semi-Automatically Generated TAG.” In TAG+12: 12th International Workshop on Tree-Adjoining Grammars and Related Formalisms. Düsseldorf, Germany. https://hal.archives-ouvertes.fr/hal-01320995.

Crabbé, Benoît, Denys Duchier, Claire Gardent, Joseph Le Roux, and Yannick Parmentier. 2013. “XMG : EXtensible MetaGrammar.” Computational Linguistics 39 (3). Massachusetts Institute of Technology Press (MIT Press): 591–629. https://hal.archives-ouvertes.fr/hal-00768224.

Duchier, Denys, Thi-Bich-Hanh Dao, and Yannick Parmentier. 2014. “Model-Theory and Implementation of Property Grammars with Features.” Journal of Logic and Computation 24 (2). Oxford University Press (OUP): 491–509. doi:10.1093/logcom/exs080.

Duchier, Denys, Brunelle Magnana Ekoukou, Yannick Parmentier, Simon Petitjean, and Emmanuel Schang. 2012. “Describing Morphologically-Rich Languages Using Metagrammars: A Look at Verbs in Ikota.” In Workshop on ”Language Technology for Normalisation of Less-Resourced Languages”, 8th Saltmil Workshop on Minority Languages and the 4th Workshop on African Language Technology, 55–60. Istanbul, Turkey. https://hal.archives-ouvertes.fr/hal-00688643.

Gardent, Claire, and Yannick Parmentier. 2005. “Large Scale Semantic Construction for Tree Adjoining Grammar.” In Logical Aspects in Computational Linguistics - Lacl’05, edited by Philippe Blache, Edward Stabler, Joan Busquets, and Richard Moot, 3492:131–46. Lecture Notes in Computer Science. Springer. https://hal.archives-ouvertes.fr/inria-00000251.

Gardent, Claire, Yannick Parmentier, Guy Perrier, and Sylvain Schmitz. 2014. “Lexical Disambiguation in LTAG Using Left Context.” In Human Language Technology. Challenges for Computer Science and Linguistics. 5th Language and Technology Conference, Ltc 2011, Poznan, Poland, November 25-27, 2011, Revised Selected Papers, edited by Zygmunt Vetulani and Joseph Mariani, 8387:67–79. Lecture Notes in Computer Science (Lncs) Series / Lecture Notes in Artificial Intelligence (Lnai) Subseries. Springer. doi:10.1007/978-3-319-08958-4_6.

Parmentier, Yannick, Laura Kallmeyer, Timm Lichte, Wolfgang Maier, and Johannes Dellert. 2008. “TuLiPA: A Syntax-Semantics Parsing Environment for Mildly Context-Sensitive Formalisms.” In 9th International Workshop on Tree-Adjoining Grammar and Related Formalisms (Tag+9), 121–28. Tübingen, Germany. https://hal.archives-ouvertes.fr/inria-00288429.

Parmentier, Y., Reuter, R., Higuet, S., Kataja, L., Kreis, Y., Duflot-Kremer, M., … Denis, B. 2020. “PIAF: Developing Computational and Algorithmic Thinking in Fundamental Education.” In AACE 2020 - edmedia + innovate learning (Vol. 1, pp. 315–322). Amsterdam / Virtual, Netherlands: Association for the Advancement of Computing in Education (AACE), Waynesville, NC. Retrieved from https://hal.archives-ouvertes.fr/hal-02888504

Petitjean, Simon, Denys Duchier, and Yannick Parmentier. 2016. “XMG2: Describing Description Languages.” In Logical Aspects of Computational Linguistics (Lacl 2016), edited by Maxime Amblard, Philippe de Groote, Sylvain Pogodalla, and Christian Rétoré, 10054:255–72. Lecture Notes in Computer Science. Nancy, France: Springer-Verlag. doi:10.1007/978-3-662-53826-5_16.

Savary, Agata, Manfred Sailer, Yannick Parmentier, Michael Rosner, Victoria Rosén, Adam Przepiórkowski, Cvetana Krstev, et al. 2015. “PARSEME – PARSing and Multiword Expressions Within a European Multilingual Network.” In 7th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics (Ltc 2015). Poznań, Poland. https://hal.archives-ouvertes.fr/hal-01223349.

Shimorina, A., Parmentier, Y., & Gardent, C. 2021. “An Error Analysis Framework for Shallow Surface Realisation.” In Transactions of the Association for Computational Linguistics, volume 9. Retrieved from https://hal.archives-ouvertes.fr/hal-03159422

Waszczuk, Jakub, Agata Savary, and Yannick Parmentier. 2016. “Promoting Multiword Expressions in A* TAG Parsing.” In 26th International Conference on Computational Linguistics (Coling 2016). Osaka, Japan. https://hal.archives-ouvertes.fr/hal-01378903.

Last update: 2022-02-08