SEMCONST - SEMantic CONSTructor
TALARIS project -
SemConst is a program which aims at building semantic representations
for sentences. As an input it takes linguistic resources of 3 kinds:
- a metagrammar describing how to build a Tree Adjoining Grammar
whose trees are annotated with semantic information using the XMG formalism,
- a lexicon of lemmas indicating how to anchor the trees of the grammar,
- a lexicon of morphological items (these lexicons are defined using
a simple text format whose definition is given here).
As an output, SemConst produces a flat semantic representation of the sentence parsed.
Note that this construction can be performed on a whole corpus (ie a set of sentences).
The SEMCONST documentation is available as a wiki here
or as a [.pdf]
Moreover we suggest that you read the following papers:
- Large scale semantic construction for TAG - LACL 2005 [.pdf]
- SemTAG: a platform for specifying Tree Adjoining Grammars and performing TAG-based Semantic Construction - ACL 2007 (Demo session) [.pdf]
- SemTAG, une architecture pour le développement et l'utilisation de grammaires d'arbres adjoints à portée sémantique - TALN 2007 [.pdf]
The SEMCONST program is developed in Oz/Mozart, and uses many tools
for Natural Language Processing (such as the DyALog system). It is
available only for linux platforms. All the tools it uses are freely
available, links are given below.
To use SEMCONST, first you require the Oz/Mozart system, which you can
download at the following adress:
You also need to install the following softwares (please have a look at the SemConst wiki page
for information about their installation) :
- An Haskell compiler
(GHC version 6.4.1 or higher, note that if you use GHC 6.6 and higher, parsec isno longer in the standard library, e.g. under ubuntu or debian, you will need to install both the packages ghc and libghc6-parsec-dev)
- A Perl interpreter (part of most linux platforms)
- An XSLT processor (such as xsltproc)
- SWI Prolog (NB: please use SWI prolog since the command line interface may change from a prolog interpreter to another)
- The GNU-build tools (i.e. GNU make, aclocal, automake, autoconf)
- The XMG system (metagrammar compiler)
- the VIEWER (TAG explorer, part of the XMG-TOOLS)
- The Mathweb Xmlparser (XML Parser written in Oz, available as a MOGUL package)
- The Inputsource Oz library (a class for reading data from an arbitrary sequence of heterogenous sources such as files, urls, strings and virtual strings, used by the XML parser)
- The lexConverter (lexicon convertion program written in haskell)
- the DyALog system (Version 1.12.0 or higher)
- the tag_utils (Version 1.12 or higher, perl utilities for DyALog-like TAG)
- the forest_utils (Version 0.1 or higher, perl utilities for DyALog-like parse forests)
: Note that the Alpage tools (namely DyALog and forest_utils) can be installed using the alpi
installation tool (recommended).
You can download the last version of the sources from a subversion repository with anonymous access. The command is the following (make sure to be in the directory where you want to put the sources):
svn checkout svn://scm.gforge.inria.fr/svn/paule/trunk/SemConst
Alternatively, you can download a tarball of the sources here.
Once you have imported the sources, invoke:
chmod a+x install.sh
Opens SEMCONST's GUI for semantic construction in interactive mode.
Opens SEMCONST's GUI for semantic construction in corpus mode, ie to
perform semantic construction on a whole corpus (default).
Performs semantic construction on a directory of corpus (batch
processing, no GUI).
to set the metagrammar to use (if the metagrammar is splitted in
many files, give the valuation file)
to set the lemmas
to set the morphological lexicon
to set the corpus (corp or batch mode, in the latter it must be a directory)
to set the output (corp or batch mode, in the latter it must be a directory)
to print help
to activate semantic construction (default is only syntactic parsing)
In order to manage a single lexicon format for both parsing and generation, we have developed a converter which takes as an input a relatively intuitive text format and produces XML lexicons for DyALog-made TAG parsers and text lexicons for the Geni surface realiser.
This converter, namely lexConverter (written in Haskell), is available via a Subversion repository with anonymous access using the following command:
svn checkout svn://scm.gforge.inria.fr/svn/paule/trunk/lex2all
Alternatively, a tarball of the sources is available here.
Once you get the sources, you can compile and install it (assuming you
have an Haskell compiler and make) via:
sudo make install
Designing real scale TAGs
Generation with TAG
If you have any question / comment, please send an email at :
claire . gardent _AT_ loria . fr
yannick . parmentier _AT_ univ-orleans . fr