Chalmers tekniska högskola, Institutionen för data- och informationsteknik, Datavetenskap (Chalmers), Chalmers University of Technology, Department of Computer Science and Engineering, Computing Science (Chalmers)
Chalmers tekniska högskola Institutionen för data- och informationsteknik, Datavetenskap (Chalmers).
Natural languages have been subject of studies for centuries and are hot topic even today. The demand for computer systems able to communicate directly in natural language places new challenges. Computational resources like grammars and lexicons and efficient processing tools are needed. Grammars are described as computer programs in declarative domain specific languages. Just like any other programming language they require mature compilers and efficient runtime systems. The topic of this thesis is the runtime system for the Grammatical Framework (GF) language. The first part of the thesis describes the semantics of the Portable Grammar Format (PGF). This is a low-level format which for GF plays the same role as the JVM bytecode for Java. The representation is designed to be as simple as possible to make it easy to write interpreters for it. The second part is for the incremetal parsing algorithm in PGF. The parser performance was always a bottle neck in GF until now. The new parser is of orders of magnitude faster than any of the previous implementations. The last contribution of the thesis is the development of the Bulgarian resource grammar. This is the first Balkan language in the resource library and the first open-source grammar for Bulgarian. The grammar development was also an important benchmark for the development of the parsing algorithm.