The first Danish CLARIN project
The Danish CLARIN consortium (Centre for Danish Language resources and Technology Infrastructure for the Humanities) received a grant of 15 million DKK (appr. 2 million €) from the research infrastructure programme of the Danish Agency for Science, Technology and Innovation. The grant was for construction of a Danish research infrastructure for the humanities integrating written, spoken, and visual records into a coherent and systematic digital repository. The project ran from January 2008 until the end of 2010.
The acronym CLARIN stands for Common Language Resources and Technology Infrastructure. CLARIN is also the name of an EU-project (see EU-CLARIN's website) which the Danish CLARIN is connected with.
The project is continued in the DigHumLab project, theme 1. Read more at info.clarin.dk
The vision is to create a researcher’s toolbox by establishing a number of digital Danish text, speech and visual resources and associated tools and to integrate these resources into a web-based environment for research thus creating a much needed support for Danish humanities and enhance its possibilities for European collaboration. The Danish CLARIN project will also improve the conditions for Danish language technology research and development by starting a structured approach to a Danish BLARK.
The Danish CLARIN project will follow standards and recommendations developed in the preparatory phase of the European CLARIN project, see www.clarin.eu ,but it is important to realize that this project has not been granted as a preparatory phase project in parallel with the European project. It involves an independent Danish investment in the construction of a national infrastructure that will stand alone as a vital contribution to the Danish research enterprise. For this reason it was vitally important for the consortium to design the work packages in such a way as to be able to deliver as a result not only the technical infrastructure but also as many types of content as possible.
Semantiske sprogressourcer for computere, Bolette S. Pedersen. Foredrag i Dansk Selskab for Datalogi, 4. maj, 2010.
Encoding Attitude and Connotation in Wordnets, Braasch, A. & B.S. Pedersen. In: The 14th EURALEX International Congress, Leeuwarden , The Netherlands, 2010.
Merging specialist taxonomies and folk taxonomies in wordnets. - a case study of plants, animals and foods in the Danish wordnet, B. S. Pedersen, S. Nimb, A. Braasch. In: Proceedings from the Seventh International Conference on Language Resources and Evaluation, s. 3181-3186. Malta 2010.
Quality indicators of LSP texts - selection and measurements. Measuring the terminological usefulness of documents for an LSP corpus, Jakob Halskov, Anna Braasch, Dorte Haltrup Hansen og Sussi Olsen. Proceedings of LREC 2010, s. 2614-2620. Malta 2010
- Compiling, annotating and publishing corpora in DK-CLARIN, the Danish incarnaton of the pan-European initiative for a common research infrastructure, Jakob Halskov og Jørg Asmussen, 2009
- CLARIN in Denmark - European and Nordic perspectives, Hanne Fersøe og Bente Maegaard. Artikel til NODALIDA-konferencen, 2009
- DanNet - the Challenge of Compiling a WordNet for Danish by Reusing a Monolingual Dictionary. Artikel af B.S. Pedersen, S. Nimb, J. Asmussen, N. H. Sørensen, L.Trap-Jensen, H. Lorentzeni i Language Resources and Evaluation Journal, september 2009.
- Hearing loss, perception and annotated corpora. Artikel af Hanne Fersøe om det indtalte PAROLE-korpus og dets anvendelsesmuligher, EU-CLARINs nyhedsbrev nr. 7, september 2009.
- Knowledge for Everyman-korpusset i det danske CLARIN projekt. Artikel af Hanne Fersøe i EU-CLARINs nyhedsbrev nr. 4, marts 2009.
- Foredrag om det danske CLARIN-projekt, 15 december 2008 på et norsk CLARIN-møde i Bergen af Hanne Fersøe.
- Distinguishing the communicative functions of gestures. K. Jokinen, C. Navarretta and P. Paggio. In A. Popescu-Belis and R. Stiefelhagen (eds.) Proceedings of 5th Joint Workshop on Machine Learning and Multimodal Interaction, Utrecht, September 2008, Springer, 38-49.
- Fri og bunden forskning om CLARIN-DK. WP2.3 Knowledge for Everyman. Foredrag af Hanne Ruus 9. september 2008 på MUDS12-konferencen.
The work is organized in five thematically defined main work packages, three of which are dedicated to making content available, while one focuses on the technical infrastructure and one on project management.
This work package consists of overall co-ordination and management of the project, and it will also address subjects such as copyright issues and financing of the deployment phase following the construction.
The project manager will ensure communication internally between the partners and between the centre and the Danish Agency for Science, Technology and Innovation.
The project manager in cooperation with the work package managers will assure the quality of data and follow up on project planning.
This work package will collect and annotate existing written language resources, contemporary and old, general language and specialised sublanguages, literary and professional, as well as parallel corpora with Danish as one of the languages. WP2 has six sub-work packages, and seven of the ten consortium members collaborate in these in different combinations.
This work package will collect and annotate three different spoken language corpora and it will develop associated tools. WP3 has three sub-work packages, which involve four of the ten consortium members.
This work package develops new and modifies existing technological resources, which are defined as data resources that are constructed. The work comprises traditional and electronic dictionaries, as well as dictionaries and semantic word nets meant for computer systems. It also comprises the linking between different dictionaries as well as between dictionaries and corpora. WP4 has two sub-work packages in which three of the ten consortium members collaborate.
The task of this WP is to provide the technical framework for the infrastructure, including a single web user interface to serve as the Danish CLARIN platform. This platform will give access to all the tools and text resources of the infrastructure, as well as a personal workspace, communication facilities, user authentication and rights management, and search and retrieval facilities. WP5 has two sub-work packages to which all of the ten consortium members contribute. The three main technological centres who are responsible for interoperability and functionality of the technical platform are The Royal Library (KB), Society for Danish Language and Literature (DSL), and Centre for Language Technology (CST).
University of Southern Denmark
Johannes Wagner, Professor, dr.phil.
University of Aarhus
Viggo Sørensen, Associate professor, mag.art.
Copenhagen Business School
Peter Juel Henrichsen, Associate professor
The Royal Library
Anders Sparre Conrad, Senior consultant
The National Museum of Denmark
Birgit Rønne, Coordinator
Society for Danish Language and Literature
Lars Trap-Jensen, Managing editor
Danish Language Council
Sabine Kirchmeier-Andersen, Director