Slovene Dependency Treebank SDT V0.1 2006-01-06 This is the proto-release of the Slovene Dependency Treebank, SDT V0.1 which contains the Prague Dependency Treebank-like annotation of the first part of Slovene translation of Orwell's "1984", taken from the MULTEXT-East parallel corpus, V3.0, c.f. http://nl.ijs.si/ME/V3/ and http://nl.ijs.si/ME/V3/doc/index.html#mtev3-doc-div2-id2305296 This release was made specifically for participation in the CoNLL-X Shared Task: Multi-lingual Dependency Parsing http://nextens.uvt.nl/~conll/ CoNLL-X Tenth Conference on Computational Natural Language Learning - New York City, June 8-9, 2006 http://www.cnts.ua.ac.be/conll/ Please see sdtCoNLL-licence.html for licencing details. This directory contains: sdt-.xml Source XML/TEI corpus with header and libraries sdt--conll.tbl Derived CONLL-X shared task tabular format sdt--report.txt Some counts on CONLL-X tei2.dtd TEI P4 DTD for XML corpus tei2conll.xsl Script to convert TEI to CONLL w_under_c.xsl Script to demote punctuation Makefile Convert with various options =============================================================== Tomaž Erjavec | Dept. of Knowledge Technologies email: tomaz.erjavec@ijs.si | Jozef Stefan Institute www: http://nl.ijs.si/et/ | Jamova 39 fax: (+386 1) 477-3131 | SI-1000 Ljubljana, Slovenia