PCEDT 1.0
Table of Contents
Documentation
  Overview
  Data
    Czech-English Penn Treebank
    Reader's Digest Parallel Corpus
    Czech Monolingual Corpus
    Dictionaries
    Data Sizes
  Tools
  References

Definitions of data types
  CSTS document type and csts.doctype
  FS format description

Licensing information
  PCEDT_license.html

Main README file
  README

Structure of the CD

PCEDT_CD_1.0/
|-- data
|   |-- PTB_corpus
|   |   |-- original [README]
|   |   |   |-- En_development (259)
|   |   |   |-- En_evaluation (256)
|   |   |   `-- En_training (48693)
|   |   |-- raw [README]
|   |   |   |-- Cz_development (259)
|   |   |   |-- Cz_evaluation (256)
|   |   |   |-- Cz_training (21113)
|   |   |   |-- En_development (259)
|   |   |   |-- En_evaluation (256)
|   |   |   `-- En_training (48693)
|   |   |-- reference_translations [README]
|   |   |   |-- En_development (4 x 259)
|   |   |   `-- En_evaluation (4 x 256)
|   |   |-- NIST_format [README]
|   |   |   |-- Cz_development (259)
|   |   |   |-- Cz_evaluation (256)
|   |   |   |-- Cz_training (21141)
|   |   |   |-- En_development (259)
|   |   |   |-- En_evaluation (256)
|   |   |   `-- En_training (21141)
|   |   |-- automatic_tagged [README]
|   |   |   |-- Cz_development (259)
|   |   |   |-- Cz_evaluation (256)
|   |   |   `-- Cz_training (21113)
|   |   |-- automatic_AR [README]
|   |   |   |-- Cz_development (259 / 256)
|   |   |   |-- Cz_evaluation (256 / 256)
|   |   |   |-- Cz_training (21113 / 21022)
|   |   |   |-- En_development (259)
|   |   |   |-- En_evaluation (256)
|   |   |   `-- En_training (48693)
|   |   |-- automatic_TR [README]
|   |   |   |-- Cz_development (259 / 256)
|   |   |   |-- Cz_evaluation (256 / 256)
|   |   |   |-- Cz_training (21113 / 21022)
|   |   |   |-- En_development (259)
|   |   |   |-- En_evaluation (256)
|   |   |   `-- En_training (48693)
|   |   `- manual_TR [README]
|   |       |-- Cz_development (233)
|   |       |-- Cz_evaluation (239)
|   |       |-- En_development (248)
|   |       |-- En_evaluation (249)
|   |       `-- En_training (760)
|   |-- RD_corpus
|   |   `-- raw [README]
|   |       |-- Align (54091)
|   |       |-- Cz (59041)
|   |       `-- En (58656)
|   |-- Czech_raw_texts [README]
|   |   `-- PureData (2,385,000)
|   `-- Dictionaries [README]
|       |-- CzechEnglishProbDict.txt (46150 pairs)
|       |-- CzechEnglishFormsDict.txt (496673 pairs)
|       `-- slovnik_data.txt (115929 pairs) 
|-- doc
|   |-- README
|   |-- PCEDT_main.html (this file)
|   |-- csts.html
|   |-- fs.html
|   `-- papers
|-- dtd
|   |-- csts.doctype
|   '-- mteval-v1.1.dtd
`-- tools [README]
    |-- SMT_QuickRun
    |   |-- SMT_QuickRun1.2.tgz
    |   `-- Doc
    |       `-- SMT_QuickRun.html
    |-- TrEd
    |   |-- tred-current.tar.gz
    |   |-- tred-dep-unix.tar.gz
    |   |-- tred_wininst_en.zip
    |   `-- Doc
    |       `-- TrEd.html
    |-- NetGraph
    |   |-- netgraph_client_application_bin_1.68.zip
    |   |-- netgraph_server_linux_i386.zip
    |   `-- Doc
    |       |-- netgraph_manual.html
    |       `-- netgraph_server_install.html
    `-- misc