home edit page issue tracker

This page pertains to UD version 2.

UD Tatar NMCTT

Language: Tatar (code: tt)
Family: Turkic, Northwestern

This treebank has been part of Universal Dependencies since the UD v2.9 release.

The following people have contributed to making this treebank part of UD: Chihiro Taguchi.

Repository: UD_Tatar-NMCTT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.13

License: CC BY-SA 4.0

Genre: nonfiction, news

Questions, comments? General annotation questions (either Tatar-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [c • taguchi (æt) sms • ed • ac • uk]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS not available
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

UD Tatar-NMCTT is a manually annotated corpus of the Tatar language based on the text from Tatar-Inform (tatar-inform.tatar), an online news website.

UD Tatar-NMCTT is a corpus of the Tatar language, manually annotated by Chihiro Taguchi under the project “NAIST Multilingual Corpus” at Nara Institute of Science and Technology, Japan. The text is taken from the online news website Tatar-Inform. The articles contain a wide variety of genres, including politics, health, incidents, etc. Upon citing the text, it is recommended to show the source link of the article, given the Russian federal law stipulating that all mass medias citing an article have to show the link to the corresponding source article. The link is available as meta data in the corpus.

Acknowledgments

This contribution to UD were never possible without generous cooperation by Zilya Mubarakshina (Tatar-Inform). The building of this corpus is funded by CICP of Nara Institute of Science and Technology.

References

Statistics of UD Tatar NMCTT

POS Tags

ADJADPADVAUXCCONJDETNOUNNUMPRONPROPNPUNCTSCONJSYMVERB

Features

AspectCaseDegreeForeignMoodNumberNumber[psor]NumTypePersonPerson[psor]PolarityPronTypeReflexTenseVerbFormVoice

Relations

acladvcladvmodadvmod:emphamodapposauxcaseccccompcompoundcompound:lvcconjdepdetdiscoursefixedflatmarknmodnsubjnummodobjoblparataxispunctrootxcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Verbs with Reflexive Core Objects

Relations Overview