home edit page issue tracker

This page pertains to UD version 2.

UD for LANGUAGE

This is a work-in-progress overview of the UD annotation for Georgian.

Tokenization and Word Segmentation

Nominal Multiword expressions (MWEs)

The Georgian Multiword expressions (MWEs) are (continuous or discontinuous) sequences of words with the following compulsory properties:

ერთგვარი ჯაჭვური რეაქცია მოხდა. \n A kind of nuclear reaction happened
nsubj(რეაქცია, მოხდა)

Modifier Dependents

A nominal head does not take any core arguments but may be associated with different types of modifiers:

  1. An nmod is a nominal phrase modifying the head of another nominal phrase.
  2. An amod is an adjective modifying the head of a nominal phrase.
  3. A nummod is a numeral modifying the head of a nominal phrase.
ფაიფურის თიხა
nmod(თიხა, ფაიფურის)
არ არსებობს გამოუვალი მდგომარეობა
amod(მდგომარეობა-4, გამოუვალი-3)
მეცხრე ცა
nummod(ცა, მეცხრე)

Function Word Dependents

Nominals may also contain the following typical function word dependents:

მეცხრე ცაზე
nummod(ცა-2, მეცხრე-1)
case(ცა-2, ზე-3)

Instruction: Describe the general rules for delimiting words (for example, based on whitespace and punctuation) and exceptions to these rules. Specify whether words with spaces and/or multiword tokens occur. Include links to further language-specific documentation if available.


Morphology

Tags


Instruction: Specify any unused tags. Explain what words are tagged as PART. Describe how the AUX-VERB and DET-PRON distinctions are drawn, and specify whether there are (de)verbal forms tagged as ADJ, ADV or NOUN. Include links to language-specific tag definitions if any.


Features

Lexical Features

Inflectional Features

Nominal Features
Verbal Features

Instruction: Describe inherent and inflectional features for major word classes (at least NOUN and VERB). Describe other noteworthy features. Include links to language-specific feature definitions if any.


Syntax

v-type —————— m-type ——————
NOM NOM (v-set)    
NOM NOM (v-set) + DAT DAT (m-set)  
NOM ERG (v-set) + DAT NOM (m-set)  
NOM ERG (v-set) + DAT NOM (m-set) + DAT DAT ( -a)
NOM ERG (v-set) + DAT NOM (-set) + DAT DAT (m- -a)

Instruction: Give criteria for identifying core arguments (subjects and objects), and describe the range of copula constructions in nonverbal clauses. List all subtype relations used. Include links to language-specific relations definitions if any.


Treebanks

There are not UD treebanks of Georgian.


Instruction: Treebank-specific pages are generated automatically from the README file in the treebank repository and from the data in the latest release. Link to the respective *-index.html page in the treebanks folder, using the language code and the treebank code in the file name.