home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Veps-VWT: POS Tags: NOUN

There are 144 NOUN lemmas (37%), 221 NOUN types (37%) and 310 NOUN tokens (24%). Out of 13 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: elo, kel’, külä, ristit, rad, rahvaz, aig, laps’, škol, kanz

The 10 most frequent NOUN types: kelel, lapsed, ristitud, külän, rad, elo, jurid, kel’t, elod, kanzan

The 10 most frequent ambiguous lemmas: vepsläine (NOUN 7, ADJ 5), pol’ (ADP 1, NOUN 1)

The 10 most frequent ambiguous types: kerdan (ADV 1, NOUN 1), vepsläižid (ADJ 1, NOUN 1)

Morphology

The form / lemma ratio of NOUN is 1.534722 (the average of all parts of speech is 1.526854).

The 1st highest number of forms (6) was observed with the lemma “rahvaz”: rahvahad, rahvahan, rahvahaze, rahvahid, rahvast, rahvaz.

The 2nd highest number of forms (5) was observed with the lemma “elo”: elo, elod, eloho, elon, elos.

The 3rd highest number of forms (5) was observed with the lemma “kodima”: kodimad, kodimaha, kodimal, kodimale, kodiman.

NOUN occurs with 3 features: Case (310; 100% instances), Number (310; 100% instances), Clitic (1; 0% instances)

NOUN occurs with 16 feature-value pairs: Case=Abl, Case=Ade, Case=All, Case=Com, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Case=Ter, Case=Tra, Clitic=Ki, Number=Plur, Number=Sing

NOUN occurs with 23 feature combinations. The most frequent feature combination is Case=Nom|Number=Plur (45 tokens). Examples: lapsed, ristitud, vanhembad, vepsläižed, Päžar’laižed, adivod, aigad, aldod, astjad, avtobusad

Relations

NOUN nodes are attached to their parents using 13 different relations: obl (115; 37% instances), obj (50; 16% instances), nsubj (47; 15% instances), conj (31; 10% instances), nmod (27; 9% instances), nsubj:cop (19; 6% instances), root (14; 5% instances), xcomp (2; 1% instances), acl:relcl (1; 0% instances), appos (1; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances), parataxis (1; 0% instances)

Parents of NOUN nodes belong to 8 different parts of speech: VERB (207; 67% instances), NOUN (69; 22% instances), (14; 5% instances), PRON (8; 3% instances), ADJ (4; 1% instances), PROPN (4; 1% instances), ADV (3; 1% instances), AUX (1; 0% instances)

89 (29%) NOUN nodes are leaves.

148 (48%) NOUN nodes have one child.

39 (13%) NOUN nodes have two children.

34 (11%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 7.

Children of NOUN nodes are attached using 19 different relations: nmod (109; 29% instances), amod (62; 17% instances), punct (43; 12% instances), conj (30; 8% instances), cc (22; 6% instances), cop (19; 5% instances), nsubj:cop (17; 5% instances), advmod (13; 4% instances), case (12; 3% instances), acl:relcl (11; 3% instances), nummod (10; 3% instances), obl (8; 2% instances), mark (3; 1% instances), advcl (2; 1% instances), appos (2; 1% instances), det (2; 1% instances), nsubj (2; 1% instances), parataxis (2; 1% instances), aux (1; 0% instances)

Children of NOUN nodes belong to 12 different parts of speech: PRON (78; 21% instances), NOUN (69; 19% instances), ADJ (63; 17% instances), PUNCT (43; 12% instances), CCONJ (22; 6% instances), PROPN (22; 6% instances), AUX (20; 5% instances), VERB (15; 4% instances), ADV (13; 4% instances), ADP (12; 3% instances), NUM (10; 3% instances), SCONJ (3; 1% instances)