home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_East_Slavic-TOROT: POS Tags: NUM

There are 74 NUM lemmas (1%), 484 NUM types (1%) and 5060 NUM tokens (2%). Out of 14 observed tags, the rank of NUM is: 7 in number of lemmas, 8 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: шесть.тысячь, единъ, дъва, трие, дъвадесяти, шестьсътъ, четыре, пять, тридесяте, шесть

The 10 most frequent NUM types: ҂ѕ҃, к҃, х҃, ѕ҃, г҃, ф҃, в҃, л҃, м҃, единъ

The 10 most frequent ambiguous lemmas: съто (NUM 72, NOUN 1), пятьнадесять (NUM 47, ADJ 1), тысяща (NUM 34, ADJ 1), осмьсътъ (NUM 30, ADP 5), четырьнадесять (NUM 28, ADJ 2), осмьнадесять (NUM 22, ADJ 2), седмьнадесять (NUM 22, ADJ 1), шестьнадесять (NUM 19, ADJ 1), сорокъ (NUM 7, ADJ 1), другыи (ADJ 303, NUM 1)

The 10 most frequent ambiguous types: х҃ (NUM 195, PROPN 1), три (NUM 76, ADV 2), и҃ (NUM 67, CCONJ 1, PRON 1), а (CCONJ 3651, NUM 63, SCONJ 7, ADP 3, NOUN 3, PRON 2, ADV 1, VERB 1), и (CCONJ 16366, ADV 1250, PRON 569, NUM 60, ADP 38, VERB 4, ADJ 2, DET 1, NOUN 1), д (NUM 56, NOUN 5, CCONJ 3, ADP 1, ADV 1), е (NUM 56, PRON 40, AUX 6, NOUN 2), г (NUM 55, ADP 52), з (ADP 133, NUM 53), ѕ (NUM 53, ADP 2)

Morphology

The form / lemma ratio of NUM is 6.540541 (the average of all parts of speech is 3.947827).

The 1st highest number of forms (68) was observed with the lemma “единъ”: а, а҃, дино, еди, един, едина, единаго, единаго], едине, единем, единема, единемъ, едино, единого, единои, едином, единомоу, единому, единомъ, единомь, единомѹ, единомꙋ, единою, единоѧ, едину, единъ, единым, единымъ, единыхъ, единыя, единыѧ, единѣ, единѣмъ, единѣм꙽, единѣхъ, единѹ, единꙋ, един꙽, едіному, едіномѹ, едінъ, едїнаго, едїне, едїного, едїнъ, єдино, єдинъ, ѥдин, ѥдина, ѥдини, ѥдино, ѥдиного, ѥдинои, ѥдином, ѥдиномȣ, ѥдиному, ѥдиномь, ѥдиномѹ, ѥдиноꙗ, ѥдину, ѥдинъ, ѥдины, ѥдиныи, ѥдинѣмъ, ѥдинѣмь, ѥдинѣхъ, ѥдинѹ, ѥдін.

The 2nd highest number of forms (47) was observed with the lemma “одинъ”: а, а҃, дним, одиного, одинои, одиномь, одиноѣ, одинъ, одинѡ, одинѹ, одно, одново, одного, однои, одномъ, одны, однѹ, одого, одїного, одїнъ, ѡдин, ѡдина, ѡдини, ѡдино, ѡдинова, ѡдиного, ѡдинои, ѡдином, ѡдиномоу, ѡдиною, ѡдинъ, ѡдинѣхъ, ѡдна, ѡднемь, ѡдно, ѡдново, ѡдного, ѡдное, ѡдному, ѡдну, ѡдны, ѡдным, ѡднѣ, ѡднѹ, ѡдіными, ѡдїну, ҃а.

The 3rd highest number of forms (25) was observed with the lemma “дъва”: .в҃, в, в҃, два, две, двем, двема, двоих, двоихъ, двоу, двою, дву, двух, двухъ, двѣ, двѣма, двѹ, двꙋ, дова, довѣ, дъва, дъвою, дъвѣ, дъвѣма, дъвѹ.

NUM occurs with 3 features: Case (1311; 26% instances), Gender (1311; 26% instances), Number (1311; 26% instances)

NUM occurs with 13 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Fem,Masc, Gender=Masc, Gender=Neut, Number=Dual, Number=Plur, Number=Sing

NUM occurs with 54 feature combinations. The most frequent feature combination is _ (3749 tokens). Examples: ҂ѕ҃, к҃, х҃, ѕ҃, г҃, ф҃, в҃, л҃, м҃, у҃

Relations

NUM nodes are attached to their parents using 21 different relations: conj (1882; 37% instances), nummod (1715; 34% instances), obl (455; 9% instances), root (259; 5% instances), nsubj (223; 4% instances), obj (176; 3% instances), nmod (107; 2% instances), appos (81; 2% instances), orphan (61; 1% instances), xcomp (50; 1% instances), dislocated (15; 0% instances), obl:arg (11; 0% instances), nsubj:pass (8; 0% instances), advcl (5; 0% instances), ccomp (3; 0% instances), parataxis (3; 0% instances), advcl:cmp (2; 0% instances), dep (1; 0% instances), fixed (1; 0% instances), obl:agent (1; 0% instances), vocative (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NUM (1950; 39% instances), NOUN (1763; 35% instances), VERB (864; 17% instances), (259; 5% instances), PROPN (67; 1% instances), PRON (50; 1% instances), AUX (49; 1% instances), ADJ (42; 1% instances), ADV (11; 0% instances), ADP (4; 0% instances), DET (1; 0% instances)

2938 (58%) NUM nodes are leaves.

721 (14%) NUM nodes have one child.

647 (13%) NUM nodes have two children.

754 (15%) NUM nodes have three or more children.

The highest child degree of a NUM node is 14.

Children of NUM nodes are attached using 19 different relations: conj (1897; 42% instances), nmod (1230; 27% instances), case (411; 9% instances), cc (326; 7% instances), orphan (190; 4% instances), appos (131; 3% instances), advmod (90; 2% instances), obl (35; 1% instances), discourse (33; 1% instances), dislocated (33; 1% instances), nummod (29; 1% instances), nsubj (22; 0% instances), acl (20; 0% instances), cop (18; 0% instances), amod (15; 0% instances), det (11; 0% instances), mark (8; 0% instances), advcl (4; 0% instances), fixed (1; 0% instances)

Children of NUM nodes belong to 12 different parts of speech: NUM (1950; 43% instances), NOUN (1352; 30% instances), ADP (420; 9% instances), CCONJ (327; 7% instances), ADV (159; 4% instances), PROPN (88; 2% instances), ADJ (68; 2% instances), PRON (64; 1% instances), VERB (42; 1% instances), AUX (21; 0% instances), DET (10; 0% instances), SCONJ (3; 0% instances)