home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-SiMoNERo: Features: Gender

This feature is universal. It occurs with 2 different values: Fem, Masc.

69360 tokens (48%) have a non-empty value of Gender. 14781 types (82%) occur at least once with a non-empty value of Gender. 7672 lemmas (72%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (39982; 27% instances), ADJ (16686; 11% instances), DET (6907; 5% instances), VERB (3889; 3% instances), PRON (1065; 1% instances), NUM (416; 0% instances), AUX (402; 0% instances), PROPN (13; 0% instances).

NOUN

39982 NOUN tokens (94% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (29256; 73%), Definite=Def (21800; 55%), Case=Nom (20629; 52%).

NOUN tokens may have the following values of Gender:

Paradigm cazMascFem
Case=Gen|Definite=Def|Number=Singcazului
Case=Gen|Definite=Def|Number=Plurcazurilor
Case=Nom|Definite=Def|Number=Singcazul
Case=Nom|Definite=Def|Number=Plurcazurile
Definite=Ind|Number=Singcaz
Definite=Ind|Number=Plurcazuri

Gender seems to be lexical feature of NOUN. 94% lemmas (4053) occur only with one value of Gender.

ADJ

16686 ADJ tokens (98% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (16647; 100%), Definite=Ind (16460; 99%), Number=Sing (11573; 69%), Case=EMPTY (9861; 59%).

ADJ tokens may have the following values of Gender:

Paradigm mareMascFem
Case=Gen|Definite=Def|Number=Singmarii
Case=Gen|Definite=Ind|Number=Singmari
Case=Nom|Definite=Def|Number=SingMarelemarea
Case=Nom|Definite=Def|Number=Plurmarile
Case=Nom|Definite=Ind|Number=Singmare
Definite=Ind|Number=Singmare

DET

6907 DET tokens (93% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Position=EMPTY (5899; 85%), Number=Sing (5606; 81%), Person=EMPTY (5475; 79%), Poss=EMPTY (3896; 56%).

DET tokens may have the following values of Gender:

Paradigm alMascFem
Number=Singala
Number=Pluraiale

VERB

3889 VERB tokens (38% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (3889; 100%), Person=EMPTY (3889; 100%), Tense=EMPTY (3889; 100%), VerbForm=Part (3889; 100%), Number=Sing (2715; 70%).

VERB tokens may have the following values of Gender:

Paradigm aveaMascFem
Number=Singavutavută
Number=Pluravute

PRON

1065 PRON tokens (25% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Person=3 (1065; 100%), Reflex=EMPTY (1065; 100%), Case=Nom (879; 83%), Strength=EMPTY (858; 81%), PronType=Dem (691; 65%), Number=Sing (648; 61%).

PRON tokens may have the following values of Gender:

Paradigm careMascFem
căruiacăreia

NUM

416 NUM tokens (9% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (381; 92%), NumType=Ord (257; 62%), Number=Plur (210; 50%).

NUM tokens may have the following values of Gender:

Paradigm doiMascFem
doidouă

Gender seems to be lexical feature of NUM. 92% lemmas (36) occur only with one value of Gender.

AUX

402 AUX tokens (8% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (402; 100%), Number=Sing (402; 100%), Person=EMPTY (402; 100%), Tense=EMPTY (402; 100%), VerbForm=Part (402; 100%).

AUX tokens may have the following values of Gender:

PROPN

13 PROPN tokens (2% of all PROPN tokens) have a non-empty value of Gender.

PROPN tokens may have the following values of Gender:

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (13586; 96%), NOUN –[nmod]–> NOUN (8534; 50%), NOUN –[det]–> DET (5035; 74%), NOUN –[conj]–> NOUN (2752; 64%), VERB –[nsubj:pass]–> NOUN (753; 62%), ADJ –[nsubj]–> NOUN (633; 91%), ADJ –[conj]–> ADJ (630; 93%), NOUN –[acl]–> ADJ (376; 89%), VERB –[obl:agent]–> NOUN (237; 54%), ADJ –[det]–> DET (169; 93%).