home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Javanese-CSUI: POS Tags: NUM

There are 1 NUM lemmas (6%), 150 NUM types (4%) and 362 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 9 in number of lemmas, 6 in number of types and 10 in number of tokens.

The 10 most frequent NUM lemmas: _

The 10 most frequent NUM types: siji, rong, sak, 15, 16, 2022, 1, 3, 6, telung

The 10 most frequent ambiguous lemmas: _ (NOUN 2867, PUNCT 2233, VERB 1952, PROPN 1573, PRON 961, ADV 798, ADP 748, ADJ 736, DET 701, NUM 362, AUX 340, SCONJ 314, CCONJ 306, PART 234, X 175, INTJ 32, SYM 12)

The 10 most frequent ambiguous types: 3 (NUM 9, PROPN 1), loro (ADJ 1, NUM 1), setengah (ADV 2, NUM 1)

Morphology

The form / lemma ratio of NUM is 150.000000 (the average of all parts of speech is 238.235294).

The 1st highest number of forms (150) was observed with the lemma “_”: 002, 003, 004, 006, 007, 010, 1, 1.000, 10, 10.34, 11, 1100, 1101, 111, 12, 13, 130, 14, 1442, 1450, 15, 150, 16, 17, 1750, 18, 18.000, 1864, 1882, 19, 19.271, 1909, 1916, 1920, 1920-an, 1922, 1924, 1934, 1945, 1946, 1948, 1958, 1964, 1966, 1967, 1970-an, 1972, 1974, 1980, 1982, 1985, 1986, 1987, 1989, 1990, 1992, 1994, 1995, 1996, 1998, 1999, 2, 2.396, 20, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012, 2015, 2019, 2021, 2022, 22, 23, 24, 26, 27, 3, 30, 300, 31, 33, 33.750.000, 36, 4, 40, 41,8, 42, 45, 46, 5, 50, 6, 63, 647,5, 7, 70, 705, 71, 75, 8, 80, 9, 90, 99, 99942, IV, Kapindho, Rp30.000, Tetelu, XVIII, atusan, enem, ewu, ewunan, juta, kupiya, las-lasan, lima, limang, loro, miliar, milyar, papat, paraga, patang, pitu, puluhan, rong, rusak, sak, sakloron, sepisan, seprapat, sepuluh, setengah, siji, siji-siji, sijining, telu, telung, tiga, yuta.

NUM occurs with 2 features: NumType (361; 100% instances), Polite (65; 18% instances)

NUM occurs with 3 feature-value pairs: NumType=Card, Polite=Form, Polite=Infm

NUM occurs with 4 feature combinations. The most frequent feature combination is NumType=Card (297 tokens). Examples: sak, siji, 15, 16, 2022, 1, 3, 6, 1946, rong

Relations

NUM nodes are attached to their parents using 14 different relations: nummod (250; 69% instances), flat (58; 16% instances), appos (10; 3% instances), conj (9; 2% instances), obl:tmod (7; 2% instances), nmod (6; 2% instances), obl (6; 2% instances), nsubj (5; 1% instances), root (5; 1% instances), xcomp (2; 1% instances), acl:relcl (1; 0% instances), nmod:tmod (1; 0% instances), nsubj:pass (1; 0% instances), obj (1; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (187; 52% instances), NUM (70; 19% instances), PROPN (50; 14% instances), VERB (21; 6% instances), X (12; 3% instances), SYM (10; 3% instances), ADJ (6; 2% instances), (5; 1% instances), PRON (1; 0% instances)

206 (57%) NUM nodes are leaves.

96 (27%) NUM nodes have one child.

34 (9%) NUM nodes have two children.

26 (7%) NUM nodes have three or more children.

The highest child degree of a NUM node is 8.

Children of NUM nodes are attached using 19 different relations: punct (91; 34% instances), flat (82; 30% instances), advmod (21; 8% instances), det (15; 6% instances), case (10; 4% instances), conj (9; 3% instances), cc (8; 3% instances), nmod (8; 3% instances), clf (7; 3% instances), nsubj (6; 2% instances), nmod:lmod (3; 1% instances), nummod (2; 1% instances), parataxis (2; 1% instances), acl (1; 0% instances), advcl (1; 0% instances), advmod:emph (1; 0% instances), amod (1; 0% instances), nmod:poss (1; 0% instances), obl (1; 0% instances)

Children of NUM nodes belong to 12 different parts of speech: PUNCT (91; 34% instances), NUM (70; 26% instances), PROPN (25; 9% instances), NOUN (21; 8% instances), DET (15; 6% instances), ADJ (14; 5% instances), ADP (10; 4% instances), ADV (8; 3% instances), CCONJ (8; 3% instances), PART (3; 1% instances), VERB (3; 1% instances), PRON (2; 1% instances)