home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Latvian-LVTB: POS Tags: NUM

There are 674 NUM lemmas (3%), 745 NUM types (1%) and 4160 NUM tokens (1%). Out of 17 observed tags, the rank of NUM is: 7 in number of lemmas, 7 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: viens, divi, trīs, otrs, pieci, četri, seši, desmit, 20, septiņi

The 10 most frequent NUM types: viens, trīs, vienu, viena, divas, vienā, divi, otru, desmit, 20

The 10 most frequent ambiguous lemmas: otrs (NUM 193, ADJ 2), i (PART 6, CCONJ 2, NUM 1, SYM 1), V (PROPN 6, NUM 3), den (NUM 2, X 1), XVIII (NUM 1, X 1), otrais (ADJ 176, NUM 1)

The 10 most frequent ambiguous types: vienu (NUM 167, X 1), otrā (ADJ 31, NUM 19), 8 (NUM 18, X 1), I (NUM 12, CCONJ 2, X 1), 2008 (NUM 7, ADJ 1), V (PROPN 6, NUM 3), den (NUM 2, X 1), XVIII (NUM 1, X 1), l (NOUN 1, NUM 1), otrās (ADJ 23, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.105341 (the average of all parts of speech is 2.339090).

The 1st highest number of forms (12) was observed with the lemma “viens”: Vienās, viena, vienai, vienam, vienas, vieni, vieniem, vienos, viens, vienu, vienā, vienām.

The 2nd highest number of forms (10) was observed with the lemma “otrs”: otra, otrai, otram, otras, otriem, otrs, otru, otrā, otrām, otrās.

The 3rd highest number of forms (8) was observed with the lemma “divi”: divas, divi, diviem, divos, divu, divus, divām, divās.

NUM occurs with 5 features: NumType (4160; 100% instances), Number (2129; 51% instances), Case (1956; 47% instances), Gender (1955; 47% instances), Typo (5; 0% instances)

NUM occurs with 12 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, NumType=Card, NumType=Frac, Number=Plur, Number=Sing, Typo=Yes

NUM occurs with 32 feature combinations. The most frequent feature combination is NumType=Card (2028 tokens). Examples: viens, trīs, vienu, viena, 20, divas, 3, 10, 30, 2

Relations

NUM nodes are attached to their parents using 24 different relations: nummod (3007; 72% instances), conj (226; 5% instances), parataxis (124; 3% instances), nsubj (109; 3% instances), root (100; 2% instances), nmod (90; 2% instances), dep (85; 2% instances), flat:name (76; 2% instances), compound (66; 2% instances), iobj (52; 1% instances), obj (48; 1% instances), obl (46; 1% instances), xcomp (38; 1% instances), flat (31; 1% instances), discourse (12; 0% instances), acl (9; 0% instances), ccomp (9; 0% instances), advcl (8; 0% instances), nsubj:pass (8; 0% instances), appos (5; 0% instances), orphan (4; 0% instances), amod (3; 0% instances), csubj (2; 0% instances), flat:foreign (2; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (2778; 67% instances), VERB (527; 13% instances), NUM (314; 8% instances), SYM (227; 5% instances), (100; 2% instances), PROPN (99; 2% instances), X (50; 1% instances), ADJ (42; 1% instances), ADV (13; 0% instances), DET (4; 0% instances), PRON (4; 0% instances), AUX (1; 0% instances), CCONJ (1; 0% instances)

2699 (65%) NUM nodes are leaves.

938 (23%) NUM nodes have one child.

291 (7%) NUM nodes have two children.

232 (6%) NUM nodes have three or more children.

The highest child degree of a NUM node is 11.

Children of NUM nodes are attached using 29 different relations: punct (653; 26% instances), advmod (293; 11% instances), nmod (288; 11% instances), conj (245; 10% instances), case (199; 8% instances), cop (106; 4% instances), cc (104; 4% instances), discourse (99; 4% instances), nsubj (88; 3% instances), advcl (69; 3% instances), compound (66; 3% instances), flat:name (48; 2% instances), dep (38; 1% instances), det (37; 1% instances), flat (36; 1% instances), amod (35; 1% instances), acl (26; 1% instances), obl (26; 1% instances), orphan (21; 1% instances), mark (19; 1% instances), parataxis (16; 1% instances), csubj (10; 0% instances), advmod:neg (7; 0% instances), nummod (6; 0% instances), advmod:emph (5; 0% instances), fixed (5; 0% instances), appos (4; 0% instances), iobj (2; 0% instances), goeswith (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: PUNCT (653; 26% instances), NOUN (392; 15% instances), ADV (376; 15% instances), NUM (314; 12% instances), ADP (202; 8% instances), AUX (106; 4% instances), PART (98; 4% instances), DET (96; 4% instances), CCONJ (85; 3% instances), VERB (63; 2% instances), ADJ (44; 2% instances), PROPN (41; 2% instances), SCONJ (35; 1% instances), PRON (22; 1% instances), SYM (20; 1% instances), X (5; 0% instances)