home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_East_Slavic-Birchbark: POS Tags: NUM

There are 103 NUM lemmas (2%), 516 NUM types (4%) and 1258 NUM tokens (5%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 8 in number of tokens.

The 10 most frequent NUM lemmas: ·в҃·, полъ, ·г҃·, два, ·е҃·, десѧть, ·д҃·, ·ѕ҃·, ·ӏ҃·, триѥ

The 10 most frequent NUM types: поло, ·в҃·, полъ, три, ·г҃·, :в҃:, :в:, ·г·, в҃, ·ӏ҃·

The 10 most frequent ambiguous lemmas: ·в҃· (NUM 157, ADJ 2), полъ (NUM 142, NOUN 2), ·г҃· (NUM 109, ADV 3, ADJ 1), ·е҃· (NUM 70, ADJ 1), десѧть (NUM 68, NOUN 1, X 1), ·д҃· (NUM 56, ADJ 2), ·ѕ҃· (NUM 48, ADJ 3), ·з҃· (NUM 30, ADJ 1), ·и҃· (NUM 16, ADJ 1), двои (NUM 8, ADJ 2)

The 10 most frequent ambiguous types: поло (NUM 48, NOUN 1), ·г҃· (NUM 30, ADV 1), в҃ (NUM 18, ADJ 1), :г҃: (NUM 15, ADV 2), пѧть (NUM 13, ADJ 2), десѧте (NUM 8, ADJ 1), дова (NUM 8, X 1), осмь (NUM 6, ADJ 1), шесть (NUM 6, ADJ 1), ·ӏ· (NUM 5, CCONJ 1)

Morphology

The form / lemma ratio of NUM is 5.009709 (the average of all parts of speech is 2.412613).

The 1st highest number of forms (38) was observed with the lemma “десѧть”: (д)есѧти, (де)сѧте, [д]—, [д]-сѧтъ, [д]ьсѧть, [десѧ]ти, [дьс]ѧ[ть, ·ӏ·[сть, д-сѧ]ть, д[е], дес, десѧть, дес(ѧте), дес[ѧть, десѧ, десѧть, десѧте, десѧти, десѧто, десѧтъ, десѧть, десѧтьма, десѧтѣ, дьс[ѧ]…, дьсѧ, дьсѧте, дьсѧти, дьсѧто, дьсѧть, дьсѧт…, надсѧте, натцѧ, наца, наца(те, нацате, нацтѧте, …[д–ѧть], 「десѧть.

The 2nd highest number of forms (31) was observed with the lemma “полъ”: (п)[ол]о, (п)[оло, (п)олъ, (полъ, пл, п[ло], п[ол]ъ, п[оло], пло, по, по:ло, по:лꙑ, по(л)[ъ, поло, полъ, по[л](ъ, по[ло], по[лъ], пол, пол(ѹ, пол[ъ], поло, полу, полъ, поль, полѹ, полѹ], пол, пъло, пълъ, …л[ъ].

The 3rd highest number of forms (29) was observed with the lemma “два”: (д)[в]е, Вуо, дви, д[ъ]ва, д[ъв]…, д{в}овѣ, два, две, двема, дви, дво, дву, двь, двѣ, девѣ, дова, дове, дови, довѣ, довѹ, дъвѣ, дъв(а), дъв[ѣ], дъва, дъвь, дъвѣ, дъвѣма, дъвѹ, дѹвѹ.

NUM occurs with 6 features: NumType (1090; 87% instances), NumForm (691; 55% instances), Case (539; 43% instances), Gender (286; 23% instances), Number (203; 16% instances), Degree (1; 0% instances)

NUM occurs with 16 feature-value pairs: Case=Acc, Case=Acc,Nom, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Digit, NumType=Card, Number=Dual, Number=Plur, Number=Sing

NUM occurs with 59 feature combinations. The most frequent feature combination is NumForm=Digit|NumType=Card (680 tokens). Examples: ·в҃·, ·г҃·, :в҃:, :в:, ·г·, ·ӏ҃·, в҃, г҃, :г҃:, ·в·

Relations

NUM nodes are attached to their parents using 17 different relations: nummod:gov (882; 70% instances), conj (78; 6% instances), nsubj (76; 6% instances), nummod (73; 6% instances), flat (43; 3% instances), nmod (35; 3% instances), root (29; 2% instances), obj (16; 1% instances), dep (7; 1% instances), obl (6; 0% instances), orphan (4; 0% instances), advcl (2; 0% instances), list (2; 0% instances), parataxis (2; 0% instances), appos (1; 0% instances), dislocated (1; 0% instances), nsubj:pass (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (953; 76% instances), NUM (134; 11% instances), PROPN (60; 5% instances), VERB (36; 3% instances), (29; 2% instances), X (22; 2% instances), ADJ (19; 2% instances), ADP (2; 0% instances), PRON (2; 0% instances), DET (1; 0% instances)

955 (76%) NUM nodes are leaves.

209 (17%) NUM nodes have one child.

58 (5%) NUM nodes have two children.

36 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 37.

Children of NUM nodes are attached using 25 different relations: nmod (113; 23% instances), punct (100; 20% instances), conj (85; 17% instances), flat (60; 12% instances), case (39; 8% instances), cc (31; 6% instances), dep (17; 3% instances), nsubj (12; 2% instances), advmod (9; 2% instances), mark (4; 1% instances), advcl (3; 1% instances), cop (3; 1% instances), nummod:gov (3; 1% instances), orphan (3; 1% instances), acl:relcl (2; 0% instances), det (2; 0% instances), nummod (2; 0% instances), obl (2; 0% instances), acl (1; 0% instances), amod (1; 0% instances), appos (1; 0% instances), iobj (1; 0% instances), list (1; 0% instances), parataxis (1; 0% instances), reparandum (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: NUM (134; 27% instances), PUNCT (100; 20% instances), ADJ (67; 13% instances), ADP (57; 11% instances), NOUN (40; 8% instances), CCONJ (32; 6% instances), X (23; 5% instances), PROPN (13; 3% instances), PART (8; 2% instances), VERB (7; 1% instances), DET (5; 1% instances), SCONJ (4; 1% instances), AUX (3; 1% instances), PRON (3; 1% instances), ADV (1; 0% instances)