home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Ruuli-RDT: POS Tags: NUM

There are 22 NUM lemmas (2%), 33 NUM types (1%) and 52 NUM tokens (1%). Out of 16 observed tags, the rank of NUM is: 8 in number of lemmas, 10 in number of types and 15 in number of tokens.

The 10 most frequent NUM lemmas: mwe, ibiri, mukaaga, musanju, biri, ikumi, inai, kanai, kataanu, kyendai

The 10 most frequent NUM types: ibiri, mukaaga, ikumi, musanju, abiri, emwe, inai, kanai, kataanu, kimwei

The 10 most frequent ambiguous lemmas: mwe (NUM 10, DET 2)

The 10 most frequent ambiguous types: kimwei (ADV 2, NUM 2), amwei (ADV 3, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.500000 (the average of all parts of speech is 2.036596).

The 1st highest number of forms (8) was observed with the lemma “mwe”: amwei, emwe, gamwei, gimwei, kimwei, ogumwei, omwe, omwei.

The 2nd highest number of forms (2) was observed with the lemma “biri”: abiri, bibiri.

The 3rd highest number of forms (2) was observed with the lemma “inai”: einai, inai.

NUM occurs with 4 features: NumForm (50; 96% instances), NumType (50; 96% instances), NounClass (10; 19% instances), Referent (8; 15% instances)

NUM occurs with 12 feature-value pairs: NounClass=Bantu1, NounClass=Bantu14, NounClass=Bantu3, NounClass=Bantu4, NounClass=Bantu6, NounClass=Bantu7, NounClass=Bantu8, NounClass=Bantu9, NumForm=Digit, NumForm=Word, NumType=Card, Referent=Yes

NUM occurs with 11 feature combinations. The most frequent feature combination is NumForm=Word|NumType=Card (34 tokens). Examples: ibiri, mukaaga, ikumi, musanju, inai, kanai, kataanu, kyendai, munaanai, einai

Relations

NUM nodes are attached to their parents using 10 different relations: nummod (14; 27% instances), obj (9; 17% instances), nmod (7; 13% instances), conj (6; 12% instances), flat:num (5; 10% instances), obl (4; 8% instances), root (4; 8% instances), acl:relcl (1; 2% instances), nmod:poss (1; 2% instances), nsubj (1; 2% instances)

Parents of NUM nodes belong to 5 different parts of speech: NOUN (19; 37% instances), NUM (14; 27% instances), VERB (14; 27% instances), (4; 8% instances), ADV (1; 2% instances)

24 (46%) NUM nodes are leaves.

15 (29%) NUM nodes have one child.

8 (15%) NUM nodes have two children.

5 (10%) NUM nodes have three or more children.

The highest child degree of a NUM node is 4.

Children of NUM nodes are attached using 13 different relations: case (15; 31% instances), conj (6; 13% instances), cc (5; 10% instances), flat:num (5; 10% instances), cop (3; 6% instances), nummod (3; 6% instances), punct (3; 6% instances), det (2; 4% instances), parataxis (2; 4% instances), acl:relcl (1; 2% instances), advmod (1; 2% instances), csubj (1; 2% instances), nsubj (1; 2% instances)

Children of NUM nodes belong to 10 different parts of speech: NUM (14; 29% instances), PART (12; 25% instances), CCONJ (5; 10% instances), ADP (3; 6% instances), AUX (3; 6% instances), PUNCT (3; 6% instances), VERB (3; 6% instances), DET (2; 4% instances), NOUN (2; 4% instances), ADV (1; 2% instances)