home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Komi_Permyak-UH: POS Tags: NUM

There are 6 NUM lemmas (1%), 7 NUM types (1%) and 9 NUM tokens (1%). Out of 15 observed tags, the rank of NUM is: 12 in number of lemmas, 10 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: кык, куим, кыка, кыкӧнӧсь, нёляӧсь, ӧтік

The 10 most frequent NUM types: кык, öтiк, Кыкӧн, куим, кыка, кыкӧнӧсь, нёляӧсь

The 10 most frequent ambiguous lemmas: кык (NUM 4, DET 1)

The 10 most frequent ambiguous types:

Morphology

The form / lemma ratio of NUM is 1.166667 (the average of all parts of speech is 1.249476).

The 1st highest number of forms (2) was observed with the lemma “кык”: Кыкӧн, кык.

The 2nd highest number of forms (1) was observed with the lemma “куим”: куим.

The 3rd highest number of forms (1) was observed with the lemma “кыка”: кыка.

NUM occurs with 5 features: NumType (7; 78% instances), Number (5; 56% instances), Case (2; 22% instances), Person (1; 11% instances), Tense (1; 11% instances)

NUM occurs with 7 feature-value pairs: Case=Nom, NumType=Card, NumType=Dist, Number=Plur, Number=Sing, Person=3, Tense=Pres

NUM occurs with 6 feature combinations. The most frequent feature combination is NumType=Card (3 tokens). Examples: кык, куим, öтiк

Relations

NUM nodes are attached to their parents using 4 different relations: nummod (5; 56% instances), root (2; 22% instances), advcl (1; 11% instances), obl (1; 11% instances)

Parents of NUM nodes belong to 3 different parts of speech: NOUN (5; 56% instances), (2; 22% instances), VERB (2; 22% instances)

7 (78%) NUM nodes are leaves.

0 (0%) NUM nodes have one child.

1 (11%) NUM nodes have two children.

1 (11%) NUM nodes have three or more children.

The highest child degree of a NUM node is 3.

Children of NUM nodes are attached using 4 different relations: punct (2; 40% instances), advmod:tmod (1; 20% instances), nsubj (1; 20% instances), nsubj:cop (1; 20% instances)

Children of NUM nodes belong to 4 different parts of speech: PUNCT (2; 40% instances), ADV (1; 20% instances), NOUN (1; 20% instances), PRON (1; 20% instances)