home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-SiMoNERo: POS Tags: NUM

There are 181 NUM lemmas (6%), 182 NUM types (4%) and 358 NUM tokens (2%). Out of 15 observed tags, the rank of NUM is: 4 in number of lemmas, 4 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: 2, 30, 7, doi, 15, 5, 60, 9, 10, 14

The 10 most frequent NUM types: 2, 30, 7, 15, 5, 60, 9, două, 10, 14

The 10 most frequent ambiguous lemmas: I (NOUN 1, NUM 1), primul (ADJ 1, NUM 1)

The 10 most frequent ambiguous types: I (NOUN 1, NUM 1), primul (ADJ 1, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.005525 (the average of all parts of speech is 1.477080).

The 1st highest number of forms (2) was observed with the lemma “doi”: doi, două.

The 2nd highest number of forms (1) was observed with the lemma “0”: 0.

The 3rd highest number of forms (1) was observed with the lemma “0,5”: 0,5.

NUM occurs with 7 features: NumType (357; 100% instances), Number (357; 100% instances), NumForm (355; 99% instances), Gender (25; 7% instances), Case (15; 4% instances), Definite (13; 4% instances), PronType (2; 1% instances)

NUM occurs with 15 feature-value pairs: Case=Gen, Case=Nom, Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Frac, NumType=Ord, Number=Plur, Number=Sing, PronType=Tot

NUM occurs with 17 feature combinations. The most frequent feature combination is Number=Plur|NumForm=Digit|NumType=Card (326 tokens). Examples: 2, 30, 7, 15, 5, 60, 9, 10, 14, 4

Relations

NUM nodes are attached to their parents using 10 different relations: nummod (223; 62% instances), parataxis (90; 25% instances), conj (28; 8% instances), appos (4; 1% instances), nsubj (4; 1% instances), root (4; 1% instances), nmod (2; 1% instances), advcl (1; 0% instances), obl (1; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (206; 58% instances), VERB (83; 23% instances), NUM (41; 11% instances), ADJ (17; 5% instances), (4; 1% instances), X (3; 1% instances), AUX (2; 1% instances), ADP (1; 0% instances), ADV (1; 0% instances)

126 (35%) NUM nodes are leaves.

72 (20%) NUM nodes have one child.

118 (33%) NUM nodes have two children.

42 (12%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 13 different relations: punct (278; 61% instances), case (72; 16% instances), conj (26; 6% instances), nmod (25; 5% instances), advmod (20; 4% instances), nummod (13; 3% instances), cc (7; 2% instances), cop (5; 1% instances), nsubj (4; 1% instances), det (3; 1% instances), parataxis (3; 1% instances), advcl (2; 0% instances), aux (1; 0% instances)

Children of NUM nodes belong to 10 different parts of speech: PUNCT (278; 61% instances), ADP (70; 15% instances), NUM (41; 9% instances), NOUN (30; 7% instances), ADV (19; 4% instances), CCONJ (7; 2% instances), AUX (6; 1% instances), DET (4; 1% instances), PRON (2; 0% instances), VERB (2; 0% instances)