home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-RRT: POS Tags: NUM

There are 946 NUM lemmas (5%), 1010 NUM types (3%) and 5552 NUM tokens (3%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: doi, 1, 2, prim, trei, 3, 4, unu, 5, 6

The 10 most frequent NUM types: 1, 2, 3, două, 4, trei, 5, 6, primul, doi

The 10 most frequent ambiguous lemmas: prim (NUM 245, ADJ 2), întâi (NUM 15, ADV 13), dintâi (NUM 10, ADV 2), zero (NUM 10, NOUN 1), _ (X 72, NUM 2, PUNCT 1), 5a (ADV 3, X 2, NUM 1, PROPN 1), V. (NOUN 3, NUM 1)

The 10 most frequent ambiguous types: 2 (NUM 279, X 2), 3 (NUM 199, X 1), i (PRON 87, NOUN 3, NUM 1), primele (NUM 30, NOUN 1), 9 (NUM 39, X 1), o (DET 1815, PRON 187, NUM 27, AUX 9, PART 8, ADV 1), 0 (NUM 21, X 2), 100 (NUM 21, X 3), un (DET 1616, NUM 16, X 2), V (NOUN 12, NUM 10)

Morphology

The form / lemma ratio of NUM is 1.067653 (the average of all parts of speech is 1.819791).

The 1st highest number of forms (12) was observed with the lemma “prim”: prim, prim-, prima, prime, primei, primele, primelor, primii, primilor, primul, primului, primă.

The 2nd highest number of forms (10) was observed with the lemma “ultim”: ultim, ultima, ultime, ultimei, ultimele, ultimelor, ultimii, ultimilor, ultimul, ultimului.

The 3rd highest number of forms (6) was observed with the lemma “doi”: doi, doilea, doua, două, ii, secund.

NUM occurs with 10 features: NumType (5552; 100% instances), Number (5536; 100% instances), NumForm (5501; 99% instances), Gender (937; 17% instances), Case (495; 9% instances), Definite (450; 8% instances), PronType (48; 1% instances), Typo (43; 1% instances), Variant (8; 0% instances), Foreign (1; 0% instances)

NUM occurs with 18 feature-value pairs: Case=Acc,Nom, Case=Dat,Gen, Definite=Def, Definite=Ind, Foreign=Yes, Gender=Fem, Gender=Masc, NumForm=Combi, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Ord, Number=Plur, Number=Sing, PronType=Tot, Typo=Yes, Variant=Short

NUM occurs with 43 feature combinations. The most frequent feature combination is Number=Sing|NumForm=Digit|NumType=Card (3517 tokens). Examples: 1, 2, 3, 4, 5, 6, 7, 8, 2004, 10

Relations

NUM nodes are attached to their parents using 21 different relations: nummod (4245; 76% instances), parataxis (751; 14% instances), conj (292; 5% instances), nsubj (75; 1% instances), compound (49; 1% instances), obj (29; 1% instances), root (22; 0% instances), appos (14; 0% instances), nsubj:pass (13; 0% instances), fixed (12; 0% instances), nmod (10; 0% instances), flat (8; 0% instances), xcomp (8; 0% instances), advcl (5; 0% instances), obl (4; 0% instances), orphan (4; 0% instances), acl (3; 0% instances), csubj (3; 0% instances), amod (2; 0% instances), dep (2; 0% instances), iobj (1; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (3760; 68% instances), VERB (1119; 20% instances), NUM (440; 8% instances), PROPN (81; 1% instances), ADJ (57; 1% instances), ADV (39; 1% instances), (22; 0% instances), PRON (17; 0% instances), ADP (10; 0% instances), AUX (2; 0% instances), DET (2; 0% instances), X (2; 0% instances), INTJ (1; 0% instances)

2919 (53%) NUM nodes are leaves.

1290 (23%) NUM nodes have one child.

1090 (20%) NUM nodes have two children.

253 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 22 different relations: punct (1864; 43% instances), case (883; 20% instances), det (395; 9% instances), conj (302; 7% instances), advmod (228; 5% instances), cc (167; 4% instances), nmod (149; 3% instances), nummod (110; 3% instances), compound (65; 1% instances), goeswith (54; 1% instances), cop (30; 1% instances), nsubj (28; 1% instances), appos (23; 1% instances), acl (22; 1% instances), amod (17; 0% instances), parataxis (9; 0% instances), mark (8; 0% instances), advcl (6; 0% instances), aux (6; 0% instances), dep (6; 0% instances), obl:pmod (2; 0% instances), flat (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: PUNCT (1864; 43% instances), ADP (882; 20% instances), NUM (440; 10% instances), DET (436; 10% instances), NOUN (197; 5% instances), ADV (189; 4% instances), CCONJ (178; 4% instances), X (56; 1% instances), AUX (36; 1% instances), VERB (30; 1% instances), PRON (24; 1% instances), ADJ (18; 0% instances), SCONJ (13; 0% instances), PROPN (10; 0% instances), PART (2; 0% instances)