home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-Nonstandard: POS Tags: DET

There are 104 DET lemmas (1%), 434 DET types (1%) and 23858 DET tokens (4%). Out of 16 observed tags, the rank of DET is: 9 in number of lemmas, 8 in number of types and 9 in number of tokens.

The 10 most frequent DET lemmas: -ul, un, al, tot, cel, tău, meu, acela, său, acesta

The 10 most frequent DET types: lui, a, un, o, toată, ta, toate, tot, al, cel

The 10 most frequent ambiguous lemmas: -ul (DET 3214, PRON 7), un (DET 3004, NUM 45, PRON 2), al (DET 2877, ADP 9, PART 7), tot (DET 2499, PRON 1429, ADV 638, NOUN 5, ADJ 1), cel (DET 1660, PRON 117, ADV 1), tău (DET 1515, PRON 58, NOUN 4), meu (DET 1338, PRON 49), acela (PRON 2737, DET 1332, NOUN 6, ADV 4, VERB 3), său (DET 1242, PRON 22, NOUN 1), acesta (PRON 1362, DET 1107, NOUN 1)

The 10 most frequent ambiguous types: lui (DET 3023, PRON 2255), a (DET 1752, PART 1448, AUX 1105, ADP 166, VERB 2, NOUN 1, PRON 1), un (DET 1367, NUM 38), o (DET 1139, PRON 945, AUX 158, INTJ 26, NUM 3), toată (DET 702, PRON 17), ta (DET 601, PRON 31, NOUN 2), toate (DET 574, PRON 419), tot (DET 579, ADV 518, PRON 124, ADJ 1, NOUN 1), cel (DET 478, PRON 153, ADV 1), cea (DET 467, PRON 37)

Morphology

The form / lemma ratio of DET is 4.173077 (the average of all parts of speech is 2.492163).

The 1st highest number of forms (63) was observed with the lemma “acela”: -acea, -aceaia, -acealia, -aceea, -acel, -acela, -cee, -ceie, -cie, ace, acea, aceae, aceaea, aceaia, aceale, acealea, acealia, acee, aceea, acei, aceia, aceie, aceii, aceiia, acel, acela, acele, acelea, aceli, acelor, acelora, acelui, aceluia, acia, acie, aciia, aciie, cea, ceaea, ceaia, ceal, ceale, cealea, cee, ceea, cei, ceia, ceie, ceii, cel, cela, cele, celea, celi, celor, celora, celui, celuia, cia, ciale, cie, cieia, ciie.

The 2nd highest number of forms (57) was observed with the lemma “acesta”: -această, -acesta, -cestu, aceaiia, aceasta, aceaste, aceastea, aceastia, această, aceaștii, acesstea, acest, acesta, acestași, aceste, acestea, acestei, acesti, acestor, acestora, acestoru, acestu, acestui, acestuia, acestuie, aceşti, aceştia, aceştii, acește, aceștea, acești, aceștia, aceștie, aceștii, aceștiia, aciaste, astă, ceasta, ceaste, ceastă, cest, cesta, ceste, cestor, cestu, cestui, cești, ceștii, ceștiia, ciaste, ciastă, iasta, ist, ista, istor, Ăst, șesti.

The 3rd highest number of forms (27) was observed with the lemma “celălalt”: aceilalți, alalți, ceaialaltă, cealalte, cealaltă, cealealalte, cealelalte, ceelaltă, ceialaltă, ceialalți, ceielalți, ceiialalte, ceiialalți, ceilalte, ceilalți, celalalt, celalaltu, celelalte, cellalt, celoralalți, celoralalții, celorlalte, celorlalți, cielaltă, cielalți, cielalții, ciielalți.

DET occurs with 9 features: PronType (23847; 100% instances), Number (23251; 97% instances), Case (20988; 88% instances), Gender (19701; 83% instances), Person (13053; 55% instances), Number[psor] (4753; 20% instances), Definite (3246; 14% instances), Poss (3156; 13% instances), Position (2055; 9% instances)

DET occurs with 22 feature-value pairs: Case=Acc,Nom, Case=Dat,Gen, Definite=Def, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Position=Postnom, Position=Prenom, Poss=Yes, PronType=Art, PronType=Dem, PronType=Emp, PronType=Ind, PronType=Int,Rel, PronType=Neg, PronType=Prs

DET occurs with 125 feature combinations. The most frequent feature combination is Case=Dat,Gen|Definite=Def|Number=Sing|PronType=Art (3101 tokens). Examples: lui, lu, iui

Relations

DET nodes are attached to their parents using 25 different relations: det (23424; 98% instances), nsubj (105; 0% instances), obj (58; 0% instances), nmod (56; 0% instances), obl (43; 0% instances), compound (32; 0% instances), conj (30; 0% instances), xcomp (19; 0% instances), root (18; 0% instances), nsubj:pass (12; 0% instances), obl:pmod (11; 0% instances), acl (9; 0% instances), advmod (8; 0% instances), advcl (6; 0% instances), appos (5; 0% instances), parataxis (5; 0% instances), ccomp (4; 0% instances), nmod:tmod (4; 0% instances), amod (2; 0% instances), fixed (2; 0% instances), dep (1; 0% instances), discourse (1; 0% instances), expl (1; 0% instances), iobj (1; 0% instances), obl:agent (1; 0% instances)

Parents of DET nodes belong to 12 different parts of speech: NOUN (16268; 68% instances), PROPN (3690; 15% instances), NUM (1180; 5% instances), PRON (954; 4% instances), ADJ (857; 4% instances), VERB (471; 2% instances), DET (349; 1% instances), ADV (63; 0% instances), (18; 0% instances), ADP (5; 0% instances), AUX (2; 0% instances), CCONJ (1; 0% instances)

23252 (97%) DET nodes are leaves.

497 (2%) DET nodes have one child.

41 (0%) DET nodes have two children.

68 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 8.

Children of DET nodes are attached using 27 different relations: det (345; 39% instances), advmod (131; 15% instances), case (70; 8% instances), punct (70; 8% instances), cop (44; 5% instances), nsubj (39; 4% instances), cc (32; 4% instances), conj (32; 4% instances), acl (24; 3% instances), obl (23; 3% instances), nmod (15; 2% instances), mark (12; 1% instances), advcl (11; 1% instances), amod (5; 1% instances), appos (5; 1% instances), csubj (5; 1% instances), vocative (4; 0% instances), aux (3; 0% instances), compound (3; 0% instances), discourse (3; 0% instances), fixed (2; 0% instances), iobj (2; 0% instances), obj (2; 0% instances), advcl:tcl (1; 0% instances), cc:preconj (1; 0% instances), nummod (1; 0% instances), parataxis (1; 0% instances)

Children of DET nodes belong to 16 different parts of speech: DET (349; 39% instances), ADV (135; 15% instances), ADP (71; 8% instances), PUNCT (70; 8% instances), NOUN (68; 8% instances), AUX (47; 5% instances), VERB (39; 4% instances), CCONJ (35; 4% instances), PRON (32; 4% instances), ADJ (13; 1% instances), SCONJ (11; 1% instances), NUM (6; 1% instances), PROPN (4; 0% instances), INTJ (3; 0% instances), PART (2; 0% instances), X (1; 0% instances)