home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-Nonstandard: POS Tags: NOUN

There are 6417 NOUN lemmas (46%), 14523 NOUN types (42%) and 96783 NOUN tokens (17%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: domn, vodă, om, țară, zi, cuvânt, lucru, turc, oaste, frate

The 10 most frequent NOUN types: vodă, domnul, doamne, țara, țară, omul, om, domnului, oaste, cuvîntul

The 10 most frequent ambiguous lemmas: domn (NOUN 2499, PROPN 18, VERB 1), vodă (NOUN 1962, VERB 1), om (NOUN 1728, PROPN 3, NUM 1), țară (NOUN 1431, PROPN 6, ADJ 5, ADV 1, VERB 1), zi (NOUN 1004, ADV 17, PROPN 4, VERB 3, ADP 1), turc (NOUN 694, PROPN 8), împărat (NOUN 667, ADV 4, PROPN 3), boier (NOUN 664, PROPN 5), vreme (NOUN 637, PROPN 1), lume (NOUN 623, PRON 1, PROPN 1, VERB 1)

The 10 most frequent ambiguous types: vodă (NOUN 1941, PROPN 1, VERB 1), omul (NOUN 411, PROPN 1), om (NOUN 390, AUX 42, NUM 1), parte (NOUN 293, ADV 1), numele (NOUN 261, ADP 1), fiiul (NOUN 78, PROPN 1), domnu (NOUN 244, PROPN 8), credință (NOUN 251, VERB 1), lucru (NOUN 248, VERB 2), duhul (NOUN 74, ADP 1)

Morphology

The form / lemma ratio of NOUN is 2.263207 (the average of all parts of speech is 2.491875).

The 1st highest number of forms (43) was observed with the lemma “zi”: dni, dza, dze, dzi, dzii, dzile, dzilele, dzileli, dzili, dzilile, dzio, dzioa, dziua, dzua, dzuei, dzuo, dzuoa, dzuîi, dzuă, dzî, dzîle, dzîlele, dzîli, dzîlile, zi, zile, zileei, zilei, zilele, zilelor, zio, zioa, ziua, ziuă, zoa, zua, zuo, zuoa, zuă, zâua, zî, zîle, zîlele.

The 2nd highest number of forms (38) was observed with the lemma “împărăție”: -mpărațîi, -mpărăţia, -mpărăţie, -mpărăție, -mpărățiia, -mpărățîia, -mpărățîie, Impărățiia, mpărățiia, mpărățîie, npărăție, părățiia, împărăţia, împărăția, împărăție, împărăției, împărății, împărățiia, împărățiii, împărățiile, împărățiilor, împărățâe, împărățâei, împărățâia, împărățâiei, împărățîe, împărățîi, împărățîia, împărățîiei, înpărăție, înpărăției, înpărățiia, înpărățiie, înpărățiii, înpărățâe, înpărățâei, înpărățâia, înpărățâie.

The 3rd highest number of forms (30) was observed with the lemma “drept”: Derepțîi, dereapte, derep, derept, dereptul, dereptului, derepți, derepții, derepților, direapta, dirept, direptul, direptului, direpț, direpți, direpții, direpților, direpțîi, direpțîlor, dreapta, drept, drepte, dreptul, drepturi, drepturile, drepturilor, drepț, drepți, drepțî, dritul.

NOUN occurs with 5 features: Case (96782; 100% instances), Definite (96782; 100% instances), Gender (96782; 100% instances), Number (96782; 100% instances), Degree (6; 0% instances)

NOUN occurs with 10 feature-value pairs: Case=Acc,Nom, Case=Dat,Gen, Case=Voc, Definite=Def, Definite=Ind, Degree=Pos, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing

NOUN occurs with 26 feature combinations. The most frequent feature combination is Case=Acc,Nom|Definite=Ind|Gender=Fem|Number=Sing (20856 tokens). Examples: țară, oaste, lume, pace, parte, credință, vreme, casă, milă, gură

Relations

NOUN nodes are attached to their parents using 30 different relations: obl (23185; 24% instances), nmod (15435; 16% instances), nsubj (15341; 16% instances), obj (14919; 15% instances), conj (8383; 9% instances), obl:pmod (3858; 4% instances), nmod:tmod (2927; 3% instances), appos (2326; 2% instances), root (2276; 2% instances), vocative (2030; 2% instances), iobj (1838; 2% instances), xcomp (1310; 1% instances), acl (563; 1% instances), advcl (467; 0% instances), parataxis (365; 0% instances), nsubj:pass (358; 0% instances), ccomp (324; 0% instances), obl:agent (316; 0% instances), compound (142; 0% instances), flat (134; 0% instances), csubj (82; 0% instances), orphan (81; 0% instances), case (48; 0% instances), advcl:tcl (30; 0% instances), fixed (18; 0% instances), amod (12; 0% instances), ccomp:pmod (11; 0% instances), nummod (2; 0% instances), advmod (1; 0% instances), discourse (1; 0% instances)

Parents of NOUN nodes belong to 16 different parts of speech: VERB (62273; 64% instances), NOUN (21928; 23% instances), PROPN (4850; 5% instances), (2276; 2% instances), ADJ (1923; 2% instances), ADV (1465; 2% instances), PRON (1426; 1% instances), AUX (214; 0% instances), NUM (177; 0% instances), INTJ (132; 0% instances), DET (68; 0% instances), ADP (29; 0% instances), SCONJ (11; 0% instances), X (6; 0% instances), CCONJ (4; 0% instances), PUNCT (1; 0% instances)

22704 (23%) NOUN nodes are leaves.

34218 (35%) NOUN nodes have one child.

22319 (23%) NOUN nodes have two children.

17542 (18%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 19.

Children of NOUN nodes are attached using 44 different relations: case (40724; 28% instances), nmod (20649; 14% instances), punct (18538; 13% instances), det (16230; 11% instances), amod (8885; 6% instances), conj (8397; 6% instances), cc (7057; 5% instances), advmod (4316; 3% instances), acl (4171; 3% instances), cop (3793; 3% instances), nummod (3424; 2% instances), nsubj (2310; 2% instances), appos (1733; 1% instances), mark (1716; 1% instances), iobj (823; 1% instances), obl (776; 1% instances), advcl (504; 0% instances), aux (471; 0% instances), obl:pmod (194; 0% instances), parataxis (189; 0% instances), advmod:tmod (170; 0% instances), vocative (163; 0% instances), csubj (161; 0% instances), nmod:tmod (160; 0% instances), compound (114; 0% instances), flat (107; 0% instances), discourse (98; 0% instances), cc:preconj (80; 0% instances), obj (76; 0% instances), orphan (73; 0% instances), xcomp (57; 0% instances), expl (50; 0% instances), advcl:tcl (46; 0% instances), expl:pv (24; 0% instances), ccomp (15; 0% instances), aux:pass (12; 0% instances), nsubj:pass (7; 0% instances), obl:agent (7; 0% instances), expl:poss (3; 0% instances), ccomp:pmod (1; 0% instances), clf (1; 0% instances), dep (1; 0% instances), fixed (1; 0% instances), list (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: ADP (41272; 28% instances), NOUN (21928; 15% instances), PUNCT (18538; 13% instances), DET (16268; 11% instances), ADJ (8015; 5% instances), CCONJ (7241; 5% instances), PRON (6613; 5% instances), VERB (6325; 4% instances), PROPN (6081; 4% instances), ADV (4729; 3% instances), AUX (4310; 3% instances), NUM (3582; 2% instances), SCONJ (977; 1% instances), PART (343; 0% instances), INTJ (101; 0% instances), X (5; 0% instances)