home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Beja-NSC: POS Tags: NOUN

There are 1 NOUN lemmas (6%), 94 NOUN types (22%) and 168 NOUN tokens (14%). Out of 16 observed tags, the rank of NOUN is: 8 in number of lemmas, 2 in number of types and 4 in number of tokens.

The 10 most frequent NOUN lemmas: _

The 10 most frequent NOUN types: kaːm, tak, biri, jhaːm, mhiːn, doːr, meːk, na, ʃartijaː, finʤaːn

The 10 most frequent ambiguous lemmas: _ (VERB 242, PUNCT 241, DET 176, NOUN 168, PRON 106, SCONJ 68, ADP 39, AUX 38, CCONJ 34, PART 28, ADV 18, ADJ 17, NUM 12, INTJ 8, X 7, PROPN 4)

The 10 most frequent ambiguous types: na (NOUN 5, PART 1), dhaj (ADP 1, NOUN 1), hoː (PRON 2, NOUN 1), naː (NOUN 1, SCONJ 1)

Morphology

The form / lemma ratio of NOUN is 94.000000 (the average of all parts of speech is 26.312500).

The 1st highest number of forms (94) was observed with the lemma “_”: bhalijeː, biri, buːn, bʔaɖ, bʔaɖaɖ, bʔeː, da, dar, dhaj, doːr, doːri, dʔiti, findikʷ, finʤan, finʤaːn, gahwat, gaw, gaɖʔa, girmai, halaka, hamoː, handii, hanʤar, hawat, haːʃ, hi, his, hoː, hoːb-ej, iːjʔaː, i̠ːjʔaː, jam, jhaːm, kam, kaːm, kaːmi, koːba, koːlej, kʷinha, liːlaːwi, maːl, mbʔaɖ, mbʔaɖi, meːk, meːki, meːs, mhiːn, mijʔat, mindikʷijaːj, mittia, mʔam, na, nafara, naː, nda, rabameːk, samaːr, saroːj, siganfoːj, sitoːboːj, suːfa, suːg, tak, takat, taki, talga, tami, tarʤimaːl, tʔiit, wanas, waʤʤai, xawaːʤa, xawaːʤai, ʃa, ʃamateː, ʃartijaː, ʃaː, ʃaːk, ʔabaː, ʔajaːj, ʔalaːma, ʔamaːr, ʔamuːl, ʔarabijaːj, ʔaraːw, ʔaːmanaːj, ʔeːgirim, ʔiːbaːb, ʔiːd, ʔoːr, ʔoːrej, ʔoːt, ʔoːti, ʤabanaː.

NOUN occurs with 4 features: Gender (118; 70% instances), Number (17; 10% instances), Foreign (3; 2% instances), ExtPos (2; 1% instances)

NOUN occurs with 5 feature-value pairs: ExtPos=ADV, Foreign=Yes, Gender=Fem, Gender=Masc, Number=Plur

NOUN occurs with 9 feature combinations. The most frequent feature combination is Gender=Masc (76 tokens). Examples: tak, jhaːm, mhiːn, biri, doːr, finʤaːn, gaw, buːn, finʤan, hanʤar

Relations

NOUN nodes are attached to their parents using 19 different relations: obj (57; 34% instances), nsubj (38; 23% instances), nmod (14; 8% instances), obl:arg (12; 7% instances), dep:comp (8; 5% instances), dislocated (8; 5% instances), fixed (5; 3% instances), obl:mod (5; 3% instances), dep:conj (3; 2% instances), parataxis (3; 2% instances), reparandum (3; 2% instances), advmod (2; 1% instances), appos (2; 1% instances), vocative (2; 1% instances), xcomp (2; 1% instances), acl:relcl (1; 1% instances), dislocated:subj (1; 1% instances), parataxis:conj (1; 1% instances), root (1; 1% instances)

Parents of NOUN nodes belong to 7 different parts of speech: VERB (128; 76% instances), NOUN (22; 13% instances), ADP (6; 4% instances), SCONJ (6; 4% instances), ADJ (3; 2% instances), AUX (2; 1% instances), (1; 1% instances)

21 (13%) NOUN nodes are leaves.

61 (36%) NOUN nodes have one child.

45 (27%) NOUN nodes have two children.

41 (24%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 15.

Children of NOUN nodes are attached using 28 different relations: det (117; 38% instances), punct (58; 19% instances), acl:relcl (21; 7% instances), nmod (21; 7% instances), nmod:poss (19; 6% instances), case (17; 6% instances), cc (9; 3% instances), reparandum (8; 3% instances), advmod (5; 2% instances), amod (5; 2% instances), appos (3; 1% instances), cop (2; 1% instances), dep:conj (2; 1% instances), discourse (2; 1% instances), fixed (2; 1% instances), nummod (2; 1% instances), acl (1; 0% instances), advcl (1; 0% instances), aux (1; 0% instances), dep (1; 0% instances), dep:comp (1; 0% instances), dislocated:det (1; 0% instances), flat:foreign (1; 0% instances), iobj (1; 0% instances), mark (1; 0% instances), nummod:det (1; 0% instances), obj (1; 0% instances), obl:arg (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: DET (118; 39% instances), PUNCT (58; 19% instances), NOUN (22; 7% instances), PRON (21; 7% instances), VERB (17; 6% instances), ADP (16; 5% instances), ADJ (9; 3% instances), CCONJ (9; 3% instances), SCONJ (9; 3% instances), NUM (7; 2% instances), PART (6; 2% instances), ADV (4; 1% instances), X (4; 1% instances), AUX (3; 1% instances), PROPN (2; 1% instances)