home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Beja-NSC: POS Tags: NOUN

There are 1 NOUN lemmas (6%), 285 NOUN types (23%) and 894 NOUN tokens (15%). Out of 16 observed tags, the rank of NOUN is: 8 in number of lemmas, 2 in number of types and 4 in number of tokens.

The 10 most frequent NOUN lemmas: _

The 10 most frequent NOUN types: tak, mhiːn, doːr, meːk, ʔoːr, jam, kaːm, naː, na, dhaj

The 10 most frequent ambiguous lemmas: _ (PUNCT 1126, VERB 1097, DET 933, NOUN 894, ADP 408, PRON 395, SCONJ 298, PART 167, CCONJ 160, AUX 125, ADV 104, ADJ 77, PROPN 32, INTJ 28, NUM 26, X 18)

The 10 most frequent ambiguous types: naː (NOUN 21, PRON 1), na (NOUN 20, SCONJ 1), dhaj (NOUN 17, ADP 1), =na (NOUN 14, PART 1), raw (NOUN 4, ADJ 2), deː (NOUN 3, DET 1), nafs (NOUN 2, PRON 2), suːr (ADV 9, ADP 3, NOUN 2), wari (ADJ 2, NOUN 2), ʔeːgrim (ADJ 2, NOUN 2)

Morphology

The form / lemma ratio of NOUN is 285.000000 (the average of all parts of speech is 76.500000).

The 1st highest number of forms (285) was observed with the lemma “_”: =na, =naː, alla, allaː, allaːji, baji, balad, balami, bani, baraːm, baxit, baʃar, baːb, baːgi, bhali, bhar, bilbil, biri, bissa, bri, buːn, bʔaɖ, bʔaɖaɖ, bʔeː, da, daba, damaːn, dammʔara, dar, dara, darab, dawaːhi, deː, deːm, dhaj, dikʷkʷaːn, dirʔa, dirʔaː, diwaːn, doːr, doːra, duːr, dwaːn, dʔawaː, faras, findikʷ, finʤan, finʤaːn, firha, firkaːk, firʔa, gab, gabal, gabaːteː, gahwat, gamiːs, ganaːj, garb, gaw, gaɖʔa, ginh, ginha, ginʔ, girma, gʷargʷadi, gʷaːb, gʷbi, gʷʔanaːti, hajʔa, halak, halaka, halla, hamoː, handi, hanʤar, harnaːjeːt, harroː, hasir, hawat, hawlijaːj, haˈwaːd, haːda~doːjeː, haːl, haːʃ, heːlaj, hi, his, hoː, hoːb, hus, iːjʔa, iːjʔaː, i̠ːjʔaː, jad, jaf, jam, jhaːm, jinaː, jiwaːʃi, kam, kantuːr, karaːj, karaːma, karaːmaː, kaːm, kiraːj, kiʃja, kjas, koːba, koːlaj, koːma, kʷinha, liːlaːw, liːli, luːbja, madar, magʷal, majʔa, mana, manan, mangaːj, manniimti, maːl, mbaːba, mbiɖeːj, mbʔaɖ, mbʔi, meːk, meːs, mha, mhallaga, mhawaj, mhijeː, mhiːn, mijaːd, mijʔat, mindikʷijaːj, mirkʷaːj, mirʔafi, misuːs, mittia, miʃʔari, miːmaʃa, mʔakʷara, mʔam, mʔari, mʔariː, mʔaːdami, na, nafara, nafs, naweː, naː, nda, ndeː, ndi, nfʔa, ngirab, nifri, nihaːs, noːs, nʔandaː, rabameːkaːji, ragad, ragada, raw, raːw, rba, reːr, reːw, rhat, riba, rifkaːk, sak, samaːr, sana, saraːt, saroːj, sasuːbajaː, siganfoːj, sijaːm, sikka, sitoːboːj, siʤin, sjaːm, suːfa, suːg, suːr, tak, takat, taktʔi, taktʔiː, talga, tam, tarabeː, tarʤimaːl, tijoː, tirig, tji, trig, tʔiit, wanas, wari, wast, waʤʤa, waːw, weːnaː, wjaː, xadaːra, xaddam, xawaːʤa, zirʔa, ɖa~ɖib, ɖiwa, ʃa, ʃabaka, ʃakeː, ʃaki, ʃakʷiːn, ʃamat, ʃanha, ʃartija, ʃawweː, ʃawwia, ʃaː, ʃaːk, ʃibibat, ʃinhat, ʃkaːm, ʃuːk, ʃʔaː, ʈibin, ʈiːn, ʈʔa, ʔaba, ʔabaː, ʔadeː, ʔadi, ʔagja, ʔaj, ʔajajdhaja, ʔajaːj, ʔalaːma, ʔalba, ʔamaːr, ʔamuːl, ʔangʷil, ʔani, ʔannuːr, ʔanoː, ʔar, ʔarabijaːj, ʔaraw, ʔaraːw, ʔarːbi, ʔasir, ʔaweː, ʔawi, ʔaːda, ʔaːdeː, ʔaːmanaːj, ʔaːrbi, ʔaːrbiː, ʔaːʃoː, ʔeːga, ʔeːgirim, ʔeːgrim, ʔeːtrig, ʔidda, ʔimir, ʔiʃa, ʔiʤir, ʔiːbaːb, ʔiːbaːbkina, ʔiːd, ʔoːr, ʔoːraj, ʔoːt, ʤabanaː, ʤaːntaːji, ʤimʔa, ʤineːnaː, ʤinsa, ʤoːharaaːji, ʤoːharajaːj.

NOUN occurs with 4 features: Gender (685; 77% instances), Number (89; 10% instances), ExtPos (2; 0% instances), Foreign (2; 0% instances)

NOUN occurs with 6 feature-value pairs: ExtPos=ADV, Foreign=Yes, Gender=Fem, Gender=Masc, Number=Coll, Number=Plur

NOUN occurs with 10 feature combinations. The most frequent feature combination is Gender=Masc (443 tokens). Examples: tak, mhiːn, doːr, jhaːm, mijʔat, jam, bhar, heːlaj, gaw, handi

Relations

NOUN nodes are attached to their parents using 21 different relations: obj (292; 33% instances), nsubj (186; 21% instances), dep:comp (185; 21% instances), dep:conj (34; 4% instances), obl:mod (33; 4% instances), nmod (28; 3% instances), dislocated:obj (26; 3% instances), obl:arg (21; 2% instances), dislocated:subj (20; 2% instances), reparandum (17; 2% instances), root (11; 1% instances), xcomp (10; 1% instances), parataxis (7; 1% instances), dislocated (5; 1% instances), vocative (5; 1% instances), acl:relcl (4; 0% instances), fixed (4; 0% instances), advmod (2; 0% instances), appos (2; 0% instances), discourse (1; 0% instances), parataxis:parenth (1; 0% instances)

Parents of NOUN nodes belong to 11 different parts of speech: VERB (599; 67% instances), ADP (170; 19% instances), NOUN (72; 8% instances), SCONJ (16; 2% instances), (11; 1% instances), ADJ (9; 1% instances), X (9; 1% instances), AUX (3; 0% instances), PROPN (3; 0% instances), INTJ (1; 0% instances), PRON (1; 0% instances)

89 (10%) NOUN nodes are leaves.

334 (37%) NOUN nodes have one child.

261 (29%) NOUN nodes have two children.

210 (23%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 9.

Children of NOUN nodes are attached using 30 different relations: det (734; 47% instances), punct (213; 14% instances), nmod:poss (127; 8% instances), acl:relcl (111; 7% instances), nmod (73; 5% instances), discourse (40; 3% instances), amod (38; 2% instances), dep (36; 2% instances), cc (32; 2% instances), advmod (30; 2% instances), dep:conj (27; 2% instances), case (16; 1% instances), cop (16; 1% instances), reparandum (13; 1% instances), dep:comp (12; 1% instances), nummod (8; 1% instances), acl (7; 0% instances), nsubj (7; 0% instances), appos (6; 0% instances), aux (4; 0% instances), dislocated:mod (4; 0% instances), obj (4; 0% instances), obl:arg (4; 0% instances), vocative (4; 0% instances), fixed (2; 0% instances), csubj (1; 0% instances), dislocated:subj (1; 0% instances), obl:mod (1; 0% instances), parataxis (1; 0% instances), parataxis:parenth (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: DET (738; 47% instances), PUNCT (213; 14% instances), PRON (150; 10% instances), NOUN (72; 5% instances), SCONJ (68; 4% instances), VERB (62; 4% instances), ADP (59; 4% instances), ADJ (54; 3% instances), PART (45; 3% instances), CCONJ (30; 2% instances), ADV (22; 1% instances), AUX (21; 1% instances), NUM (17; 1% instances), INTJ (9; 1% instances), PROPN (8; 1% instances), X (5; 0% instances)