Treebank Statistics: UD_Naija-NSC: POS Tags: X
There are 115 X
lemmas (3%), 322 X
types (6%) and 40230 X
tokens (29%).
Out of 15 observed tags, the rank of X
is: 6 in number of lemmas, 5 in number of types and 1 in number of tokens.
The 10 most frequent X
lemmas: #, //, <, {, }, [, |c, ||, ], >+
The 10 most frequent X
types: #, //, <, {, }, [, |c, ||, ], >+
The 10 most frequent ambiguous lemmas: X (X 410, INTJ 6, DET 3, NUM 1), ma (PART 28, X 3), per (ADP 6, X 3), cup (NOUN 3, X 2), dem (PRON 2221, PART 42, X 2), na (AUX 2025, X 2, ADP 1), di (DET 2815, X 1), gbogbo (ADV 1, X 1), husband (NOUN 33, X 1), ka (PRON 1, X 1)
The 10 most frequent ambiguous types: ma (PART 28, X 8), b~ (X 7, DET 1), ba (PART 15, X 5), e (PRON 1564, X 5, DET 1), a (DET 127, PRON 22, X 3, NOUN 1), o~ (X 3, NUM 1), per (ADP 6, X 3), wa (INTJ 12, X 3), da (X 2, DET 1), de (PRON 1225, X 2)
- ma
- b~
- ba
- e
- a
- o~
- per
- wa
- da
- de
Morphology
The form / lemma ratio of X
is 2.800000 (the average of all parts of speech is 1.162960).
The 1st highest number of forms (210) was observed with the lemma “X”: Adigo, Agwan~, Alap~, Ala~, Ea~, Fren~, Fr~, Had~, Kw~, Lil~, Max~, Om~, Oria~, RI~, STP~, X, a, abf, ab~, ak~, ala, almo~, al~, anyb~, ar~, avera~, aw~, a~, ba, ban~, ba~, be~, bi~, brin~, bro~, br~, bu~, b~, ca, ca~, chea~, checkli~, chi~, ch~, cle~, conne~, con~, coun~, co~, cr~, cu~, c~, da, de~, di~, do~, du~, d~, e, eh, en~, epurutepu, etin~, ev, everyti~, everyt~, ev~, exa~, e~, fa, fai~, fe~, fini~, fin~, fi~, fore~, for~, fo~, f~, ga~, gbu~, ge, gene~, ge~, ghet~, giti, gi~, gm~, gover~, go~, gu~, g~, hav~, hel~, hip~, ho~, hub~, huma~, h~, im~, inf~, ingred~, insi~, inst~, i~, kambia, k~, lafs~, la~, le~, lit~, li~, ma, mad~, mana~, ma~, med~, me~, mil~, mir~, mon~, mor~, mow~, mo~, mu~, m~, nai~, ne, nikan, norm~, not~, no~, nso, nu, num~, n~, ogbeni, origi~, ori~, oro~, over~, o~, pala~, pa~, pelu, peo~, pers~, pe~, pik~, pi~, pla~, pol~, post, pre~, profe~, pro~, pur~, pu~, p~, re, reach, repre~, res~, re~, r~, sab~, sa~, se~, shere, sh~, sin~, sis~, si~, sle~, sm~, som~, so~, spe~, spu~, sp~, st~, su~, swe~, sy~, s~, tawon, ta~, thirt~, thou~, ti~, traffi~, tre~, tri~, tu, t~, una, under~, un~, wa, wa~, wet~, we~, wit~, wi~, wom~, wor~, wo~, wu~, w~, zaga.
The 2nd highest number of forms (2) was observed with the lemma “//”: //, }.
The 3rd highest number of forms (2) was observed with the lemma “>”: <, >.
X
occurs with 6 features: PronType (7; 0% instances), Case (6; 0% instances), Person (6; 0% instances), PartType (4; 0% instances), Number (3; 0% instances), NumType (1; 0% instances)
X
occurs with 9 feature-value pairs: Case=Nom
, NumType=Card
, Number=Plur
, Number=Sing
, PartType=Cop
, Person=2
, Person=3
, PronType=Dem
, PronType=Prs
X
occurs with 7 feature combinations.
The most frequent feature combination is _
(40218 tokens).
Examples: #, //, <, {, }, [, |c, ||, ], >+
Relations
X
nodes are attached to their parents using 33 different relations: dep (39698; 99% instances), flat:foreign (97; 0% instances), root (84; 0% instances), obj (57; 0% instances), reparandum (46; 0% instances), nmod (34; 0% instances), obl:mod (30; 0% instances), nsubj (25; 0% instances), flat (17; 0% instances), obl:arg (17; 0% instances), acl:relcl (13; 0% instances), conj (13; 0% instances), discourse (11; 0% instances), xcomp (10; 0% instances), advcl (8; 0% instances), compound (8; 0% instances), dislocated (8; 0% instances), compound:svc (7; 0% instances), iobj (7; 0% instances), parataxis (7; 0% instances), advcl:cleft (6; 0% instances), ccomp (5; 0% instances), fixed (4; 0% instances), parataxis:conj (4; 0% instances), parataxis:parenth (4; 0% instances), compound:redup (2; 0% instances), vocative (2; 0% instances), advmod (1; 0% instances), appos (1; 0% instances), cc (1; 0% instances), compound:prt (1; 0% instances), nmod:poss (1; 0% instances), obl:agent (1; 0% instances)
Parents of X
nodes belong to 16 different parts of speech: VERB (20538; 51% instances), NOUN (7167; 18% instances), ADV (2597; 6% instances), PRON (2559; 6% instances), ADJ (1542; 4% instances), AUX (1103; 3% instances), PROPN (1004; 2% instances), X (935; 2% instances), INTJ (933; 2% instances), NUM (453; 1% instances), DET (397; 1% instances), ADP (373; 1% instances), PART (287; 1% instances), SCONJ (182; 0% instances), (84; 0% instances), CCONJ (76; 0% instances)
39852 (99%) X
nodes are leaves.
27 (0%) X
nodes have one child.
49 (0%) X
nodes have two children.
302 (1%) X
nodes have three or more children.
The highest child degree of a X
node is 26.
Children of X
nodes are attached using 37 different relations: dep (814; 51% instances), reparandum (260; 16% instances), flat:foreign (112; 7% instances), nsubj (61; 4% instances), aux (55; 3% instances), case (49; 3% instances), det (44; 3% instances), discourse (36; 2% instances), cop (26; 2% instances), advmod (18; 1% instances), nmod (17; 1% instances), mark (15; 1% instances), flat (13; 1% instances), amod (11; 1% instances), conj (10; 1% instances), cc (7; 0% instances), dislocated (7; 0% instances), advcl (5; 0% instances), acl (4; 0% instances), nmod:poss (4; 0% instances), obl:mod (4; 0% instances), advcl:cleft (3; 0% instances), compound (3; 0% instances), parataxis:conj (3; 0% instances), acl:relcl (2; 0% instances), compound:redup (2; 0% instances), nummod (2; 0% instances), appos (1; 0% instances), ccomp (1; 0% instances), compound:svc (1; 0% instances), csubj (1; 0% instances), expl:subj (1; 0% instances), fixed (1; 0% instances), obj (1; 0% instances), parataxis:discourse (1; 0% instances), vocative (1; 0% instances), xcomp (1; 0% instances)
Children of X
nodes belong to 15 different parts of speech: X (935; 59% instances), NOUN (130; 8% instances), VERB (87; 5% instances), AUX (83; 5% instances), PRON (83; 5% instances), DET (48; 3% instances), ADP (44; 3% instances), SCONJ (35; 2% instances), ADJ (33; 2% instances), PROPN (29; 2% instances), ADV (26; 2% instances), INTJ (26; 2% instances), PART (20; 1% instances), CCONJ (13; 1% instances), NUM (5; 0% instances)