home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-NYUAD: POS Tags: X

There are 36 X lemmas (1%), 1 X types (6%) and 927 X tokens (0%). Out of 16 observed tags, the rank of X is: 4 in number of lemmas, 16 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: _، typo، TBupdate، the، EADS، b، in، w، &Cx0b، &QC

The 10 most frequent X types: _

The 10 most frequent ambiguous lemmas: _ (NOUN 221327, PUNCT 71973, ADJ 68841, ADP 62617, VERB 55127, PROPN 48391, ADV 23955, SCONJ 15652, NUM 15105, PRON 12926, AUX 6881, DET 6354, CCONJ 3889, PART 1501, X 380, INTJ 56), typo (X 317, ADP 1), TBupdate (NOUN 408, ADJ 340, VERB 268, X 190, PROPN 69, PUNCT 15, ADP 1, SCONJ 1), b (ADP 12334, NOUN 21, DET 2, PRON 2, SCONJ 2, X 2, ADJ 1, VERB 1), w (CCONJ 43819, SCONJ 235, ADP 42, NOUN 41, VERB 40, ADJ 14, PRON 12, PROPN 9, DET 4, PART 3, NUM 2, PUNCT 2, X 2), F (NOUN 2, X 1), l (ADP 15628, PART 165, NOUN 29, SCONJ 28, ADV 2, VERB 2, ADJ 1, DET 1, NUM 1, PROPN 1, PUNCT 1, X 1), s (AUX 2274, VERB 7, NOUN 1, X 1)

The 10 most frequent ambiguous types: _ (NOUN 221899, ADP 91743, PUNCT 75266, ADJ 69355, PROPN 57421, VERB 55469, CCONJ 49161, PRON 43495, ADV 24067, SCONJ 16614, NUM 15377, AUX 9155, DET 6363, PART 2521, X 927, INTJ 56)

Morphology

The form / lemma ratio of X is 0.027778 (the average of all parts of speech is 0.003044).

The 1st highest number of forms (1) was observed with the lemma “&Cx0b”: _.

The 2nd highest number of forms (1) was observed with the lemma “&QC”: _.

The 3rd highest number of forms (1) was observed with the lemma “&UR”: _.

X occurs with 9 features: Gender (482; 52% instances), Number (482; 52% instances), Definite (281; 30% instances), Person (205; 22% instances), Voice (205; 22% instances), Mood (197; 21% instances), Case (38; 4% instances), AdpType (2; 0% instances), Polarity (2; 0% instances)

X occurs with 21 feature-value pairs: AdpType=Prep, Case=Acc, Case=Gen, Case=Nom, Definite=Com, Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Mood=Ind, Mood=Jus, Mood=Sub, Number=Dual, Number=Plur, Number=Sing, Person=1, Person=2, Person=3, Polarity=Neg, Voice=Act, Voice=Pass

X occurs with 47 feature combinations. The most frequent feature combination is _ (439 tokens). Examples: _

Relations

X nodes are attached to their parents using 9 different relations: nmod (653; 70% instances), obj (106; 11% instances), nmod:poss (93; 10% instances), iobj (22; 2% instances), nsubj (21; 2% instances), mark (19; 2% instances), root (11; 1% instances), acl (1; 0% instances), xcomp (1; 0% instances)

Parents of X nodes belong to 11 different parts of speech: NOUN (347; 37% instances), VERB (346; 37% instances), ADV (50; 5% instances), ADJ (49; 5% instances), PROPN (35; 4% instances), PRON (33; 4% instances), X (30; 3% instances), NUM (15; 2% instances), (11; 1% instances), DET (7; 1% instances), CCONJ (4; 0% instances)

607 (65%) X nodes are leaves.

156 (17%) X nodes have one child.

100 (11%) X nodes have two children.

64 (7%) X nodes have three or more children.

The highest child degree of a X node is 19.

Children of X nodes are attached using 17 different relations: nmod (203; 32% instances), obj (114; 18% instances), case (76; 12% instances), punct (45; 7% instances), amod (31; 5% instances), ccomp (26; 4% instances), nummod (24; 4% instances), cc (21; 3% instances), xcomp (17; 3% instances), mark (16; 3% instances), advmod (15; 2% instances), iobj (15; 2% instances), cop (11; 2% instances), nsubj (9; 1% instances), nmod:poss (5; 1% instances), aux (1; 0% instances), det (1; 0% instances)

Children of X nodes belong to 15 different parts of speech: NOUN (228; 36% instances), ADP (76; 12% instances), PROPN (45; 7% instances), PUNCT (45; 7% instances), VERB (43; 7% instances), ADJ (32; 5% instances), PRON (32; 5% instances), X (30; 5% instances), NUM (25; 4% instances), ADV (22; 3% instances), CCONJ (21; 3% instances), SCONJ (13; 2% instances), AUX (12; 2% instances), PART (4; 1% instances), DET (2; 0% instances)