home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Slovenian-SSJ: POS Tags: X

There are 1081 X lemmas (4%), 1093 X types (2%) and 2092 X tokens (1%). Out of 17 observed tags, the rank of X is: 7 in number of lemmas, 7 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: dr., de, the, št., d., t., m., of, o., p.

The 10 most frequent X types: dr., de, the, št., d., t., m., of, o., p.

The 10 most frequent ambiguous lemmas: a (CCONJ 216, X 12, ADV 5, NOUN 2), York (PROPN 14, X 10), on (PRON 2656, X 8), Al (X 7, PROPN 4), - (PUNCT 1119, X 6, SYM 1), i (X 4, NOUN 1), in (CCONJ 6119, ADV 5, X 5, NOUN 1), master (X 5, NOUN 1), Berkeley (PROPN 5, X 4), San (PROPN 6, X 4)

The 10 most frequent ambiguous types: de (X 32, VERB 1), a (CCONJ 161, X 11, NOUN 2, ADV 1), New (PROPN 20, X 12), on (PRON 22, X 8), to (DET 612, X 6), Al (X 7, PROPN 4), - (PUNCT 1119, X 6, SYM 1), York (X 6, PROPN 4), sta (AUX 339, VERB 14, X 6), Windows (PROPN 13, X 5)

Morphology

The form / lemma ratio of X is 1.011101 (the average of all parts of speech is 1.932008).

The 1st highest number of forms (4) was observed with the lemma “York”: YORK, York, Yorka, Yorku.

The 2nd highest number of forms (2) was observed with the lemma “Janeiro”: JANEIRO, Janeiro.

The 3rd highest number of forms (2) was observed with the lemma “New”: NEW, New.

X occurs with 2 features: Foreign (1296; 62% instances), Abbr (655; 31% instances)

X occurs with 2 feature-value pairs: Abbr=Yes, Foreign=Yes

X occurs with 3 feature combinations. The most frequent feature combination is Foreign=Yes (1296 tokens). Examples: de, the, of, a, la, New, and, to, Al, on

Relations

X nodes are attached to their parents using 25 different relations: flat:foreign (784; 37% instances), nmod (366; 17% instances), appos (144; 7% instances), flat:name (122; 6% instances), root (108; 5% instances), conj (100; 5% instances), fixed (92; 4% instances), nsubj (77; 4% instances), list (58; 3% instances), amod (55; 3% instances), obl (46; 2% instances), dep (39; 2% instances), cc (35; 2% instances), parataxis (24; 1% instances), obj (23; 1% instances), advmod (6; 0% instances), orphan (3; 0% instances), case (2; 0% instances), vocative (2; 0% instances), acl (1; 0% instances), flat (1; 0% instances), goeswith (1; 0% instances), iobj (1; 0% instances), mark (1; 0% instances), xcomp (1; 0% instances)

Parents of X nodes belong to 12 different parts of speech: X (1036; 50% instances), NOUN (430; 21% instances), PROPN (284; 14% instances), VERB (151; 7% instances), (108; 5% instances), ADJ (35; 2% instances), NUM (21; 1% instances), ADV (14; 1% instances), DET (7; 0% instances), SCONJ (3; 0% instances), SYM (2; 0% instances), ADP (1; 0% instances)

1292 (62%) X nodes are leaves.

218 (10%) X nodes have one child.

270 (13%) X nodes have two children.

312 (15%) X nodes have three or more children.

The highest child degree of a X node is 35.

Children of X nodes are attached using 27 different relations: flat:foreign (784; 36% instances), punct (623; 28% instances), nummod (109; 5% instances), appos (103; 5% instances), flat:name (100; 5% instances), fixed (88; 4% instances), nmod (78; 4% instances), conj (67; 3% instances), case (56; 3% instances), list (33; 2% instances), cc (32; 1% instances), dep (30; 1% instances), amod (29; 1% instances), advmod (15; 1% instances), acl (9; 0% instances), orphan (8; 0% instances), parataxis (8; 0% instances), nsubj (5; 0% instances), obj (5; 0% instances), cop (4; 0% instances), det (2; 0% instances), cc:preconj (1; 0% instances), flat (1; 0% instances), mark (1; 0% instances), obl (1; 0% instances), vocative (1; 0% instances), xcomp (1; 0% instances)

Children of X nodes belong to 15 different parts of speech: X (1036; 47% instances), PUNCT (623; 28% instances), NUM (148; 7% instances), PROPN (122; 6% instances), NOUN (94; 4% instances), ADP (46; 2% instances), CCONJ (33; 2% instances), ADJ (27; 1% instances), SYM (14; 1% instances), VERB (14; 1% instances), ADV (11; 1% instances), SCONJ (11; 1% instances), PART (8; 0% instances), AUX (4; 0% instances), DET (3; 0% instances)