home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Slovenian-SST: POS Tags: X

There are 75 X lemmas (2%), 226 X types (4%) and 466 X tokens (2%). Out of 16 observed tags, the rank of X is: 6 in number of lemmas, 6 in number of types and 16 in number of tokens.

The 10 most frequent X lemmas: _, green, of, grass, home, non, stop, beautiful, day, fa

The 10 most frequent X types: s-, j-, n-, m-, p-, z-, t-, k-, v-, d-

The 10 most frequent ambiguous lemmas: _ (X 367, PUNCT 1), ka (SCONJ 22, X 2), on (PRON 308, X 2), a (ADV 137, INTJ 16, NOUN 6, CCONJ 3, X 1), da (SCONJ 533, PART 16, X 1), in (CCONJ 414, ADV 1, X 1), kaj (PRON 197, ADV 43, X 1), la (ADV 1, X 1), minus (NOUN 6, X 1), od (ADP 83, X 1)

The 10 most frequent ambiguous types: ka (SCONJ 22, X 2), on (PRON 24, X 2), a (ADV 137, INTJ 16, NOUN 6, CCONJ 3, X 1), bi (AUX 134, VERB 15, X 1), da (SCONJ 533, PART 16, VERB 15, X 1), di (INTJ 2, X 1), ga (PRON 60, X 1), i (PART 1, X 1), imam (VERB 19, X 1), in (CCONJ 414, ADV 1, X 1)

Morphology

The form / lemma ratio of X is 3.013333 (the average of all parts of speech is 1.573353).

The 1st highest number of forms (152) was observed with the lemma “_”: Bel-, Franc-, Oma-, Slove-, a-, am-, an-, anal-, avto-, avtomob-, b-, ba-, ce-, ci, d-, dej, dela-, des-, deves-, devetinos-, di, do-, dovoli-, dovre-, e-, fe-, fizi-, g-, gos-, gospo-, gre-, grn-, hotl-, i, i-, ins-, ist-, istrija-, j-, jabol-, k-, ka-, km-, knji-, kolo-, kom-, kompliciraš, ku-, kur-, l-, le-, lu-, m-, ma-, mar-, mat-, mest-, mi-, mid-, mis-, mišani, moš-, n-, na-, naj-, napi-, nar-, naslednj-, nek-, ni-, nih-, nikak-, nje-, njegov-, o-, od-, ojo, om, on-, op-, opaz-, orož-, ose-, p-, pet-, petnš-, po-, pok-, ponava-, pos-, posr-, pr-, pre-, pred-, prelis-, preve-, prevo-, pri-, psi-, r-, ra-, raz-, razu-, raču-, re-, rec-, s-, sa-, se, se-, sko-, sla-, slovarj-, so-, spa-, spe-, spla-, sprašva-, st-, t-, tist-, to-, trans-, u-, usta-, uze-, v-, va-, ver-, vs-, vslak-, vzmeti-, z-, zaba-, zac-, zag-, zar-, zdaj-, zl-, zmišlajo, zob-, zve-, č-, čak-, š-, še-, špi-, šte-, štir-, ž-, žens-, žul-.

The 2nd highest number of forms (1) was observed with the lemma “Bewegung”: bewegung.

The 3rd highest number of forms (1) was observed with the lemma “Mission”: mission.

X occurs with 1 features: Foreign (96; 21% instances)

X occurs with 1 feature-value pairs: Foreign=Yes

X occurs with 2 feature combinations. The most frequent feature combination is _ (370 tokens). Examples: s-, j-, n-, m-, p-, z-, t-, k-, v-, d-

Relations

X nodes are attached to their parents using 21 different relations: reparandum (320; 69% instances), flat:foreign (59; 13% instances), root (22; 5% instances), conj (9; 2% instances), obl (8; 2% instances), nmod (7; 2% instances), dep (6; 1% instances), parataxis (6; 1% instances), fixed (5; 1% instances), advcl (4; 1% instances), advmod (3; 1% instances), nsubj (3; 1% instances), obj (3; 1% instances), vocative (3; 1% instances), cc (2; 0% instances), acl (1; 0% instances), amod (1; 0% instances), appos (1; 0% instances), case (1; 0% instances), ccomp (1; 0% instances), dislocated (1; 0% instances)

Parents of X nodes belong to 16 different parts of speech: VERB (107; 23% instances), X (83; 18% instances), NOUN (67; 14% instances), ADJ (34; 7% instances), ADV (31; 7% instances), DET (29; 6% instances), PART (22; 5% instances), (22; 5% instances), PRON (15; 3% instances), PROPN (13; 3% instances), CCONJ (11; 2% instances), SCONJ (10; 2% instances), AUX (9; 2% instances), ADP (7; 2% instances), NUM (5; 1% instances), INTJ (1; 0% instances)

345 (74%) X nodes are leaves.

61 (13%) X nodes have one child.

26 (6%) X nodes have two children.

34 (7%) X nodes have three or more children.

The highest child degree of a X node is 8.

Children of X nodes are attached using 27 different relations: flat:foreign (56; 20% instances), punct (55; 20% instances), reparandum (27; 10% instances), case (19; 7% instances), advmod (18; 6% instances), nsubj (13; 5% instances), aux (10; 4% instances), discourse (10; 4% instances), cop (9; 3% instances), cc (8; 3% instances), mark (8; 3% instances), obj (8; 3% instances), parataxis (5; 2% instances), det (4; 1% instances), discourse:filler (4; 1% instances), orphan (4; 1% instances), expl (3; 1% instances), fixed (3; 1% instances), nmod (3; 1% instances), amod (2; 1% instances), dep (2; 1% instances), parataxis:restart (2; 1% instances), acl (1; 0% instances), appos (1; 0% instances), conj (1; 0% instances), iobj (1; 0% instances), parataxis:discourse (1; 0% instances)

Children of X nodes belong to 15 different parts of speech: X (83; 30% instances), PUNCT (55; 20% instances), ADP (20; 7% instances), AUX (19; 7% instances), PRON (17; 6% instances), PART (16; 6% instances), CCONJ (12; 4% instances), DET (12; 4% instances), SCONJ (12; 4% instances), ADV (8; 3% instances), NOUN (8; 3% instances), VERB (8; 3% instances), INTJ (5; 2% instances), ADJ (2; 1% instances), NUM (1; 0% instances)