Treebank Statistics: UD_Slovenian-SST: POS Tags: X
There are 75 X
lemmas (2%), 226 X
types (4%) and 466 X
tokens (2%).
Out of 16 observed tags, the rank of X
is: 6 in number of lemmas, 6 in number of types and 16 in number of tokens.
The 10 most frequent X
lemmas: _, green, of, grass, home, non, stop, beautiful, day, fa
The 10 most frequent X
types: s-, j-, n-, m-, p-, z-, t-, k-, v-, d-
The 10 most frequent ambiguous lemmas: _ (X 367, PUNCT 1), ka (SCONJ 22, X 2), on (PRON 308, X 2), a (ADV 137, INTJ 16, NOUN 6, CCONJ 3, X 1), da (SCONJ 533, PART 16, X 1), in (CCONJ 414, ADV 1, X 1), kaj (PRON 197, ADV 43, X 1), la (ADV 1, X 1), minus (NOUN 6, X 1), od (ADP 83, X 1)
The 10 most frequent ambiguous types: ka (SCONJ 22, X 2), on (PRON 24, X 2), a (ADV 137, INTJ 16, NOUN 6, CCONJ 3, X 1), bi (AUX 134, VERB 15, X 1), da (SCONJ 533, PART 16, VERB 15, X 1), di (INTJ 2, X 1), ga (PRON 60, X 1), i (PART 1, X 1), imam (VERB 19, X 1), in (CCONJ 414, ADV 1, X 1)
- ka
- on
- a
- bi
- da
- di
- INTJ 2: pa če noče zaspati ko je še mala pa ji špilaš pred posteljico veš [audience:laughter] tako delaš ti di di pa še igraš
- X 1: v povprečju moško telo nar- [gap] [gap] di dvajset do trideset procentov več kot eee testosterona kot žensko ampak eee p- [gap] igra pomembno vlogo v zdravju tako moškega tudi eee žensk
- ga
- i
- PART 1: eee da b- [gap] in dodatno nih- [gap] nobenemu ni dala dopusta tako da so one to vse i [name:personal] je delala celi dan in [name:personal] so delale celi dan
- X 1: in eee dejstvo je da i v slovenski vojski s poklicnimi vojaki ne moremo čakati da jih bomo imeli ves čas doma temveč je prav da jih pošljemo na preizkušnjo v tujino
- imam
- in
Morphology
The form / lemma ratio of X
is 3.013333 (the average of all parts of speech is 1.573353).
The 1st highest number of forms (152) was observed with the lemma “_”: Bel-, Franc-, Oma-, Slove-, a-, am-, an-, anal-, avto-, avtomob-, b-, ba-, ce-, ci, d-, dej, dela-, des-, deves-, devetinos-, di, do-, dovoli-, dovre-, e-, fe-, fizi-, g-, gos-, gospo-, gre-, grn-, hotl-, i, i-, ins-, ist-, istrija-, j-, jabol-, k-, ka-, km-, knji-, kolo-, kom-, kompliciraš, ku-, kur-, l-, le-, lu-, m-, ma-, mar-, mat-, mest-, mi-, mid-, mis-, mišani, moš-, n-, na-, naj-, napi-, nar-, naslednj-, nek-, ni-, nih-, nikak-, nje-, njegov-, o-, od-, ojo, om, on-, op-, opaz-, orož-, ose-, p-, pet-, petnš-, po-, pok-, ponava-, pos-, posr-, pr-, pre-, pred-, prelis-, preve-, prevo-, pri-, psi-, r-, ra-, raz-, razu-, raču-, re-, rec-, s-, sa-, se, se-, sko-, sla-, slovarj-, so-, spa-, spe-, spla-, sprašva-, st-, t-, tist-, to-, trans-, u-, usta-, uze-, v-, va-, ver-, vs-, vslak-, vzmeti-, z-, zaba-, zac-, zag-, zar-, zdaj-, zl-, zmišlajo, zob-, zve-, č-, čak-, š-, še-, špi-, šte-, štir-, ž-, žens-, žul-.
The 2nd highest number of forms (1) was observed with the lemma “Bewegung”: bewegung.
The 3rd highest number of forms (1) was observed with the lemma “Mission”: mission.
X
occurs with 1 features: Foreign (96; 21% instances)
X
occurs with 1 feature-value pairs: Foreign=Yes
X
occurs with 2 feature combinations.
The most frequent feature combination is _
(370 tokens).
Examples: s-, j-, n-, m-, p-, z-, t-, k-, v-, d-
Relations
X
nodes are attached to their parents using 21 different relations: reparandum (320; 69% instances), flat:foreign (59; 13% instances), root (22; 5% instances), conj (9; 2% instances), obl (8; 2% instances), nmod (7; 2% instances), dep (6; 1% instances), parataxis (6; 1% instances), fixed (5; 1% instances), advcl (4; 1% instances), advmod (3; 1% instances), nsubj (3; 1% instances), obj (3; 1% instances), vocative (3; 1% instances), cc (2; 0% instances), acl (1; 0% instances), amod (1; 0% instances), appos (1; 0% instances), case (1; 0% instances), ccomp (1; 0% instances), dislocated (1; 0% instances)
Parents of X
nodes belong to 16 different parts of speech: VERB (107; 23% instances), X (83; 18% instances), NOUN (67; 14% instances), ADJ (34; 7% instances), ADV (31; 7% instances), DET (29; 6% instances), PART (22; 5% instances), (22; 5% instances), PRON (15; 3% instances), PROPN (13; 3% instances), CCONJ (11; 2% instances), SCONJ (10; 2% instances), AUX (9; 2% instances), ADP (7; 2% instances), NUM (5; 1% instances), INTJ (1; 0% instances)
345 (74%) X
nodes are leaves.
61 (13%) X
nodes have one child.
26 (6%) X
nodes have two children.
34 (7%) X
nodes have three or more children.
The highest child degree of a X
node is 8.
Children of X
nodes are attached using 27 different relations: flat:foreign (56; 20% instances), punct (55; 20% instances), reparandum (27; 10% instances), case (19; 7% instances), advmod (18; 6% instances), nsubj (13; 5% instances), aux (10; 4% instances), discourse (10; 4% instances), cop (9; 3% instances), cc (8; 3% instances), mark (8; 3% instances), obj (8; 3% instances), parataxis (5; 2% instances), det (4; 1% instances), discourse:filler (4; 1% instances), orphan (4; 1% instances), expl (3; 1% instances), fixed (3; 1% instances), nmod (3; 1% instances), amod (2; 1% instances), dep (2; 1% instances), parataxis:restart (2; 1% instances), acl (1; 0% instances), appos (1; 0% instances), conj (1; 0% instances), iobj (1; 0% instances), parataxis:discourse (1; 0% instances)
Children of X
nodes belong to 15 different parts of speech: X (83; 30% instances), PUNCT (55; 20% instances), ADP (20; 7% instances), AUX (19; 7% instances), PRON (17; 6% instances), PART (16; 6% instances), CCONJ (12; 4% instances), DET (12; 4% instances), SCONJ (12; 4% instances), ADV (8; 3% instances), NOUN (8; 3% instances), VERB (8; 3% instances), INTJ (5; 2% instances), ADJ (2; 1% instances), NUM (1; 0% instances)