Treebank Statistics: UD_Slovenian-SSJ: POS Tags: X
There are 787 X
lemmas (3%), 792 X
types (2%) and 1684 X
tokens (1%).
Out of 17 observed tags, the rank of X
is: 7 in number of lemmas, 7 in number of types and 15 in number of tokens.
The 10 most frequent X
lemmas: dr., de, the, št., d., t., m., of, o., p.
The 10 most frequent X
types: dr., de, the, št., d., t., m., of, o., p.
The 10 most frequent ambiguous lemmas: a (CCONJ 216, X 12, ADV 5, NOUN 2), on (PRON 2656, X 8), i (X 4, NOUN 1), in (CCONJ 6119, ADV 5, X 5, NOUN 1), master (X 5, NOUN 1), Al (PROPN 7, X 4), se (PRON 4048, X 4), S (NOUN 4, X 3), art (X 3, NOUN 1), da (SCONJ 2973, PART 13, X 3)
The 10 most frequent ambiguous types: de (X 32, VERB 1), a (CCONJ 161, X 11, NOUN 2, ADV 1), on (PRON 22, X 8), to (DET 612, X 6), sta (AUX 339, VERB 14, X 6), i (X 4, NOUN 1), in (CCONJ 5908, ADV 5, X 5, NOUN 1), Al (PROPN 7, X 4), se (PRON 3466, X 4), New (PROPN 29, X 3)
- de
- a
- CCONJ 161: Moj oče je bil rojen pred 105 leti , a ni bil najmlajši otrok .
- X 11: Skupna stopnja [ a joint degree ] se lahko izda kot :
- NOUN 2: Ti diši moje delovno mesto , a !
- ADV 1: Ne bi bilo pravično zahtevati , naj Nemčija še naprej podpira tri evropske vogale , v zameno pa bi morala Berlin in Pariz prenehati razglašati , da je njuna politika že a priori dobra za vse .
- on
- to
- sta
- i
- X 4: » Pa pukla je Avstrija - preskupo je , ljudi nemaju para za Prater i gađanje . «
- NOUN 1: V celjskem klubu so ugotovili nesmiselnost vlaganja velikih sredstev v drage tuje trenerje in igralce ( čast izjemam , kot so Perić , Kokšarov in še nekateri ) in se osredotočili na odličen podmladek iz lastnih vrst , prihod treh mladih Ljubljančanov ( Zorman , Brumen , Natek ) pa je bil pika na i .
- in
- CCONJ 5908: Zakonodaja in trg delovne sile sta med seboj tesno povezana .
- ADV 5: Geto je out , črnci so in .
- X 5: Gerhard Giesemann : Farben und Farbsymbole in der frühen Lyrik von Oton Župančič .
- NOUN 1: Absolutne enote so pt ( točke ( angleško : points ) ) , in ( palci ) , cm ( centimetri ) , mm ( milimetri ) in pc ( pike ( angleško : picas ) ) .
- Al
- se
- New
Morphology
The form / lemma ratio of X
is 1.006353 (the average of all parts of speech is 1.935546).
The 1st highest number of forms (2) was observed with the lemma “european”: EUROPEAN, European.
The 2nd highest number of forms (2) was observed with the lemma “more”: MORE, More.
The 3rd highest number of forms (2) was observed with the lemma “rana”: RANA, RÁNA.
X
occurs with 2 features: Foreign (874; 52% instances), Abbr (666; 40% instances)
X
occurs with 2 feature-value pairs: Abbr=Yes
, Foreign=Yes
X
occurs with 3 feature combinations.
The most frequent feature combination is Foreign=Yes
(874 tokens).
Examples: de, the, of, a, la, and, to, on, el, von
Relations
X
nodes are attached to their parents using 24 different relations: flat:foreign (535; 32% instances), nmod (304; 18% instances), appos (129; 8% instances), flat:name (123; 7% instances), root (100; 6% instances), fixed (92; 5% instances), conj (86; 5% instances), amod (55; 3% instances), nsubj (50; 3% instances), list (47; 3% instances), dep (38; 2% instances), cc (36; 2% instances), obl (35; 2% instances), obj (19; 1% instances), parataxis (18; 1% instances), advmod (6; 0% instances), acl (2; 0% instances), case (2; 0% instances), orphan (2; 0% instances), discourse (1; 0% instances), flat (1; 0% instances), iobj (1; 0% instances), mark (1; 0% instances), xcomp (1; 0% instances)
Parents of X
nodes belong to 12 different parts of speech: X (695; 41% instances), PROPN (360; 21% instances), NOUN (342; 20% instances), VERB (115; 7% instances), (100; 6% instances), ADJ (28; 2% instances), NUM (20; 1% instances), ADV (13; 1% instances), DET (5; 0% instances), SCONJ (3; 0% instances), SYM (2; 0% instances), ADP (1; 0% instances)
1068 (63%) X
nodes are leaves.
164 (10%) X
nodes have one child.
213 (13%) X
nodes have two children.
239 (14%) X
nodes have three or more children.
The highest child degree of a X
node is 35.
Children of X
nodes are attached using 27 different relations: punct (545; 32% instances), flat:foreign (477; 28% instances), nummod (99; 6% instances), flat:name (94; 6% instances), appos (88; 5% instances), fixed (88; 5% instances), nmod (70; 4% instances), conj (51; 3% instances), case (40; 2% instances), dep (29; 2% instances), list (28; 2% instances), amod (27; 2% instances), cc (22; 1% instances), acl (8; 0% instances), advmod (8; 0% instances), parataxis (8; 0% instances), obj (5; 0% instances), orphan (5; 0% instances), nsubj (4; 0% instances), cop (3; 0% instances), flat (2; 0% instances), cc:preconj (1; 0% instances), det (1; 0% instances), mark (1; 0% instances), obl (1; 0% instances), vocative (1; 0% instances), xcomp (1; 0% instances)
Children of X
nodes belong to 15 different parts of speech: X (695; 41% instances), PUNCT (545; 32% instances), NUM (138; 8% instances), PROPN (122; 7% instances), NOUN (75; 4% instances), ADP (31; 2% instances), ADJ (26; 2% instances), CCONJ (22; 1% instances), SYM (13; 1% instances), VERB (13; 1% instances), SCONJ (10; 1% instances), ADV (8; 0% instances), PART (4; 0% instances), AUX (3; 0% instances), DET (2; 0% instances)