Treebank Statistics: UD_Slovenian-SSJ: POS Tags: NUM
There are 1123 NUM
lemmas (4%), 1166 NUM
types (2%) and 5595 NUM
tokens (2%).
Out of 17 observed tags, the rank of NUM
is: 5 in number of lemmas, 6 in number of types and 14 in number of tokens.
The 10 most frequent NUM
lemmas: en, dva, trije, 2, štirje, 1, pet, eden, 10, 3
The 10 most frequent NUM
types: 2, eno, 1, dve, dva, dveh, tri, ena, eden, 10
The 10 most frequent ambiguous lemmas: dva (NUM 278, X 1), pet (NUM 82, ADJ 1, NOUN 1), I. (NUM 18, X 3), I (NUM 3, NOUN 1, X 1), V. (X 5, NUM 3), X (NOUN 2, NUM 2), V (NOUN 3, NUM 1)
The 10 most frequent ambiguous types: pet (NUM 43, NOUN 1), sedem (NUM 24, VERB 1), I. (NUM 18, X 3), tridesetih (NUM 5, ADJ 2), I (NUM 3, NOUN 1, X 1), V. (X 5, NUM 3), X (NOUN 2, NUM 2), dvajsetih (ADJ 2, NUM 2), V (ADP 731, NOUN 3, NUM 1)
- pet
- NUM 43: Zemljanka številka pet bi bila lahko grobnica .
- NOUN 1: Včasih željo povsem ali začasno poteši že elektronski robotski pes , kot na primer Sonyjev Aibo ter celo bolj bazični elektronski ljubljenček , kakšen Nano - pet , GigaPet ali daleč najbolj popularen Tamagotchi , ki se je pojavil konec devetdesetih let prejšnjega stoletja .
- sedem
- I.
- tridesetih
- I
- V.
- X
- dvajsetih
- ADJ 2: Snovanje iger ga je pritegovalo že v dijaških letih , zatem v začetku stoletja , v času prve vojne in v dvajsetih letih , misel nanje ga je obletavala še v tridesetih .
- NUM 2: Pri dvajsetih letih se le redko katera ženska sprašuje o svoji plodnosti , večina se zanositvi prav v teh letih skuša čim bolj izogniti .
- V
Morphology
The form / lemma ratio of NUM
is 1.038290 (the average of all parts of speech is 1.935546).
The 1st highest number of forms (11) was observed with the lemma “en”: en, ena, ene, enega, enem, enemu, enga, eni, enih, enim, eno.
The 2nd highest number of forms (5) was observed with the lemma “trije”: treh, trem, tremi, tri, trije.
The 3rd highest number of forms (5) was observed with the lemma “štirje”: štiri, štirih, štirim, štirimi, štirje.
NUM
occurs with 5 features: NumForm (5594; 100% instances), NumType (5594; 100% instances), Case (1468; 26% instances), Number (1468; 26% instances), Gender (1013; 18% instances)
NUM
occurs with 18 feature-value pairs: Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, NumType=Ord
, NumType=Sets
, Number=Dual
, Number=Plur
, Number=Sing
NUM
occurs with 64 feature combinations.
The most frequent feature combination is NumForm=Digit|NumType=Card
(3413 tokens).
Examples: 2, 1, 10, 3, 6, 30, 20, 4, 2000, 15
Relations
NUM
nodes are attached to their parents using 18 different relations: nummod (4307; 77% instances), conj (447; 8% instances), obl (166; 3% instances), flat (165; 3% instances), nmod (96; 2% instances), appos (77; 1% instances), list (74; 1% instances), nsubj (64; 1% instances), dep (53; 1% instances), root (44; 1% instances), parataxis (42; 1% instances), orphan (24; 0% instances), obj (19; 0% instances), acl (6; 0% instances), ccomp (5; 0% instances), xcomp (3; 0% instances), iobj (2; 0% instances), advcl (1; 0% instances)
Parents of NUM
nodes belong to 11 different parts of speech: NOUN (3607; 64% instances), NUM (650; 12% instances), PROPN (592; 11% instances), VERB (272; 5% instances), ADJ (172; 3% instances), X (138; 2% instances), SYM (94; 2% instances), (44; 1% instances), ADV (16; 0% instances), DET (8; 0% instances), PRON (2; 0% instances)
3697 (66%) NUM
nodes are leaves.
1358 (24%) NUM
nodes have one child.
311 (6%) NUM
nodes have two children.
229 (4%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 9.
Children of NUM
nodes are attached using 25 different relations: punct (983; 34% instances), conj (463; 16% instances), advmod (381; 13% instances), case (272; 10% instances), nmod (175; 6% instances), flat (167; 6% instances), cc (115; 4% instances), cop (51; 2% instances), list (37; 1% instances), appos (35; 1% instances), nsubj (35; 1% instances), orphan (23; 1% instances), det (19; 1% instances), amod (17; 1% instances), parataxis (14; 0% instances), dep (13; 0% instances), nummod (12; 0% instances), mark (11; 0% instances), acl (9; 0% instances), aux (9; 0% instances), obl (6; 0% instances), csubj (5; 0% instances), advcl (2; 0% instances), cc:preconj (1; 0% instances), vocative (1; 0% instances)
Children of NUM
nodes belong to 16 different parts of speech: PUNCT (983; 34% instances), NUM (650; 23% instances), ADP (265; 9% instances), NOUN (218; 8% instances), ADV (193; 7% instances), DET (128; 4% instances), PART (115; 4% instances), CCONJ (110; 4% instances), AUX (60; 2% instances), ADJ (34; 1% instances), VERB (27; 1% instances), X (20; 1% instances), PROPN (15; 1% instances), SYM (15; 1% instances), SCONJ (14; 0% instances), PRON (9; 0% instances)