home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Georgian-GLC: POS Tags: NUM

There are 312 NUM lemmas (3%), 405 NUM types (3%) and 1025 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 7 in number of lemmas, 6 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: ერთი, ორი, 19, 20, სამი, 2, ბევრი, 1, 18, 17

The 10 most frequent NUM types: პირველი, XIX, მეორე, XX, ორი, ერთი, სამი, პირველ, ერთ, ბევრი

The 10 most frequent ambiguous lemmas: ერთი (NUM 157, PRON 27), 2 (NUM 18, X 1), მილიონი (NUM 7, NOUN 1), 15 (NUM 2, NOUN 1)

The 10 most frequent ambiguous types: ერთი (NUM 33, PRON 20), ერთ (NUM 16, PRON 5), პირველად (ADV 24, NUM 2, ADJ 1), 15 (NOUN 1, NUM 1), C (NUM 1, X 1)

Morphology

The form / lemma ratio of NUM is 1.298077 (the average of all parts of speech is 1.677821).

The 1st highest number of forms (11) was observed with the lemma “ერთი”: ერთ, ერთი, ერთიცა, ერთ–, პირველ, პირველად, პირველთაგანი, პირველი, პირველივე, პირველმა, პირველსა.

The 2nd highest number of forms (11) was observed with the lemma “ორი”: მეორე, მეორეს, მეორეც, მეორის, მე‐2, ორ, ორად, ორი, ორივე, ორივეს, ორჯერ.

The 3rd highest number of forms (5) was observed with the lemma “20”: 20, 20-, 20-იან, XX, მე-20.

NUM occurs with 5 features: NumType (1024; 100% instances), NumForm (660; 64% instances), Case (388; 38% instances), Number (366; 36% instances), PartType (5; 0% instances)

NUM occurs with 14 feature-value pairs: Case=Dat, Case=Erg, Case=Ess, Case=Gen, Case=Ins, Case=Nom, NumForm=Digit, NumForm=Roman, NumType=Card, NumType=Mult, NumType=Ord, Number=Plur, Number=Sing, PartType=Emp

NUM occurs with 29 feature combinations. The most frequent feature combination is NumForm=Digit|NumType=Card (469 tokens). Examples: 1992, 1999, 2, 1, 2008, 30-იან, 11, 20, 2001, 2005

Relations

NUM nodes are attached to their parents using 13 different relations: nummod (904; 88% instances), parataxis (51; 5% instances), obl (23; 2% instances), conj (16; 2% instances), root (7; 1% instances), amod (5; 0% instances), nmod (5; 0% instances), appos (4; 0% instances), obj (3; 0% instances), obl:tmod (3; 0% instances), nsubj (2; 0% instances), acl (1; 0% instances), orphan (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (806; 79% instances), ADJ (104; 10% instances), NUM (30; 3% instances), VERB (26; 3% instances), PROPN (21; 2% instances), SYM (9; 1% instances), ADV (8; 1% instances), PRON (8; 1% instances), (7; 1% instances), X (6; 1% instances)

800 (78%) NUM nodes are leaves.

147 (14%) NUM nodes have one child.

58 (6%) NUM nodes have two children.

20 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 20 different relations: punct (136; 40% instances), nmod (79; 24% instances), case (24; 7% instances), nummod (20; 6% instances), conj (18; 5% instances), nsubj (10; 3% instances), obl (9; 3% instances), cc (8; 2% instances), cop (7; 2% instances), amod (5; 1% instances), advmod (4; 1% instances), appos (3; 1% instances), obl:tmod (3; 1% instances), parataxis (3; 1% instances), orphan (2; 1% instances), advcl (1; 0% instances), advmod:lmod (1; 0% instances), dep (1; 0% instances), det (1; 0% instances), det:poss (1; 0% instances)

Children of NUM nodes belong to 12 different parts of speech: PUNCT (136; 40% instances), NOUN (94; 28% instances), NUM (30; 9% instances), ADP (24; 7% instances), ADJ (12; 4% instances), CCONJ (8; 2% instances), PROPN (8; 2% instances), AUX (7; 2% instances), PRON (7; 2% instances), ADV (5; 1% instances), VERB (3; 1% instances), SYM (2; 1% instances)