Treebank Statistics: UD_Sanskrit-Vedic: POS Tags: NOUN
There are 6396 NOUN
lemmas (46%), 15108 NOUN
types (41%) and 72315 NOUN
tokens (35%).
Out of 13 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: agni, deva, indra, yajña, brahman, loka, ap, paśu, prāṇa, soma
The 10 most frequent NOUN
types: _, agniḥ, devāḥ, agnim, agne, brahma, indra, indraḥ, deva, āpaḥ
The 10 most frequent ambiguous lemmas: deva (NOUN 1583, ADJ 25), yajña (NOUN 792, ADV 1), kāma (NOUN 419, ADV 5), go (NOUN 350, ADV 8), brāhmaṇa (NOUN 318, ADJ 2), āditya (NOUN 305, ADJ 20), agra (NOUN 264, ADV 2, ADJ 1), ahar (NOUN 244, ADV 1), saṃvatsara (NOUN 235, ADV 2), ratha (NOUN 218, ADV 2)
The 10 most frequent ambiguous types: _ (NOUN 2255, ADJ 407, CCONJ 331, SCONJ 216, NUM 186, PRON 124, VERB 106, INTJ 86, ADP 73, ADV 54, DET 14), devāḥ (NOUN 466, ADJ 1), deva (NOUN 253, ADJ 1), yajñam (NOUN 186, ADV 1), namaḥ (NOUN 162, VERB 1), devān (NOUN 131, ADJ 1), karma (NOUN 127, VERB 1), āyuḥ (NOUN 127, ADJ 1), jyotiḥ (NOUN 119, ADV 2), agre (NOUN 107, ADV 16)
- _
- NOUN 2255: prete _ gṛhapatau _ karaṇam
- ADJ 407: apavarge _ bhojanam yathāśakti
- CCONJ 331: prete _ gṛhapatau _ karaṇam
- SCONJ 216: saḥ _ _ _ kāmaḥ bhavati saṃkalpāt eva asya pitaraḥ samuttiṣṭhanti
- NUM 186: tāni _ śatam saṃpeduḥ
- PRON 124: prajāpatiḥ prāyacchat iḍām agne iti sviṣṭakṛtam _ _ _ ardhe juhuyāt
- VERB 106: _ indriyaḥ ca vasudhām prāpsyasi iti ca mām bravīt
- INTJ 86: _ iti
- ADP 73: ākāśam _ astam yanti
- ADV 54: tasmāt prajāḥ daśa māsaḥ garbham bhṛtvā ekādaśam _ prajāyante
- DET 14: tasmāt u _ paśuḥ
- devāḥ
- deva
- yajñam
- namaḥ
- devān
- karma
- āyuḥ
- jyotiḥ
- agre
Morphology
The form / lemma ratio of NOUN
is 2.362101 (the average of all parts of speech is 2.674382).
The 1st highest number of forms (22) was observed with the lemma “brahman”: _, brahma, brahmabhiḥ, brahmabhyaḥ, brahman, brahmanaḥ, brahmane, brahmani, brahmanā, brahmaṇas, brahmaṇaḥ, brahmaṇe, brahmaṇi, brahmaṇā, brahmaṇām, brahmā, brahmānam, brahmāṇam, brahmāṇau, brahmāṇaḥ, brahmāṇi, brahmāṇā.
The 2nd highest number of forms (22) was observed with the lemma “deva”: _, deva, devaiḥ, devam, devasya, devau, devayoḥ, devaḥ, deve, devebhiḥ, devebhyaḥ, devena, deveṣu, devā, devān, devānt, devānām, devāsaḥ, devāt, devāya, devāḥ, devāṁ.
The 3rd highest number of forms (21) was observed with the lemma “anta”: _, anta, antaiḥ, antam, antataḥ, antau, antayoḥ, antayā, antaḥ, ante, antebhyaḥ, antena, anteṣu, antā, antābhiḥ, antām, antān, antāni, antāt, antāḥ, antāṁ.
NOUN
occurs with 4 features: Case (64911; 90% instances), Gender (64911; 90% instances), Number (64911; 90% instances), Compound (7365; 10% instances)
NOUN
occurs with 15 feature-value pairs: Case=Abl
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Case=Voc
, Compound=Yes
, Gender=Fem
, Gender=Masc
, Gender=Neut
, Number=Dual
, Number=Plur
, Number=Sing
NOUN
occurs with 73 feature combinations.
The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing
(9869 tokens).
Examples: agniḥ, indraḥ, prajāpatiḥ, prāṇaḥ, yajñaḥ, kāmaḥ, ātmā, devaḥ, somaḥ, savitā
Relations
NOUN
nodes are attached to their parents using 54 different relations: obj (10396; 14% instances), nsubj (9910; 14% instances), conj (7426; 10% instances), nmod (6686; 9% instances), flat (6050; 8% instances), root (4688; 6% instances), obl (4465; 6% instances), orphan (2104; 3% instances), obl:instr (1857; 3% instances), vocative (1686; 2% instances), obl:goal (1659; 2% instances), compound:coord (1575; 2% instances), obl:lmod (1340; 2% instances), acl (1165; 2% instances), nmod:appos (1029; 1% instances), iobj (967; 1% instances), obl:tmod (895; 1% instances), ccomp (818; 1% instances), obl:manner (728; 1% instances), advcl:ccomp (680; 1% instances), obl:source (675; 1% instances), xcomp (597; 1% instances), advcl (567; 1% instances), acl:relcl (532; 1% instances), advcl:fin (433; 1% instances), obl:soc (427; 1% instances), advcl:manner (416; 1% instances), appos (340; 0% instances), parataxis (286; 0% instances), xcomp:result (274; 0% instances), obl:agent (221; 0% instances), obl:path (190; 0% instances), obl:grad (159; 0% instances), obl:benef (156; 0% instances), amod (125; 0% instances), advcl:cond (118; 0% instances), acl:attr (111; 0% instances), compound (99; 0% instances), csubj (96; 0% instances), advcl:dpct (88; 0% instances), acl:dpct (79; 0% instances), advcl:caus (65; 0% instances), dislocated (59; 0% instances), advcl:tcl (22; 0% instances), nmod:pred (18; 0% instances), compound:name (9; 0% instances), discourse (8; 0% instances), acl:crel (4; 0% instances), advcl:concess (4; 0% instances), case (4; 0% instances), ccomp:rel (4; 0% instances), acl:ptcp (2; 0% instances), advcl:lcl (2; 0% instances), fixed (1; 0% instances)
Parents of NOUN
nodes belong to 12 different parts of speech: VERB (35168; 49% instances), NOUN (22956; 32% instances), ADJ (4917; 7% instances), (4688; 6% instances), PRON (3000; 4% instances), ADV (597; 1% instances), NUM (538; 1% instances), ADP (178; 0% instances), PART (123; 0% instances), INTJ (59; 0% instances), CCONJ (53; 0% instances), SCONJ (38; 0% instances)
31340 (43%) NOUN
nodes are leaves.
28307 (39%) NOUN
nodes have one child.
8109 (11%) NOUN
nodes have two children.
4559 (6%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 30.
Children of NOUN
nodes are attached using 62 different relations: nmod (8886; 14% instances), conj (6685; 11% instances), flat (6494; 11% instances), det (4698; 8% instances), nsubj (4490; 7% instances), amod (4473; 7% instances), acl (3770; 6% instances), orphan (2674; 4% instances), discourse (2661; 4% instances), cc (2424; 4% instances), mark (1929; 3% instances), compound:coord (1588; 3% instances), nummod (1383; 2% instances), advmod (1370; 2% instances), case (1267; 2% instances), case:sim (844; 1% instances), nmod:appos (813; 1% instances), cop (637; 1% instances), obj (473; 1% instances), mark:sim (365; 1% instances), appos (350; 1% instances), acl:relcl (340; 1% instances), parataxis (328; 1% instances), acl:dpct (282; 0% instances), ccomp (272; 0% instances), acl:attr (229; 0% instances), obl (190; 0% instances), vocative (180; 0% instances), advcl (165; 0% instances), acl:ptcp (163; 0% instances), advcl:cond (143; 0% instances), csubj (123; 0% instances), compound (100; 0% instances), iobj (81; 0% instances), obl:tmod (80; 0% instances), obl:lmod (79; 0% instances), compound:name (47; 0% instances), advcl:tcl (45; 0% instances), obl:soc (43; 0% instances), advcl:caus (40; 0% instances), obl:manner (39; 0% instances), obl:benef (37; 0% instances), advcl:fin (36; 0% instances), obl:source (34; 0% instances), obl:instr (32; 0% instances), obl:goal (25; 0% instances), advcl:manner (21; 0% instances), dislocated (18; 0% instances), nmod:pred (18; 0% instances), obl:agent (13; 0% instances), obl:grad (8; 0% instances), advcl:ccomp (6; 0% instances), advcl:dpct (6; 0% instances), xcomp (5; 0% instances), acl:crel (4; 0% instances), acl:cont (3; 0% instances), ccomp:rel (3; 0% instances), obl:path (3; 0% instances), advcl:lcl (2; 0% instances), acl:pred (1; 0% instances), aux (1; 0% instances), fixed (1; 0% instances)
Children of NOUN
nodes belong to 13 different parts of speech: NOUN (22956; 37% instances), PRON (10350; 17% instances), ADJ (7782; 13% instances), PART (6467; 11% instances), VERB (5182; 8% instances), ADV (2475; 4% instances), CCONJ (2325; 4% instances), NUM (1557; 3% instances), ADP (872; 1% instances), AUX (638; 1% instances), DET (526; 1% instances), INTJ (200; 0% instances), SCONJ (190; 0% instances)