home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Marathi-CMUPAN: POS Tags: NOUN

There are 7246 NOUN lemmas (41%), 14763 NOUN types (49%) and 36980 NOUN tokens (31%). Out of 14 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: मंदिर, वर्ष, आज, काम, भाग, वेळ, पाणी, शहर, किलोमीटर, घर

The 10 most frequent NOUN types: आज, वेळी, किलोमीटर, अंतरावर, काम, मंदिर, किमी., पीक, पाणी, ता

The 10 most frequent ambiguous lemmas: मंदिर (NOUN 346, PROPN 15), वर्ष (NOUN 245, ADV 1, PROPN 1), आज (NOUN 244, ADV 4), काम (NOUN 223, VERB 1), भाग (NOUN 201, PROPN 2), पाणी (NOUN 181, PROPN 1), शहर (NOUN 179, PROPN 4, ADJ 1), घर (NOUN 175, ADV 1, PROPN 1), नाव (NOUN 161, ADV 1), दिवस (NOUN 154, PROPN 1)

The 10 most frequent ambiguous types: आज (NOUN 182, ADV 2), मंदिर (NOUN 129, PROPN 6), पीक (NOUN 112, PROPN 1), पाणी (NOUN 98, PROPN 1), निर्माण (NOUN 86, ADJ 12), माहिती (NOUN 86, PROPN 1), केंद्र (NOUN 67, PROPN 2), पोलिस (NOUN 65, PROPN 11), समावेश (NOUN 62, ADJ 1, PROPN 1), पर्यटन (NOUN 59, PROPN 1)

Morphology

The form / lemma ratio of NOUN is 2.037400 (the average of all parts of speech is 1.720393).

The 1st highest number of forms (46) was observed with the lemma “वर्ष”: वर्ष, वर्षं, वर्षअखेरपर्यंत, वर्षभर, वर्षभरात, वर्षही, वर्षांचा, वर्षांची, वर्षांचे, वर्षांच्या, वर्षांत, वर्षांनंतर, वर्षांनी, वर्षांपर्यंत, वर्षांपर्यंतच्या, वर्षांपासून, वर्षांपूर्वी, वर्षांपूर्वीच, वर्षांपूर्वीची, वर्षांपेक्षा, वर्षांपेक्षाही, वर्षांप्रमाणे, वर्षांमध्ये, वर्षांसाठी, वर्षाचा, वर्षाचे, वर्षाच्या, वर्षात, वर्षातील, वर्षानंतर, वर्षानिमित्त, वर्षापासून, वर्षापूर्वीही, वर्षापेक्षा, वर्षाला, वर्षासमान, वर्षासाठी, वर्षी, वर्षीच, वर्षीच्या, वर्षीपर्यंत, वर्षीपासून, वर्षीपेक्षा, वर्षीसुद्धा, वर्षीही, वर्षे.

The 2nd highest number of forms (42) was observed with the lemma “मंदिर”: मंदिर, मंदिरं, मंदिरसुद्धा, मंदिरा, मंदिरांचा, मंदिरांची, मंदिरांचे, मंदिरांच्या, मंदिरांना, मंदिरांनी, मंदिरांपर्यंत, मंदिरांपासून, मंदिरांपैकी, मंदिरांमध्ये, मंदिरांमध्येही, मंदिरांसाठी, मंदिराकडे, मंदिराचा, मंदिराची, मंदिराचीसुद्धा, मंदिराचे, मंदिराच्या, मंदिराच्याआतून, मंदिराच्यामधून, मंदिरात, मंदिरातील, मंदिरातून, मंदिरापर्यंत, मंदिरापासून, मंदिराबरोबर, मंदिराबरोबरच, मंदिरामधून, मंदिरामध्ये, मंदिराला, मंदिरावर, मंदिरावरून, मंदिराशी, मंदिरास, मंदिरासमोरच, मंदिरासमोरील, मंदिरासाठी, मंदिरे.

The 3rd highest number of forms (35) was observed with the lemma “गाव”: गवामधून, गाव, गावच्या, गावजवळच, गावांकडे, गावांचा, गावांच्या, गावांच्यामध्ये, गावांत, गावांतील, गावांतून, गावांना, गावांमधील, गावांमध्ये, गावाकडे, गावाचा, गावाची, गावाचे, गावाच्या, गावाच्याआतून, गावाच्यामधून, गावाजवळ, गावाजवळील, गावात, गावातील, गावातून, गावापर्यंत, गावापासून, गावाबाहेर, गावाबाहेरील, गावामध्ये, गावाला, गावाशेजारी, गावी, गावे.

NOUN occurs with 3 features: Number (33066; 89% instances), Case (32287; 87% instances), Gender (30814; 83% instances)

NOUN occurs with 10 feature-value pairs: Case=Acc, Case=Dat, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing

NOUN occurs with 45 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing (4837 tokens). Examples: वापर, निर्णय, समावेश, प्रयत्न, भाग, आनंद, धबधबा, कार्यक्रम, जिल्हा, विकास

Relations

NOUN nodes are attached to their parents using 19 different relations: nmod (8452; 23% instances), obl (8185; 22% instances), compound (6133; 17% instances), nsubj (5226; 14% instances), obj (3723; 10% instances), conj (2051; 6% instances), root (1442; 4% instances), amod (691; 2% instances), case (411; 1% instances), iobj (299; 1% instances), acl (105; 0% instances), xcomp (96; 0% instances), nsubj:pass (75; 0% instances), advcl (40; 0% instances), acl:relcl (35; 0% instances), nmod:poss (7; 0% instances), dep (6; 0% instances), vocative (2; 0% instances), compound:lvc (1; 0% instances)

Parents of NOUN nodes belong to 13 different parts of speech: VERB (19686; 53% instances), NOUN (12535; 34% instances), (1442; 4% instances), PROPN (1382; 4% instances), ADJ (1013; 3% instances), NUM (306; 1% instances), AUX (178; 0% instances), ADV (169; 0% instances), PRON (117; 0% instances), ADP (63; 0% instances), DET (56; 0% instances), SCONJ (17; 0% instances), PART (16; 0% instances)

12894 (35%) NOUN nodes are leaves.

15132 (41%) NOUN nodes have one child.

5526 (15%) NOUN nodes have two children.

3428 (9%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 19.

Children of NOUN nodes are attached using 27 different relations: nmod (12834; 32% instances), amod (6784; 17% instances), compound (4166; 10% instances), punct (3330; 8% instances), det (2735; 7% instances), nummod (2475; 6% instances), conj (2010; 5% instances), cop (1304; 3% instances), cc (1274; 3% instances), nsubj (884; 2% instances), acl (715; 2% instances), case (679; 2% instances), dep (303; 1% instances), acl:relcl (220; 1% instances), obl (92; 0% instances), advmod (76; 0% instances), obj (51; 0% instances), xcomp (43; 0% instances), advcl (34; 0% instances), mark (28; 0% instances), dislocated (16; 0% instances), nmod:poss (15; 0% instances), aux (10; 0% instances), aux:pass (4; 0% instances), det:poss (2; 0% instances), ccomp (1; 0% instances), discourse (1; 0% instances)

Children of NOUN nodes belong to 13 different parts of speech: NOUN (12535; 31% instances), ADJ (5524; 14% instances), PROPN (4552; 11% instances), PRON (3774; 9% instances), PUNCT (3330; 8% instances), NUM (2672; 7% instances), VERB (2520; 6% instances), SCONJ (1472; 4% instances), DET (1383; 3% instances), AUX (1362; 3% instances), ADV (608; 2% instances), ADP (254; 1% instances), PART (100; 0% instances)