home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Persian-PerDT: POS Tags: NOUN

There are 12664 NOUN lemmas (49%), 19428 NOUN types (51%) and 168216 NOUN tokens (34%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: کس، سال، مردم، کار، روز، دست، کشور، همه، خدا، وقت

The 10 most frequent NOUN types: سال، مردم، کار، کسی، دست، روز، سر، خدا، صورت، کشور

The 10 most frequent ambiguous lemmas: کس (NOUN 1226, PRON 2), سال (NOUN 1222, PROPN 1), مردم (NOUN 1022, PROPN 2), کار (NOUN 961, PROPN 8), روز (NOUN 938, PROPN 7, ADJ 2), کشور (NOUN 779, PROPN 69), همه (NOUN 777, DET 118, PRON 7), خدا (NOUN 742, PROPN 55), وقت (NOUN 733, SCONJ 1), جا (NOUN 649, INTJ 1)

The 10 most frequent ambiguous types: سال (NOUN 984, PROPN 1), مردم (NOUN 961, PROPN 2), کار (NOUN 735, PROPN 8), کسی (NOUN 718, PRON 1), روز (NOUN 672, PROPN 7, ADJ 2), سر (NOUN 600, ADP 66, PROPN 3, ADJ 1), خدا (NOUN 598, PROPN 53), کشور (NOUN 545, PROPN 66), راه (NOUN 483, PROPN 18), وقتی (NOUN 477, SCONJ 3)

Morphology

The form / lemma ratio of NOUN is 1.534112 (the average of all parts of speech is 1.486663).

The 1st highest number of forms (11) was observed with the lemma “هدف”: اهداف, اهدافتان, اهدافش, اهدافم, اهدافمان, اهدافی, هدف, هدفی, هدف‌ها, هدف‌های, هدف‌هایی.

The 2nd highest number of forms (10) was observed with the lemma “اثر”: آثار, آثارش, آثارشان, آثارم, آثاری, اثر, اثرات, اثرها, اثرهای, اثری.

The 3rd highest number of forms (10) was observed with the lemma “دلیل”: ادله, ادلهٔ, دلائل, دلائلی, دلایل, دلایلی, دلیل, دلیلی, دلیل‌های, دلیل‌هایی.

NOUN occurs with 1 features: Number (167690; 100% instances)

NOUN occurs with 2 feature-value pairs: Number=Plur, Number=Sing

NOUN occurs with 3 feature combinations. The most frequent feature combination is Number=Sing (139430 tokens). Examples: سال، کار، کسی، دست، روز، خدا، سر، صورت، کشور، بار

Relations

NOUN nodes are attached to their parents using 19 different relations: nmod (41210; 24% instances), compound:lvc (30807; 18% instances), obl (27344; 16% instances), nsubj (18818; 11% instances), obj (17110; 10% instances), obl:arg (16978; 10% instances), conj (10156; 6% instances), xcomp (1776; 1% instances), root (1472; 1% instances), appos (733; 0% instances), nsubj:pass (484; 0% instances), amod (419; 0% instances), acl (363; 0% instances), ccomp (266; 0% instances), advcl (166; 0% instances), vocative (97; 0% instances), csubj (9; 0% instances), iobj (6; 0% instances), goeswith (2; 0% instances)

Parents of NOUN nodes belong to 15 different parts of speech: VERB (103943; 62% instances), NOUN (53611; 32% instances), ADJ (5143; 3% instances), PROPN (1611; 1% instances), (1472; 1% instances), PRON (812; 0% instances), AUX (570; 0% instances), ADP (306; 0% instances), ADV (265; 0% instances), CCONJ (143; 0% instances), INTJ (130; 0% instances), SCONJ (118; 0% instances), NUM (77; 0% instances), DET (13; 0% instances), PART (2; 0% instances)

52981 (31%) NOUN nodes are leaves.

53137 (32%) NOUN nodes have one child.

46484 (28%) NOUN nodes have two children.

15614 (9%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 16.

Children of NOUN nodes are attached using 28 different relations: case (59043; 30% instances), nmod (58694; 29% instances), amod (22218; 11% instances), conj (10356; 5% instances), det (10138; 5% instances), cc (9142; 5% instances), acl (7624; 4% instances), punct (7399; 4% instances), nummod (5048; 3% instances), cop (2382; 1% instances), nsubj (1701; 1% instances), obl (1698; 1% instances), advmod (1300; 1% instances), dep (1214; 1% instances), appos (538; 0% instances), mark (490; 0% instances), compound:lvc (458; 0% instances), advcl (166; 0% instances), csubj (163; 0% instances), obl:arg (153; 0% instances), ccomp (84; 0% instances), xcomp (33; 0% instances), obj (25; 0% instances), aux (21; 0% instances), vocative (10; 0% instances), compound (3; 0% instances), goeswith (2; 0% instances), flat:name (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: ADP (58653; 29% instances), NOUN (53611; 27% instances), ADJ (22797; 11% instances), PRON (12224; 6% instances), DET (10235; 5% instances), CCONJ (9668; 5% instances), PUNCT (7399; 4% instances), PROPN (7316; 4% instances), VERB (7116; 4% instances), NUM (5067; 3% instances), AUX (2506; 1% instances), ADV (1905; 1% instances), SCONJ (1373; 1% instances), INTJ (200; 0% instances), PART (33; 0% instances), X (1; 0% instances)