home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Persian-PerDT: POS Tags: PRON

There are 55 PRON lemmas (0%), 86 PRON types (0%) and 24140 PRON tokens (5%). Out of 16 observed tags, the rank of PRON is: 10 in number of lemmas, 9 in number of types and 6 in number of tokens.

The 10 most frequent PRON lemmas: او، من، خود، ما، آن، آنها، شما، تو، این، هم

The 10 most frequent PRON types: خود، او، آن، ش، ما، من، م، شما، آنها، این

The 10 most frequent ambiguous lemmas: او (PRON 4408, NOUN 3, PROPN 1), من (PRON 3482, NOUN 11, ADP 1), خود (PRON 2856, NOUN 3), ما (PRON 2343, NOUN 1), آن (PRON 2019, DET 1140, NOUN 16, ADJ 1), آنها (PRON 2007, NOUN 1), تو (PRON 1215, NOUN 11, ADP 10, PROPN 1), این (DET 4859, PRON 904, NOUN 2), هم (ADV 1295, PRON 465, ADJ 1), وی (PRON 396, PROPN 2)

The 10 most frequent ambiguous types: خود (PRON 3472, NOUN 1), او (PRON 2051, NOUN 3, PROPN 1), آن (PRON 1991, DET 1130, NOUN 10, ADJ 1), ش (PRON 1949, ADJ 2, NOUN 1), ما (PRON 1867, NOUN 1), من (PRON 1696, NOUN 8, ADP 1), م (PRON 1571, VERB 109, AUX 4, ADJ 1, PROPN 1), آنها (PRON 1187, NOUN 1), این (DET 4814, PRON 902, NOUN 2), تو (PRON 734, ADP 11, NOUN 5)

Morphology

The form / lemma ratio of PRON is 1.563636 (the average of all parts of speech is 1.486663).

The 1st highest number of forms (7) was observed with the lemma “آنها”: آنها, آنهایی, آن‌ها, آن‌هایی, انها, شان, یشان.

The 2nd highest number of forms (6) was observed with the lemma “او”: اش, او, اویی, ش, و, یش.

The 3rd highest number of forms (6) was observed with the lemma “تو”: ات, ت, تو, توی, تو‌, یت.

PRON occurs with 3 features: Number (20256; 84% instances), Person (16853; 70% instances), PronType (5697; 24% instances)

PRON occurs with 6 feature-value pairs: Number=Plur, Number=Sing, Person=1, Person=2, Person=3, PronType=Prs

PRON occurs with 15 feature combinations. The most frequent feature combination is _ (3884 tokens). Examples: خود، هم، خویش، یکدیگر، کجا، همدیگر، چه، آن، چنین، خویشتن

Relations

PRON nodes are attached to their parents using 17 different relations: nmod (12618; 52% instances), nsubj (4412; 18% instances), obl:arg (2923; 12% instances), obj (2218; 9% instances), obl (1300; 5% instances), root (248; 1% instances), conj (103; 0% instances), xcomp (91; 0% instances), compound:lvc (62; 0% instances), appos (58; 0% instances), amod (32; 0% instances), ccomp (24; 0% instances), dep (16; 0% instances), acl (13; 0% instances), nsubj:pass (12; 0% instances), advcl (9; 0% instances), csubj (1; 0% instances)

Parents of PRON nodes belong to 12 different parts of speech: NOUN (12224; 51% instances), VERB (10069; 42% instances), PRON (784; 3% instances), ADJ (562; 2% instances), (248; 1% instances), PROPN (103; 0% instances), AUX (50; 0% instances), ADP (33; 0% instances), INTJ (26; 0% instances), ADV (23; 0% instances), CCONJ (12; 0% instances), SCONJ (6; 0% instances)

15281 (63%) PRON nodes are leaves.

7258 (30%) PRON nodes have one child.

1111 (5%) PRON nodes have two children.

490 (2%) PRON nodes have three or more children.

The highest child degree of a PRON node is 7.

Children of PRON nodes are attached using 22 different relations: case (7032; 62% instances), acl (1185; 10% instances), nmod (904; 8% instances), punct (641; 6% instances), cop (313; 3% instances), nsubj (295; 3% instances), dep (281; 2% instances), conj (251; 2% instances), appos (130; 1% instances), obl (103; 1% instances), cc (81; 1% instances), advmod (61; 1% instances), advcl (26; 0% instances), mark (25; 0% instances), det (17; 0% instances), amod (6; 0% instances), obl:arg (6; 0% instances), csubj (4; 0% instances), xcomp (3; 0% instances), ccomp (2; 0% instances), nummod (1; 0% instances), obj (1; 0% instances)

Children of PRON nodes belong to 15 different parts of speech: ADP (7026; 62% instances), VERB (957; 8% instances), NOUN (812; 7% instances), PRON (784; 7% instances), PUNCT (641; 6% instances), ADV (342; 3% instances), AUX (321; 3% instances), SCONJ (234; 2% instances), CCONJ (99; 1% instances), PROPN (52; 0% instances), ADJ (46; 0% instances), PART (26; 0% instances), DET (18; 0% instances), INTJ (9; 0% instances), NUM (1; 0% instances)