home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Urdu-UDTB: POS Tags: NUM

There are 333 NUM lemmas (3%), 325 NUM types (3%) and 2461 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 13 in number of tokens.

The 10 most frequent NUM lemmas: اےک، ایک، دو، تین، لاکھ، ہزار، چھ، چار، کروڑ، پانچ

The 10 most frequent NUM types: اےک، ایک، دو، تین، دونوں، لاکھ، ہزار، کروڑ، پانچ، چار

The 10 most frequent ambiguous lemmas: اےک (NUM 530, PRON 9, NOUN 2, PART 1), ایک (NUM 256, NOUN 5, PROPN 3, ADJ 1, ADV 1, PRON 1), دو (NUM 198, NOUN 20, ADJ 16, DET 3, PROPN 2, PRON 1), تین (NUM 79, ADJ 2, NOUN 2), لاکھ (NUM 53, NOUN 3), ہزار (NUM 53, NOUN 12, ADJ 2, PROPN 1), چار (NUM 45, NOUN 1), کروڑ (NUM 44, NOUN 5, ADJ 1), پانچ (NUM 42, ADJ 1), دس (NUM 34, ADJ 1, NOUN 1, PROPN 1)

The 10 most frequent ambiguous types: اےک (NUM 535, PRON 8, NOUN 3, PART 1), ایک (NUM 281, NOUN 5, PROPN 3, ADV 1, PRON 1), دو (NUM 139, PROPN 2, NOUN 1), دونوں (NUM 70, NOUN 22, ADJ 13, DET 3, PRON 1), لاکھ (NUM 48, NOUN 2), ہزار (NUM 47, NOUN 9, PROPN 1), کروڑ (NUM 45, NOUN 4), پانچ (NUM 44, ADJ 1), چار (NUM 44, NOUN 1), دس (NUM 34, NOUN 1, PROPN 1)

Morphology

The form / lemma ratio of NUM is 0.975976 (the average of all parts of speech is 1.103404).

The 1st highest number of forms (6) was observed with the lemma “چھ”: 2, 3, 4, 5, 6, چھ.

The 2nd highest number of forms (3) was observed with the lemma “ایک”: 15, 2.4, ایک.

The 3rd highest number of forms (2) was observed with the lemma “1”: 1, 124361.

NUM occurs with 6 features: NumType (2461; 100% instances), Case (94; 4% instances), Number (59; 2% instances), Gender (45; 2% instances), Person (25; 1% instances), Echo (6; 0% instances)

NUM occurs with 9 feature-value pairs: Case=Acc, Case=Nom, Echo=Rdp, Gender=Fem, Gender=Masc, NumType=Card, Number=Plur, Number=Sing, Person=3

NUM occurs with 19 feature combinations. The most frequent feature combination is NumType=Card (2353 tokens). Examples: اےک، ایک، دو، تین، دونوں، لاکھ، ہزار، پانچ، کروڑ، چار

Relations

NUM nodes are attached to their parents using 15 different relations: nummod (2126; 86% instances), compound (139; 6% instances), obl (46; 2% instances), nmod (36; 1% instances), dep (29; 1% instances), nsubj (21; 1% instances), conj (17; 1% instances), obj (16; 1% instances), amod (8; 0% instances), root (7; 0% instances), xcomp (7; 0% instances), acl:relcl (5; 0% instances), acl (2; 0% instances), dislocated (1; 0% instances), iobj (1; 0% instances)

Parents of NUM nodes belong to 8 different parts of speech: NOUN (2001; 81% instances), NUM (242; 10% instances), PROPN (92; 4% instances), VERB (79; 3% instances), ADJ (29; 1% instances), PRON (7; 0% instances), (7; 0% instances), DET (4; 0% instances)

1987 (81%) NUM nodes are leaves.

329 (13%) NUM nodes have one child.

94 (4%) NUM nodes have two children.

51 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 9.

Children of NUM nodes are attached using 19 different relations: dep (147; 21% instances), compound (132; 19% instances), case (114; 16% instances), punct (65; 9% instances), nummod (57; 8% instances), nmod (53; 7% instances), amod (26; 4% instances), conj (22; 3% instances), cop (20; 3% instances), nsubj (20; 3% instances), cc (17; 2% instances), obl (12; 2% instances), det (8; 1% instances), mark (4; 1% instances), obj (4; 1% instances), acl:relcl (3; 0% instances), advcl (1; 0% instances), advmod (1; 0% instances), iobj (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: NUM (242; 34% instances), PART (129; 18% instances), ADP (122; 17% instances), PUNCT (65; 9% instances), NOUN (53; 7% instances), AUX (20; 3% instances), CCONJ (17; 2% instances), PROPN (15; 2% instances), DET (11; 2% instances), ADJ (10; 1% instances), ADV (8; 1% instances), PRON (7; 1% instances), SCONJ (4; 1% instances), VERB (4; 1% instances)