home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Hebrew-PostRab: POS Tags: NOUN

There are 600 NOUN lemmas (42%), 788 NOUN types (38%) and 1819 NOUN tokens (23%). Out of 13 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: יום, מקום, בית, יד, דבר, גט, שם, ראש, זמן, פה

The 10 most frequent NOUN types: בית, מקום, יום, גט, ראש, יד, פי, זמן, ארץ, בעל

The 10 most frequent ambiguous lemmas: יד (NOUN 33, NUM 1), דבר (NOUN 28, VERB 1), שם (NOUN 26, ADV 25, PROPN 4, VERB 3), שנה (NOUN 23, VERB 1), בן (NOUN 21, VERB 2), בעל (NOUN 19, VERB 1), עד (ADP 21, NOUN 16), דרך (NOUN 15, VERB 1), אדם (NOUN 12, PROPN 3), חלק (NOUN 6, VERB 2)

The 10 most frequent ambiguous types: יד (NOUN 21, NUM 1), בעל (NOUN 18, VERB 1), שם (ADV 25, NOUN 14, PROPN 4, VERB 1), דבר (NOUN 13, VERB 1), אדם (NOUN 12, PROPN 3), מצוה (NOUN 9, VERB 1), מוכר (NOUN 8, VERB 1), הן (PRON 11, NOUN 5), חלק (NOUN 5, VERB 2), כלים (NOUN 4, VERB 1)

Morphology

The form / lemma ratio of NOUN is 1.313333 (the average of all parts of speech is 1.440168).

The 1st highest number of forms (5) was observed with the lemma “שנה”: שנה, שנותי, שנים, שנת, שנתים.

The 2nd highest number of forms (4) was observed with the lemma “אשה”: אשה, אשת, נשי, נשים.

The 3rd highest number of forms (4) was observed with the lemma “בן”: בן, בנ, בני, בנים.

NOUN occurs with 3 features: Number (1766; 97% instances), Gender (1763; 97% instances), ExtPos (2; 0% instances)

NOUN occurs with 7 feature-value pairs: ExtPos=ADV, Gender=Fem, Gender=Fem,Masc, Gender=Masc, Number=Dual, Number=Plur, Number=Sing

NOUN occurs with 11 feature combinations. The most frequent feature combination is Gender=Masc|Number=Sing (849 tokens). Examples: בית, מקום, יום, גט, ראש, זמן, בעל, שליח, עולם, דבר

Relations

NOUN nodes are attached to their parents using 31 different relations: obl (458; 25% instances), compound:smixut (312; 17% instances), obj (233; 13% instances), nsubj (223; 12% instances), conj (166; 9% instances), nmod (90; 5% instances), nsubj:cop (84; 5% instances), fixed (38; 2% instances), appos (35; 2% instances), acl:relcl (25; 1% instances), obl:unmarked (24; 1% instances), root (24; 1% instances), advcl (22; 1% instances), dislocated (19; 1% instances), ccomp (11; 1% instances), nsubj:outer (7; 0% instances), orphan (7; 0% instances), xcomp (7; 0% instances), nmod:poss (5; 0% instances), obl:tmod (5; 0% instances), flat (4; 0% instances), case (3; 0% instances), csubj (3; 0% instances), parataxis (3; 0% instances), acl (2; 0% instances), advmod (2; 0% instances), dep (2; 0% instances), nummod (2; 0% instances), amod (1; 0% instances), nmod:tmod (1; 0% instances), nmod:unmarked (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (971; 53% instances), NOUN (605; 33% instances), ADJ (83; 5% instances), ADV (37; 2% instances), PRON (28; 2% instances), (24; 1% instances), AUX (18; 1% instances), ADP (17; 1% instances), NUM (13; 1% instances), PROPN (12; 1% instances), DET (9; 0% instances), CCONJ (2; 0% instances)

255 (14%) NOUN nodes are leaves.

694 (38%) NOUN nodes have one child.

565 (31%) NOUN nodes have two children.

305 (17%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 12.

Children of NOUN nodes are attached using 33 different relations: case (673; 23% instances), det (525; 18% instances), compound:smixut (362; 12% instances), nmod:poss (256; 9% instances), conj (184; 6% instances), cc (174; 6% instances), acl:relcl (138; 5% instances), amod (91; 3% instances), nmod (89; 3% instances), mark (80; 3% instances), nummod (73; 2% instances), advmod (56; 2% instances), nsubj:cop (48; 2% instances), appos (36; 1% instances), cop (33; 1% instances), nsubj (29; 1% instances), advcl (24; 1% instances), acl (11; 0% instances), obl (7; 0% instances), fixed (5; 0% instances), parataxis (5; 0% instances), csubj (4; 0% instances), flat (4; 0% instances), case:gen (3; 0% instances), dep (3; 0% instances), dislocated (3; 0% instances), case:acc (2; 0% instances), ccomp (2; 0% instances), discourse (1; 0% instances), nmod:tmod (1; 0% instances), nmod:unmarked (1; 0% instances), nsubj:outer (1; 0% instances), orphan (1; 0% instances)

Children of NOUN nodes belong to 13 different parts of speech: ADP (680; 23% instances), NOUN (605; 21% instances), DET (490; 17% instances), PRON (372; 13% instances), CCONJ (187; 6% instances), VERB (162; 6% instances), ADJ (120; 4% instances), NUM (93; 3% instances), PROPN (67; 2% instances), ADV (64; 2% instances), SCONJ (61; 2% instances), AUX (21; 1% instances), INTJ (3; 0% instances)