home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Hebrew-HTB: POS Tags: ADJ

There are 1310 ADJ lemmas (12%), 2526 ADJ types (13%) and 8416 ADJ tokens (5%). Out of 15 observed tags, the rank of ADJ is: 4 in number of lemmas, 4 in number of types and 6 in number of tokens.

The 10 most frequent ADJ lemmas: _, אחר, רב, חדש, גדול, לאומי, אחרון, ישראלי, אמריקני, טוב

The 10 most frequent ADJ types: אחרים, ראשון, גדול, לאומי, חדש, אחר, קשה, ראשונה, צריך, רבים

The 10 most frequent ambiguous lemmas: _ (NOUN 365, VERB 326, ADJ 230, ADV 192, AUX 169, CCONJ 109, X 76, PRON 57, SCONJ 46, DET 33), אחר (ADJ 212, ADP 47), רב (ADJ 177, NOUN 83, ADV 3), גדול (ADJ 147, NOUN 1), לאומי (ADJ 129, PROPN 1), ישראלי (ADJ 106, NOUN 12), אמריקני (ADJ 98, NOUN 16), טוב (ADJ 95, NOUN 11, ADV 3), נוסף (ADJ 92, VERB 18, ADP 2), שונה (ADJ 81, VERB 3)

The 10 most frequent ambiguous types: ראשון (ADJ 75, PROPN 19, NUM 5), גדול (ADJ 70, NOUN 1), לאומי (ADJ 68, PROPN 1), אחר (ADJ 54, ADP 47, ADV 2), קשה (ADJ 54, ADV 11), רבים (ADJ 49, NOUN 35), רב (ADJ 47, NOUN 39, ADV 3), טוב (ADJ 44, NOUN 7, ADV 3), ישראלי (ADJ 43, NOUN 5), אחרת (ADJ 38, ADV 9)

Morphology

The form / lemma ratio of ADJ is 1.928244 (the average of all parts of speech is 1.702584).

The 1st highest number of forms (57) was observed with the lemma “_”: אזוטרי, אידאליסטי, אלמנטרי, אקספרסיווי, אשמה, בית”ריות, בכורה, בלטיות, בלטית, בנאליות, גרמניה, דומה, דרמאטי, הבעתי, המוצעת, ויזואלית, זקוק, חובבניות, חוליגאנים, טכסיים, טרפים, יהודיה, יורקית, ייצוגי, יתר, כווייתים, כולל, כוללת, לבדך, מגוייס, מגוררת, מדוייקות, מהולל, מהוללת, מהפכני, נ”ל, סבאי, סיטרי, סימבולי, סימבולית, סיעודי, ספורטיווית, עיצוביים, עלוב, פונדמליסטיים, פלורנטיני, קולינארית, קולינרי, קיים, ראשון, ראשונה, ראשונות, ראשונים, רצוי, רשמית, שיפוטיות, תיאמן.

The 2nd highest number of forms (7) was observed with the lemma “ותיק”: וותיק, וותיקה, וותיקות, וותיקים, ותיק, ותיקה, ותיקים.

The 3rd highest number of forms (7) was observed with the lemma “יהודי”: יהודי, יהודיה, יהודיות, יהודייה, יהודיים, יהודים, יהודית.

ADJ occurs with 4 features: Gender (8289; 98% instances), Number (8289; 98% instances), Definite (104; 1% instances), Abbr (10; 0% instances)

ADJ occurs with 6 feature-value pairs: Abbr=Yes, Definite=Cons, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing

ADJ occurs with 13 feature combinations. The most frequent feature combination is Gender=Masc|Number=Sing (3546 tokens). Examples: ראשון, לאומי, גדול, חדש, אחר, צריך, קשה, טוב, ישראלי, אמריקאי

Relations

ADJ nodes are attached to their parents using 22 different relations: amod (6449; 77% instances), conj (487; 6% instances), root (429; 5% instances), acl:relcl (226; 3% instances), dep (172; 2% instances), obl (107; 1% instances), ccomp (103; 1% instances), xcomp (87; 1% instances), nmod (77; 1% instances), advcl (73; 1% instances), parataxis (33; 0% instances), appos (32; 0% instances), nsubj (30; 0% instances), advmod (28; 0% instances), fixed (27; 0% instances), flat:name (15; 0% instances), obj (11; 0% instances), compound:smixut (10; 0% instances), nmod:poss (9; 0% instances), acl (6; 0% instances), nsubj:outer (3; 0% instances), nsubj:cop (2; 0% instances)

Parents of ADJ nodes belong to 14 different parts of speech: NOUN (6756; 80% instances), VERB (523; 6% instances), ADJ (437; 5% instances), (429; 5% instances), PROPN (100; 1% instances), ADV (73; 1% instances), AUX (35; 0% instances), PRON (26; 0% instances), ADP (10; 0% instances), X (7; 0% instances), CCONJ (6; 0% instances), DET (6; 0% instances), NUM (6; 0% instances), SCONJ (2; 0% instances)

3381 (40%) ADJ nodes are leaves.

3295 (39%) ADJ nodes have one child.

602 (7%) ADJ nodes have two children.

1138 (14%) ADJ nodes have three or more children.

The highest child degree of a ADJ node is 14.

Children of ADJ nodes are attached using 33 different relations: det (3024; 32% instances), punct (1230; 13% instances), advmod (731; 8% instances), obl (689; 7% instances), nsubj (684; 7% instances), conj (491; 5% instances), cc (462; 5% instances), mark (453; 5% instances), cop (372; 4% instances), xcomp (301; 3% instances), dep (190; 2% instances), case (149; 2% instances), advcl (129; 1% instances), compound:affix (115; 1% instances), compound:smixut (86; 1% instances), csubj (68; 1% instances), nsubj:cop (40; 0% instances), parataxis (21; 0% instances), ccomp (19; 0% instances), amod (18; 0% instances), obj (18; 0% instances), acl:relcl (17; 0% instances), case:gen (17; 0% instances), nummod (10; 0% instances), appos (9; 0% instances), nsubj:outer (9; 0% instances), fixed (8; 0% instances), mark:q (6; 0% instances), case:acc (4; 0% instances), nmod:poss (4; 0% instances), dislocated (3; 0% instances), acl (2; 0% instances), nmod (2; 0% instances)

Children of ADJ nodes belong to 14 different parts of speech: DET (3060; 33% instances), NOUN (1253; 13% instances), PUNCT (1230; 13% instances), ADV (852; 9% instances), VERB (657; 7% instances), CCONJ (483; 5% instances), SCONJ (451; 5% instances), ADJ (437; 5% instances), AUX (367; 4% instances), PRON (291; 3% instances), ADP (173; 2% instances), PROPN (98; 1% instances), NUM (24; 0% instances), X (5; 0% instances)