home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Hebrew: POS Tags: DET

There are 20 DET lemmas (0%), 26 DET types (0%) and 17424 DET tokens (11%). Out of 16 observed tags, the rank of DET is: 12 in number of lemmas, 12 in number of types and 4 in number of tokens.

The 10 most frequent DET lemmas: ה, כול, כמה, הרבה, רוב, _, שום, מספר, אף, שאר

The 10 most frequent DET types: ה, ה_, כל, כמה, הרבה, רוב, שום, מספר, אף, שאר

The 10 most frequent ambiguous lemmas: ה (DET 16515, SCONJ 745, X 28), כול (DET 524, NOUN 31, ADV 1), הרבה (DET 35, ADV 22, VERB 13), רוב (DET 34, NOUN 15), _ (VERB 420, NOUN 368, ADJ 231, ADP 190, ADV 174, PRON 130, CCONJ 113, AUX 99, X 86, SCONJ 47, PART 34, DET 33), שום (DET 33, NOUN 4, PROPN 1), מספר (NOUN 38, DET 31), אף (CCONJ 99, DET 20, NOUN 13), שאר (NOUN 20, DET 16), מרבית (DET 14, NOUN 2)

The 10 most frequent ambiguous types: ה (DET 13596, SCONJ 745, X 21), ה_ (DET 2935, X 8), הרבה (DET 35, ADV 22, VERB 3, X 1), רוב (DET 34, NOUN 10), שום (DET 33, NOUN 4, PROPN 1), מספר (DET 31, NOUN 30, VERB 6), אף (CCONJ 99, DET 20, ADV 14, NOUN 12), שאר (NOUN 19, DET 16), מרבית (DET 14, NOUN 1), מחצית (NOUN 22, DET 11, X 1)

Morphology

The form / lemma ratio of DET is 1.300000 (the average of all parts of speech is 1.709692).

The 1st highest number of forms (5) was observed with the lemma “”: אילו, ה, מחצית, מירב, מרבה.

The 2nd highest number of forms (2) was observed with the lemma “איזה”: איזה, איזו.

The 3rd highest number of forms (2) was observed with the lemma “ה”: ה, ה_.

DET occurs with 4 features: PronType (16531; 95% instances), Definite (893; 5% instances), Gender (18; 0% instances), HebSource (9; 0% instances)

DET occurs with 5 feature-value pairs: Definite=Cons, Gender=Masc, HebSource=ConvUncertainHead, HebSource=ConvUncertainLabel, PronType=Art

DET occurs with 5 feature combinations. The most frequent feature combination is PronType=Art (16528 tokens). Examples: ה, ה_

Relations

DET nodes are attached to their parents using 21 different relations: det:def (16346; 94% instances), det (766; 4% instances), dep (144; 1% instances), advmod (39; 0% instances), fixed (36; 0% instances), mark (31; 0% instances), advcl (9; 0% instances), obl (9; 0% instances), compound:smixut (8; 0% instances), nsubj (7; 0% instances), obj (6; 0% instances), root (4; 0% instances), amod (3; 0% instances), aux:q (3; 0% instances), nmod:poss (3; 0% instances), advmod:phrase (2; 0% instances), appos (2; 0% instances), conj (2; 0% instances), nsubj:cop (2; 0% instances), iobj (1; 0% instances), parataxis (1; 0% instances)

Parents of DET nodes belong to 12 different parts of speech: NOUN (13173; 76% instances), ADJ (3077; 18% instances), NUM (357; 2% instances), VERB (295; 2% instances), PRON (236; 1% instances), PROPN (180; 1% instances), ADV (58; 0% instances), ADP (19; 0% instances), AUX (17; 0% instances), DET (5; 0% instances), (4; 0% instances), PUNCT (3; 0% instances)

17345 (100%) DET nodes are leaves.

30 (0%) DET nodes have one child.

31 (0%) DET nodes have two children.

18 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 20.

Children of DET nodes are attached using 17 different relations: dep (94; 48% instances), fixed (49; 25% instances), punct (11; 6% instances), case (8; 4% instances), flat:name (8; 4% instances), obl (5; 3% instances), advmod (4; 2% instances), case:gen (3; 2% instances), det:def (3; 2% instances), acl:relcl (2; 1% instances), case:acc (2; 1% instances), cc (2; 1% instances), amod (1; 1% instances), appos (1; 1% instances), compound:smixut (1; 1% instances), conj (1; 1% instances), nsubj (1; 1% instances)

Children of DET nodes belong to 14 different parts of speech: ADV (51; 26% instances), PUNCT (45; 23% instances), PROPN (21; 11% instances), NOUN (14; 7% instances), NUM (14; 7% instances), VERB (14; 7% instances), ADP (12; 6% instances), ADJ (5; 3% instances), DET (5; 3% instances), PART (5; 3% instances), PRON (3; 2% instances), SCONJ (3; 2% instances), AUX (2; 1% instances), CCONJ (2; 1% instances)