home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-Sequoia: POS Tags: NUM

There are 376 NUM lemmas (5%), 376 NUM types (4%) and 1732 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: deux, 5, trois, 2, 2006, 10, 1, 30, 3, 4

The 10 most frequent NUM types: deux, 5, trois, 2, 2006, 10, 1, 30, 3, 4

The 10 most frequent ambiguous lemmas: 1 000 (DET 2, NUM 2), 1/10 (NOUN 6, NUM 2), 1/100 (NOUN 4, NUM 2), 10 000 (DET 2, NUM 2), neuf (ADJ 2, NUM 2), 1/1000 (NOUN 2, NUM 1), 15 000 (DET 1, NUM 1), II (ADJ 1, NUM 1)

The 10 most frequent ambiguous types: 1 000 (DET 2, NUM 2), 1/10 (NOUN 6, NUM 2), 1/100 (NOUN 4, NUM 2), 10 000 (DET 2, NUM 2), neuf (NUM 2, ADJ 1), 1/1000 (NOUN 2, NUM 1), 15 000 (DET 1, NUM 1), II (ADJ 1, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.407211).

The 1st highest number of forms (1) was observed with the lemma “-6”: -6.

The 2nd highest number of forms (1) was observed with the lemma “0,0001”: 0,0001.

The 3rd highest number of forms (1) was observed with the lemma “0,001”: 0,001.

NUM occurs with 3 features: NumType (1726; 100% instances), ExtPos (1; 0% instances), Gender (1; 0% instances)

NUM occurs with 3 feature-value pairs: ExtPos=ADJ, Gender=Masc, NumType=Card

NUM occurs with 3 feature combinations. The most frequent feature combination is NumType=Card (1725 tokens). Examples: deux, 5, trois, 2, 2006, 10, 1, 30, 3, 4

Relations

NUM nodes are attached to their parents using 16 different relations: nummod (878; 51% instances), nmod (511; 30% instances), obl:mod (186; 11% instances), conj (46; 3% instances), obl:arg (36; 2% instances), obj (14; 1% instances), parataxis (14; 1% instances), parataxis:insert (14; 1% instances), appos (9; 1% instances), nsubj:pass (5; 0% instances), orphan (5; 0% instances), acl:relcl (4; 0% instances), nsubj (4; 0% instances), root (3; 0% instances), fixed (2; 0% instances), amod (1; 0% instances)

Parents of NUM nodes belong to 12 different parts of speech: NOUN (1346; 78% instances), VERB (212; 12% instances), NUM (76; 4% instances), PROPN (57; 3% instances), ADJ (15; 1% instances), X (9; 1% instances), ADP (4; 0% instances), ADV (4; 0% instances), DET (3; 0% instances), (3; 0% instances), SYM (2; 0% instances), PRON (1; 0% instances)

1206 (70%) NUM nodes are leaves.

277 (16%) NUM nodes have one child.

145 (8%) NUM nodes have two children.

104 (6%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 20 different relations: punct (259; 28% instances), nmod (207; 23% instances), case (203; 22% instances), det (97; 11% instances), conj (41; 4% instances), cc (32; 3% instances), obl:arg (21; 2% instances), obl:mod (14; 2% instances), advmod (13; 1% instances), dep (8; 1% instances), amod (7; 1% instances), appos (4; 0% instances), parataxis (3; 0% instances), acl (2; 0% instances), fixed (2; 0% instances), orphan (2; 0% instances), acl:relcl (1; 0% instances), cop (1; 0% instances), nsubj (1; 0% instances), nummod (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: PUNCT (259; 28% instances), ADP (209; 23% instances), NOUN (202; 22% instances), DET (100; 11% instances), NUM (76; 8% instances), CCONJ (33; 4% instances), ADV (11; 1% instances), PRON (8; 1% instances), ADJ (6; 1% instances), PROPN (5; 1% instances), VERB (4; 0% instances), X (3; 0% instances), SYM (2; 0% instances), AUX (1; 0% instances)