home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-GSD: POS Tags: NUM

There are 1951 NUM lemmas (5%), 1951 NUM types (4%) and 10490 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: deux, trois, 2, 3, 5, quatre, 2010, 4, 2009, 20

The 10 most frequent NUM types: deux, trois, 2, 3, 5, quatre, 2010, 4, 2009, 20

The 10 most frequent ambiguous lemmas: deux (NUM 599, PRON 10, NOUN 4), trois (NUM 227, PRON 2), 2 (NUM 154, PROPN 5, PRON 1), quatre (NUM 115, PRON 1), 2010 (NUM 122, PROPN 1), 4 (NUM 120, PROPN 2), 1 (NUM 103, PROPN 2), 15 (NUM 93, PROPN 1), cinq (NUM 68, NOUN 3, PRON 1), 18 (NUM 74, PROPN 1)

The 10 most frequent ambiguous types: deux (NUM 598, PRON 6, NOUN 4), trois (NUM 227, PRON 2), 2 (NUM 154, PROPN 5, PRON 2), quatre (NUM 115, PRON 1), 2010 (NUM 122, PROPN 1), 4 (NUM 120, PROPN 2), 7 (NUM 105, ADJ 1), 1 (NUM 103, PROPN 2), 15 (NUM 93, PROPN 1), cinq (NUM 67, NOUN 3, PRON 1)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.305352).

The 1st highest number of forms (2) was observed with the lemma “1963”: 1963, 1969.

The 2nd highest number of forms (1) was observed with the lemma “’06”: ‘06.

The 3rd highest number of forms (1) was observed with the lemma “’900”: ‘900.

NUM occurs with 2 features: Gender (10; 0% instances), Number (10; 0% instances)

NUM occurs with 2 feature-value pairs: Gender=Fem, Number=Plur

NUM occurs with 2 feature combinations. The most frequent feature combination is _ (10480 tokens). Examples: deux, trois, 2, 3, 5, quatre, 2010, 4, 2009, 20

Relations

NUM nodes are attached to their parents using 24 different relations: nummod (3727; 36% instances), nmod (3263; 31% instances), obl (1406; 13% instances), obl:mod (872; 8% instances), obl:arg (526; 5% instances), conj (389; 4% instances), appos (112; 1% instances), compound (58; 1% instances), nsubj (28; 0% instances), obj (26; 0% instances), root (19; 0% instances), flat:name (15; 0% instances), xcomp (14; 0% instances), nsubj:pass (11; 0% instances), dep (6; 0% instances), parataxis (6; 0% instances), advcl (3; 0% instances), fixed (2; 0% instances), obl:agent (2; 0% instances), acl:relcl (1; 0% instances), advmod (1; 0% instances), amod (1; 0% instances), ccomp (1; 0% instances), orphan (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (5837; 56% instances), VERB (2758; 26% instances), PROPN (712; 7% instances), NUM (580; 6% instances), SYM (429; 4% instances), ADJ (63; 1% instances), X (43; 0% instances), PRON (33; 0% instances), (19; 0% instances), ADV (9; 0% instances), INTJ (3; 0% instances), ADP (2; 0% instances), PUNCT (1; 0% instances), SCONJ (1; 0% instances)

5606 (53%) NUM nodes are leaves.

2752 (26%) NUM nodes have one child.

1611 (15%) NUM nodes have two children.

521 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 11.

Children of NUM nodes are attached using 28 different relations: case (2879; 37% instances), nmod (1442; 19% instances), punct (1228; 16% instances), det (1090; 14% instances), conj (402; 5% instances), advmod (261; 3% instances), cc (257; 3% instances), appos (44; 1% instances), amod (41; 1% instances), cop (28; 0% instances), nummod (28; 0% instances), nsubj (26; 0% instances), acl (18; 0% instances), compound (11; 0% instances), acl:relcl (7; 0% instances), obl:mod (6; 0% instances), advcl:cleft (4; 0% instances), mark (3; 0% instances), advcl (2; 0% instances), flat:name (2; 0% instances), obl (2; 0% instances), parataxis (2; 0% instances), aux (1; 0% instances), ccomp (1; 0% instances), expl (1; 0% instances), fixed (1; 0% instances), obl:arg (1; 0% instances), orphan (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: ADP (2895; 37% instances), NOUN (1250; 16% instances), PUNCT (1226; 16% instances), DET (1093; 14% instances), NUM (580; 7% instances), CCONJ (246; 3% instances), ADV (242; 3% instances), ADJ (55; 1% instances), PROPN (46; 1% instances), PRON (41; 1% instances), VERB (38; 0% instances), AUX (29; 0% instances), X (22; 0% instances), SYM (13; 0% instances), SCONJ (12; 0% instances), PART (1; 0% instances)