home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-GSD: POS Tags: NUM

There are 1877 NUM lemmas (5%), 1878 NUM types (4%) and 10438 NUM tokens (3%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: deux, trois, 2, 3, 5, quatre, 2010, un, 4, 2009

The 10 most frequent NUM types: deux, trois, 2, 3, 5, quatre, 2010, 4, 2009, 20

The 10 most frequent ambiguous lemmas: un (DET 10060, PRON 319, NUM 122, X 1), 4 (NUM 119, PROPN 2, X 1), 7 (NUM 104, X 1), II (NUM 78, PROPN 5), cinq (NUM 77, NOUN 3), 1er (NUM 44, ADJ 18), 50 (NUM 39, X 1), I (NUM 29, PROPN 10), cent (NUM 20, NOUN 1), neuf (NUM 17, ADJ 8, NOUN 1)

The 10 most frequent ambiguous types: 4 (NUM 119, PROPN 2, X 1), 7 (NUM 104, ADJ 1, X 1), II (NUM 78, PROPN 5), cinq (NUM 68, NOUN 3), une (DET 3373, PRON 114, NUM 58, NOUN 1), un (DET 3929, PRON 182, NUM 58, X 1), 1er (NUM 44, ADJ 16), 50 (NUM 39, X 1), I (NUM 29, PROPN 10), IV (NUM 23, ADJ 1)

Morphology

The form / lemma ratio of NUM is 1.000533 (the average of all parts of speech is 1.307295).

The 1st highest number of forms (2) was observed with the lemma “cent”: cent, cents.

The 2nd highest number of forms (2) was observed with the lemma “un”: un, une.

The 3rd highest number of forms (1) was observed with the lemma “’06”: ‘06.

NUM occurs with 4 features: Number (10422; 100% instances), Gender (61; 1% instances), Typo (2; 0% instances), ExtPos (1; 0% instances)

NUM occurs with 5 feature-value pairs: ExtPos=PROPN, Gender=Fem, Number=Plur, Number=Sing, Typo=Yes

NUM occurs with 6 feature combinations. The most frequent feature combination is Number=Plur (9197 tokens). Examples: deux, trois, 2, quatre, 2010, 2009, 2008, 2011, 3, 5

Relations

NUM nodes are attached to their parents using 21 different relations: nummod (3579; 34% instances), nmod (3227; 31% instances), obl:mod (2213; 21% instances), obl:arg (741; 7% instances), conj (373; 4% instances), appos (71; 1% instances), orphan (40; 0% instances), nsubj (32; 0% instances), obj (32; 0% instances), root (24; 0% instances), flat:name (20; 0% instances), parataxis (19; 0% instances), acl:relcl (18; 0% instances), flat (16; 0% instances), nsubj:pass (12; 0% instances), xcomp (11; 0% instances), advcl (3; 0% instances), ccomp (3; 0% instances), obl:agent (2; 0% instances), fixed (1; 0% instances), obl (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (5749; 55% instances), VERB (2693; 26% instances), PROPN (651; 6% instances), NUM (642; 6% instances), SYM (445; 4% instances), X (65; 1% instances), ADJ (62; 1% instances), ADP (57; 1% instances), PRON (36; 0% instances), (24; 0% instances), ADV (9; 0% instances), INTJ (3; 0% instances), AUX (1; 0% instances), DET (1; 0% instances)

5528 (53%) NUM nodes are leaves.

2429 (23%) NUM nodes have one child.

1742 (17%) NUM nodes have two children.

739 (7%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 26 different relations: case (2952; 35% instances), punct (1725; 20% instances), nmod (1418; 17% instances), det (1091; 13% instances), conj (379; 4% instances), cc (279; 3% instances), advmod (186; 2% instances), obl:arg (174; 2% instances), amod (57; 1% instances), appos (38; 0% instances), cop (31; 0% instances), nsubj (26; 0% instances), flat (16; 0% instances), acl (15; 0% instances), obl:mod (9; 0% instances), orphan (9; 0% instances), parataxis (9; 0% instances), flat:name (8; 0% instances), acl:relcl (6; 0% instances), mark (6; 0% instances), advcl:cleft (5; 0% instances), expl:subj (5; 0% instances), advcl (3; 0% instances), flat:foreign (2; 0% instances), aux:tense (1; 0% instances), discourse (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: ADP (2961; 35% instances), PUNCT (1725; 20% instances), NOUN (1286; 15% instances), DET (1090; 13% instances), NUM (642; 8% instances), CCONJ (256; 3% instances), ADV (173; 2% instances), ADJ (63; 1% instances), PROPN (63; 1% instances), PRON (55; 1% instances), SYM (44; 1% instances), VERB (39; 0% instances), AUX (32; 0% instances), X (17; 0% instances), SCONJ (4; 0% instances), INTJ (1; 0% instances)