home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-ParisStories: POS Tags: NUM

There are 42 NUM lemmas (2%), 45 NUM types (1%) and 243 NUM tokens (1%). Out of 15 observed tags, the rank of NUM is: 7 in number of lemmas, 8 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: deux, trois, six, dix, cinq, mille, quatre, huit, quatorze, sept

The 10 most frequent NUM types: deux, trois, six, dix, cinq, mille, quatre, huit, quatorze, sept

The 10 most frequent ambiguous lemmas: cinq (NUM 12, PROPN 1), un (DET 871, PRON 24, NUM 5, ADP 1)

The 10 most frequent ambiguous types: une (DET 236, PRON 9, NUM 3), un (DET 426, PRON 15, NUM 2)

Morphology

The form / lemma ratio of NUM is 1.071429 (the average of all parts of speech is 1.379119).

The 1st highest number of forms (2) was observed with the lemma “cent”: cent, cents.

The 2nd highest number of forms (2) was observed with the lemma “trois”: 3, trois.

The 3rd highest number of forms (2) was observed with the lemma “un”: un, une.

NUM occurs with 2 features: Number (16; 7% instances), Gender (6; 2% instances)

NUM occurs with 3 feature-value pairs: Gender=Fem, Gender=Masc, Number=Plur

NUM occurs with 4 feature combinations. The most frequent feature combination is _ (221 tokens). Examples: deux, trois, six, cinq, mille, quatre, dix, huit, quatorze, sept

Relations

NUM nodes are attached to their parents using 14 different relations: nummod (159; 65% instances), flat (22; 9% instances), nmod (19; 8% instances), obl:mod (18; 7% instances), conj (6; 2% instances), obj (5; 2% instances), reparandum (5; 2% instances), obl:arg (2; 1% instances), root (2; 1% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), ccomp (1; 0% instances), dislocated (1; 0% instances), flat:name (1; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (171; 70% instances), NUM (32; 13% instances), VERB (20; 8% instances), ADJ (6; 2% instances), PRON (5; 2% instances), AUX (4; 2% instances), PROPN (2; 1% instances), (2; 1% instances), X (1; 0% instances)

182 (75%) NUM nodes are leaves.

27 (11%) NUM nodes have one child.

21 (9%) NUM nodes have two children.

13 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 8.

Children of NUM nodes are attached using 18 different relations: flat (22; 17% instances), det (18; 14% instances), punct (17; 13% instances), amod (11; 8% instances), case (7; 5% instances), conj (7; 5% instances), reparandum (7; 5% instances), discourse (6; 5% instances), nmod (6; 5% instances), advmod (5; 4% instances), cc (5; 4% instances), cop (5; 4% instances), nsubj (5; 4% instances), obl:mod (4; 3% instances), acl:relcl (2; 2% instances), dislocated (2; 2% instances), mark (2; 2% instances), dep (1; 1% instances)

Children of NUM nodes belong to 13 different parts of speech: NUM (32; 24% instances), DET (19; 14% instances), PUNCT (17; 13% instances), ADJ (11; 8% instances), NOUN (11; 8% instances), PRON (9; 7% instances), ADP (8; 6% instances), ADV (7; 5% instances), CCONJ (6; 5% instances), AUX (5; 4% instances), INTJ (3; 2% instances), VERB (3; 2% instances), SCONJ (1; 1% instances)