home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-PUD: POS Tags: NUM

There are 229 NUM lemmas (5%), 230 NUM types (4%) and 451 NUM tokens (2%). Out of 15 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: deux, trois, quatre, 1, 3, milliard, 10, II, III, dix

The 10 most frequent NUM types: deux, trois, quatre, 1, 3, 10, II, III, dix, milliards

The 10 most frequent ambiguous lemmas: milliard (NUM 7, NOUN 2), un (DET 660, PRON 11, NUM 5), 1er (NUM 3, ADJ 1), premier (ADJ 36, NUM 1)

The 10 most frequent ambiguous types: milliards (NUM 6, NOUN 1), un (DET 225, PRON 10, NUM 5), 1er (NUM 3, ADJ 1), milliard (NOUN 1, NUM 1), premier (ADJ 6, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.004367 (the average of all parts of speech is 1.300944).

The 1st highest number of forms (2) was observed with the lemma “milliard”: milliard, milliards.

The 2nd highest number of forms (1) was observed with the lemma “0”: 0.

The 3rd highest number of forms (1) was observed with the lemma “06h30”: 06h30.

NUM occurs with 3 features: Gender (4; 1% instances), Number (4; 1% instances), Typo (1; 0% instances)

NUM occurs with 3 feature-value pairs: Gender=Masc, Number=Sing, Typo=Yes

NUM occurs with 3 feature combinations. The most frequent feature combination is _ (446 tokens). Examples: deux, trois, quatre, 1, 3, 10, II, III, dix, milliards

Relations

NUM nodes are attached to their parents using 10 different relations: nummod (218; 48% instances), obl (86; 19% instances), nmod (83; 18% instances), appos (25; 6% instances), obl:mod (18; 4% instances), conj (9; 2% instances), nsubj (4; 1% instances), obl:arg (4; 1% instances), obj (2; 0% instances), orphan (2; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (257; 57% instances), VERB (103; 23% instances), SYM (35; 8% instances), NUM (23; 5% instances), PROPN (22; 5% instances), ADV (6; 1% instances), ADJ (2; 0% instances), PRON (2; 0% instances), ADP (1; 0% instances)

258 (57%) NUM nodes are leaves.

100 (22%) NUM nodes have one child.

61 (14%) NUM nodes have two children.

32 (7%) NUM nodes have three or more children.

The highest child degree of a NUM node is 5.

Children of NUM nodes are attached using 13 different relations: case (134; 41% instances), punct (65; 20% instances), nmod (45; 14% instances), det (30; 9% instances), advmod (19; 6% instances), cc (9; 3% instances), conj (9; 3% instances), nummod (8; 2% instances), appos (2; 1% instances), amod (1; 0% instances), goeswith (1; 0% instances), orphan (1; 0% instances), parataxis (1; 0% instances)

Children of NUM nodes belong to 11 different parts of speech: ADP (135; 42% instances), PUNCT (65; 20% instances), NOUN (34; 10% instances), DET (30; 9% instances), NUM (23; 7% instances), ADV (18; 6% instances), CCONJ (9; 3% instances), PROPN (8; 2% instances), ADJ (1; 0% instances), PRON (1; 0% instances), X (1; 0% instances)