home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-FQB: POS Tags: NOUN

There are 1403 NOUN lemmas (37%), 1577 NOUN types (36%) and 4051 NOUN tokens (17%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: nom, année, ville, président, état, lieu, logement, film, _, pays

The 10 most frequent NOUN types: nom, année, ville, président, état, lieu, logement, pays, film, compagnie

The 10 most frequent ambiguous lemmas: _ (NOUN 33, ADP 31, DET 18, SCONJ 10, ADV 9, VERB 7, ADJ 5, PRON 4, CCONJ 1, SYM 1, X 1), acide (NOUN 7, ADJ 2), animal (NOUN 5, ADJ 3), général (ADJ 8, NOUN 5), maison (NOUN 5, PROPN 1), or (NOUN 5, X 1), être (AUX 1313, VERB 69, NOUN 4), anglais (ADJ 12, NOUN 3), cent (NOUN 3, NUM 1), espagnol (ADJ 4, NOUN 3)

The 10 most frequent ambiguous types: aide (NOUN 27, VERB 1), mort (VERB 20, NOUN 9), général (ADJ 2, NOUN 2), maison (NOUN 5, PROPN 1), or (NOUN 5, X 1), Pôle (NOUN 4, PROPN 2), voyage (NOUN 4, VERB 1), été (AUX 61, NOUN 4), anglais (ADJ 6, NOUN 3), cause (NOUN 3, VERB 1)

Morphology

The form / lemma ratio of NOUN is 1.124020 (the average of all parts of speech is 1.164044).

The 1st highest number of forms (19) was observed with the lemma “_”: bord, cas, cause, chose, compte, cours, fin, fois, milieu, moment, moyenne, occasion, rapport, sujet, titres, travers, tête, vigueur, étranger.

The 2nd highest number of forms (3) was observed with the lemma “dollar”: $, dollar, dollars.

The 3rd highest number of forms (2) was observed with the lemma “Monsieur”: M, Mr..

NOUN occurs with 6 features: Number (3734; 92% instances), Gender (3674; 91% instances), NumType (12; 0% instances), Poss (3; 0% instances), ExtPos (1; 0% instances), Typo (1; 0% instances)

NOUN occurs with 8 feature-value pairs: ExtPos=ADP, Gender=Fem, Gender=Masc, NumType=Card, Number=Plur, Number=Sing, Poss=Yes, Typo=Yes

NOUN occurs with 16 feature combinations. The most frequent feature combination is Gender=Masc|Number=Sing (1699 tokens). Examples: nom, président, état, lieu, logement, film, monde, baseball, âge, aéroport

Relations

NOUN nodes are attached to their parents using 17 different relations: nmod (1114; 27% instances), nsubj (1033; 25% instances), obj (601; 15% instances), obl:mod (377; 9% instances), obl:arg (331; 8% instances), root (187; 5% instances), nsubj:pass (148; 4% instances), dislocated (123; 3% instances), conj (42; 1% instances), xcomp (34; 1% instances), fixed (33; 1% instances), obl:agent (13; 0% instances), advcl (8; 0% instances), acl:relcl (4; 0% instances), advcl:cleft (1; 0% instances), case (1; 0% instances), dep (1; 0% instances)

Parents of NOUN nodes belong to 10 different parts of speech: VERB (1820; 45% instances), NOUN (1159; 29% instances), ADJ (575; 14% instances), (187; 5% instances), PRON (149; 4% instances), ADV (95; 2% instances), ADP (32; 1% instances), PROPN (31; 1% instances), NUM (2; 0% instances), DET (1; 0% instances)

183 (5%) NOUN nodes are leaves.

843 (21%) NOUN nodes have one child.

1672 (41%) NOUN nodes have two children.

1353 (33%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 10.

Children of NOUN nodes are attached using 25 different relations: det (3200; 35% instances), case (1793; 20% instances), nmod (1744; 19% instances), amod (911; 10% instances), punct (360; 4% instances), cop (201; 2% instances), nsubj (194; 2% instances), acl (139; 2% instances), mark (135; 1% instances), acl:relcl (74; 1% instances), appos (55; 1% instances), conj (53; 1% instances), nummod (47; 1% instances), cc (36; 0% instances), dep (36; 0% instances), obl:mod (20; 0% instances), advmod (12; 0% instances), flat:name (12; 0% instances), advcl (7; 0% instances), expl:subj (7; 0% instances), aux:tense (5; 0% instances), dislocated (1; 0% instances), fixed (1; 0% instances), goeswith (1; 0% instances), parataxis (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: DET (3194; 35% instances), ADP (1796; 20% instances), NOUN (1159; 13% instances), ADJ (916; 10% instances), PROPN (747; 8% instances), PUNCT (360; 4% instances), VERB (221; 2% instances), AUX (206; 2% instances), PRON (159; 2% instances), SCONJ (128; 1% instances), NUM (59; 1% instances), X (42; 0% instances), CCONJ (35; 0% instances), ADV (23; 0% instances)