home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-GSD: POS Tags: PRON

There are 44 PRON lemmas (0%), 117 PRON types (0%) and 17687 PRON tokens (4%). Out of 16 observed tags, the rank of PRON is: 10 in number of lemmas, 8 in number of types and 8 in number of tokens.

The 10 most frequent PRON lemmas: lui, soi, qui, ce, moi, on, y, que, nous, un

The 10 most frequent PRON types: il, qui, se, s’, elle, c’, on, y, ils, lui

The 10 most frequent ambiguous lemmas: ce (DET 2214, PRON 975, X 1), moi (PRON 728, X 5), on (PRON 630, X 5), y (PRON 536, PROPN 2, SYM 1, X 1), que (SCONJ 2260, PRON 483, ADV 245), un (DET 10065, PRON 319, NUM 122, X 1), dont (PRON 283, CCONJ 96), en (ADP 5860, PRON 282), lequel (PRON 221, ADJ 1, DET 1), autre (ADJ 386, PRON 159)

The 10 most frequent ambiguous types: se (PRON 1341, DET 1, X 1), s’ (PRON 984, SCONJ 47), on (PRON 329, X 5, AUX 2), y (PRON 527, PROPN 2, SYM 1, X 1), ce (DET 541, PRON 328, X 1), dont (PRON 284, CCONJ 96), le (DET 13756, PRON 281, PROPN 12, X 2), en (ADP 5073, PRON 281), vous (PRON 249, X 1), qu’ (SCONJ 675, PRON 248, ADV 92)

Morphology

The form / lemma ratio of PRON is 2.659091 (the average of all parts of speech is 1.309093).

The 1st highest number of forms (21) was observed with the lemma “lui”: -elle, -elles, -eux, -il, -ils, -le, -t-elle, -t-il, elle, elles, eux, il, ils, l’, la, le, les, leur, lui, t’il, t-il.

The 2nd highest number of forms (8) was observed with the lemma “moi”: -je, -moi, J, j’, je, m’, me, moi.

The 3rd highest number of forms (6) was observed with the lemma “toi”: -toi, -tu, t’, te, toi, tu.

PRON occurs with 11 features: PronType (17687; 100% instances), Person (14413; 81% instances), Number (11255; 64% instances), Emph (10298; 58% instances), Gender (9298; 53% instances), Case (8065; 46% instances), Reflex (2490; 14% instances), ExtPos (51; 0% instances), Typo (31; 0% instances), Number[psor] (3; 0% instances), Person[psor] (3; 0% instances)

PRON occurs with 26 feature-value pairs: Case=Acc, Case=Dat, Case=Nom, Emph=No, Emph=Yes, ExtPos=ADP, ExtPos=ADV, ExtPos=CCONJ, ExtPos=PROPN, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Number[psor]=Plur, Person=1, Person=2, Person=3, Person[psor]=1, PronType=Dem, PronType=Ind, PronType=Int, PronType=Neg, PronType=Prs, PronType=Rel, Reflex=Yes, Typo=Yes

PRON occurs with 118 feature combinations. The most frequent feature combination is Case=Nom|Emph=No|Gender=Masc|Number=Sing|Person=3|PronType=Prs (3606 tokens). Examples: il, -il

Relations

PRON nodes are attached to their parents using 38 different relations: nsubj (8192; 46% instances), obj (2093; 12% instances), nsubj:pass (1054; 6% instances), expl:pv (1017; 6% instances), expl:subj (931; 5% instances), iobj (870; 5% instances), obl:mod (789; 4% instances), expl:pass (686; 4% instances), nmod (607; 3% instances), obl:arg (307; 2% instances), expl:comp (211; 1% instances), root (176; 1% instances), conj (174; 1% instances), parataxis (120; 1% instances), appos (92; 1% instances), nsubj:caus (63; 0% instances), fixed (58; 0% instances), obj:agent (31; 0% instances), acl:relcl (30; 0% instances), xcomp (30; 0% instances), iobj:agent (24; 0% instances), case (20; 0% instances), obl:agent (15; 0% instances), advmod (13; 0% instances), cc (12; 0% instances), dislocated (12; 0% instances), dep:comp (11; 0% instances), ccomp (9; 0% instances), nsubj:outer (9; 0% instances), dislocated:subj (7; 0% instances), advcl (6; 0% instances), orphan (6; 0% instances), mark (4; 0% instances), dep (3; 0% instances), acl (2; 0% instances), dislocated:mod (1; 0% instances), dislocated:obj (1; 0% instances), parataxis:insert (1; 0% instances)

Parents of PRON nodes belong to 15 different parts of speech: VERB (14650; 83% instances), NOUN (1669; 9% instances), ADJ (637; 4% instances), PRON (230; 1% instances), (176; 1% instances), PROPN (106; 1% instances), ADV (95; 1% instances), ADP (47; 0% instances), NUM (36; 0% instances), SYM (11; 0% instances), X (9; 0% instances), DET (8; 0% instances), INTJ (5; 0% instances), AUX (4; 0% instances), SCONJ (4; 0% instances)

15516 (88%) PRON nodes are leaves.

974 (6%) PRON nodes have one child.

660 (4%) PRON nodes have two children.

537 (3%) PRON nodes have three or more children.

The highest child degree of a PRON node is 9.

Children of PRON nodes are attached using 33 different relations: case (1093; 25% instances), punct (716; 16% instances), nmod (648; 15% instances), acl:relcl (340; 8% instances), det (306; 7% instances), cop (224; 5% instances), nsubj (187; 4% instances), cc (169; 4% instances), advmod (150; 3% instances), conj (115; 3% instances), acl (113; 3% instances), amod (86; 2% instances), fixed (84; 2% instances), appos (38; 1% instances), obl:mod (28; 1% instances), expl:subj (26; 1% instances), advcl:cleft (25; 1% instances), orphan (17; 0% instances), mark (15; 0% instances), advcl (12; 0% instances), aux:tense (9; 0% instances), parataxis (8; 0% instances), nummod (6; 0% instances), goeswith (3; 0% instances), discourse (2; 0% instances), dislocated:subj (2; 0% instances), obl:agent (2; 0% instances), aux:pass (1; 0% instances), dep (1; 0% instances), dislocated (1; 0% instances), dislocated:obj (1; 0% instances), nsubj:pass (1; 0% instances), parataxis:insert (1; 0% instances)

Children of PRON nodes belong to 15 different parts of speech: ADP (1063; 24% instances), PUNCT (716; 16% instances), NOUN (643; 15% instances), VERB (508; 11% instances), DET (306; 7% instances), AUX (241; 5% instances), PRON (230; 5% instances), ADV (191; 4% instances), CCONJ (163; 4% instances), PROPN (147; 3% instances), ADJ (128; 3% instances), SCONJ (41; 1% instances), NUM (36; 1% instances), X (15; 0% instances), INTJ (2; 0% instances)