home cs/pos edit page issue tracker

PRON: pronoun

Definition

Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.

Pronouns under this definition function like nouns. Note that Czech grammar traditionally extends the term pronoun to words that substitute for adjectives. Such words are not tagged PRON under our universal scheme. They are tagged as determiners in order to annotate the same thing same way across languages.

For instance, tohle  “this” is traditionally called pronoun in Czech grammar, regardless of context (the notion of determiners does not exist in Czech grammar). To make the annotation parallel across languages, it should be now tagged PRON in Tohle jsem viděl včera.  “I saw this yesterday.” and DET in Tohle auto jsem viděl včera.  “I saw this car yesterday.”

Examples

References


Treebank Statistics (UD_Czech)

There are 115 PRON lemmas (0%), 424 PRON types (0%) and 72548 PRON tokens (5%). Out of 17 observed tags, the rank of PRON is: 7 in number of lemmas, 7 in number of types and 8 in number of tokens.

The 10 most frequent PRON lemmas: se, ten, který, on, já, všechen, jenž, co, kdo, což

The 10 most frequent PRON types: se, to, si, které, který, která, co, tím, kteří, tom

The 10 most frequent ambiguous lemmas: ten (PRON 11968, DET 1312), který (PRON 10604, DET 143), on (PRON 7262, ADP 9, PART 1), (PRON 3373, NOUN 1), jenž (PRON 2211, DET 648), co (PRON 1859, ADV 239, SCONJ 210, PART 21), což (PRON 748, INTJ 3, PART 1), něco (PRON 541, DET 1), jaký (DET 391, PRON 337), někdo (PRON 321, DET 3)

The 10 most frequent ambiguous types: se (PRON 21370, ADP 1901), to (PRON 5916, DET 101, PART 30, ADP 5), si (PRON 3737, VERB 1, AUX 1), které (PRON 3205, DET 43), který (PRON 2886, DET 20), která (PRON 1993, DET 11), co (PRON 1187, ADV 233, SCONJ 207, PART 7), tím (PRON 1059, DET 55), kteří (PRON 1156, DET 6), tom (PRON 1023, DET 66, PROPN 5)

Morphology

The form / lemma ratio of PRON is 3.686957 (the average of all parts of speech is 2.195970).

The 1st highest number of forms (28) was observed with the lemma “on”: ho, je, jeho, jej, jemu, ji, jich, jim, jimi, jí, jím, mu, ni, nich, nim, nimi, ní, ním, ně, něho, něj, něm, němu, on, ona, oni, ono, ony

The 2nd highest number of forms (25) was observed with the lemma “jenž”: jehož, jejichž, jejímž, jejíž, jejž, jemuž, jenž, jež, jichž, jimiž, jimž, již, jímž, jíž, nichž, nimiž, nimž, niž, nímž, níž, něhož, nějž, němuž, němž, něž

The 3rd highest number of forms (19) was observed with the lemma “můj”: moje, mou, má, mého, mému, mých, mým, můj, naše, našeho, našem, našemu, naši, našich, našim, našimi, naší, naším, náš

PRON occurs with 18 features: cs-feat/PronType (72548; 100% instances), cs-feat/Case (72284; 100% instances), cs-feat/Number (40779; 56% instances), cs-feat/Gender (33610; 46% instances), cs-feat/Variant (27181; 37% instances), cs-feat/Reflex (25901; 36% instances), cs-feat/Person (11271; 16% instances), cs-feat/Animacy (7235; 10% instances), cs-feat/PrepCase (4926; 7% instances), cs-feat/Negative (1102; 2% instances), cs-feat/NumType (419; 1% instances), cs-feat/Style (316; 0% instances), cs-feat/Poss (256; 0% instances), cs-feat/Number[psor] (139; 0% instances), cs-feat/Foreign (79; 0% instances), cs-feat/Gender[psor] (41; 0% instances), cs-feat/NameType (15; 0% instances), cs-feat/Abbr (6; 0% instances)

PRON occurs with 47 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Foreign=Foreign, Gender=Fem, Gender=Fem,Neut, Gender=Masc, Gender=Masc,Neut, Gender=Neut, Gender[psor]=Fem, Gender[psor]=Masc,Neut, NameType=Com, NameType=Oth, NameType=Pro, Negative=Neg, NumType=Card, NumType=Mult, NumType=Ord, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Poss=Yes, PrepCase=Npr, PrepCase=Pre, PronType=Dem, PronType=Dem,Ind, PronType=Ind, PronType=Int,Rel, PronType=Neg, PronType=Prs, PronType=Rel, PronType=Tot, Reflex=Yes, Style=Arch, Style=Coll, Variant=Short

PRON occurs with 416 feature combinations. The most frequent feature combination is Case=Acc|PronType=Prs|Reflex=Yes|Variant=Short (21419 tokens). Examples: se, sa

Relations

PRON nodes are attached to their parents using 25 different relations: cs-dep/expl (17180; 24% instances), cs-dep/nsubj (16386; 23% instances), cs-dep/dobj (14338; 20% instances), cs-dep/nmod (9362; 13% instances), cs-dep/auxpass:reflex (4906; 7% instances), cs-dep/advmod (2922; 4% instances), cs-dep/iobj (2658; 4% instances), cs-dep/xcomp (841; 1% instances), cs-dep/conj (832; 1% instances), cs-dep/nsubjpass (559; 1% instances), cs-dep/root (556; 1% instances), cs-dep/cc (426; 1% instances), cs-dep/dep (368; 1% instances), cs-dep/amod (341; 0% instances), cs-dep/discourse (319; 0% instances), cs-dep/appos (151; 0% instances), cs-dep/acl (127; 0% instances), cs-dep/advcl (96; 0% instances), cs-dep/ccomp (88; 0% instances), cs-dep/foreign (52; 0% instances), cs-dep/parataxis (25; 0% instances), cs-dep/csubj (8; 0% instances), cs-dep/case (5; 0% instances), cs-dep/cop (1; 0% instances), cs-dep/mark (1; 0% instances)

Parents of PRON nodes belong to 15 different parts of speech: VERB (60721; 84% instances), NOUN (6317; 9% instances), ADJ (2820; 4% instances), PRON (855; 1% instances), ADV (592; 1% instances), ROOT (556; 1% instances), NUM (342; 0% instances), PROPN (198; 0% instances), DET (67; 0% instances), PART (23; 0% instances), CONJ (19; 0% instances), SYM (17; 0% instances), ADP (15; 0% instances), INTJ (4; 0% instances), SCONJ (2; 0% instances)

56588 (78%) PRON nodes are leaves.

11839 (16%) PRON nodes have one child.

2704 (4%) PRON nodes have two children.

1417 (2%) PRON nodes have three or more children.

The highest child degree of a PRON node is 24.

Children of PRON nodes are attached using 31 different relations: cs-dep/case (10593; 46% instances), cs-dep/acl (3251; 14% instances), cs-dep/punct (1659; 7% instances), cs-dep/advmod:emph (1008; 4% instances), cs-dep/cc (875; 4% instances), cs-dep/nmod (836; 4% instances), cs-dep/amod (831; 4% instances), cs-dep/cop (738; 3% instances), cs-dep/conj (633; 3% instances), cs-dep/nsubj (619; 3% instances), cs-dep/xcomp (570; 2% instances), cs-dep/dep (267; 1% instances), cs-dep/mark (227; 1% instances), cs-dep/appos (225; 1% instances), cs-dep/advmod (196; 1% instances), cs-dep/nummod:gov (132; 1% instances), cs-dep/advcl (87; 0% instances), cs-dep/det (75; 0% instances), cs-dep/foreign (39; 0% instances), cs-dep/neg (36; 0% instances), cs-dep/nummod (30; 0% instances), cs-dep/ccomp (26; 0% instances), cs-dep/discourse (23; 0% instances), cs-dep/csubj (22; 0% instances), cs-dep/aux (19; 0% instances), cs-dep/det:numgov (18; 0% instances), cs-dep/parataxis (9; 0% instances), cs-dep/dobj (7; 0% instances), cs-dep/auxpass:reflex (1; 0% instances), cs-dep/det:nummod (1; 0% instances), cs-dep/vocative (1; 0% instances)

Children of PRON nodes belong to 16 different parts of speech: ADP (10549; 46% instances), VERB (3983; 17% instances), NOUN (1733; 8% instances), PUNCT (1659; 7% instances), ADJ (1201; 5% instances), CONJ (1140; 5% instances), ADV (961; 4% instances), PRON (855; 4% instances), PART (254; 1% instances), SCONJ (226; 1% instances), NUM (196; 1% instances), PROPN (179; 1% instances), DET (97; 0% instances), AUX (19; 0% instances), INTJ (1; 0% instances), SYM (1; 0% instances)


PRON in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]