PRON

home cs/pos edit page issue tracker

`PRON`: pronoun

Definition

Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.

Pronouns under this definition function like nouns. Note that Czech grammar traditionally extends the term pronoun to words that substitute for adjectives. Such words are not tagged PRON under our universal scheme. They are tagged as determiners in order to annotate the same thing same way across languages.

For instance, tohle “this” is traditionally called pronoun in Czech grammar, regardless of context (the notion of determiners does not exist in Czech grammar). To make the annotation parallel across languages, it should be now tagged PRON in Tohle jsem viděl včera. “I saw this yesterday.” and DET in Tohle auto jsem viděl včera. “I saw this car yesterday.”

Examples

personal pronouns: já, ty, on, ona, ono, my, vy, oni, ony “I, you, he, she, it, we, you, they, they”
reflexive pronouns: sebe, se, sobě, si, sebou “oneself”
demonstrative pronouns: tohle as in Tohle jsem viděl včera. “I saw this yesterday.”
interrogative pronouns: kdo, co “who, what” as in Co si myslíš? “What do you think?”
relative pronouns: kdo, co “who, what” as in Zajímalo by mě, co si myslíš. “I wonder what you think.”
indefinite pronouns: někdo, něco “somebody, something”
total pronouns: každý, všichni “everybody, all”
negative pronouns: nikdo, nic “nobody, nothing”

References

Treebank Statistics (UD_Czech)

There are 115 PRON lemmas (0%), 424 PRON types (0%) and 72548 PRON tokens (5%). Out of 17 observed tags, the rank of PRON is: 7 in number of lemmas, 7 in number of types and 8 in number of tokens.

The 10 most frequent PRON lemmas: se, ten, který, on, já, všechen, jenž, co, kdo, což

The 10 most frequent PRON types: se, to, si, které, který, která, co, tím, kteří, tom

The 10 most frequent ambiguous lemmas: ten (PRON 11968, DET 1312), který (PRON 10604, DET 143), on (PRON 7262, ADP 9, PART 1), já (PRON 3373, NOUN 1), jenž (PRON 2211, DET 648), co (PRON 1859, ADV 239, SCONJ 210, PART 21), což (PRON 748, INTJ 3, PART 1), něco (PRON 541, DET 1), jaký (DET 391, PRON 337), někdo (PRON 321, DET 3)

The 10 most frequent ambiguous types: se (PRON 21370, ADP 1901), to (PRON 5916, DET 101, PART 30, ADP 5), si (PRON 3737, VERB 1, AUX 1), které (PRON 3205, DET 43), který (PRON 2886, DET 20), která (PRON 1993, DET 11), co (PRON 1187, ADV 233, SCONJ 207, PART 7), tím (PRON 1059, DET 55), kteří (PRON 1156, DET 6), tom (PRON 1023, DET 66, PROPN 5)

se
- PRON 21370: Z kukly se vyklubal motýl
- ADP 1901: Mohou zde porovnat svůj vývoj , záměry se světovými trendy .
to
- PRON 5916: Nebude to poprvé ani naposledy .
- DET 101: Na to neznám odpověď .
- PART 30: Vedle toho musí být s to zastoupit i chybějícího kolegu .
- ADP 5: Obálka s adresou to professor Servít from Prague .
si
- PRON 3737: Firma , která si je vyžádá , platí pouze náklady na jejich pobyt .
- VERB 1: Nebylo by žádným uměním dospět ke sporu , když bychom si během téže úvahy vykládali chování světelných paprsků různými způsoby .
- AUX 1: Bollettieri tenkrát ke mně přišel a povídá : , Porazil si jednoho z nejlepších tenistů budoucnosti . ‘ .
které
- PRON 3205: Máme zaměstnance , které občas vysíláme na služební cestu .
- DET 43: Na které žánry lze dnes v českém filmu vsadit s jistotou ?
který
- PRON 2886: * Počet zásobovaných bytů 136725 , který stále roste .
- DET 20: ” Nevím , který úředník vypočítal , že kilo králíka mě stojí sedmnáct korun .
která
- PRON 1993: Firma , která si je vyžádá , platí pouze náklady na jejich pobyt .
- DET 11: O která rádia půjde , je těžké předem hovořit .
co
- PRON 1187: Nevíte , co kam započítat ?
- ADV 233: Samozřejmě nejen co do efektu , ale i co do nákladů .
- SCONJ 207: Učinil tak 24 hodin poté , co nabídku zprvu odmítl .
- PART 7: A co když by Maastricht v neděli neprošel ?
tím
- PRON 1059: Nebyly s tím u nás dosud žádné zkušenosti .
- DET 55: Párky však zřejmě nejsou tím správným teplým předkrmem před utkáním .
kteří
- PRON 1156: Platí to především o těch , kteří si vybrali za svého favorita hotely .
- DET 6: Je možno už předem odhadnout , kteří lidé nebudou v grémiu chybět .
tom
- PRON 1023: V tom s vámi nesouhlasím .
- DET 66: A v tom případě by se mohla 14 . srpna 2126 srazit s naší Zemí .
- PROPN 5: Praha ( tom ) -

Morphology

The form / lemma ratio of PRON is 3.686957 (the average of all parts of speech is 2.195970).

The 1st highest number of forms (28) was observed with the lemma “on”: ho, je, jeho, jej, jemu, ji, jich, jim, jimi, jí, jím, mu, ni, nich, nim, nimi, ní, ním, ně, něho, něj, něm, němu, on, ona, oni, ono, ony

The 2nd highest number of forms (25) was observed with the lemma “jenž”: jehož, jejichž, jejímž, jejíž, jejž, jemuž, jenž, jež, jichž, jimiž, jimž, již, jímž, jíž, nichž, nimiž, nimž, niž, nímž, níž, něhož, nějž, němuž, němž, něž

The 3rd highest number of forms (19) was observed with the lemma “můj”: moje, mou, má, mého, mému, mých, mým, můj, naše, našeho, našem, našemu, naši, našich, našim, našimi, naší, naším, náš

PRON occurs with 18 features: cs-feat/PronType (72548; 100% instances), cs-feat/Case (72284; 100% instances), cs-feat/Number (40779; 56% instances), cs-feat/Gender (33610; 46% instances), cs-feat/Variant (27181; 37% instances), cs-feat/Reflex (25901; 36% instances), cs-feat/Person (11271; 16% instances), cs-feat/Animacy (7235; 10% instances), cs-feat/PrepCase (4926; 7% instances), cs-feat/Negative (1102; 2% instances), cs-feat/NumType (419; 1% instances), cs-feat/Style (316; 0% instances), cs-feat/Poss (256; 0% instances), cs-feat/Number[psor] (139; 0% instances), cs-feat/Foreign (79; 0% instances), cs-feat/Gender[psor] (41; 0% instances), cs-feat/NameType (15; 0% instances), cs-feat/Abbr (6; 0% instances)

PRON occurs with 47 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Foreign=Foreign, Gender=Fem, Gender=Fem,Neut, Gender=Masc, Gender=Masc,Neut, Gender=Neut, Gender[psor]=Fem, Gender[psor]=Masc,Neut, NameType=Com, NameType=Oth, NameType=Pro, Negative=Neg, NumType=Card, NumType=Mult, NumType=Ord, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Poss=Yes, PrepCase=Npr, PrepCase=Pre, PronType=Dem, PronType=Dem,Ind, PronType=Ind, PronType=Int,Rel, PronType=Neg, PronType=Prs, PronType=Rel, PronType=Tot, Reflex=Yes, Style=Arch, Style=Coll, Variant=Short

PRON occurs with 416 feature combinations. The most frequent feature combination is Case=Acc|PronType=Prs|Reflex=Yes|Variant=Short (21419 tokens). Examples: se, sa

Relations

PRON nodes are attached to their parents using 25 different relations: cs-dep/expl (17180; 24% instances), cs-dep/nsubj (16386; 23% instances), cs-dep/dobj (14338; 20% instances), cs-dep/nmod (9362; 13% instances), cs-dep/auxpass:reflex (4906; 7% instances), cs-dep/advmod (2922; 4% instances), cs-dep/iobj (2658; 4% instances), cs-dep/xcomp (841; 1% instances), cs-dep/conj (832; 1% instances), cs-dep/nsubjpass (559; 1% instances), cs-dep/root (556; 1% instances), cs-dep/cc (426; 1% instances), cs-dep/dep (368; 1% instances), cs-dep/amod (341; 0% instances), cs-dep/discourse (319; 0% instances), cs-dep/appos (151; 0% instances), cs-dep/acl (127; 0% instances), cs-dep/advcl (96; 0% instances), cs-dep/ccomp (88; 0% instances), cs-dep/foreign (52; 0% instances), cs-dep/parataxis (25; 0% instances), cs-dep/csubj (8; 0% instances), cs-dep/case (5; 0% instances), cs-dep/cop (1; 0% instances), cs-dep/mark (1; 0% instances)

Parents of PRON nodes belong to 15 different parts of speech: VERB (60721; 84% instances), NOUN (6317; 9% instances), ADJ (2820; 4% instances), PRON (855; 1% instances), ADV (592; 1% instances), ROOT (556; 1% instances), NUM (342; 0% instances), PROPN (198; 0% instances), DET (67; 0% instances), PART (23; 0% instances), CONJ (19; 0% instances), SYM (17; 0% instances), ADP (15; 0% instances), INTJ (4; 0% instances), SCONJ (2; 0% instances)

56588 (78%) PRON nodes are leaves.

11839 (16%) PRON nodes have one child.

2704 (4%) PRON nodes have two children.

1417 (2%) PRON nodes have three or more children.

The highest child degree of a PRON node is 24.

Children of PRON nodes are attached using 31 different relations: cs-dep/case (10593; 46% instances), cs-dep/acl (3251; 14% instances), cs-dep/punct (1659; 7% instances), cs-dep/advmod:emph (1008; 4% instances), cs-dep/cc (875; 4% instances), cs-dep/nmod (836; 4% instances), cs-dep/amod (831; 4% instances), cs-dep/cop (738; 3% instances), cs-dep/conj (633; 3% instances), cs-dep/nsubj (619; 3% instances), cs-dep/xcomp (570; 2% instances), cs-dep/dep (267; 1% instances), cs-dep/mark (227; 1% instances), cs-dep/appos (225; 1% instances), cs-dep/advmod (196; 1% instances), cs-dep/nummod:gov (132; 1% instances), cs-dep/advcl (87; 0% instances), cs-dep/det (75; 0% instances), cs-dep/foreign (39; 0% instances), cs-dep/neg (36; 0% instances), cs-dep/nummod (30; 0% instances), cs-dep/ccomp (26; 0% instances), cs-dep/discourse (23; 0% instances), cs-dep/csubj (22; 0% instances), cs-dep/aux (19; 0% instances), cs-dep/det:numgov (18; 0% instances), cs-dep/parataxis (9; 0% instances), cs-dep/dobj (7; 0% instances), cs-dep/auxpass:reflex (1; 0% instances), cs-dep/det:nummod (1; 0% instances), cs-dep/vocative (1; 0% instances)

Children of PRON nodes belong to 16 different parts of speech: ADP (10549; 46% instances), VERB (3983; 17% instances), NOUN (1733; 8% instances), PUNCT (1659; 7% instances), ADJ (1201; 5% instances), CONJ (1140; 5% instances), ADV (961; 4% instances), PRON (855; 4% instances), PART (254; 1% instances), SCONJ (226; 1% instances), NUM (196; 1% instances), PROPN (179; 1% instances), DET (97; 0% instances), AUX (19; 0% instances), INTJ (1; 0% instances), SYM (1; 0% instances)

PRON in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]