home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Latvian-LVTB: POS Tags: CCONJ

There are 37 CCONJ lemmas (0%), 41 CCONJ types (0%) and 11497 CCONJ tokens (4%). Out of 17 observed tags, the rank of CCONJ is: 14 in number of lemmas, 15 in number of types and 9 in number of tokens.

The 10 most frequent CCONJ lemmas: un, bet, vai, gan, taču, arī, ne, kā, jo, tomēr

The 10 most frequent CCONJ types: un, bet, vai, gan, taču, arī, ne, jo, kā, tomēr

The 10 most frequent ambiguous lemmas: un (CCONJ 7478, X 1), bet (CCONJ 1371, SCONJ 1), vai (SCONJ 482, CCONJ 448, PART 258, INTJ 2), gan (CCONJ 430, PART 178, SCONJ 57), taču (CCONJ 384, PART 54), arī (PART 1494, CCONJ 266, SCONJ 30), ne (PART 251, CCONJ 218, SCONJ 1), (SCONJ 768, ADV 533, CCONJ 197, PART 80, PRON 3, DET 1), jo (SCONJ 417, CCONJ 195, PART 26), tomēr (CCONJ 183, PART 81, SCONJ 1)

The 10 most frequent ambiguous types: un (CCONJ 7096, ADP 1, X 1), bet (CCONJ 1090, SCONJ 1), vai (SCONJ 478, CCONJ 422, PART 101, INTJ 1), gan (CCONJ 414, PART 175, SCONJ 57), taču (CCONJ 242, PART 52), arī (PART 1360, CCONJ 265, SCONJ 30), ne (PART 231, CCONJ 212, DET 1, SCONJ 1), jo (SCONJ 384, CCONJ 187, PART 24), (SCONJ 714, ADV 379, CCONJ 189, PART 80, PRON 25, DET 7), tomēr (CCONJ 91, PART 75)

Morphology

The form / lemma ratio of CCONJ is 1.108108 (the average of all parts of speech is 2.244795).

The 1st highest number of forms (2) was observed with the lemma “arī”: ari, arī.

The 2nd highest number of forms (2) was observed with the lemma “kā”: ka, kā.

The 3rd highest number of forms (2) was observed with the lemma “nevis”: nevis, nevīs.

CCONJ occurs with 2 features: Polarity (218; 2% instances), Typo (6; 0% instances)

CCONJ occurs with 2 feature-value pairs: Polarity=Neg, Typo=Yes

CCONJ occurs with 3 feature combinations. The most frequent feature combination is _ (11273 tokens). Examples: un, bet, vai, gan, taču, arī, jo, kā, tomēr, nevis

Relations

CCONJ nodes are attached to their parents using 12 different relations: cc (10873; 95% instances), fixed (312; 3% instances), mark (253; 2% instances), case (33; 0% instances), conj (6; 0% instances), discourse (6; 0% instances), root (5; 0% instances), dep (3; 0% instances), nsubj (3; 0% instances), flat:name (1; 0% instances), iobj (1; 0% instances), reparandum (1; 0% instances)

Parents of CCONJ nodes belong to 17 different parts of speech: VERB (5633; 49% instances), NOUN (3651; 32% instances), ADJ (727; 6% instances), PROPN (512; 4% instances), ADV (332; 3% instances), CCONJ (266; 2% instances), PRON (173; 2% instances), NUM (66; 1% instances), SCONJ (37; 0% instances), X (30; 0% instances), SYM (21; 0% instances), PART (20; 0% instances), AUX (12; 0% instances), ADP (6; 0% instances), INTJ (5; 0% instances), (5; 0% instances), DET (1; 0% instances)

11030 (96%) CCONJ nodes are leaves.

458 (4%) CCONJ nodes have one child.

5 (0%) CCONJ nodes have two children.

4 (0%) CCONJ nodes have three or more children.

The highest child degree of a CCONJ node is 5.

Children of CCONJ nodes are attached using 9 different relations: fixed (439; 91% instances), punct (26; 5% instances), conj (7; 1% instances), discourse (3; 1% instances), flat:name (3; 1% instances), nmod (2; 0% instances), nummod (1; 0% instances), parataxis (1; 0% instances), reparandum (1; 0% instances)

Children of CCONJ nodes belong to 9 different parts of speech: CCONJ (266; 55% instances), PART (179; 37% instances), PUNCT (26; 5% instances), NOUN (4; 1% instances), SCONJ (4; 1% instances), ADV (1; 0% instances), NUM (1; 0% instances), PRON (1; 0% instances), PROPN (1; 0% instances)