home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_English-EWT: POS Tags: SCONJ

There are 47 SCONJ lemmas (0%), 71 SCONJ types (0%) and 4603 SCONJ tokens (2%). Out of 17 observed tags, the rank of SCONJ is: 13 in number of lemmas, 13 in number of types and 14 in number of tokens.

The 10 most frequent SCONJ lemmas: that, if, as, because, for, of, since, before, while, like

The 10 most frequent SCONJ types: that, if, as, because, for, of, since, before, like, while

The 10 most frequent ambiguous lemmas: that (SCONJ 1166, PRON 1112, DET 297, ADV 18), as (ADP 475, SCONJ 424, ADV 235), because (SCONJ 212, ADP 51), for (ADP 2098, SCONJ 185, CCONJ 6, X 3), of (ADP 4229, SCONJ 158, ADV 2), since (SCONJ 111, ADP 38, ADV 5), before (SCONJ 108, ADP 64, ADV 43), while (SCONJ 106, NOUN 21), like (ADP 214, VERB 212, SCONJ 104, INTJ 27, ADJ 11, ADV 1, NOUN 1), after (ADP 164, SCONJ 103, ADV 6)

The 10 most frequent ambiguous types: that (SCONJ 1157, PRON 968, DET 192, ADV 18, ADP 1), as (ADP 426, SCONJ 365, ADV 222, AUX 1), because (SCONJ 186, ADP 44), for (ADP 2029, SCONJ 183, CCONJ 5, X 5), of (ADP 4177, SCONJ 156, ADV 3, AUX 1, CCONJ 1), since (SCONJ 92, ADP 31, ADV 5), before (SCONJ 100, ADP 60, ADV 41), like (ADP 202, VERB 182, SCONJ 102, INTJ 21, ADJ 11, ADV 1), while (SCONJ 82, NOUN 21), with (ADP 1394, SCONJ 87, ADV 3, AUX 1)

Morphology

The form / lemma ratio of SCONJ is 1.510638 (the average of all parts of speech is 1.237686).

The 1st highest number of forms (7) was observed with the lemma “because”: b/c, bc, beacuse, because, becouse, becuse, coz.

The 2nd highest number of forms (4) was observed with the lemma “if”: I’d, if, ig, it.

The 3rd highest number of forms (3) was observed with the lemma “cause”: cause, cos, cus.

SCONJ occurs with 4 features: ExtPos (52; 1% instances), Typo (20; 0% instances), Abbr (11; 0% instances), Style (1; 0% instances)

SCONJ occurs with 5 feature-value pairs: Abbr=Yes, ExtPos=ADP, ExtPos=SCONJ, Style=Vrnc, Typo=Yes

SCONJ occurs with 6 feature combinations. The most frequent feature combination is _ (4520 tokens). Examples: that, if, as, because, for, of, since, before, like, while

Relations

SCONJ nodes are attached to their parents using 6 different relations: mark (4533; 98% instances), fixed (63; 1% instances), case (2; 0% instances), conj (2; 0% instances), reparandum (2; 0% instances), ccomp (1; 0% instances)

Parents of SCONJ nodes belong to 14 different parts of speech: VERB (3559; 77% instances), ADJ (413; 9% instances), NOUN (323; 7% instances), AUX (101; 2% instances), ADV (65; 1% instances), SCONJ (38; 1% instances), PRON (36; 1% instances), PROPN (34; 1% instances), PART (10; 0% instances), NUM (8; 0% instances), ADP (7; 0% instances), INTJ (4; 0% instances), DET (3; 0% instances), SYM (2; 0% instances)

4522 (98%) SCONJ nodes are leaves.

64 (1%) SCONJ nodes have one child.

16 (0%) SCONJ nodes have two children.

1 (0%) SCONJ nodes have three or more children.

The highest child degree of a SCONJ node is 3.

Children of SCONJ nodes are attached using 7 different relations: fixed (67; 68% instances), punct (18; 18% instances), obl:unmarked (6; 6% instances), cc (3; 3% instances), conj (3; 3% instances), advmod (1; 1% instances), goeswith (1; 1% instances)

Children of SCONJ nodes belong to 9 different parts of speech: SCONJ (38; 38% instances), PUNCT (18; 18% instances), CCONJ (13; 13% instances), PART (12; 12% instances), NOUN (10; 10% instances), VERB (3; 3% instances), ADP (2; 2% instances), ADV (2; 2% instances), X (1; 1% instances)