home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Turkish-Penn: POS Tags: X

There are 210 X lemmas (1%), 270 X types (1%) and 473 X tokens (0%). Out of 15 observed tags, the rank of X is: 7 in number of lemmas, 7 in number of types and 13 in number of tokens.

The 10 most frequent X lemmas: _, ne, biri, %15, %9, itibaren, tefek, %70, %8, gibi

The 10 most frequent X types: biridir, nedir, neler, itibarendir, %15, %9, tefek, %8, %2, %5

The 10 most frequent ambiguous lemmas: ne (PRON 74, ADV 51, CCONJ 36, ADJ 30, X 29), biri (PRON 120, X 21), itibaren (ADP 29, X 11), gibi (ADP 280, X 8), ait (ADP 28, X 7), bu (DET 1505, PRON 791, X 6), şu (DET 108, PRON 17, X 4), biz (PRON 90, NOUN 25, X 3), değiş (VERB 112, X 3), fazla (ADJ 224, ADV 121, ADP 7, X 3)

The 10 most frequent ambiguous types: biridir (X 14, NUM 1), de (CCONJ 635, X 6, ADV 1, VERB 1), fazlaydı (ADJ 3, X 3), ki (SCONJ 104, X 3), çok (ADV 283, ADJ 182, DET 16, X 3, ADP 1, NOUN 1), Benim (PRON 5, NOUN 2, X 2), arası (NOUN 20, X 2), bizim (PRON 13, NOUN 5, X 1), da (CCONJ 641, X 2, ADV 1), den (X 2, VERB 1)

Morphology

The form / lemma ratio of X is 1.285714 (the average of all parts of speech is 2.343544).

The 1st highest number of forms (23) was observed with the lemma “_”: ‘daki, ‘de, ‘e, ‘nin, ‘yu, ‘ın, .’e, .’un, 3.4, arası, da, dan, de, den, dışı, ki, nun, nın, s, sonu, un, çok, çoğu.

The 2nd highest number of forms (7) was observed with the lemma “%70”: %70, %70’e, %70’inden, %70’ine, %70’ini, %70’lik, %70’ten.

The 3rd highest number of forms (3) was observed with the lemma “%15”: %15, %15’e, %15’ten.

X occurs with 1 features: Number (146; 31% instances)

X occurs with 2 feature-value pairs: Number=Plur, Number=Sing

X occurs with 3 feature combinations. The most frequent feature combination is _ (327 tokens). Examples: %15, %9, tefek, %8, %2, %5, %7.875, de, %1, %10

Relations

X nodes are attached to their parents using 17 different relations: amod (95; 20% instances), nmod (92; 19% instances), obl (82; 17% instances), root (74; 16% instances), goeswith (42; 9% instances), obj (17; 4% instances), compound (14; 3% instances), case (13; 3% instances), conj (13; 3% instances), nsubj (11; 2% instances), list (5; 1% instances), advcl (4; 1% instances), appos (4; 1% instances), fixed (4; 1% instances), ccomp (1; 0% instances), mark (1; 0% instances), parataxis (1; 0% instances)

Parents of X nodes belong to 10 different parts of speech: NOUN (179; 38% instances), VERB (120; 25% instances), (74; 16% instances), PROPN (25; 5% instances), NUM (22; 5% instances), X (21; 4% instances), ADJ (18; 4% instances), ADV (8; 2% instances), DET (4; 1% instances), ADP (2; 0% instances)

302 (64%) X nodes are leaves.

77 (16%) X nodes have one child.

28 (6%) X nodes have two children.

66 (14%) X nodes have three or more children.

The highest child degree of a X node is 8.

Children of X nodes are attached using 23 different relations: punct (123; 31% instances), nsubj (65; 17% instances), nmod (55; 14% instances), obl (25; 6% instances), amod (17; 4% instances), cc (17; 4% instances), conj (17; 4% instances), obj (13; 3% instances), advmod (11; 3% instances), compound (10; 3% instances), advcl (7; 2% instances), case (7; 2% instances), list (5; 1% instances), csubj (4; 1% instances), discourse (4; 1% instances), mark (2; 1% instances), parataxis (2; 1% instances), xcomp (2; 1% instances), appos (1; 0% instances), det (1; 0% instances), fixed (1; 0% instances), flat (1; 0% instances), nummod (1; 0% instances)

Children of X nodes belong to 12 different parts of speech: PUNCT (123; 31% instances), NOUN (113; 29% instances), PROPN (31; 8% instances), VERB (24; 6% instances), ADJ (22; 6% instances), X (21; 5% instances), ADV (16; 4% instances), CCONJ (16; 4% instances), NUM (13; 3% instances), ADP (8; 2% instances), DET (3; 1% instances), PRON (1; 0% instances)