home fi/pos edit page issue tracker

X: other

The tag X is used for words that for some reason cannot be assigned a real part-of-speech category.

Foreign words appearing inside native text are tagged X (see also Foreign).

Examples


Treebank Statistics (UD_Finnish)

There are 228 X lemmas (1%), 237 X types (0%) and 285 X tokens (0%). Out of 15 observed tags, the rank of X is: 7 in number of lemmas, 8 in number of types and 14 in number of tokens.

The 10 most frequent X lemmas: metal, common, death, dolly, eHealth, a, and, api, fun, it

The 10 most frequent X types: metal, common, death, a, and, eHealth, fun, it, pic, DIY

The 10 most frequent ambiguous lemmas: a (NOUN 30, X 3, PROPN 1), and (PROPN 10, X 3), I (ADJ 34, PROPN 5, X 2), Diàoyútái (PROPN 1, X 1), Do (PROPN 5, X 1), Don’t (PROPN 1, X 1), Finnish (PROPN 2, X 1), Grey (PROPN 1, X 1), Life (PROPN 2, X 1), Yourself (PROPN 1, X 1)

The 10 most frequent ambiguous types: a (NOUN 30, X 3, PROPN 1), and (PROPN 10, X 3), I (ADJ 26, PROPN 3, X 2), Do (PROPN 5, X 1), Don’t (PROPN 1, X 1), Finnish (PROPN 2, X 1), Life (X 1, PROPN 1), On (VERB 92, AUX 13, PROPN 5, X 1), Yourself (PROPN 1, X 1), by (X 1, PROPN 1)

Morphology

The form / lemma ratio of X is 1.039474 (the average of all parts of speech is 2.036755).

The 1st highest number of forms (3) was observed with the lemma “dolly”: dolly, dollyja, dollyn.

The 2nd highest number of forms (2) was observed with the lemma “API”: API, APIn.

The 3rd highest number of forms (2) was observed with the lemma “eHealth”: eHealth, eHealthin.

X occurs with 1 features: fi-feat/Foreign (276; 97% instances)

X occurs with 2 feature-value pairs: Foreign=Foreign, Foreign=Fscript

X occurs with 3 feature combinations. The most frequent feature combination is Foreign=Foreign (249 tokens). Examples: metal, common, death, a, and, eHealth, fun, it, pic, DIY

Relations

X nodes are attached to their parents using 22 different relations: fi-dep/foreign (93; 33% instances), fi-dep/appos (48; 17% instances), fi-dep/compound:nn (32; 11% instances), fi-dep/name (21; 7% instances), fi-dep/conj (14; 5% instances), fi-dep/root (14; 5% instances), fi-dep/nmod (13; 5% instances), fi-dep/nsubj (11; 4% instances), fi-dep/discourse (10; 4% instances), fi-dep/dobj (10; 4% instances), fi-dep/nmod:gobj (3; 1% instances), fi-dep/amod (2; 1% instances), fi-dep/cc (2; 1% instances), fi-dep/nmod:poss (2; 1% instances), fi-dep/parataxis (2; 1% instances), fi-dep/remnant (2; 1% instances), fi-dep/advmod (1; 0% instances), fi-dep/ccomp (1; 0% instances), fi-dep/csubj:cop (1; 0% instances), fi-dep/goeswith (1; 0% instances), fi-dep/mwe (1; 0% instances), fi-dep/nsubj:cop (1; 0% instances)

Parents of X nodes belong to 6 different parts of speech: X (127; 45% instances), NOUN (80; 28% instances), VERB (37; 13% instances), PROPN (24; 8% instances), ROOT (14; 5% instances), ADJ (3; 1% instances)

160 (56%) X nodes are leaves.

39 (14%) X nodes have one child.

25 (9%) X nodes have two children.

61 (21%) X nodes have three or more children.

The highest child degree of a X node is 11.

Children of X nodes are attached using 25 different relations: fi-dep/punct (119; 32% instances), fi-dep/foreign (93; 25% instances), fi-dep/conj (26; 7% instances), fi-dep/name (26; 7% instances), fi-dep/nmod (20; 5% instances), fi-dep/amod (17; 5% instances), fi-dep/cc (14; 4% instances), fi-dep/appos (13; 4% instances), fi-dep/advmod (6; 2% instances), fi-dep/nmod:poss (5; 1% instances), fi-dep/cop (4; 1% instances), fi-dep/nsubj:cop (4; 1% instances), fi-dep/acl:relcl (3; 1% instances), fi-dep/advcl (2; 1% instances), fi-dep/compound:nn (2; 1% instances), fi-dep/det (2; 1% instances), fi-dep/nummod (2; 1% instances), fi-dep/parataxis (2; 1% instances), fi-dep/remnant (2; 1% instances), fi-dep/acl (1; 0% instances), fi-dep/aux (1; 0% instances), fi-dep/cc:preconj (1; 0% instances), fi-dep/discourse (1; 0% instances), fi-dep/mark (1; 0% instances), fi-dep/nsubj (1; 0% instances)

Children of X nodes belong to 12 different parts of speech: PUNCT (127; 35% instances), X (127; 35% instances), NOUN (41; 11% instances), ADJ (20; 5% instances), VERB (17; 5% instances), CONJ (12; 3% instances), PROPN (10; 3% instances), ADV (6; 2% instances), PRON (3; 1% instances), SYM (3; 1% instances), AUX (1; 0% instances), SCONJ (1; 0% instances)


Treebank Statistics (UD_Finnish-FTB)

There are 272 X lemmas (1%), 271 X types (1%) and 306 X tokens (0%). Out of 14 observed tags, the rank of X is: 8 in number of lemmas, 10 in number of types and 14 in number of tokens.

The 10 most frequent X lemmas: the, 70-, ala-, in, keng-, maa-, sosiaali-, 50-, aquis, beat

The 10 most frequent X types: the, 70-, ala-, in, keng-, maa-, sosiaali-, 50-, Lilla, Pretty

The 10 most frequent ambiguous lemmas: blues (PROPN 1, X 1), dementia (NOUN 1, X 1), home (NOUN 3, X 1), is (NOUN 2, X 1), me (PRON 487, DET 74, X 1), termi (NOUN 4, X 1)

The 10 most frequent ambiguous types: m- (PRON 1, X 1), me (PRON 124, VERB 1, X 1), se- (PRON 1, X 1), termi (NOUN 2, X 1)

Morphology

The form / lemma ratio of X is 0.996324 (the average of all parts of speech is 2.041153).

The 1st highest number of forms (1) was observed with the lemma “10-”: 10-.

The 2nd highest number of forms (1) was observed with the lemma “100-”: 100-.

The 3rd highest number of forms (1) was observed with the lemma “150-”: 150-.

X does not occur with any features.

Relations

X nodes are attached to their parents using 4 different relations: fi-dep/conj (148; 48% instances), fi-dep/dep (142; 46% instances), fi-dep/root (15; 5% instances), fi-dep/vocative (1; 0% instances)

Parents of X nodes belong to 9 different parts of speech: NOUN (148; 48% instances), X (65; 21% instances), VERB (34; 11% instances), ADJ (23; 8% instances), PROPN (17; 6% instances), ROOT (15; 5% instances), PRON (2; 1% instances), ADV (1; 0% instances), SCONJ (1; 0% instances)

214 (70%) X nodes are leaves.

55 (18%) X nodes have one child.

19 (6%) X nodes have two children.

18 (6%) X nodes have three or more children.

The highest child degree of a X node is 7.

Children of X nodes are attached using 17 different relations: fi-dep/dep (61; 35% instances), fi-dep/punct (37; 21% instances), fi-dep/nmod (17; 10% instances), fi-dep/conj (10; 6% instances), fi-dep/amod (9; 5% instances), fi-dep/nsubj (8; 5% instances), fi-dep/cop (7; 4% instances), fi-dep/advmod (5; 3% instances), fi-dep/cc (5; 3% instances), fi-dep/acl (4; 2% instances), fi-dep/aux (2; 1% instances), fi-dep/case (2; 1% instances), fi-dep/vocative (2; 1% instances), fi-dep/csubj (1; 1% instances), fi-dep/det (1; 1% instances), fi-dep/mark (1; 1% instances), fi-dep/neg (1; 1% instances)

Children of X nodes belong to 12 different parts of speech: X (65; 38% instances), PUNCT (37; 21% instances), VERB (22; 13% instances), NOUN (14; 8% instances), PROPN (8; 5% instances), ADJ (7; 4% instances), ADV (6; 3% instances), CONJ (5; 3% instances), PRON (5; 3% instances), ADP (2; 1% instances), DET (1; 1% instances), SCONJ (1; 1% instances)


X in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]