This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home no/pos issue tracker

X: other

Definition

The tag X is used for words that for some reason cannot be assigned a real part-of-speech category. In the Norwegian data, this tag is used mostly for foreign words.


Treebank Statistics (UD_Norwegian)

There are 438 X lemmas (2%), 438 X types (1%) and 726 X tokens (0%). Out of 17 observed tags, the rank of X is: 6 in number of lemmas, 6 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: the, of, and, in, to, you, a, is, for, i

The 10 most frequent X types: the, of, and, in, to, you, a, is, for, i

The 10 most frequent ambiguous lemmas: the (X 31, DET 4), of (X 25, ADP 1), and (X 20, CONJ 1, NOUN 1), in (X 16, ADJ 2), to (NUM 356, X 11), you (X 9, PRON 1), a (X 8, NOUN 7, INTJ 1), is (NOUN 13, X 7, VERB 1), for (ADP 3710, ADV 148, CONJ 99, X 7), i (ADP 8631, X 3, PROPN 1)

The 10 most frequent ambiguous types: the (X 31, DET 4), of (X 25, ADP 1), and (X 20, NOUN 1, CONJ 1), in (X 16, ADJ 2), to (NUM 331, X 11), you (X 9, PRON 1), a (X 8, ADJ 5, NOUN 2, INTJ 1), is (X 7, NOUN 4, VERB 1), for (ADP 3543, ADV 143, CONJ 44, X 7), i (ADP 7853, X 3)

Morphology

The form / lemma ratio of X is 1.000000 (the average of all parts of speech is 1.382778).

The 1st highest number of forms (1) was observed with the lemma “32”: 32.

The 2nd highest number of forms (1) was observed with the lemma “34”: 34.

The 3rd highest number of forms (1) was observed with the lemma “Annan”: Annan.

X does not occur with any features.

Relations

X nodes are attached to their parents using 11 different relations: foreign (478; 66% instances), name (152; 21% instances), root (51; 7% instances), nmod (12; 2% instances), compound (9; 1% instances), dobj (9; 1% instances), appos (5; 1% instances), xcomp (5; 1% instances), conj (2; 0% instances), nsubj (2; 0% instances), acl (1; 0% instances)

Parents of X nodes belong to 9 different parts of speech: X (478; 66% instances), PROPN (156; 21% instances), ROOT (51; 7% instances), VERB (20; 3% instances), NOUN (17; 2% instances), ADJ (1; 0% instances), ADP (1; 0% instances), DET (1; 0% instances), PRON (1; 0% instances)

641 (88%) X nodes are leaves.

3 (0%) X nodes have one child.

9 (1%) X nodes have two children.

73 (10%) X nodes have three or more children.

The highest child degree of a X node is 46.

Children of X nodes are attached using 13 different relations: foreign (481; 65% instances), punct (218; 30% instances), case (10; 1% instances), parataxis (10; 1% instances), advmod (3; 0% instances), appos (3; 0% instances), conj (3; 0% instances), cc (2; 0% instances), det (2; 0% instances), mark (2; 0% instances), nmod (2; 0% instances), acl:relcl (1; 0% instances), amod (1; 0% instances)

Children of X nodes belong to 10 different parts of speech: X (478; 65% instances), PUNCT (218; 30% instances), VERB (12; 2% instances), ADP (11; 1% instances), NOUN (7; 1% instances), PROPN (5; 1% instances), ADV (3; 0% instances), CONJ (2; 0% instances), ADJ (1; 0% instances), SCONJ (1; 0% instances)


X in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]