home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Norwegian-Bokmaal: POS Tags: PROPN

There are 4640 PROPN lemmas (19%), 5008 PROPN types (15%) and 18260 PROPN tokens (6%). Out of 17 observed tags, the rank of PROPN is: 2 in number of lemmas, 2 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Norge, Regjeringen, Obama, USA, Oslo, Jan, Cathrine, Stortinget, Svalbard, Den

The 10 most frequent PROPN types: Norge, Obama, Regjeringen, Jan, Oslo, USA, Den, Svalbard, Mayen, Stortinget

The 10 most frequent ambiguous lemmas: Oslo (PROPN 153, X 2), Den (PROPN 116, DET 1, X 1), The (PROPN 53, X 1), Fashanu (PROPN 44, X 2), Det (PROPN 27, X 3), Annan (PROPN 25, X 1), norsk (ADJ 498, NOUN 18, PROPN 1), de (PRON 1636, DET 1349, PROPN 11, X 6, ADV 1), Haram (PROPN 16, X 1), ©NTB (PROPN 14, X 1)

The 10 most frequent ambiguous types: Regjeringen (PROPN 168, NOUN 38), Oslo (PROPN 147, X 2), Den (DET 222, PROPN 116, PRON 82, X 1), The (PROPN 53, X 1), Regjeringens (PROPN 44, NOUN 5), Fashanu (PROPN 36, X 2), Arbeiderpartiet (PROPN 34, NOUN 1), Mitt (PROPN 28, PRON 9), Det (PRON 1652, DET 159, PROPN 27, X 3), Annan (PROPN 25, X 1)

Morphology

The form / lemma ratio of PROPN is 1.079310 (the average of all parts of speech is 1.381903).

The 1st highest number of forms (3) was observed with the lemma “Demokratene”: Demokratene, demokratenes, demokretane.

The 2nd highest number of forms (3) was observed with the lemma “EU”: EU, EU’s, EUs.

The 3rd highest number of forms (3) was observed with the lemma “FN”: FN, FN’s, FNs.

PROPN occurs with 3 features: Gender (2689; 15% instances), Case (1214; 7% instances), Abbr (653; 4% instances)

PROPN occurs with 5 feature-value pairs: Abbr=Yes, Case=Gen, Gender=Fem, Gender=Masc, Gender=Neut

PROPN occurs with 10 feature combinations. The most frequent feature combination is _ (13916 tokens). Examples: Norge, Obama, Regjeringen, Oslo, Den, Svalbard, Mayen, Cathrine, Bertelsen, Bergen

Relations

PROPN nodes are attached to their parents using 20 different relations: nsubj (4681; 26% instances), flat:name (4206; 23% instances), nmod (4107; 22% instances), obl (2297; 13% instances), conj (1184; 6% instances), obj (570; 3% instances), root (520; 3% instances), nsubj:pass (188; 1% instances), parataxis (120; 1% instances), appos (115; 1% instances), compound (110; 1% instances), xcomp (57; 0% instances), iobj (48; 0% instances), orphan (26; 0% instances), advcl (10; 0% instances), acl (6; 0% instances), acl:relcl (5; 0% instances), ccomp (5; 0% instances), flat:foreign (3; 0% instances), csubj (2; 0% instances)

Parents of PROPN nodes belong to 13 different parts of speech: VERB (6877; 38% instances), PROPN (5865; 32% instances), NOUN (4193; 23% instances), (520; 3% instances), ADJ (478; 3% instances), PRON (94; 1% instances), ADV (65; 0% instances), DET (58; 0% instances), NUM (49; 0% instances), ADP (44; 0% instances), INTJ (8; 0% instances), X (5; 0% instances), SYM (4; 0% instances)

8396 (46%) PROPN nodes are leaves.

4809 (26%) PROPN nodes have one child.

2630 (14%) PROPN nodes have two children.

2425 (13%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 18.

Children of PROPN nodes are attached using 28 different relations: case (5271; 27% instances), flat:name (5227; 27% instances), punct (2555; 13% instances), nmod (2369; 12% instances), conj (1236; 6% instances), cc (861; 4% instances), acl:relcl (280; 1% instances), advmod (271; 1% instances), appos (251; 1% instances), amod (232; 1% instances), cop (161; 1% instances), det (153; 1% instances), obl (145; 1% instances), acl (112; 1% instances), nsubj (98; 1% instances), expl (59; 0% instances), mark (50; 0% instances), xcomp (32; 0% instances), acl:cleft (31; 0% instances), parataxis (30; 0% instances), orphan (28; 0% instances), advcl (24; 0% instances), nummod (20; 0% instances), aux (10; 0% instances), compound (4; 0% instances), csubj (2; 0% instances), discourse (1; 0% instances), reparandum (1; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: PROPN (5865; 30% instances), ADP (5367; 28% instances), NOUN (2728; 14% instances), PUNCT (2555; 13% instances), CCONJ (909; 5% instances), ADJ (527; 3% instances), VERB (354; 2% instances), NUM (255; 1% instances), ADV (214; 1% instances), DET (199; 1% instances), AUX (171; 1% instances), X (156; 1% instances), PRON (144; 1% instances), SCONJ (48; 0% instances), PART (20; 0% instances), INTJ (1; 0% instances), SYM (1; 0% instances)