home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Welsh-CCG: POS Tags: VERB

There are 243 VERB lemmas (5%), 612 VERB types (8%) and 2807 VERB tokens (6%). Out of 15 observed tags, the rank of VERB is: 4 in number of lemmas, 4 in number of types and 8 in number of tokens.

The 10 most frequent VERB lemmas: bod, cael, gallu, dod, gwneud, gweld, dylu, mynd, dweud, darfod

The 10 most frequent VERB types: mae, oedd, sy, sydd, bydd, fydd, ydw, ydych, ydym, yw

The 10 most frequent ambiguous lemmas: bod (VERB 1664, AUX 1021, NOUN 286), cael (NOUN 238, VERB 124), gallu (VERB 107, NOUN 44), dod (NOUN 118, VERB 80), gwneud (NOUN 101, VERB 68), gweld (NOUN 65, VERB 51), mynd (NOUN 116, VERB 36), dweud (NOUN 52, VERB 30), dechrau (NOUN 55, VERB 18), gwybod (NOUN 33, VERB 18)

The 10 most frequent ambiguous types: mae (VERB 240, AUX 83, NOUN 2), oedd (AUX 140, VERB 129), sy (VERB 118, AUX 30), sydd (VERB 101, AUX 20), bydd (VERB 45, AUX 11), fydd (VERB 51, AUX 19), ydw (VERB 42, AUX 7), ydych (VERB 35, AUX 18), ydym (VERB 42, AUX 6), yw (AUX 169, VERB 36)

Morphology

The form / lemma ratio of VERB is 2.518519 (the average of all parts of speech is 1.444620).

The 1st highest number of forms (90) was observed with the lemma “bod”: ‘S, ‘dyn, Basai, Byddan, Dw’, Fasai, O’dd, Roeddent, Rwdy, Rwi, Rydw, Sai, Wi, baent, bai, baswn, bo’, bof, bu, buaswn, buodd, buom, bydd, bydda, byddaf, byddai, byddech, byddem, byddent, byddet, byddwch, byddwn, bûm, dw, dwi, dwy, fasat, fo, fu, fuasen, fues, fydd, fyddaf, fyddai, fyddan, fyddant, fydden, fyddet, fyddi, fyddwch, fyddwn, fyswch, ma’, mae, maen, maent, oedd, oedda, oeddach, oeddan, oeddech, oeddem, oedden, oeddent, oeddet, oeddwn, oes, ro’n, rwyf, ry’n, sy, sydd, tydi, wy, wyf, wyt, y’n, ydach, ydan, ydi, ydw, ydw’, ydwyf, ydy, ydych, ydym, ydyn, ydynt, ydyw, yw.

The 2nd highest number of forms (26) was observed with the lemma “cael”: Cawsom, Gafon, Gest, caf, cafodd, cafwyd, caiff, cawn, cefais, ceir, ces, cewch, chafodd, chafwyd, chaiff, cheir, ga’, gaent, gafodd, gaiff, gawn, gawsoch, gawson, gefais, geir, ges.

The 3rd highest number of forms (25) was observed with the lemma “gallu”: Allech, Allet, Gallaf, Gallasai, Gallet, Gallu, Gelli, all, allaf, allai, allem, allwch, allwn, ellid, ellir, gall, gallai, gallant, gallech, gallodd, gallwch, gallwn, gellid, gellir, gellwch.

VERB occurs with 6 features: Mood (2807; 100% instances), Person (2807; 100% instances), VerbForm (2807; 100% instances), Tense (2735; 97% instances), Number (2544; 91% instances), Mutation (478; 17% instances)

VERB occurs with 20 feature-value pairs: Mood=Cnd, Mood=Imp, Mood=Ind, Mood=Sub, Mutation=AM, Mutation=NM, Mutation=SM, Number=Plur, Number=Sing, Person=0, Person=1, Person=2, Person=3, Tense=Fut, Tense=Imp, Tense=Past, Tense=Pqp, Tense=Pres, VerbForm=Fin, VerbForm=FinRel

VERB occurs with 103 feature combinations. The most frequent feature combination is Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin (613 tokens). Examples: mae, yw, oes, ydy, ydi, ydyw, ‘S, ma’, rwyf, tydi

Relations

VERB nodes are attached to their parents using 12 different relations: root (1678; 60% instances), acl:relcl (471; 17% instances), conj (274; 10% instances), advcl (114; 4% instances), ccomp (111; 4% instances), acl (105; 4% instances), parataxis (35; 1% instances), appos (10; 0% instances), obl (3; 0% instances), csubj (2; 0% instances), nmod (2; 0% instances), obj (2; 0% instances)

Parents of VERB nodes belong to 8 different parts of speech: (1678; 60% instances), NOUN (694; 25% instances), VERB (302; 11% instances), ADJ (52; 2% instances), PRON (42; 1% instances), PROPN (29; 1% instances), NUM (7; 0% instances), AUX (3; 0% instances)

4 (0%) VERB nodes are leaves.

228 (8%) VERB nodes have one child.

510 (18%) VERB nodes have two children.

2065 (74%) VERB nodes have three or more children.

The highest child degree of a VERB node is 9.

Children of VERB nodes are attached using 24 different relations: punct (2189; 24% instances), xcomp (1638; 18% instances), nsubj (1615; 17% instances), advmod (1149; 12% instances), obl (914; 10% instances), obj (440; 5% instances), conj (277; 3% instances), advcl (270; 3% instances), cc (260; 3% instances), ccomp (218; 2% instances), mark (138; 1% instances), parataxis (41; 0% instances), case (33; 0% instances), det (33; 0% instances), obl:agent (27; 0% instances), appos (10; 0% instances), csubj (6; 0% instances), amod (4; 0% instances), discourse (4; 0% instances), acl (1; 0% instances), aux (1; 0% instances), cop (1; 0% instances), iobj (1; 0% instances), nummod (1; 0% instances)

Children of VERB nodes belong to 15 different parts of speech: NOUN (3912; 42% instances), PUNCT (2189; 24% instances), PRON (847; 9% instances), PART (742; 8% instances), ADV (318; 3% instances), VERB (302; 3% instances), CCONJ (264; 3% instances), PROPN (197; 2% instances), ADJ (181; 2% instances), SCONJ (122; 1% instances), NUM (84; 1% instances), ADP (54; 1% instances), DET (35; 0% instances), AUX (17; 0% instances), SYM (7; 0% instances)