Treebank Statistics: UD_Irish-TwittIrish: POS Tags: X
There are 732 X
lemmas (7%), 780 X
types (6%) and 1197 X
tokens (3%).
Out of 17 observed tags, the rank of X
is: 4 in number of lemmas, 6 in number of types and 11 in number of tokens.
The 10 most frequent X
lemmas: #gaeilge, _, #gaa, #tg4, #clg, pm, #lágaeilge, #snag, #gaelic, #gaeltacht
The 10 most frequent X
types: #gaeilge, #gaa, #tg4, #clg, pm, #lágaeilge, #snag, #gaelic, #gaeltacht, #
The 10 most frequent ambiguous lemmas: #gaeilge (X 92, PROPN 3, ADJ 1, NOUN 1), _ (X 51, PROPN 1), #tg4 (X 8, PROPN 1), pm (NOUN 18, X 12, NUM 1, PROPN 1), #Irish (X 4, PROPN 1), i.n. (NOUN 5, X 4, ADP 2), #SeachtainnaGaeilge (X 3, PROPN 1), #aimsir (NOUN 3, X 2), #brodclub (X 3, NOUN 1), #ceol (X 3, NOUN 1)
The 10 most frequent ambiguous types: #gaeilge (X 85, NOUN 2, PROPN 2, ADJ 1), pm (NOUN 19, X 12, NUM 1, PROPN 1), #Ireland (X 5, PROPN 1), #Irish (X 4, PROPN 1), #SeachtainnaGaeilge (X 3, PROPN 1), #aimsir (NOUN 3, X 2), #brodclub (X 3, NOUN 1), #ceol (X 3, NOUN 1), mhaith (ADJ 46, NOUN 6, X 3), srl (X 3, ADV 2, NOUN 1)
- #gaeilge
- X 85: Tá an chuallacht ar twitter !! :) #gaeilge #mánuad #anshiftabea ??
- NOUN 2: @user635 grúpa an-tábhachtach ó thaobh seirbhísí #gaeilge #achtgtcd #fss #hse #sláinte
- PROPN 2: ( & solas dar ndoigh !! ) Nil se in aon fhocloir #gaeilge ata agam .
- ADJ 1: RT @user259 : Trendáil #gaeilge #SnaG do seachtain na Gaeilge @user197 @user1493 @user619
- pm
- NOUN 19: Oíche Scannán an Aoine seo , ar a seacht a chlog ( 7 pm ) i Chevy Chase !! http://t.co/aCo9Lw7lmx
- X 12: Ceol @user1280 anocht 7 pm ó #tradfest an Chlocháin i mí Aibreáin . @user397 #irishmusic
- NUM 1: Rugbaí Beo ar @user997 inniú ag 15.15 pm ! Aironi V Connachta agus ansin Cardiff Blues V Laighin ! :) Bigí Linn
- PROPN 1: An scéal is déanaí maidir le seirbhís aeir oileáin Árann AGUS a bhfuil i ndán do Thuaisceart Éireann , anocht ar Nuacht TG4 @ 7 pm .
- #Ireland
- #Irish
- #SeachtainnaGaeilge
- #aimsir
- #brodclub
- #ceol
- mhaith
- srl
- X 3: RT @user54 : Uisce Éireann chun $ 86 a chaitheamh ar chomhairleoireacht srl http://t.co/Eso95UykLJ
- ADV 2: RT @user1058 : @user1639 @user412 @user27 agus tá an teanga ag fáil bháis agua níl ach uaireanta chloig fágtha aici srl srl e …
- NOUN 1: @user1575 Thriall mise an Bikram Íoga don 1 uair mí ó shin . Bhí na poses srl iontach simplí i gcomparáid le Ashtanga íoga ach an TEAS !!
Morphology
The form / lemma ratio of X
is 1.065574 (the average of all parts of speech is 1.212231).
The 1st highest number of forms (46) was observed with the lemma “_”: #CosaSásta, #Gaeilge, -dúil, 15, Bhreitheamh, Ghaeilge, Jobs.ie, Mheara, a, bhrón, bhuíoch, cheoil, chinnte, chosúil, chóip, cit, dea-thoil, deachair, dearg, fhreagra, g, ghaelach, ghean, gleoite, ionraic, l, lár, mhaith, neills, o, obair, oíche, pa, phracticiul, rtha, scanrúil, scéal, seo, shuimúil, standing, suimuil, sásta, tapaigh, teannas, thógtha, éireoidh.
The 2nd highest number of forms (3) was observed with the lemma “i.n.”: i.n, i.n., in.
The 3rd highest number of forms (2) was observed with the lemma “#CruinniunaComhairle”: #CouncilMeeting, #CruinniunaComhairle.
X
does not occur with any features.
Relations
X
nodes are attached to their parents using 24 different relations: parataxis:hashtag (933; 78% instances), nmod (71; 6% instances), obl (56; 5% instances), goeswith (49; 4% instances), obj (12; 1% instances), root (11; 1% instances), parataxis:sentence (9; 1% instances), conj (8; 1% instances), compound (7; 1% instances), amod (6; 1% instances), flat (5; 0% instances), nsubj (5; 0% instances), parataxis (5; 0% instances), appos (3; 0% instances), parataxis:url (3; 0% instances), ccomp (2; 0% instances), dep (2; 0% instances), discourse (2; 0% instances), flat:foreign (2; 0% instances), flat:name (2; 0% instances), case (1; 0% instances), list (1; 0% instances), vocative (1; 0% instances), vocative:mention (1; 0% instances)
Parents of X
nodes belong to 14 different parts of speech: NOUN (572; 48% instances), VERB (330; 28% instances), PROPN (115; 10% instances), ADJ (72; 6% instances), PRON (33; 3% instances), X (19; 2% instances), NUM (17; 1% instances), (11; 1% instances), ADV (8; 1% instances), ADP (7; 1% instances), INTJ (4; 0% instances), SYM (4; 0% instances), PART (3; 0% instances), DET (2; 0% instances)
1028 (86%) X
nodes are leaves.
120 (10%) X
nodes have one child.
21 (2%) X
nodes have two children.
28 (2%) X
nodes have three or more children.
The highest child degree of a X
node is 9.
Children of X
nodes are attached using 28 different relations: case (83; 29% instances), punct (53; 19% instances), det (29; 10% instances), nmod (21; 7% instances), vocative:mention (11; 4% instances), advmod (9; 3% instances), parataxis:sentence (9; 3% instances), parataxis:hashtag (8; 3% instances), parataxis:url (7; 2% instances), obl (6; 2% instances), compound (5; 2% instances), conj (5; 2% instances), flat (5; 2% instances), parataxis (5; 2% instances), parataxis:rt (5; 2% instances), amod (4; 1% instances), cc (4; 1% instances), obl:tmod (4; 1% instances), nsubj (3; 1% instances), appos (2; 1% instances), aux (1; 0% instances), cop (1; 0% instances), discourse (1; 0% instances), discourse:emo (1; 0% instances), flat:foreign (1; 0% instances), flat:name (1; 0% instances), nummod (1; 0% instances), obj (1; 0% instances)
Children of X
nodes belong to 14 different parts of speech: ADP (82; 29% instances), PUNCT (53; 19% instances), PROPN (33; 12% instances), DET (28; 10% instances), NOUN (23; 8% instances), X (19; 7% instances), SYM (15; 5% instances), ADV (8; 3% instances), NUM (7; 2% instances), ADJ (4; 1% instances), CCONJ (4; 1% instances), PRON (4; 1% instances), VERB (4; 1% instances), AUX (2; 1% instances)