UD_Haitian_Creole-Adolphe
|
UD_Haitian_Creole-Autogramm
|
Tokenization and Word Segmentation
|
Tokenization and Word Segmentation
|
- This corpus contains 3314 sentences and 71734 tokens.
|
- This corpus contains 144 sentences and 3279 tokens.
|
- This corpus contains 6790 tokens (9%) that are not followed by a space.
|
- This corpus contains 279 tokens (9%) that are not followed by a space.
|
- This corpus does not contain words with spaces.
|
- This corpus does not contain words with spaces.
|
- This corpus contains 14 types of words that contain both letters and punctuation. Examples: Ing-wen, Jean-Dickens, jw.org, l', n', 'dèt', 'peche', Chia-lung, Jean-Pierre, Jr., kè-sote, sere-sere, tèt-chaje, wo-nivo
|
- This corpus contains 9 types of words that contain both letters and punctuation. Examples: Ing-wen, Jean-Dickens, Ayiti', Chia-lung, Jean-Pierre, kè-sote, sere-sere, tèt-chaje, wo-nivo
|
|
|
Morphology
Tags
- This corpus uses 17 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, SYM, VERB, X
|
Morphology
Tags
- This corpus uses 16 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PRON, PROPN, PUNCT, SCONJ, SYM, VERB, X
- This corpus does not use the following tags: PART
|
- This corpus contains 9 word types tagged as particles (PART): a, ann, men, non, pa, pou, t, wi, èske
|
|
- This corpus contains 29 lemmas tagged as pronouns (PRON): a, anyen, de, ke, ki, kiyès, kote, kwa, kèlkeswa, ladan, li, lui, lòt, menm, mwen, nou, noumenm, ou, pa, pèsonn, sa, sila, tan, te, toulede, tout, wou, yo, youn
|
- This corpus contains 10 lemmas tagged as pronouns (PRON): anyen, ke, ki, kwa, li, mwen, nou, ou, sa, yo
|
- This corpus contains 32 lemmas tagged as determiners (DET): anpil, chak, de, kelkeswa, ki, konbyen, konsa, kèk, kèlkeswa, la, lòt, menm, nenpòt, non, oken, okenn, pa, pifò, plizyè, pyès, sa, sila, sèt, ti, toude, toule, tout, twòp, tèl, yo, yon, youn
|
- This corpus contains 14 lemmas tagged as determiners (DET): chak, ki, kèk, la, lòt, nempòt, non, oken, okenn, plizyè, sa, tout, yo, yon
|
- Out of the above, 11 lemmas occurred sometimes as PRON and sometimes as DET: de, ki, kèlkeswa, lòt, menm, pa, sa, sila, tout, yo, youn
|
- Out of the above, 3 lemmas occurred sometimes as PRON and sometimes as DET: ki, sa, yo
|
- This corpus contains 7 lemmas tagged as auxiliaries (AUX): ap, dwe, ka, pral, se, ta, te
|
- This corpus contains 7 lemmas tagged as auxiliaries (AUX): ap, dwe, ka, pral, se, ta, te
|
- Out of the above, 4 lemmas occurred sometimes as AUX and sometimes as VERB: dwe, ka, pral, te
|
- Out of the above, 1 lemmas occurred sometimes as AUX and sometimes as VERB: ka
|
- This corpus does not use the VerbForm feature.
|
- This corpus does not use the VerbForm feature.
|
Nominal Features
|
Nominal Features
|
|
|
|
|
|
|
- Plur
- DET: yo, kèk, plizyè, anpil, de, sa, Toule, toude
- PRON: yo, nou, n, y, tout, n', de, noumenm, toulede, yon
|
- Plur
- DET: yo, kèk, plizyè
- PRON: yo, nou, n, y, yon
|
- Sing
- DET: a, yon, la, an, nan, sa, lan, chak, youn, lòt
- PRON: l, li, w, ou, m, mwen, sa, n, t, youn
|
- Sing
- DET: yon, a, la, an, sa, nan, chak, lòt, yo, yoon
- PRON: m, li, mwen, l, sa, w, ni, Ou, nou
|
|
|
|
|
|
|
- Def
- DET: a, la, an, nan, yo, lan, sa
|
- Def
- DET: a, yo, la, an, nan, sa
|
- Ind
- DET: yon, anpil, a, plizyè, kèk, youn, chak, de, sa, yo
|
|
Degree and Polarity
|
Degree and Polarity
|
|
|
|
|
- Neg
- ADV: pa, pap, t, ap, non
- DET: okenn, pa, oken
- PRON: anyen, pèsonn
|
- Neg
- ADV: pa, p
- DET: okenn, oken
|
|
|
Verbal Features
|
Verbal Features
|
|
|
|
|
|
|
|
|
|
|
|
|
- Fut
- AUX: pral, apral, pwal, ap
|
|
|
|
|
|
|
|
Pronouns, Determiners, Quantifiers
|
Pronouns, Determiners, Quantifiers
|
|
|
- Art
- DET: a, yon, la, an, yo, nan, lan, youn, anpil, tout
|
- Art
- DET: yon, yo, a, la, an, nan, yoon
|
- Dem
- DET: sa, konsa, sila, a
- PRON: sa, sila
|
|
- Neg
- ADV: anyen
- PRON: anyen, pèsonn
|
|
- Prs
- DET: pa, Chak
- PRON: yo, l, nou, n, li, w, ou, m, mwen, y
|
- Prs
- PRON: m, li, yo, l, mwen, nou, n, w, y, ni
|
- Rel
- ADV: kote, ki
- PRON: ki, k, ke, kote, kiyès, sa
- SCONJ: ki, k, ke
|
- Rel
- ADV: kote
- DET: ki
- PRON: ki, ke, k
- SCONJ: ke
|
|
|
- Card
- NUM: de, 12, 14, 144000, 1914, 200,000, 2013, 28, 3, 3000
|
|
|
|
- Yes
- DET: pa, yo
- PRON: li, nou, l, m
|
- Yes
- DET: yo
- PRON: li, l, m, nou
|
|
|
|
|
- 1
- PRON: nou, n, m, mwen, n', noumenm, pa, t
|
|
|
|
- 3
- DET: yo
- PRON: yo, l, li, y, t, sa, l', menm, ni, a
|
- 3
- PRON: li, yo, l, y, ni, sa, yon
|
|
|
|
|
|
|
Other Features
|
Other Features
|
|
|
- Typo
- Yes
- ADJ: o
- NOUN: d, jennon
- PROPN: d
|
- Typo
- Yes
- ADJ: o
- NOUN: d, jennon
- PROPN: d
|
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: se.
|
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: se.
|
- This corpus uses 6 lemmas as auxiliaries (aux). Examples: te, ap, ka, dwe, pral, ta.
|
- This corpus uses 6 lemmas as auxiliaries (aux). Examples: te, ap, ka, ta, pral, dwe.
|
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (866)
- VERB--NOUN-ADP(ak) (1)
- VERB--NOUN-ADP(an) (1)
- VERB--NOUN-ADP(nan) (6)
- VERB--NOUN-ADP(pou) (1)
- VERB--PRON (4540)
- VERB--PRON-ADP(ak) (2)
- VERB--PRON-ADP(nan) (3)
- VERB--PRON-ADP(pou) (2)
- VERB--PRON-ADP(sou) (2)
|
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (111)
- VERB--PRON (219)
|
- obj
- VERB--NOUN (3485)
- VERB--NOUN-ADP(ak) (5)
- VERB--NOUN-ADP(anrapò) (1)
- VERB--NOUN-ADP(bay) (1)
- VERB--NOUN-ADP(de) (8)
- VERB--NOUN-ADP(kijanm) (1)
- VERB--NOUN-ADP(konsenan) (1)
- VERB--NOUN-ADP(kont) (1)
- VERB--NOUN-ADP(nan) (11)
- VERB--NOUN-ADP(pou) (1)
- VERB--NOUN-ADP(sou) (2)
- VERB--PRON (1633)
- VERB--PRON-ADP(ak) (2)
- VERB--PRON-ADP(de) (1)
- VERB--PRON-ADP(konsenan) (1)
- VERB--PRON-ADP(kont) (1)
- VERB--PRON-ADP(nan) (8)
- VERB--PRON-ADP(pou) (2)
|
- obj
- VERB--NOUN (158)
- VERB--NOUN-ADP(ak) (1)
- VERB--NOUN-ADP(de) (2)
- VERB--NOUN-ADP(kijanm) (1)
- VERB--NOUN-ADP(sou) (1)
- VERB--PRON (34)
|
- iobj
- VERB--NOUN (12)
- VERB--PRON (73)
|
- iobj
- VERB--NOUN (5)
- VERB--PRON (8)
|
|
|
|
|
|
|
Relations Overview
- This corpus uses 7 relation subtypes: acl:relcl, advcl:cleft, compound:svc, flat:name, obl:arg, obl:mod, parataxis:insert
- The following 6 relation types are not used in this corpus at all: csubj, expl, clf, list, orphan, reparandum
|
Relations Overview
- This corpus uses 7 relation subtypes: acl:relcl, advcl:cleft, compound:svc, flat:name, obl:arg, obl:mod, parataxis:insert
- The following 2 main types are not used alone, they are always subtyped: compound, flat
- The following 6 relation types are not used in this corpus at all: csubj, xcomp, expl, clf, list, orphan
|