Getting a repository
Make sure you have a github account and contact a member of our release and documentation task force: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org. They will get you going.
Coding metadata for the front page
The table on the front page is automatically generated from special lines in the
README.md file for every language. This means that in order to add a new language, also its repository must be created, minimally with the readme file. Here is an example of the language metadata block from the Finnish README file
Documentation status: complete Data source: semi-automatic Data available since: UD v1.0 License: CC BY-SA 4.0 Genre: blog wiki legal news fiction Contributors: Ginter, Filip; Kanerva, Jenna; Laippala, Veronika; Missilä, Anna; Pyysalo, Sampo Contact: email@example.com, firstname.lastname@example.org
This block can be anywhere in the readme file. The properties are as follows:
Documentation statuscan be
Data sourcecan be
manual. Here, as a rough guidance,
manualmeans that every word of every sentence has been manually checked, whereas
semi-automaticmeans an automatic conversion with major manual checks of various types of constructions.
Data available sincecan be
UD v1.1, or
UD v1.2. As the current release is 1.1, new languages which will be included in 1.2 should set this property accordingly, so that they are included in the upcoming automatic validation runs.
License: anything containing the string
BY-NC-SAwill be given the CC non-commercial logo,
BYthe CC logo, and
GNUthe GNU logo. To add any other license, please provide a suitable icon to email@example.com and firstname.lastname@example.org.
Genre: this is simply a space-separated list of genres which gets mapped into symbols in the table. The possible genres are listed in this file in the repository. If you don’t see yours, just edit the file on GitHub and add your genre, choosing one of the symbols from the FontAwesome list. Please make sure you get the syntax right, since this is a machine-readable JSON file. It is also possible to not add the genre to the
genre_symbols.jsonfile, in which case the default symbol will be used automatically. The genre name will still remain visible in the mouse-over tooltip.
Contributors: the list of contributors to be included with the data release and in the LINDAT download page. This is a semi-colon separated list where every name is in the
Last, Firstform and the readme file should be utf-8 encoded to make sure special characters are preserved correctly.
Contact: e-mail address(es) of contact person(s) for the treebank (typically a subset of the contributors). The address may be used for inquiries about the treebank, such as questions that are not answered in documentation and/or issue trackers. More importantly, it may be used by people who want to contribute to the treebank and need to coordinate with the maintainers, and by the UD release task force to discuss issues that need to be fixed before the release. Warning: the addresses listed here may be exposed to spamming robots.
Making a release
When you are ready to contribute to a release, please read the release checklist.