%0 Journal Article %A Forkel, Robert %A List, Johann-Mattis %A Greenhill, Simon J. %A Rzymski, Christoph %A Bank, Sebastian %A Cysouw, Michael %A Hammarström, Harald %A Haspelmath, Martin %A Kaiping, Gereon A. %A Gray, Russell D. %+ Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society CALC, Max Planck Institute for the Science of Human History, Max Planck Society Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society %T Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics : %G eng %U https://hdl.handle.net/21.11116/0000-0007-91A6-9 %R 10.1038/sdata.2018.205 %F OTHER: shh1098 %7 2018-10-16 %D 2018 %8 16.10.2018 %* Review method: peer-reviewed %X The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices. %K The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for validation and manipulation, a basic ontology which links to more general frameworks, and usage examples of best practices. %J Scientific Data %O Sci. Data %V 5 %] 180205 %I Nature Publishing Group %C London, United Kingdom %@ 2052-4463