In this posting I will try to give some advice about how to choose the right spatial data sources. I will focus on legal issues. As you will see the conclusion will have to be one based on more than one factor.
As far as I know very few people, if at all anyone, have been imprisoned for breach of copyrights pertaining to spatial data. We have often heard about how sharing digital movies may lead to high fines and imprisonment. Even linking to copyrighted material might lead you into such problems.
Most countries have regulations on intellectual property. It is not a novelty. In 1777 in a court case between the composers Johann Christian Bach and Karl Friedrich Abel, the dispute was about whether sheet music fell under that time’s copyright law. Lord Mansfield concluded the case with these words:
A person may use the copy by playing it, but he has no right to rob the author of the profit, by multiplying copies and disposing of them for his own use.
(Source: From the book “Free Culture” by Lawrence Lessig)
This verdict is just one of many pointing to challenges for the owner of writings, media, computer programs, maps and other items falling under regulations on intellectual property. Many of us have enjoyed movies using the VHS copy a friend made for us of a TV show we missed. Others have received digital copies. Some of this would fall under what is called fair use, other is in the end of the day part of robbing the owner of the profit. The first is ok, the latter is not.
While I am not by any means trained in law, I should still be able to distinguish right from wrong by reading the law or having it mediated to me in lay-mans terms. If the law fails to leave me with such an understanding I should of course stand to be corrected. But the regulators should at the same time consider what is the biggest problem – institutions that fail to communicate the essence of the law, or the citizen that for some reason is not able to understand what is wrong and what is right even after being informed.
Sharing a roads data set may not be a biggie. Certainly not when it’s origin is dubious and when the originator has failed to provide the data with information about it’s legal status. Or maybe they did, but it somehow got lost in transaction? However unproblematic the roads data sharing is, it still represents a challenge for the communities relying on these data.
These are some of the potential consequences:
- As the roads data set gets shared original metadata is stripped, the data gets reprojected, even “corrected” in the best interest, and so on. Consequence: Corrupted data.
- If the original contributor of the data relied on funding this funding might dry out since everyone knows that there is a “similar” data set available at the same quality as the one sold. Consequence: Data contributer goes out of business or looses its funding.
- The roads data was meant to be distributed freely, but the copy data practice leads to confusion about its origin. In the end of the day nobody knows the actual source. Consequence: A good source of data is not used and the funding party looses interest.
Today we see a movement towards openly sharing data: governments, organizations, Web sites, individuals, and devices. Data are being copied and used. Given the right tools data are also refined, remixed, corrected refined and properly credited. Open data finds it’s parallel with the development of open-source software. This latter is what has given us gis systems like QGIS, Geoserver, Mapserver, Gdal and many other tools.
In this posting I will look at how some sources contribute to such processes. I will use the Tanzania roads data set as an example. For Tanzania roads data we are left with four choices:
- Official government data
- The tz_roads.shp dataset of unclear origin
- Openstreetmap data
The last three of them are available for use. Only OpenStreetmap data has a clear licenses.
- To Share: To copy, distribute and use the database.
- To Create: To produce works from the database.
- To Adapt: To modify, transform and build upon the database.
- Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the ODbL. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database.
- Share-Alike: If you publicly use any adapted version of this database, or works produced from an adapted database, you must also offer that adapted database under the ODbL.
- Keep open: If you redistribute the database, or an adapted version of it, then you may use technological measures that restrict the work (such as DRM) as long as you also redistribute a version without such measures.
Not knowing too much about the OpenStreetmap data in Tanzania I did some statistics on a copy of the roads data set from OpenStreetMap as provided to my by Howard Frederick in early January this year. I used this dataset in my Tanzania Protected areas map earlier published on this site. OpenStreetmap has mapped a total of 48.363 kms divided among 10.253 segments in all of Tanzania. Highlights according to the well documented categories of OSM are:
- primary: 4.401 km
- roads: 24.622 km
- secondary: 5.524 km
- tertiary:4.033 km
- tracks: 1.658 km
- steps: 150 meters (!)
Tracks4Africa relies on the T4A Data Community to contribute to their data. The following paragraphs is what I would call their social contract with their users.
The T4A Data Community is the very foundation of Tracks4Africa. It is the source of GPS data, information and professional advice, guidance and inspiration. Embedded in the T4A Community is an eco-consciousness and a value system that practically rules everything.
Membership to T4A is open, free and voluntary. Recording GPS field data is a by-product of outdoor/leisure or business travel. Data must be collected and prepared according toT4A community standards.
T4A data members actively collect and submit field data and information, and in return receive a monthly updated GPS map called “T4A Maps QA”. This map is just a 2D map (only roads and points) and it is about a 10Mb download. It is NOT a free map and it is NOT a navigational toolkit. It serves only as a quality correction tool and it is distributed ONLY to active Data Members.
At least up until 2009 they provided the community with detalied overviews of submitted and processed data. Their website has a page dedicated to the history of Tracks4Africa. It tells an impressive tale about 1100 submitters of a total of 10.000.000 kms tracks and 600 000 kms of African qualified roads by 2010. They did not provide any further details on their licensing on their webpage.
The choice is made – for roads data
Considering the alternatives above my choice is clear. OpenStreetmap does the job. Ideally the Tanzanian Government should provide their data as well. It would be useful to have access to the official roads data if not for use, then at least to make sure that the OpenStreetMap data is not wrong. For the OpenStreetmap data set is not a perfect one. It is the result of many individuals contributing on their own terms. Some contributors are excellent, other again are not at all excellent. Still – the presence of a data set with a clear license makes for a good basis for research and navigation.
Choosing OpenStreetMap over government, private and random data for roads data is a no-brainer. The case is however much more different when it comes to authoritative data like district boundaries, protected areas, land use and other. In those cases a mix of “fair use”, oportunism and less accurate sources might be a better choice. I hope to come back to those issues in a later posting.
- Lawrence Lessig (2004). Free culture: how big media uses technology and the law to lock down culture and control creativity.