Japan Nuclear Crowd Map Unfolded
Last Friday I posted about Southampton University’s website mashing data on post-Fukushima contamination from the government, the independent Safecast network and, crucially, data collated from anyone by Pachube/Cosm/Xively.
I had contacted Southampton about the problem before 5pm and only posted about it before packing it in fir the night. Matteo Venanzi from Southampton emailed me at 4.08 am, so researchers at Southampton clearly work a strange shift pattern.
His email said: “As you correctly spotted, the Xively sensor is reporting wrong readings. After your and other emails, we temporally stopped the system and we are now working to remove broken sensors from our database. To make it clear, our system is not responsible of the quality of the Xively sensors, it simply collects crowdsourced, publicly available radiation data streamed by Xively and interpolates them using a Gaussian process creating the heatmap. The “intelligence” of this fusion refers to the use of a Gaussian process to interpolate the values over the entire space. You understand that this process does not excludes that some broken sensors, which often occurs many crowdsoucing project, may produce wrong input values and, in turns, wrong interpolations. However, it would have been much appreciated that you would have been discussed this issue with us (I just saw your email now) before publishing your article on the safecast mailing list. It is our interest to provide useful services to people. Thanks.”
This email demonstrates a touchy sensitivity to what a lay reader might understand “intelligent” crowdsourcing to involve. The English language website was not taken down in the evening, but had been taken off-line in the morning. Unfortunately the Japanese language version was still online, which I pointed out to Matteo. Both maps have now been taken offline and the sites display a message:
These messages show a interesting transference of responsibility for the errors in the map to the the sensors and the algorithm. Just to be blunt, the point I was making in my post last week was not that the “sensor is reporting wrong readings”; the sensor had not reported anything for nearly two years, so it isn’t, in the present tense, reporting anything. My first criticism was that if you are creating a map of current data, checking that the data you are using are current is a pretty basic place to start. All the data used were time-stamped. The second criticism is that if you really were motivated by wanting to help people in Japan you would review the areas of high radiation identified to check check that you were not needlessly spreading anxiety. There is no evidence that any review was carried out at all. The amp they produced was so out of line with the maps produced by either the Japanese authorities or Safecast that it would take anyone with a modicum of awareness to spot that either the maps were identifying something very significant or were wrong; it would then take them ten minutes to identify that it was the latter. There is a piquant irony in that the creators of this website are also, I learn, involved in very valuable research on trustworthiness in crowdsourcing: Trust-based fusion of untrustworthy information in crowdsourcing applications. This is a very interesting technical paper which, behind the detail, explains why a data source providing a single outlier data point is potentially untrustworthy. The one unequivocally true statement on the website is its advice “Please check the Safecast map for the correct radiation levels.” My suspicion that Southampton saw producing their map as an easy way to demonstrate the “impact” of their research on data visualisation was not addressed. It is clear that the group at Southampton do not think that they are in any way at fault, though I expect the “aggregation algorithm” has been taken out for a damn good thrashing, as would have been the “faulty” sensor, except that it is in Japan and dead as the ドードー.