Oh, I didn’t know about this. Fantastic!!
I see that the initiative is about promoting CEA data sharing for researchers in general. To facilitate data consumption by machine (e.g. for autonomous applications) but also cross-dataset research), however, it seems like it would be more valuable if the data guidelines define exact field names and data types for well-known columns. For example,
Timestamp must be a string in RFC 3339 format;
AmbientTemp_<SensorID> must be a float in Celsius. Otherwise the data guidelines are more of data packaging guidelines than about the data itself.
Also, since we’re already in the business of issuing data guidelines, I would simply insist on CSV format, for all the reasons already given on the webpage.
Perhaps after collecting more data from different sources, it will be possible to determine the most common fields and create a more formal specification. Though I suspect we already know enough to do this on some level. Should I go ahead and take a stab at this and do a PR?