International Polar Year Data Challenge
Given my attempt to run the North Pole Marathon, I’ve gotten involved in a project called International Polar Year. While this is a major worldwide scientific research project that’s occurring over the next two years, there’s a concentration of scientists in Boulder – including several of the people at the National Snow and Ice Data Center at CU Boulder.
Mark Parsons – one of the guys at NSIDC – is deeply involved in figuring out how to manage all the data associated with IPY. A few weeks ago, we talked about the massive challenge associated with this project. I asked Mark to write up an overview so that we could start to think about who in the tech business might be able to help us deal with the massive amount of data IPY is going to try to organize. Following is his summary:
IPY is perhaps the most multidisciplinary, integrative, international science project ever conceived. The polar regions have a large and increasingly apparent influence on global systems. These influences range from global ocean circulation which does much to define current climate to local human adaptations that help us understand human knowledge in fundamental ways. Understanding these processes and influences is a bold challenge. The fact that more than 50,000 investigators from more than 60 countries are seeking to meet that challenge shows that the scope of the inquiry is huge. Yet at a more fundamental level, it is necessary to identify, integrate, and interpret physical, life, and social science data in new ways. This is a science challenge, but it is also a challenge for data management, information science, computer science, and basic human communication.
New paradigms and practical methods are necessary to explore, discover, visualize, and synthesize data and information. This includes technical solutions that allow us to derive new knowledge from the growing mountain of data but also include social solutions abetted by new technology that allow us to better share knowledge, coordinate resources, and educate the next generation.
The challenge is all the greater because the data collected during IPY plus the ongoing data supporting IPY will be highly distributed. There will be no one or even few central archives. Furthermore, the data will be extremely variable in nature including multi-spectral remote sensing imagery, detailed in-situ measurements of polar flora and fauna, and native-language interviews of Inuit hunters and elders. IPY is actively promoting the use of international data description and transfer standards, but there is a limit to how broadly these can apply to such diverse, distributed, and multilingual data.
In short the challenges of IPY provides a unique opportunity to test and implement new technologies and methods for interactive data access and human communication. This is essential to sustain the legacy of IPY and ensure an educated populace able to address increasingly complex world problems.
So – Mark and I are on a quest – we are looking for technology companies that are interested in engaging with IPY to try to figure out how to deal with this massive project. Anyone out there (including folks that know how to deal with massive amounts of distributed data – hints to my friends at Google, Microsoft, and Yahoo) interested?