Well before the invention of the computer and the following onslaught of digital data, S.R. Ranganathan in his 1939 book, “Theory of Library Catalogue,” described in detail how a library catalog should work and be prepared. Since then, the way we develop and maintain information databases and archives has dramatically changed. The format has morphed beyond traditional books and journals to large amounts of digital data.
The Office of Coast Survey alone produces digital data on order of several terabytes a year, and much of this data is collected by NOAA hydrographic survey vessels. In order to make the data easily accessible and available to the public, it is formatted and submitted to the National Center for Environmental Information (NCEI) ocean data archive. Here, the basic principles of library cataloging apply – find the appropriate information quickly and in the location it is supposed to be stored.
In 2014, while seeking oceanographic data profiles in the Arctic, Coast Survey made a discovery. Some of the data collected by NOAA hydrographic survey vessels were missing in the oceanographic database at the National Oceanographic Data Center, now the NCEI ocean data archive. A follow up investigation revealed a breakdown in data stewardship protocols and the oceanographic data were only stored in the multibeam sonar data archive, preventing direct access to the data by the general public.
An effort was initiated to recover the data and translate it into a format suitable for searchable databases. Data from a total of 466 hydrographic surveys (acquired from 2005 to 2013) were recovered. The recovered data were acquired by a host of sensors in various digital formats and varying stages of completeness of digital metadata, making the data validation a challenge.
Chen Zhang and Mashkoor Malik, along with two interns from the University of Maryland, Walther Rodriguez and Heather Lam, initially started processing the profiles manually, but soon developed several automated tools to validate and convert the data. Once complete, a total of 35,373 oceanographic profiles were sent to NCEI’s ocean data archive and cataloged appropriately, where researchers from all over the world can easily access this data for their specific scientific applications.
Coast Survey is grateful to NCEI staff David Fischman, Thomas Carey, and Christopher Paver, in providing support to recover, process, and archive the oceanographic data.