IsoGenie Database:  IsoGenieDB

IsoGenieDB: Overview

Due to the nature of the data collected at Stordalen Mire in Abisko Sweden a standard one size fits all database system is insufficient.  The data is not only large in size but also highly complex in the nature of the data types including but not limited to soil organic matter chemical spectra; DNA and RNA sequencing data; metaproteomic data; geochemistry; climate; and vegetation surveys. Therefore, IsoGenie postdoctoral fellow Dr. Benjamin Bolduc lead the effort to design a database management system, The IsoGenieDB, which allows easy storage and accessibility of data collected by members of the IsoGenie project.

Use and Capacity:

It allows data to be connected within and between different biological and non-biological datatypes.  (or It allows data to be connected within and between different datatypes, which is a structure fundamental for all types of biological data). Given the “meta”-level awareness, data within the IsoGenieDB can be queried on numerous properties, i.e. retrieving radioisotope data associated with metatranscriptomic samples that show a correlation between cloud cover and population genome abundance. The results of these analyses can reveal underlying or overarching patterns of interaction often invisible within ecosystems due to incomplete integration of these relationships among data types.

Structure and Design:

The IsoGenieDB is a full software stack that provides storage of raw and processed data, an efficient means of query the data, and a web-based user interface for private and public dissemination of collected data. It consists of a Neo4j-powered graph database that uses a property graph model to represent and store data generated by the IsoGenie consortium. The scheme-free, property model design of graph databases allow inclusion of new datatypes without needing a priori knowledge of future data.