Infrastructure for managing and analyzing biological networks derived from collections of plant images Abhiram Das1, Alexander Bucksch1,2, Joshua S. Weitz1,3
Goals • Online storage of cleared leaf images and its metadata. • Managing and sharing image collections with trusted community. • Batch upload and download of images. • Ingesting processed images from thirdparty image processing software.
Findings • Hosted on iPlant and accessible at http:// www.clearedleavesdb.org/ • Image collections from Smithsonian and University of Western Australia.
Islandora
User and Roles
RDF Repository
High throughput computing nodes Image processing module
Fedora Commons
Graph Database
Drupal
Cytoscape web
Goals • Online storage and trait computation of plant root images. • Share root images and computation result with colleagues.
1. Store and manage image data of biological spatial networks 2. Extract and store networks from image data using existing algorithms 3. Aggregate, analyze and visualize networks
Findings • Currently being used to investigate physiological and genetic relevance of computed traits. • For example water stressed monocot roots has many shallow angle roots.
Background
This work is supported by:
iRODS data grid (File System)
Application 2: DIRT – Digging Into Root Traits
Objectives
Less number of shallow angle roots
MySQL
PHP
Web Services
Image Custom Modules Processing
Leaf and Root Image Database
iRODS Client
Web Client
Goals 1. Build an integrated platform to store heterogeneous plant image data, extract phenotypic data from images, process and analyze the data and map it to genotype. 2. Design ontology content model to store heterogeneous and evolving plant phenotypic data. 3. Provide a platform to plugin new image processing modules and allow end users to build their own processing pipeline using existing modules. 4. Analyze and visualize biological networks. 5. Integrate QTL and GWAS analysis pipeline to map phenotype to genotype. 1. Platform should scale to high volume of image data and metadata in TB scale. 2. It should scale to computation of large number of data points. 3. It should have intuitive user interfaces.
Apache Web Server Drupal Core
Image Processing Workflows
Challenges
Many shallow angle roots
System Architecture
Drupal
Recent advances in imaging techniques have produced large collections of images and metadata in different fields of science. In particular, plant phenomics is generating large datasets that warrant approaches for high-throughput data collection, management, processing and analysis. Quantifying plant phenotypes requires the development of novel informatics tools that span image analysis, trait estimation, workflow design and data integration. A number of software tools have been developed recently to estimate plant traits [1-4]. A central concept unifying many of these tools is that estimates of plant traits are derived from reconstruction of networks. However, significant work remains to (i) develop trait estimation algorithms and workflows suitable for field studies; (ii) integrate trait-estimation with formal data integration methods to enable discovery of trends and patterns spanning multiple experiments.
Integrated platform for plant image analysis
Application 1: Cleared Leaf Image Database
STORAGE
Currently available imaging technologies permit rapid acquisition of high-resolution images of spatial networks in biology e.g., leaf venation networks, cardiovascular networks, cortical networks, root networks and ant trails. Hence, scientists interested in spatial networks are increasingly becoming analysis-limited, instead of data-limited. The challenges to analyzing this data include how to: (i) identify a network from image data (whether 2D, 3D or point clouds); (ii) visualize structural properties of large, complex biological networks with meta-data; (iii) utilize a common analysis framework for different networks in different problem domains; (iv) distribute the results and raw data of spatial network analysis to the community. At present, many systems are available for biological image management [5], but they are limited in the sense that they lack automation and analysis capabilities designed for spatial networks. So, we bring together few state of the art tools and technologies to build a web-based platform specifically designed to meet following objectives.
Proposed Architecture
PROCESSING
Introduction
Current Platform and Applications for Plant Image Analysis
CLIENT
Abstract
of Biology, 2School of Interactive Computing and 3School of Physics and, Georgia Institute of Technology
Plant Ontology Content Model
1School
References Web Service Client
JAVA
Standalone Client
College of Sciences Research Development Grant Center for Data Analytics Seed Grant
1. Le Bot J, et al. (2010) DART: a software to analyse root system architecture and development from captured images. Plant and Soil 326(1-2):261-273. 2. Clark RT, et al. (2011) Three-dimensional root phenotyping with a novel imaging and software platform. Plant Physiology 156(2): 455-465. 3. Galkovskyi T, et al. (2012) GiA Roots: Software for the high throughput analysis of plant root system architecture. BMC Plant Biology 12(116). 4. Trachsel S, Kaeppler SM, Brown KM, & Lynch JP (2011) Shovelomics: high throughput phenotyping of maize (Zea mays L.) root architecture in the field. Plant and Soil 341(1-2):75-87. 5. Kvilekval K, Fedorov D, Obara B, Singh A, & Manjunath BS (2010) Bisque: a platform for bioimage analysis and management. Bioinformatics 26(4):544-552. http://ecotheory.biology.gatech.edu
[email protected] [email protected] [email protected]