Abhiram Das1, Alexander Bucksch1,2, Joshua S. Weitz1,3 ...

3 downloads 8551 Views 2MB Size Report
that span image analysis, trait estimation, workflow design and data integration ... PHP. Drupal. Core. Custom. Modules. Web. Services. Standalone Client. JAVA.
Infrastructure for managing and analyzing biological networks derived from collections of plant images Abhiram Das1, Alexander Bucksch1,2, Joshua S. Weitz1,3

Goals •  Online storage of cleared leaf images and its metadata. •  Managing and sharing image collections with trusted community. •  Batch upload and download of images. •  Ingesting processed images from thirdparty image processing software.

Findings •  Hosted on iPlant and accessible at http:// www.clearedleavesdb.org/ •  Image collections from Smithsonian and University of Western Australia.

Islandora

User and Roles

RDF Repository

High throughput computing nodes Image processing module

Fedora Commons

Graph Database

Drupal

Cytoscape web

Goals •  Online storage and trait computation of plant root images. •  Share root images and computation result with colleagues.

1.  Store and manage image data of biological spatial networks 2.  Extract and store networks from image data using existing algorithms 3.  Aggregate, analyze and visualize networks

Findings •  Currently being used to investigate physiological and genetic relevance of computed traits. •  For example water stressed monocot roots has many shallow angle roots.

Background

This work is supported by:

iRODS data grid (File System)

Application 2: DIRT – Digging Into Root Traits

Objectives

Less number of shallow angle roots

MySQL

PHP

Web Services

Image Custom Modules Processing

Leaf and Root Image Database

iRODS Client

Web Client

Goals 1.  Build an integrated platform to store heterogeneous plant image data, extract phenotypic data from images, process and analyze the data and map it to genotype. 2.  Design ontology content model to store heterogeneous and evolving plant phenotypic data. 3.  Provide a platform to plugin new image processing modules and allow end users to build their own processing pipeline using existing modules. 4.  Analyze and visualize biological networks. 5.  Integrate QTL and GWAS analysis pipeline to map phenotype to genotype. 1.  Platform should scale to high volume of image data and metadata in TB scale. 2.  It should scale to computation of large number of data points. 3.  It should have intuitive user interfaces.

Apache Web Server Drupal Core

Image Processing Workflows

Challenges

Many shallow angle roots

System Architecture

Drupal

Recent advances in imaging techniques have produced large collections of images and metadata in different fields of science. In particular, plant phenomics is generating large datasets that warrant approaches for high-throughput data collection, management, processing and analysis. Quantifying plant phenotypes requires the development of novel informatics tools that span image analysis, trait estimation, workflow design and data integration. A number of software tools have been developed recently to estimate plant traits [1-4]. A central concept unifying many of these tools is that estimates of plant traits are derived from reconstruction of networks. However, significant work remains to (i) develop trait estimation algorithms and workflows suitable for field studies; (ii) integrate trait-estimation with formal data integration methods to enable discovery of trends and patterns spanning multiple experiments.

Integrated platform for plant image analysis

Application 1: Cleared Leaf Image Database

STORAGE

Currently available imaging technologies permit rapid acquisition of high-resolution images of spatial networks in biology e.g., leaf venation networks, cardiovascular networks, cortical networks, root networks and ant trails. Hence, scientists interested in spatial networks are increasingly becoming analysis-limited, instead of data-limited. The challenges to analyzing this data include how to: (i) identify a network from image data (whether 2D, 3D or point clouds); (ii) visualize structural properties of large, complex biological networks with meta-data; (iii) utilize a common analysis framework for different networks in different problem domains; (iv) distribute the results and raw data of spatial network analysis to the community. At present, many systems are available for biological image management [5], but they are limited in the sense that they lack automation and analysis capabilities designed for spatial networks. So, we bring together few state of the art tools and technologies to build a web-based platform specifically designed to meet following objectives.

Proposed Architecture

PROCESSING

Introduction

Current Platform and Applications for Plant Image Analysis

CLIENT

Abstract

of Biology, 2School of Interactive Computing and 3School of Physics and, Georgia Institute of Technology

Plant Ontology Content Model

1School

References Web Service Client

JAVA

Standalone Client

College of Sciences Research Development Grant Center for Data Analytics Seed Grant

1. Le Bot J, et al. (2010) DART: a software to analyse root system architecture and development from captured images. Plant and Soil 326(1-2):261-273. 2. Clark RT, et al. (2011) Three-dimensional root phenotyping with a novel imaging and software platform. Plant Physiology 156(2): 455-465. 3. Galkovskyi T, et al. (2012) GiA Roots: Software for the high throughput analysis of plant root system architecture. BMC Plant Biology 12(116). 4. Trachsel S, Kaeppler SM, Brown KM, & Lynch JP (2011) Shovelomics: high throughput phenotyping of maize (Zea mays L.) root architecture in the field. Plant and Soil 341(1-2):75-87. 5. Kvilekval K, Fedorov D, Obara B, Singh A, & Manjunath BS (2010) Bisque: a platform for bioimage analysis and management. Bioinformatics 26(4):544-552. http://ecotheory.biology.gatech.edu [email protected] [email protected] [email protected]