The iPlant Collaborative, renamed Cyverse in 2017, is a virtual organization created by a cooperative agreement funded by the US National Science Foundation (NSF) to create cyberinfrastructure for the plant sciences (botany).[1] The NSF compared cyberinfrastructure to physical infrastructure, "... the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor".[2] In September 2013 it was announced that the National Science Foundation had renewed iPlant's funding for a second 5-year term with an expansion of scope to all non-human life science research.[3]

Cyverse (formerly iPlant Collaborative)
Type of site
Scientific support
Available inEnglish
URLcyverse.org
CommercialNo
Launched2008

The project develops computing systems and software that combine computing resources, like those of TeraGrid, and bioinformatics and computational biology software. Its goal is easier collaboration among researchers with improved data access and processing efficiency. Primarily centered in the United States, it collaborates internationally.

History

edit

Biology is relying more and more on computers.[4] Plant biology is changing with the rise of new technologies.[5] With the advent of bioinformatics, computational biology, DNA sequencing, geographic information systems and others computers can greatly assist researchers who study plant life looking for solutions to challenges in medicine, biofuels, biodiversity, agriculture and problems like drought tolerance, plant breeding, and sustainable farming.[6] Many of these problems cross traditional disciplines and facilitating collaboration between plant scientists of diverse backgrounds and specialties is necessary.[6][7][8]

In 2006, the NSF solicited proposals to create "a new type of organization – a cyberinfrastructure collaborative for plant science" with a program titled "Plant Science Cyberinfrastructure Collaborative" (PSCIC) with Christopher Greer as program director.[9] A proposal was accepted (adopting the convention of using the word "Collaborative" as a noun) and iPlant was officially created on February 1, 2008.[1][9] Funding was estimated as $10 million per year over five years.[10]

Richard Jorgensen led the team through the proposal stage and was the principal investigator (PI) from 2008 to 2009.[10] Gregory Andrews, Vicki Chandler, Sudha Ram and Lincoln Stein served as Co-Principal Investigators (Co-PIs) from 2008 to 2009. In late 2009, Stephen Goff was named PI and Daniel Stanzione was added as a Co-PI.[1][11][12] As of May 2014, Co-PI Stanzione was replaced by 4 new Co-PIs: Doreen Ware at Cold Spring Harbor, Nirav Merchant and Eric Lyons at the University of Arizona, and Matthew Vaughn at the Texas Advanced Computing Center.[13]

The iPlant project supports what has been called e-Science, which is a use of information systems technology that is being adopted by the research community in efforts such as the National Center for Ecological Analysis and Synthesis (NCEAS), ELIXIR,[14] and the Bamboo Technology Project that started in September 2010.[15][16] iPlant is "designed to create the foundation to support the computational needs of the research community and facilitate progress toward solutions of major problems in plant biology."[6][17]

The project works as a collaboration. It seeks input from the wider plant science community on what to build.[18] Based on that input, it has enabled easier use of large data sets,[19] created a community-driven research environment to share existing data collections within a research area and between research areas[20] and shares data with provenance tracking.[21][22] One model studied for collaboration was Wikipedia.[23][24]

Several more recent National Science Foundation awards mentioned iPlant explicitly in their descriptions, as either a design pattern to follow or a collaborator with whom the recipient will work.[25]

Institutions

edit

The primary institution for the iPlant project is the University of Arizona, located within the BIO5 Institute in Tucson.[26] Since its inception in 2008, personnel worked at other institutions including Cold Spring Harbor Laboratory, University of North Carolina, Wilmington, and the University of Texas at Austin in the Texas Advanced Computing Center.[27] Purdue University and Arizona State University were part of the original project group.[10]

Other collaborating institutions that received support from iPlant for their work on a Grand Challenge in phylogenetics starting in March 2009 included Yale University, University of Florida, and the University of Pennsylvania.[27] A trait evolution group was led at the University of Tennessee.[28] A visualization workshop employing iPlant was run by Virginia Tech in 2011.[29]

The NSF requires that funding subcontracts stay within the United States, but international collaboration started in 2009 with the Technical University Munich[27] and University of Toronto in 2010.[29][30] East Main Evaluation & Consulting provides external oversight, advice, and assistance.[31]

Services

edit

The iPlant project makes its cyberinfrastructure available several different ways and offers services to make it the accessible to its primary audience. The design was meant to grow in response to needs of the research community it serves.[6]

The Discovery Environment

edit

The Discovery Environment integrates community-recommended software tools into a system that can handle terabytes of data using high-performance supercomputers to perform these tasks much more quickly. It has an interface designed to hide the complexity needed to do this from the end user. The goal was to make the cyberinfrastructure available to non-technical end users who are not as comfortable using a command-line interface.[6][32]

iPlant Foundational APIs

edit

A set of application programming interfaces (APIs) for developers allow access to iPlant services, including authentication, data management, high performance supercomputing resources from custom, locally produced software.[6][33]

Atmosphere

edit

Atmosphere is a cloud computing platform that provides easy access to pre-configured, frequently used analysis routines, relevant algorithms, and data sets, and accommodates computationally and data-intensive bioinformatics tasks.[6] It uses the Eucalyptus virtualization platform.[34][35]

iPlant Semantic Web

edit

The iPlant Semantic Web effort uses an iPlant-created architecture, protocol, and platform called the Simple Semantic Web Architecture and Protocol (SSWAP) for semantic web linking using a plant science focused ontology.[6][36][37] SSWAP is based on the notion of RESTful web services with an ontology based on Web Ontology Language (OWL).[38][39]

Taxonomic Name Resolution Service

edit
 
Screen shot of the DNA Subway tool

The Taxonomic Name Resolution Service (TNRS) is a free utility for correcting and standardizing plant names. This is needed because plant names that are misspelled, out of date (because a newer synonym is preferred), or incomplete make it hard to use computers to process large lists.[6][40][41]

My-Plant

edit

My-Plant.org is a social networking community for plant biologists, educators and others to come together to share information and research, collaborate, and track the latest developments in plant science.[6][42] The My-Plant network uses the terminology clades to group users in a manner similar to phylogenetics of plants themselves.[42] It was implemented using Drupal as its content management system.[42]

DNA Subway

edit

The DNA Subway website uses a graphical user interface (GUI) to generate DNA sequence annotations, explore plant genomes for members of gene and transposon families, and conduct phylogenetic analyses. It makes high-level DNA analysis available to faculty and students by simplifying annotation and comparative genomics workflows.[6][43] It was developed for iPlant by the Dolan DNA Learning Center.[44][45]

References

edit
  1. ^ a b c "PSCIC Full Proposal: The iPlant Collaborative: A Cyberinfrastructure-Centered Community for a New Plant Biology". Award Abstract #0735191. National Science Foundation. August 22, 2011. Retrieved September 21, 2011.
  2. ^ Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure. National Science Foundation. January 15, 2003. Retrieved September 18, 2011.
  3. ^ Renewal announcement on iPlant news website http://news.iplantcollaborative.org/?p=212. Retrieved May 27, 2014. {{cite web}}: Missing or empty |title= (help)
  4. ^ Stein, Lincoln (September 2008). "Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges". Nature Reviews Genetics. 9 (9): 678–688. doi:10.1038/nrg2414. PMID 18714290. S2CID 339653.
  5. ^ Dilworth, Machi F (2009). "Perspective: Plant biology—A quiet pioneer". Plant Biotechnology. 26 (2): 183–187. doi:10.5511/plantbiotechnology.26.183.
  6. ^ a b c d e f g h i j k Goff, Stephen A.; et al. (2011). "The iPlant Collaborative: Cyberinfrastructure for Plant Biology". Frontiers in Plant Science. 2: 34. doi:10.3389/fpls.2011.00034. ISSN 1664-462X. PMC 3355756. PMID 22645531.
  7. ^ Walsh, Lorraine; Khan, Peter E. (2010), Collaborative working in higher education: the social academy, New York: Routledge, p. 37, ISBN 978-0-415-99167-4
  8. ^ Peter Arzberger (March 24, 2010). "Biological Sciences and Cyberinfrastructure for the 21st Century" (PDF). Coalition for Academic Scientific Computation presentation. Retrieved September 29, 2011.
  9. ^ a b "Plant Science Cyberinfrastructure Collaborative (PSCIC)". Program Solicitation 06-594. National Science Foundation. November 30, 2006. Retrieved September 21, 2011.
  10. ^ a b c "National Science Foundation Awards $50 Million for Collaborative Plant Biology Project to Tackle Greater Science Questions". News release. National Science Foundation. January 30, 2008. Retrieved September 21, 2011.
  11. ^ "Stephen Goff, Ph.D." Staff biography page from iPlant web site. Archived from the original on November 28, 2011. Retrieved September 21, 2011.
  12. ^ "Dan Stanzione, Ph.D." Staff biography page from Texas Advanced Computing Center web site. Retrieved September 21, 2011.
  13. ^ "Leadership". Leadership page from iPlant website. Retrieved May 27, 2014.
  14. ^ "ELIXIR: Data for Life". Official web site. Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom: EMBL-European Bioinformatics Institute. Retrieved September 28, 2011.
  15. ^ "About Bamboo". Project Bamboo website. Archived from the original on October 9, 2011. Retrieved September 27, 2011.
  16. ^ Brown, Susan; Thatcher, Sherry (2011). "Factors Influencing Adoption And Non-Adoption Of Cyberinfrastructure By The Research Community". Pacific Asia Conference on Information Systems Proceedings. ISBN 978-1-86435-644-1.
  17. ^ Steve Goff (September 18, 2008). "The iPlant Collaborative: A Cyberinfrastructure-Centered Community of Plant and Computing Scientists" (PDF). New PhytologistSymposium presentation. Archived from the original (PDF) on April 22, 2012. Retrieved September 29, 2011.
  18. ^ Heidi Ledford (June 24, 2009). "Cyberinfrastructure: Feed me data". Nature. 459 (7250): 1047–1049. doi:10.1038/4591047a. PMID 19553968. The iPlant programme was designed to give plant scientists a new information infrastructure. But first they had to decide what they wanted...
  19. ^ Jordan, Chris; Stanzione, Dan; Ware, Doreen; Lu, Jerry; Noutsos, Christos (2010). "Comprehensive data infrastructure for plant bioinformatics" (PDF). 2010 IEEE International Conference on Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS). pp. 1–5. doi:10.1109/CLUSTERWKSP.2010.5613093. ISBN 978-1-4244-8395-2. S2CID 2278398.
  20. ^ Moore, Reagan W.; et al. (May 19, 2009). "Policy-Based Distributed Data Management Systems". 4th International Conference on Open Repositories. Georgia Institute of Technology. Retrieved September 28, 2011.
  21. ^ Ram, Sudha; Jun Liu (2010). "Provenance Management in BioSciences". Advances in Conceptual Modeling – Applications and Challenges. Advances in Conceptual Modeling‚ Applications and Challenges. Vol. 6413. Springer Berlin / Heidelberg. pp. 54–64. doi:10.1007/978-3-642-16385-2_8. ISBN 978-3-642-16384-5.
  22. ^ Brown, Susan A.; Thatcher, Sherry; Dang, Yan (2010). "Managing Knowledge in a Changing Scientific Landscape: The Impact of Cyberinfrastructure". 2010 43rd Hawaii International Conference on System Sciences. pp. 1–10. doi:10.1109/HICSS.2010.263. ISBN 978-1-4244-5509-6. S2CID 1132763.
  23. ^ Thomas Veneklasen (March 11, 2010). "Who does what on Wikipedia?". e! Science News. Retrieved September 29, 2011.
  24. ^ Jun Liu; Sudha Ram (December 2009). "Who Does What: Collaboration Patterns in the Wikipedia and Their Impact on Data Quality". 19th Workshop on Information Technologies and Systems: 175–180. SSRN 1565682.
  25. ^ "Award Search—Awardee Information". NSF website search. National Science Foundation. Retrieved September 28, 2011. See abstracts for awards #0849861, #0923975, #0940841, #0953184, #1027542, #1031416, #1032105, #1126481, #1126998.
  26. ^ "UA-Led Research Team Awarded $50 Million to Solve Plant Biology's Grand Challenges". News release. University of Arizona. January 30, 2008. Retrieved September 23, 2011.
  27. ^ a b c "iPlant Moving Forward on Grand Challenge Collaborations". iPlant Leaflet. Vol. 09–2. April 9, 2009. Retrieved September 25, 2011.
  28. ^ "Trait Evolution". iPlant web site. Retrieved September 25, 2011.
  29. ^ a b "Integration and Visualization Workshop". iPlant Leaflet. Vol. 11–2. June 8, 2011. Retrieved September 25, 2011.
  30. ^ "Computing for Life: Scientists depend on advanced computing to better understand evolution, drug discovery and genetics". University of Texas at Austin. May 31, 2010. Archived from the original on October 2, 2011. Retrieved September 25, 2011.
  31. ^ "Current Projects". EMEC website. Retrieved February 21, 2013.
  32. ^ Matthew Helmke (August 19, 2011). "The iPlant Discovery Environment". Discovery Environment manual 0.4. Retrieved September 28, 2011.
  33. ^ Dooley, Rion (July 19, 2011). "An API To Feed the World". TeraGrid 2011: Extreme Digital Discovery. Salt Lake City, Utah. Archived from the original (PDF) on September 19, 2012. Retrieved September 28, 2011.
  34. ^ Seung-jin, Kim; et al. (November 1, 2010). "Facilitating access to customized computational infrastructure for plant sciences: Atmosphere cloudfront" (PDF). 2nd IEEE International Conference on Cloud Computing Technology and Science. Indianapolis, Indiana. Retrieved September 28, 2011.
  35. ^ Juan Antonio Raygoza Garay; Sonya Lowry; John Wregglesworth (May 3, 2011). "Enabling Plant Sciences Research with the iPlant Discovery Environment and Condor" (PDF). Condor Week presentation. University of Wisconsin Computer Science Department. Retrieved September 28, 2011.
  36. ^ Gessler, D. D.; et al. (September 23, 2009). "SSWAP: A Simple Semantic Web Architecture and Protocol for semantic web services". BMC Bioinformatics. 10 (309): 309. doi:10.1186/1471-2105-10-309. PMC 2761904. PMID 19775460.
  37. ^ Nelson, R. T.; et al. (June 4, 2010). "Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for bioinformatics resource discovery and disparate data and service integration". BioData Mining. 3 (1): 3. doi:10.1186/1756-0381-3-3. PMC 2894815. PMID 20525377.
  38. ^ Pautasso, Cesare; Wilde, Erik; Alarcon, Rosa (December 4, 2013). REST: Advanced Research Topics and Practical Applications. Springer Science & Business Media. pp. 76–77. ISBN 9781461492993. Retrieved June 27, 2016.
  39. ^ Barros, Alistair; Oberle, Daniel (March 2, 2012). Handbook of Service Description: USDL and Its Methods. Springer Science & Business Media. p. 169. ISBN 9781461418641. Retrieved June 27, 2016.
  40. ^ Narro, Martha L.; et al. (August 2011). "The TNRS: a taxonomic name resolution service for plants". Plant Biology. Minneapolis. Archived from the original on May 16, 2011. Retrieved September 14, 2011.
  41. ^ John Whitfield (June 13, 2011). "Species spellchecker fixes plant glitches: Online tool should weed out misspellings and duplications". Nature. 474 (7351): 263. doi:10.1038/474263a. PMID 21677719.
  42. ^ a b c Hanlon, Matthew R.; Mock, Stephen; Nuthulapati, Praveen; Gonzales, Michael B.; Soltis, Pamela; Soltis, Douglas; Majure, Lucas C.; Payton, Adam; Mishler, Brent; Tremblay, Susan; Madsen, Thomas; Olmstead, Richard; McCourt, Richard; Wojciechowski, Martin; Merchant, Nirav (2010). "My-Plant.org: A phylogenetically structured social network". 2010 Gateway Computing Environments Workshop (GCE). pp. 1–8. doi:10.1109/GCE.2010.5676118. ISBN 978-1-4244-9751-5. S2CID 12621375.
  43. ^ Schaeffer, Mary; et al. (2011). "MaizeGDB: curation and outreach go hand-in-hand". Database: The Journal of Biological Databases and Curation. 2011. Oxford University Press: bar022. doi:10.1093/database/bar022. PMC 3104940. PMID 21624896.
  44. ^ "DNA Subway: Fast Track to Gene Annotation and Genome Analysis". Dolan DNA Learning Center website. Retrieved September 29, 2011.
  45. ^ Uwe Hilgert (July 9, 2011). "DNA Subway Places Students On Fast Track To Plant Genome Analysis and DNA BarCoding". Botany 2011: Healing the Planet (workshop). Retrieved September 29, 2011.
edit