Target
Mission statementDevelopment of Big Data information systems
LocationUniversity of Groningen, Netherlands
EstablishedJanuary 2009
FundingFunded by the European Fund for Regional Development & partners
Websitewww.rug.nl/target

Target is an ongoing public-private project in the area of large-scale data management and information systems. It is run by a consortium of ten academic and IT industry partners coordinated by the University of Groningen, the Netherlands. Target conducts research and development with a focus on the design of intelligent information systems that can efficiently process data and extract information from extremely large and structurally diverse datasets. Target has set up an expertise center in Groningen for

  • data management of (inter)national science projects in the area of astronomy, life sciences, artificial intelligence, medical diagnosis and more
  • initiation of market-driven R&D activities in collaboration with IT businesses that can lead to competitive Big Data solutions and applications.

The Target infrastructure belongs to one of the largest academic computer center in the Netherlands. Hosted by the Donald Smits Center for Information Technology at the University of Groningen, the Target facilities consist of more than 10 Petabytes of GPFS-based storage, high-performance supercomputing cluster and a GRID cluster, which is a part of the Big Grid and the European Grid Infrastructure.

History edit

The last few decades have seen a steep increase in the efficiency, diversity and availability of data collection and data storage technologies. This has resulted in many local, national and international projects in the natural, physical and social sciences becoming very data-intensive, with significant intellectual and financial resources dedicated to effective data manipulation, data mining and data analysis. The OmegaCEN research group at the Kapteyn Astronomical Institute(University of Groningen) has for many years worked in the field of data handling for massive datasets of wide-field astronomical image data. They have developed a scalable distributed environment called Astro-WISE,[1] which is based on an approach to data management that considerably differs from the classical techniques of predefined processing pipelines.[2] This new approach is based on the principles of linking data processing, analysis and storage into an intelligent information system that enables a continuous and controlled integration of improved data management techniques.[3] Such information systems are, in principle, highly scalable and provide a virtual research environment designed to enhance collaboration. The success of Astro-WISE and the ongoing efforts in the Northern part of the Netherland for quick transition of new academic knowledge into innovative, high-impact applications led to the establishment of Target. The project was launched in 2009 after it received 32 million euro funding for a period of five years from the European Fund for Regional Development, the Dutch Ministry of Ecomonic Affairs (Pieken in de Delta), and the provinces of Groningen and Drenthe. Supported by the Northern Netherlands Provinces Alliance (SNN) and the Groningen municipality, Target operates under the auspices of Sensor Universe.

Target Partners edit

 
The Target data center is hosted by the Donald Smits Center for Information Technology located at the University of Groningen, The Netherlands

Target is a consortium of ten partners coming from academic/research institutions, emerging local businesses and well established global IT industries.

  • OmegaCEN Research Group – A part of the Kapteyn Astronomical Institute at the University of Groningen, OmegaCEN group is headed by Prof. Edwin A. Valentijn. The group conducts research in the field of wide-field astronomical imaging and astronomical information technology. Prof. Edwin A. Valentijn is also the founder and coordinator of Target.
  • Donald Smits Center for Information Technology (CIT) – CIT is one of the Dutch academic ICT centers for High Performance Computing. CIT hosts and manages the integrated ICT infrastructure of Target.
  • University Medical Center Groningen (UMCG) –The UMCG is one of the largest hospitals in the Netherlands. IT researchers from the UMCG work with Target on the development of a flexible data management system for Lifelines – a large initiative in the north of the Netherlands aiming at collecting data for 165000 people over the course of 30 years in order to identify and analyze aging patterns.[4]
  • Artificial Intelligence Institute (ALICE) - ALICE is located at the University of Groningen and it is involved in research activities in the area of artificial intelligence and cognitive engineering. Target collaborates with the institute in the research of (i) handwritten text recognition, (ii) scalability of large data files, and (iii) analyzing unstructured information. The Monk system used for handwritten text recognition is hosted by the Target data center.[5]
  • ASTRON - Target and ASTRON work together to develop the LOFAR long-term archive [6].
  • IBM - The multinational company provides most of the hardware infrastructure for the Target center and its experts are involved in the development of robust and reliable architectural design that can meet the often disparate requirements of all Target users.
  • Oracle - Oracle participates in joint research with Target on very large distributed databases and scalable access to tables with billions of entries.
  • Nspyre - Nspyre is one of the IT business partners in the Target project that specializes in innovative software solutions for high-tech automation of large-scale industrial services.
  • Elkoog/Heeii - Elkoog/Heeii is another IT business partner in the Target project. It is a fast-growing company, based in Groningen that offers innovative internet search services and recommendation engines for web browsers.
  • Target Holding - Target Holding stimulates and guides knowledge transfer from the Target expertise center to interested commercial parties.

Areas of R&D edit

Target conducts multidisciplinary R&D in the following areas

  • scalable distributed data storage
  • information system workflows
  • massive data production and quality control
  • scalable distributed database systems
  • data visualization.

Projects edit

Target participates in a number of data-intensive scientific projects in astronomy, handwritten text recognition algorithms, medical research on healthy aging, development of diagnostic tools for Parkinson’s disease and more.

LOFAR Long-term Archive edit

 
Target has developed and maintains the LOFAR Long-term Archive. The telescope will generate Petabytes of data that will be stored at the Target data center in Groningen and several other data centers in Europe. Image Credit: ASTRON

Much of the data from the LOFAR telescope will be stored, accessed from and archived on the LOFAR Long-Term Archive, designed by ASTRON and Target.[6] The data will be hosted at the Target data center and several other European centers.

Monk edit

 
The Monk system for handwritten text recognition is hosted on the Target facilities. Image credit: Photographed by Peter Maas, manuscript written by Marie Eugène François Thomas Dubois

Monk is a system, developed by prof. Schomaker and his group at the Artificial Intelligence Institute (ALICE) at the University of Groningen. It utilizes sophisticated algorithms for handwritten text recognition in a variety of existing archives.[5][7][8] Currently a number of books from the Dutch National Archives as well as several international historical collections have been ingested into Monk and used to further improve the processing algorithms. MONK is also a scientific user of the Target infrastructure.

LifeLines edit

LifeLines is a long-term medical research project run by the University Medical Center Groningen (UMCG). An array of genotype and phenotype data will be gathered from 165000 people once every five years for a total period of thirty years. The accumulated data will be used by researchers and medical specialists to gain insights into the processes related to aging and understand why age-related health degradation varies so widely.[4] Target provides LifeLines with the infrastructure for data storage, access and processing.

GLIMPS edit

Run by Dr. K Leenders, a professor of neurology at the UMCG, GLIMPS is a research project set to find faster and more reliable diagnostic tools for Parkinson’s disease.[9] GLIMPS explores the possibilities of using complex image-based algorithms and PET scans for early detection of Parkinson’s.[10] To test the effectiveness of such algorithms, GLIMPS is building a large database of PET scans delivered by numerous hospitals in the Netherlands. This database will be used to improve and refine the software algorithms and compare their output with currently existing clinical diagnosis. Target is responsible for building and maintaining the GLIMPS database as well as ensuring the smooth running of the image-based algorithms on its computing facilities.

Others edit

Additionally, Target is fully or jointly involved in the data management for other astronomical projects such as KiDs/VIKING astronomical survey, ESA’s Euclid mission, ESO’s MUSE instrument (mounted on the Very Large Telescope), MICADO (to be mounted on the E-ELT) and Gaia. Target Holding also manages a number of commercial projects that utilize the expertise or ICT infrastructure of Target. These projects are usually a result of collaborations with emerging and/or well established public and private businesses in the North of the Netherlands.

References edit

  1. ^ Begeman, K.; Belikov, A. N.; Boxhoorn, D. R.; Valentijn, E. A. (2012). "The Astro-WISE data centric information system". Experimental Astronomy. 35 (1–2): 1. arXiv:1208.0447. doi:10.1007/s10686-012-9311-4. S2CID 254464778. {{cite journal}}: Check date values in: |year= / |date= mismatch (help); Unknown parameter |month= ignored (help)
  2. ^ Verdoes Kleijn, Gijs A.; Belikov, Andrey N.; McFarland, John P. (2013). "The data zoo in Astro-WISE". Experimental Astronomy. 35 (1–2): 187. arXiv:1208.6299. doi:10.1007/s10686-012-9314-1. S2CID 254466607. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: date and year (link)
  3. ^ Mwebaze, Johnson (2012). Extreme Data Lineage in Ad-hoc Astronomical Data Processing. University of Groningen: PhD Dissertation. ISBN 9789036757591.
  4. ^ a b Stolk, R. P.; Rosmalen, J. G.; Postma, D. S.; De Boer, R. A.; Navis, G.; Slaets, J. P.; Ormel, J.; Wolffenbuttel, B. H. (2008). "Universal risk factors for multifactorial diseases: LifeLines: a three-generation population-based study". European Journal of Epidemiology. 23 (1): 67–74. doi:10.1007/s10654-007-9204-4. PMID 18075776. S2CID 2430182. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: date and year (link)
  5. ^ a b Van Der Zant, Tijn; Schomaker, Lambert; Zinger, Svitlana; Van Schie, Henny (2009). "Where are the Search Engines for Handwritten Documents?". Interdisciplinary Science Reviews. 34 (2–3): 224–235. doi:10.1179/174327909X441126. S2CID 57037481.{{cite journal}}: CS1 maint: date and year (link)
  6. ^ a b Belikov, A (2011). "Target for LOFAR Long Term Archive: Architecture and Implementation". Proc. Of ADASS XXI, ASP Conf. Series. arXiv:1111.6443. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  7. ^ Van Der Zant, Tijn; Schomaker, Lambert; Valentijn, Edwin (January 28, 2008). Yanikoglu, Berrin A.; Berkner, Kathrin (eds.). "Large-scale parallel document-image processing". Proceedings of Document Recognition and Retrieval XV, IS&T/SPIE International Symposium on Electronic Imaging. Document Recognition and Retrieval XV. 6815: 68150N-68150N. doi:10.1117/12.765482. S2CID 40083465.
  8. ^ Schomaker, L.R.B. (January 28, 2008). "Word mining in a sparsely-labeled handwritten collection". Proceedings of Document Recognition and Retrieval XV, IS&T/SPIE International Symposium on Electronic Imaging: 6815–6823. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  9. ^ Teune, Laura Klaaske (2013). Glucose metabolic patterns in neurodegenerative brain diseases. PhD Dissertation.
  10. ^ Teune, Laura (2013). FDG- PET Imaging in Neurodegenerative Brain Diseases, chapter 22 of the book "Functional Brain Mapping and the Endeavor to Understand the Working Brain". InTech. ISBN 978-953-51-1160-3.