Unit for Processing and Data Archiving

Rack with the UPADT80 storage system (100 TB), computing nodes and database nodes

Rack with the UPADT80 storage system (100 TB), computing nodes and database nodes

The data archiving and processing of the images collected at the OAJ will be carried out in the Unit for Processing and Data Archiving (UPAD). This data center will provide the hardware infrastructure needed to store, process and analyze the images, as well as keep data backup. It will also have the provide efficient access to the scientific database and sky images for the astronomical community and the general public.

The UPAD hardware has three main systems:

  • OAJ/CPD
  • UPAD main storage and processing
  • EDAM (External Data Access Machine)

OAJ/CPD

The system has two dedicated IO servers that control the movement of the data from the camera servers to their mirrors (other two IO servers) at the UPAD. During day time those two servers secure two backup copies of raw data. The OAJ has a storage system 90 TB net that acts as a buffer for two months of data collection.

UPAD storage and processing

At UPAD datacenter, the main logistic about data handling and pipeline processes scheduling is controlled, apart from the two IO servers dedicated to the download of OAJ data and service nodes, by two queue control servers that run Sun Grid Engine as batch queuing system. An scheme showing the nodes and storage systems deployed at OAJ and UPAD is shown in figure 1.

Nodes and storage systems deployed at OAJ and UPAD

Figure 1. Nodes and storage systems deployed at OAJ and UPAD.

UPAD main storage and processing capabilities

The hardware for storaging and processing the J-PLUS data collected with JAST80 is already deployed. It consist on 2 processing nodes, apart from the queue control servers, 2 database servers and a centralized disk storage system with a capacity of 90 TB net.

The main UPAD storage system deployment was carried out during December 2014. The disk storage system consists on a Netapp cluster with 8 nodes providing a net storage capacity > 1000 TB with dual parity protection. The robotic tape library Spectra Logic T950 with 2 frames has 1600 LTO6 slots (~4 PB). Both storage tiers are integrated by a HSM solution. The core network and disk storage system provide more than 5000 MB aggregated bandwidth. The global storage systems and core network solution has been designed and will be integrated and deployed by BULL ESPAÑA.

Concerning the processing systems. In order to minimize the IO operations overhead, the pipelines can be configured to store all the intermediate product in RAM drive. Also the IO to the centralized storage is decreassed by storing locally, in the computed nodes, some frequently accessed data as for example Calibration Frames.

An important part of the computing resources at UPAD was deployed during the year 2015. This first processing infrastructure consists on 17 Fujitsu servers. Each server has 2 CPUs with 12 cores, 192 GB of RAM, and 4.0 TB of scratch storage.

The UPAD storage, network, and processing infrastructure is funded by the Subprograma de Proyectos de Infraestructura Científico-Tecnológica of the Spanish Ministry of Economy and Competitiveness (MINECO) (FCDD10-4E-867), cofunded by the European Fund for Regional Development (FEDER) and Fondo de Inversiones de Teruel (FITE).

European Regional Development Fund (FEDER)

A quick view of the three stages and amount of images to be processed

A quick view of the three stages and amount of images to be processed

Software pipelines have been designed to handle the enormous data flow produced by the panoramic camera and maximize the scientific output. The Data Management Software will automatically process the data collected during the night to check if its quality fulfills the scientific and technical requirements, update the survey's databases and feed the Scheduler to compute the telescope targets of the following nights.

View details »