How NASA is building a petabyte-scale geospatial archive in the cloud

Video description

EOSDIS is working toward a vision of a cloud-based, highly flexible system to meet its ever-growing and evolving data demands. Cumulus, a free and open source framework, supports this vision via configurable workflows to ingest, process, archive, manage, and distribute NASA’s Earth imagery. The Cumulus infrastructure is designed for scalability and reliability, using much of the AWS serverless platform, which enables Cumulus to scale in real time to be performant under the largest expected workloads.

Cumulus is poised to make a huge impact on how NASA manages and disseminates its Earth science imagery. In one notable case, the NASA-ISRO Synthetic Aperture Radar (NISAR) mission, Cumulus will be used to collect more data in a year than exists in NASA’s current archive. The NISAR mission will collect 45 PB a year and process that data at a rate of 1 GB per second. The need for Cumulus is proven through its application to NASA missions, but its application has extended beyond NASA’s Distributed Active Archive Centers (DAACs). It’s being used to monitor agriculture in Tanzania, apply machine learning models to estimate hurricane intensity, and generate air quality predictions using near-real-time forecast data.

Aimee Barciauskas (Development Seed) outlines the motivation for Cumulus, the achievements and hurdles of the past two years, and its varied applications. You’ll learn about the availability of the open-sourced software and how NASA intends to make its Earth Observing Geospatial data available for free to the public in the cloud.

Product information

  • Title: How NASA is building a petabyte-scale geospatial archive in the cloud
  • Author(s): Aimee Barciauskas
  • Release date: December 2019
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 0636920361817