Overview

The first application we needed to create was one that would allow us to collect the data that members of our consortium generated. The application is hosted at http://upload.kpmp.org. It is only available to a small subset of the consortium. The code for this application stack is at http://github.com/KPMP in the following repositories: stateManagerService, eridanus-data, zipWorker.

Repositories

List of the repositories required to run this application.
Repository Description
orion-data This is the service layer for the Data Lake Uploader (and is also the service layer for the Data Manager Dashboard).
orion-web This is the GUI layer for the Data Lake Uploader.
heavens-docker This repository holds all of the docker containers and docker-compose scripts for running our applications.
stateManagerService This service is used to manage the state of packages as they travel through the Data Lake to the Knowledge Environment.
eridanus-data This service is used to notify our staff of events within the system.
zipWorker This service is used to zip our package files for download in the Data Lake Uploader.
norma Tool designed to read in a Google Spreadsheet to generate our dynamic form DTDs.

Data Stores

MongoDB
dataLake
Collection Name Description
dynamicForms Collection to hold the DTDs for the dynamic forms used on the upload form. You can see the latest dtds in the dtds folder inside the norma project.
packageTypeIcons Static collection used to group package types together into larger categories. An export of this data exists in orion-data/data/packageTypeIcons.json.
packages Collection that contains the metadata for packages uploaded to the Data Lake. There is an object reference (via the submitter attribute) to entries in the users collection. A minimal example json structure exists in orion-data/data/packages-template.json. The application is designed to allow for any number of additional attributes.
releases Collection used to populate our Help page. An example json structure exists in orion-data/data/releases-template.json.
state Collection of all states a package has passed through. An example json structure exists in orion-data/data/state-template.json.
users Collection of users who have uploaded packages to the Data Lake. An examplejson structure exists in orion-data/data/users-template.json. The "shibId" attribute is the Shibboleth Id we capture when the user authenticates.
File System
The application can be configured to point to any location on disk for the file system. Inside that directory, when a user uploads a file the system will create a file called "package_<unique_id>" where the Unique Id is generated with Java's UUID functionality.