Developer documentation specific to the object-store-api.

  • General developer documentation for DINA API/module can be found in DINA Developer Guide.

  • Documentation about modules implementation can be found in dina-base-api.

1. Object Storage

1.1. MinIO

The Object Store module uses MinIO as a storage abstraction layer.

1.2. Object Metadata

The Object Store workflow consists of the upload of an object (a file) and the submission of the associated metadata. Some data available at the upload stage should be preserved and transferred to the associated metadata once the latter is submitted. For example: the original filename, the hash of the file, the received media type, the detected media type, etc.

1.3. Expected Sequence

  • Multipart upload of a file in a specific bucket

  • The backend returns a uuid representing the fileIdentifier

  • Post of the metadata with fileIdentifier set to the uuid return be the previous step

  • The file is available for download using the bucket and the fileIdentifier

2. File Upload/Download

2.1. Upload a file

Send a POST request to the /api/v1/file/{bucket} endpoint.

Example request using curl:

curl -i http://localhost:8081/api/v1/file/mybucket -F "file=@./my-image.png;type=image/png"

Example response:

{
	"fileIdentifier": "0050559c-beae-48d6-a2b0-91f5a666fad1",
	"originalFilename": "example-png.png",
	"sha1Hex": "c0c1d898ed827d6db02a03df941225184277d9e5",
	"receivedMediaType": "image/png",
	"detectedMediaType": "image/png",
	"detectedFileExtension": ".png",
	"evaluatedMediaType": "image/png",
	"evaluatedFileExtension": ".png",
	"sizeInBytes": 976,
	"thumbnailIdentifier": "b9f6f58b-1b41-400f-b178-f91d11221b7c"
}
If the file already exists based on the sha1hex, a warning will be returned in the meta section of the response.

Secondly, send a POST request to the /api/v1/metadata endpoint to create a Metadata record for the stored object, using the fileIdentifier from the upload response.

curl -X POST http://localhost:8081/api/v1/metadata \
-H "Content-Type: application/vnd.api+json" \
-H "Accept: application/vnd.api+json" \
--data-binary @- << EOF
{
  "data": {
    "type": "metadata",
    "attributes": {
      "bucket": "mybucket",
      "dcType": "Image",
      "fileExtension": ".png",
      "fileIdentifier": "0050559c-beae-48d6-a2b0-91f5a666fad1"
    }
  }
}
EOF

2.2. Upload a Derivative

Uploading a Derivative is similar to uploading a regular file.

  1. Upload the derivative file by sending a POST (mutlipart) request to: /api/v1/file/{bucket}/derivative

curl -i http://localhost:8081/api/v1/file/mybucket/derivative -F "file=@./my-image.png;type=image/png"

Example response:

{
  "id": 2,
  "fileIdentifier": "5c158d6e-09eb-4272-88b6-3349b638100d",
  "dcType": "IMAGE",
  "createdBy": "dev",
  "originalFilename": "my-image.png",
  "sha1Hex": "3a37e3546074b6a67afef2fc1b402bf9233a1eb7",
  "receivedMediaType": "image/png",
  "detectedMediaType": "image/png",
  "detectedFileExtension": ".png",
  "evaluatedMediaType": "image/png",
  "evaluatedFileExtension": ".png",
  "sizeInBytes": 6091,
  "bucket": "mybucket",
  "isDerivative": true,
}
  1. Upload the derivative resource for that file at: api/v1/derivative/

A derivative will be expected to derive from an existing file with an existing metadata for the original file.

The fileIdentifier in the following example points to the returned file Identifier for step one. This identifies the uploaded derivative.

The acDerivedFrom object in the following example represents the original resource (of type metadata) this derivative will derive from.

Example Request: POST /api/v1/derivative/

curl --request POST \
  --url http://localhost:8081/api/v1/derivative/ \
  --header 'Content-Type: application/vnd.api+json' \
  --data '{
	"data": {
		"type": "derivative",
		"attributes": {
			"dcType": "IMAGE",
			"fileIdentifier": "5c158d6e-09eb-4272-88b6-3349b638100d"
		},
		"relationships": {
			"acDerivedFrom": {
				"data": {
					"id": "e39ba089-4757-433a-a787-2287b3defb46",
					"type": "metadata"
				}
			}
		}
	}
}'

2.3. Uploading a thumbnail for a specific resource

You can upload a derivative to be a thumbnail for an Object.

After uploading a derivative file, when the derivative resource for that file is being uploaded 2 things will be required.

  1. You must set "derivativeType": "THUMBNAIL_IMAGE" as shown in the following example.

  2. You must provide a valid acDerivedFrom object.

Example Request: POST /api/v1/derivative/

curl --request POST \
  --url http://localhost:8081/api/v1/derivative/ \
  --header 'Content-Type: application/vnd.api+json' \
  --data '{
	"data": {
		"type": "derivative",
		"attributes": {
			"fileIdentifier": "df85bc1b-7365-4621-ab00-2cdd48808252",
			"dcType": "Image",
			"derivativeType": "THUMBNAIL_IMAGE"
		},
		"relationships": {
			"acDerivedFrom": {
				"data": {
					"id": "c8b71e52-ccf1-4409-8d3d-deb23e0a9906",
					"type": "metadata"
				}
			}
		}
	}
}'
this does not resize the image! If you want an image to be resized and used as a thumbnail, submit the image as a regular derivative with a provided acDerivedFrom and a thumbnail will be generated for this image and associated with the given acDerivedFrom.

2.4. File Download

To download a stored object, send a GET request to the /api/v1/file/{bucket}/{fileId} endpoint.

Example request:

curl http:/localhost:8081/api/v1/file/mybucket/0050559c-beae-48d6-a2b0-91f5a666fad1 > my-downloaded-image.png

2.5. Derivative File Download

To download a stored derivative, send a GET request to the /api/v1/file/{bucket}/derivative/{fileId} endpoint.

Example Request: GET /api/v1/file/{bucket}/derivative/{fileId}

curl --request GET \
  --url http://localhost:8081/api/v1/file/dev-group/derivative/cbb9484a-67f1-4112-accd-829bdfa0ad9e

2.6. Get File Information

It is possible to check for the presence of a file directly on the file system. The user must be SUPER_USER on the target group (bucket) and the filename (uuid) with extensions is required.

GET /api/v1/file-info/{bucket}/{filename}

GET /api/v1/file-info/{bucket}/derivative/{filename}

3. API

3.1. Additional Endpoints

3.1.1. resource-name-identifier

Get the identifier (UUID) based on the name, the type and the group.

GET /resource-name-identifier?filter[type][EQ]=metadata&filter[name][EQ]=name1&filter[group][EQ]=aafc

Available for types :

  • metadata

3.2. Administrative Endpoints

Unless explicitly mentioned administrative endpoints require DINA_ADMIN role.

3.2.1. Regenerate Thumbnails

Send a POST request to derivative-generation with a body like:

{
  "data": {
    "type": "derivative-generation",
    "attributes": {
      "metadataUuid": "uuid of the original object",
      "derivativeType": "THUMBNAIL_IMAGE",
      "derivedFromType": "LARGE_IMAGE"
    }
  }
}

That will instruct the api to generate a thumbnail from the large image derivative. This should be used in cases where the original object can not be used to create the thumbnails (like raw image files). Otherwise, derivedFromType should simply be omitted to use the original object as a source.

3.2.2. Index Refresh

Trigger a message to (re)index a resource.

POST /index-refresh

{
  "data": {
    "type": "index-refresh",
    "id": "c9e66a08-8b59-4183-8346-e2298af32bfe",
    "attributes": {
      "docType": "metadata"
    }
  }
}

4. Configuration

4.1. Orphan Removal

The orphan removal cron job is responsible for identifying and removing orphan objects. Orphan objects are object uploads that do not have any matching file identifiers associated with derivatives or metadata objects.

orphan-removal.expiration.objectMaxAge="20d"
orphan-removal.cron.expression="0 0 * * * *"
Property Description

orphan-removal.expiration.objectMaxAge

The maximum age of objects to be considered for removal based on the object upload created on date. Objects older than this duration will be checked for orphan status. The duration should be specified using the ISO-8601 duration format, e.g., "20d" represents 20 days.

orphan-removal.cron.expression

The cron expression to schedule the orphan removal process. The process will run based on this cron expression to check for orphan objects if they have expired. The expression follows the cron format, e.g., "0 0 * * * *" represents running this check every hour of every day.

If the orphan-removal.cron.expression variable is not specified (default), the orphan removal service will not be performed.

5. Media Type Detection

5.1. Library

The Object Store module uses Apache Tika to try to automagically detect the media type when possible.

5.2. Expected behavior

The Object Store module will behave differently depending on the information included in the multipart upload. As a general rule, the module will use the media type and the file extension (extracted from the filename) that is provided with the file upload (if available).

5.2.1. Content-Type provided

If the Content-Type of the file is specified in the multipart upload, the Object Store will still detect the media-type but will only store it in the metadata field and return it to the user as information.

5.2.2. No Content-Type provided

If the Content-Type of the file is NOT specified in the multipart upload, the Object Store will detect the media type. If a filename is provided the file extension will be preserved otherwise the default extension of the detected media type will be used.

6. Object Export

The object-store-api can export objects in a compressed (zip) file.

6.1. Request an export

Send a POST request to object-export with a body like:

{
  "data": {
    "type": "object-export",
    "attributes": {
      "name": "my export",
      "fileIdentifiers": ["fileUUID"]
    }
  }
}

The response will include the export UUID.

To include files in folder(s) it is possible to add the following attributes:

"exportLayout": { "subfolder/" : ["fileUUID"] }

It is also possible to use an alternative name (alias) for files:

"filenameAliases": { "fileUUID" : "myFileAlias" }

6.2. Export processing

  • Assemble the export and compress the content in an archive

  • Emit a message on the messaging system including a toa

  • export-api downloads the archive

  • The archive is deleted and available in the export-api

7. Temporary Object Access

The object-store can generate temporary files that can be downloaded by another service using a randomly generated key. The key is communicated using an event (message).

7.1. Sequence of operations

  • Create a temporary file (e.g. image export)

  • Register the file to get a key

  • Trigger an event on the message queue to indicate a file is available

  • The consumer pulls the file using the key (/toa/{key} endpoint)

  • The key is removed from usable keys

  • The temporary file is deleted from the object-store temporary storage