Python Restricted Endpoint Examples

The HCA API provides several ways for users of the Human Cell Atlas (HCA) to access and download data sets from the HCA. This page covers how to access the HCA using Python API bindings.

The API calls listed here are restricted to those with upload or ingest permissions. Data will be submitted through a single Ingestion Service API. Submitted data will go through basic quality assurance before it is deposited into the Data Storage System (DSS) component.

In the document that follows, privileged user refers to a user with proper credentials and permission to upload/ingest data into the DSS.

NOTE: The HCA CLI utility is compatible with Python 3.5+.

delete_bundle

Deletes an existing bundle given a UUID, version, and replica.

Inputs:

  • uuid - a unique, user-created UUID.
  • creator-uid - a unique user ID (uid) for the bundle creator uid. This accepts integer values.
  • version - a unique, user-created version number. Use the create_verson() API function to generate a DSS_VERSION.
  • replica - which replica to use (corresponds to cloud providers; choices: aws or gcp)
  • files - a valid list of file objects, separated by commas (e.g., [{<first_file>}, {<second_file>}, ...  ]). Each file object must include the following details:
    • Valid UUID of the file
    • Valid version number of the file
    • Name of the file
    • Boolean value - is this file indexed

Example call to delete_bundle():

from hca import HCAConfig
from hca.dss import DSSClient

hca_config = HCAConfig()
hca_config["DSSClient"].swagger_url = f"https://dss.dev.data.humancellatlas.org/v1/swagger.json"
dss = DSSClient(config=hca_config)

print(dss.delete_bundle(reason='test', uuid='98f6c379-cb78-4a61-9310-f8cc0341c0ea', version='2019-08-02T202456.025543Z', replica='aws'))

put_bundle

Creates a bundle. A bundle can contain multiple files of arbitrary type.

Inputs:

  • uuid - a unique, user-created UUID.
  • creator-uid - a unique user ID (uid) for the bundle creator uid. This accepts integer values.
  • version - a unique, user-created version number. Use the create_verson() API function to generate a DSS_VERSION.
  • replica - which replica to use (corresponds to cloud providers; choices: aws or gcp)
  • files - a valid list of file objects, separated by commas (e.g., [{<first_file>}, {<second_file>}, ...  ]). Each file object must include the following details:
    • Valid UUID of the file
    • Valid version number of the file
    • Name of the file
    • Boolean value - is this file indexed

Example call to put_bundle():

from hca import HCAConfig
from hca.dss import DSSClient
import os

hca_config = HCAConfig()

hca_config["DSSClient"].swagger_url = f"https://dss.dev.data.humancellatlas.org/v1/swagger.json"
dss = DSSClient(config=hca_config)

dss.put_bundle(
    creator_uid=0,
    uuid="98f6c379-cb78-4a61-9310-f8cc0341c0ea",
    version="2019-08-02T202456.025543Z",
    replica="aws",
    files=[
        {
            "uuid": "2196a626-38da-4489-8b2f-645d342f6aab",
            "version": "2019-07-10T001103.121000Z",
            "name": "process_1.json1",
            "indexed": False,
        }
    ],
)

patch_bundle

Allows a user to modify an existing bundle. User passes in an optional list of files to add or remove from an existing bundle.

add_files/remove_files follow this format:

[
  {
    "path": "string",
    "type": "string",
    "uuid": "string",
    "version": "string"
  }
]

Example call to patch_bundle():

from hca import HCAConfig
from hca.dss import DSSClient

hca_config = HCAConfig()
hca_config["DSSClient"].swagger_url = f"https://dss.dev.data.humancellatlas.org/v1/swagger.json"
dss = DSSClient(config=hca_config)

print(dss.patch_bundle(uuid='98f6c379-cb78-4a61-9310-f8cc0341c0ea', version='2019-08-02T202456.025543Z', replica='aws'))

put_file

Creates a new version of a file, given an existing UUID, version, creator uid, and source URL.

Example call to put_file():

from hca import HCAConfig
from hca.dss import DSSClient

hca_config = HCAConfig()

hca_config["DSSClient"].swagger_url = f"https://dss.dev.data.humancellatlas.org/v1/swagger.json"
dss = DSSClient(config=hca_config)

print(
    dss.put_file(
        uuid="ead6434d-efb5-4554-98bc-027e160547c5",
        version="2019-07-30T174916.268875Z",
        creator_uid=0,
        source_url="s3://jeffwu-test/ead6434d-efb5-4554-98bc-027e160547c5/get_bundle.json",
    )
)

put_collection, delete_collection, patch_collection, get_collection(s)

  • get_collection() - Given a collection UUID, get the collection.
  • get_collections() - Get a list of collections for a given user.
  • delete_collection() - Given a collection UUID and replica, delete the collection from the replica.
  • put_collection() - Create a collection.
  • patch_collection() - Add or remove a given list of files from an existing collection.

To add or remove files with the API endpoints above, specify each file in the following format:

[
  {
    "path": "string",
    "type": "string",
    "uuid": "string",
    "version": "string"
  }
]

Example API calls:

from hca import HCAConfig
from hca.dss import DSSClient
import uuid
import os

hca_config = HCAConfig()
hca_config["DSSClient"].swagger_url = f"https://dss.dev.data.humancellatlas.org/v1/swagger.json"
dss = DSSClient(config=hca_config)

# Creates a new collection
collection = dss.put_collection(
    uuid=str(uuid.uuid4()),
    version="2018-09-17T161441.564206Z",  # arbitrary
    description="foo",
    details={},
    replica="aws",
    name="bar",
    contents=[
        {
            "type": "bundle",
            "uuid": "ff818282-9735-45fa-a094-e9f2d3d0a954",  # overwrite if necessary
            "version": "2019-08-06T170839.843085Z",  # arbitrary
            "path": "https://dss.dev.data.humancellatlas.org/v1/bundles/ff818282-9735-45fa-a094-e9f2d3d0a954?version=2019-08-06T170839.843085Z&replica=aws",
        }
    ],
)

uuid, version = collection["uuid"], collection["version"]

# Gets a list of collections
print(dss.get_collections(replica="aws"))

# Can add/remove files from a collection
print(dss.patch_collection(replica="aws", uuid=uuid, version=version))

# Gets a collection based on replcia and uuid
print(dss.get_collection(replica="aws", uuid=uuid))

# Deletes a colelction based on replica and uuid
print(dss.delete_collection(replica="aws", uuid=uuid))

upload

Uploads a directory of files from the local filesystem and creates a bundle containing the uploaded files.

Example call to upload():

from hca import HCAConfig
from hca.dss import DSSClient
import boto3    

s3 = boto3.resource('s3')
bucket = s3.Bucket('upload-test-unittest') 

hca_config = HCAConfig()
hca_config["DSSClient"].swagger_url = f"https://dss.dev.data.humancellatlas.org/v1/swagger.json"
dss = DSSClient(config=hca_config)

print(dss.upload(src_dir="data/", replica="aws", staging_bucket="upload-test-unittest"))
 
bucket.objects.all().delete()

print("Upload successful")

Links: Index / Module Index / Search Page