Ingestion SDK

Ingesting data from local

You can upload metadata from your local environment directly using the upload_to_lake utility.

from dioptra.lake.utils import upload_to_lake
def upload_to_lake(
    records: List[object]
) -> object
# Returns a dict with an `id` property.
Parameter
Description
records

a list of json objects describing a datapoint. The objects should follow the format described in the Supported fields

Ingesting data located in an object store

If the data is in an object store, you can ingest it directly. The call will create a signed URL for the object so the Object Store credentials should be configured on the machine. It will then submit an upload request to Dioptra with the signed URL.

from dioptra.lake.utils import upload_to_lake_from_bucket
def upload_to_lake_from_bucket(
    bucket_name: str,
    object_name: str
) -> object
# Returns a dict with an `id` property.
Parameter
Description
bucket_name

name of the bucket where the ndjson is located

records

The path to the new line delimited JSON file. The objects in the file should follow the format described in the Supported fields

Wait for your upload to finish

The values returned from an upload_* method is an upload id. To wait for it, you can use the wait_for_upload method

def wait_for_upload(upload_id: str) -> object
# Returns a dict with details of an upload
Parameter
Description

upload_id

The id of the upload created by a call to upload_to_lake*

Last updated

Was this helpful?