Supported fields

Datapoints

Datapoints are the core element of katiML. Each datapoint describes a unique piece of data like an image, a video or a document. Below are the supported fields per data type.

"type": "IMAGE",
"metadata": {
    "width": "INT", 
    "height": "INT",
    "uri": "STRING", // a uri to the image (supported aws s3, gcp, azure)
    "normalized_roi": { // crop of the image the model focuses on
        "top": "INTEGER",
        "left": "INTEGER",
        "height": "INTEGER",
        "width": "INTEGER",
        "STRING": "STRING" // any key value pair you'd like
    },
    "brightness": "FLOAT",
    "sharpness": "FLOAT",
    "contrast": "FLOAT"
}

"type": "TEXT",
"text": "STRING",
"metadata": {
    "STRING": "STRING" // any key value pair you'd like
}

"type": "VIDEO",
"metadata": { 
    "width": "INT",
    "height": "INT",
    "uri": "STRING", // a uri to the video (supported aws s3, gcp, azure)
    "frame": "INT",
    "frame_rate": "FLOAT"
}

"type": "TEXT",
"metadata": {
    "uri": "STRING", // a uri to the document (supported aws s3, gcp, azure)
    "paragraph": { // a section of the document the model focuses on
        "token_start": "INTEGER",
        "token_end": "INTEGER",
    }
}

On top of this, generic fields can be passed like tags, timestamp or GPS coordinates

{
    "gps_coordinates": {
        "lat": "FLOAT",
        "long": "FLOAT"
    },
    "timestamp": "TIMESTAMP", // default: ingestion time
    "tags": {
        "STRING": "STRING" //  key-value pairs for arbitrary metadata
    }
}

Ground Truths

In katiML, ground truths are attached to datapoints. Their format depends on task type.

"groundtruths": [{
    "task_type": "CLASSIFICATION",
    "class_name": "STRING"
}]

"groundtruths": [{
    "task_type": "COMPLETION",
    "completions": [{
        "text": "STRING"
    }, {...}]
}]

"groundtruths": [{
    "task_type": "INSTANCE_SEGMENTATION",
    "bboxes": [{
        "coco_polygon": ["INTEGER"] // a flat list x, y coordinates of a polygon in COCO format
        "class_name": "STRING" // optional class name
    }]
}]

"groundtruths": [{
    "task_type": "SEGMENTATION",
    "class_names": ["STRING"] // Array of class names. Index should match the class mask
    "segmentation_class_mask": [["INTEGER"]] // 2d array with each element having the index of the corresponding class. format HxW
}]

"groundtruths": [{
    "task_type": "LANE_DETECTION",
    "lanes": [{
        "coco_polyline": ["INTEGER"] // a flat list x, y coordinates of a polygon in COCO format
        "classifications": [{
            "name": "STRING", // 'color', 'type', 'ego_position', etc...
            "value": "STRING" // 'yellow', 'continuous', 'left', etc...
        }, {...}]
    }]
}]

Predictions

In katiML, predictions are attached to datapoints. Their format depends on task type.

"predictions": [{
    "task_type": "CLASSIFICATION",
    "model_name": "STRING" // name of the model
    "logits": ["FLOAT"] // 1d or 2d array of logits with matching indexes to the class_names. If 2d then this is mc dropout 
    "class_names": ["STRING"] // Array of class names
    "embeddings": {
        "STRING": ["FLOAT"] // key value pair: key is the layer name and value is a 1d, 2d or 3d array with the embeddings values format: 3d format CxHxW
    },
    "metrics": {
        "STRING": "FLOAT",
        ...
    }
}]

"predictions": [{
    "task_type": "COMPLETION",
    "completions": [{
        "text": "STRING",
        "confidence": "FLOAT",
        "metrics": {
            "STRING": "FLOAT",
            ...
        }
    }, {...}]
}]

"predictions": [{
    "task_type": "INSTANCE_SEGMENTATION",
    "model_name": "STRING" // name of the model
    "bboxes": [{
        "confidence": "FLOAT", // optional
        "confidences": ["FLOAT"], // optional - argmax will populate confidence
        "class_name": "STRING", // options
        "class_names": ["STRING"], //optional -- argmax will populate class_name
        "coco_polygon": ["INTEGER"], // a flat list x, y coordinates of a polygon in COCO format
        "top": "INTEGER", // If not provided, will be computer from the mask
        "left": "INTEGER", // If not provided, will be computer from the mask
        "height": "INTEGER", // If not provided, will be computer from the mask
        "width": "INTEGER", // If not provided, will be computer from the mask
    }],
    "embeddings": {
        "STRING": ["FLOAT"] // key value pair: key is the layer name and value is a 1d, 2d or 3d array with the embeddings values. 3d format CxHxW
    },
    "metrics": {
        "STRING": "FLOAT",
        ...
    }
}]

"predictions": [{
    "task_type": "SEGMENTATION",
    "model_name": "STRING" // name of the model
    "logits": ["FLOAT"] // 3d or 4d array of logits with matching indexes to the class_names. If 4d then this is mc dropout. Format CxHxW
    "segmentation_class_mask": [["INTEGER"]], // 2d array with each element having the index of the corresponding class. format HxW
    "embeddings": {
        "STRING": ["FLOAT"] // key value pair: key is the layer name and value is a 1d, 2d or 3d array with the embeddings values. 3d format CxHxW
    },
    "metrics": {
        "STRING": "FLOAT",
        ...
    }
}]

"predictions": [{
    "task_type": "LANE_DETECTION",
    "model_name": "STRING" // name of the model
    "lanes": [{
        "coco_polyline": ["INTEGER"], // a flat list of x, y coordinates of a polygon in COCO format        
        "confidence": "FLOAT",
        "classifications": [{
            "name": "STRING", // 'color', 'type', 'ego_position', etc...
            "value": "STRING", // 'yellow', 'continuous', 'left', etc...
            "confidence": "FLOAT",
            "confidences": ["FLOAT"], // if given, will populate "value" and "confidence"
            "values": ["STRING"] // if given, will populate "value" and "confidence"
        }, {...}]
    }],
    "embeddings": {
        "STRING": ["FLOAT"] // key value pair: key is the layer name and value is a 1d, 2d or 3d array with the embeddings values. 3d format CxHxW
    },
    "grad_embeddings": {
        "STRING": ["FLOAT"] // key value pair: key is the layer name and value is a 1d, 2d or 3d array with the embeddings values. 3d format CxHxW
    },
    "metrics": {
        "STRING": "FLOAT",
        ...
    }
}]

PreviousDataset SDK NextMatching local data with Kati ML IDs

Last updated 2 years ago

Was this helpful?