Table of contents

Activity API

The Activity API allows you to view details about a specific harvesting job. This endpoint allows you to find extractor errors, and items that had a specific error message.

List Extractor Errors

This API will show you the amount of errors each extractor had.

GET /api/data/v3/activity/{request_id}/extractors
  • {request_id} - (string) Harvest Job request ID

You will receive the following response:

{
	"extractors": [{
			"name": "captions",
			"display_name": "Embedded Captions",
			"num_errors": 0
		},
		{
			"name": "document_pages",
			"display_name": "Documents",
			"num_errors": 0
		},
		{
			"name": "drm",
			"display_name": "DRM",
			"num_errors": 2
		},
		{
			"name": "exiv2",
			"display_name": "EXIV2",
			"num_errors": 0
		}
	]
}
  • extractors[].name - (string) The actual name of the extractor that ran in this harvest. This value is used as {extractor} in all other Activity endpoint API requests.
  • extractors[].display_name - (string) The display name of the extractor that ran in this harvest
  • extractors[].num_error - (integer) Amount of errors the extractor caught

List Errors for an Extractor

This API will list all errors that occurred in this harvesting job. Each error will have an error_hash that will allow you to find items in the next API endpoint.

GET /api/data/v3/activity/{request_id}/extractors/{extractor}?page_token={token}&limit={limit}&offset={offset}
  • {request_id} - (string) Harvest Job request ID
  • {extractor} - (string) Extractor name
Optional Queries
  • limit - (integer) Maximum results to show
  • offset - (integer) The offset amount of results (used for pagination)
  • page-token - (string) Show results for the next set of results, from next_page_token value.

You will receive the following response:

{
	"extractor": "credits",
	"extractor_name": "Credits",
	"errors": [{
		"error": "credits: error message ouchhh",
		"error_hash": "8f1e07deac51451cc1ed2778312af796",
		"num_files": 1
	}],
	"next_page_token": "",
	"previous_page_token": ""
}
  • extractor - (string) The actual name of the extractor being searched
  • extractor_name - (string) The display name of the extractor being searched
  • errors[].error - (string) The error message that occured
  • errors[].error_hash - (string) The MD5 hash of the error message
  • errors[].num_files - (integer) Amount of errors with same error message
  • num_files - (integer) Total amount of files that had errors with extractor
  • next_page_token - (string) The next set of paginated results for page-token query
  • previous_page_token - (string) The previous set of paginated results for page-token query

The previous and next page tokens can be used to retrieve the previous or next page of results. If the previous or next page tokens are an empty string, then there are no more results. Use the previous_page_token or next_page_token as the URL query string value for page-token to fetch previous/next set of results.

List Items That Have a Specific Error

This API will list all item’s in the harvesting job that had this specific error message. The error_md5 is a MD5 hash of the error message string.

GET /api/data/v3/activity/{request_id}/extractors/{extractor}/errors/{error_md5}/files?page_token={token}&limit={limit}&offset={offset}
  • {request_id} - (string) Harvest Job request ID
  • {extractor} - (string) Extractor name
  • {error_md5} - (string) Error message hashed into MD5.
Optional Queries
  • limit - (integer) Maximum results to show
  • offset - (integer) The offset amount of results (used for pagination)
  • page-token - (string) Show results for the next set of results, from next_page_token value.

You will receive the following response:

{
	"files": [{
			"item_id": "8535dc5d1de2e83fba40bc0fe66a9ab8"
		},
		{
			"item_id": "08376d120bcd1cb8d14329e1e8e9cb71"
		},
		{
			"item_id": "69f6a57e28e7358f554388f1be6b6be3"
		}
	],
	"num_files": 3,
	"next_page_token": "",
	"previous_page_token": ""
}
  • files[].item_id - (string) The Item ID that had this extractor error
  • num_files - (integer) Total amount of files that had errors with extractor
  • next_page_token - (string) The next set of paginated results for page-token query
  • previous_page_token - (string) The previous set of paginated results for page-token query

The previous and next page tokens can be used to retrieve the previous or next page of results. If the previous or next page tokens are an empty string, then there are no more results. Use the previous_page_token or next_page_token as the URL query string value for page-token to fetch previous/next set of results.

List Multiple Extractor Errors

This API will return extractor errors for all of the request ID’s you send as a JSON array of strings.

POST /api/data/v3/activity/bulk/extractors

{"jobs": ["{request_id}", "{request_id}"]}
  • jobs - (string) The JSON object that contains an array of request_id’s.
  • request_id[] - (string) The request ID for a specific job

You will receive the following response in the same order as your array of request ID’s:

[
	{
		"request_id": "5dfd2a32e285d27ed8f17d46f29869d2",
		"extractors": [
			{
				"name": "black_scenes",
				"display_name": "Black Frames",
				"num_errors": 0
			},
			{
				"name": "drm",
				"display_name": "DRM",
				"num_errors": 0
			},
			{
				"name": "exiv2",
				"display_name": "EXIV2",
				"num_errors": 0
			},
			{
				"name": "hashes",
				"display_name": "Hashes",
				"num_errors": 0
			}
		]
	},
	{
		"request_id": "5dfd1289dc77d22895667329961ceece",
		"extractors": [
			{
				"name": "drm",
				"display_name": "DRM",
				"num_errors": 0
			},
			{
				"name": "exiv2",
				"display_name": "EXIV2",
				"num_errors": 0
			},
			{
				"name": "hashes",
				"display_name": "Hashes",
				"num_errors": 0
			}
		]
	}
]
  • request_id - (string) The request ID for this object
  • extractors[].name - (string) The actual name of the extractor that ran in this harvest. This value is used as {extractor} in all other Activity endpoint API requests.
  • extractors[].display_name - (string) The display name of the extractor that ran in this harvest
  • extractors[].num_error - (integer) Amount of errors the extractor caught

This documentation is generated from the latest version of GrayMeta Platform. For documentation relevant to your own deployed version, please use the documentation inside the application.