Table of contents

Search API

Overview

To search the metadata in the GrayMeta Platform, make the following request:

POST /api/data/search
{
	"query": {query},
	"limit": {limit},
	"page": {page},

	"types": {types},
    "fields": {fields},
    "sort_fields":{sort_fields},
	"only": {only},
    "filters": {filters},
    "aggregations": {aggregations},
    "hit_counts": {
        "advertising": 1,
        "audio_classification": 1,
        "caption": 2,
        "description": 3,
        "labels": 5,
        "locations": 8,
        "logos": 13,
        "keyword": 21,
        "nsfw": 21,
        "ocr": 13,
        "people": 8,
        "sound": 5,
        "speech_to_text": 3,
        "sport": 2,
        "tag": 1,
        "text_content": 1
    }
}
  • query (string) - The text to search for - See the Full text search documentation

  • limit (int) - The number of items to return per page

  • page (int) - Index of the page number to get (0 is first page)

  • types (array of strings) Optional - List of types of media to include. Default: all types.

  • fields (array of strings) Optional - List of fields to perform the fulltext search. Default: all fields.

  • sort_fields.field (string) Optional - Field used for sorting

  • sort_fields.asc (bool) Optional - Set to true to sort ascending.

  • only (array of strings) Optional - List of fields that are returned with the results. Default: all types.

  • filters (object) Optional - Object describing additional filters to apply - See the Filters documentation

  • aggregations - (object)Optional Object describing aggregations to request - See the Aggregations documentation

  • hit_counts (object) - List of hit counts by item data type (any type which has 0 hits will omitted)

Search response

A typical search query response looks like this:

{
    "query": "{query}",
    "limit": {limit},
    "page": {page},
    "total_hits": {total_hits},
    "results": [
        {
            "result": {
                "_id": "b44698697fc7de5725d1e1fe22d38e32",
                "last_modified": "2016-07-20T21:21:09Z",
                "location_id": "AVbdTJJcuT9aoTJxqbUD",
                "location_kind": "azure",
                "location_name": "Demo Content",
                "mime_type": "audio/x-wav",
                "name": "Annie and Brie/Raw Audio/AB3-1C_2561.wav",
                "stow_container_id": "annie-and-brie",
                "stow_container_name": "annie-and-brie",
                "stow_url": "azure://democontent.blob.core.windows.net/annie-and-brie/Annie%20and%20Brie/Raw%20Audio/AB3-1C_2561.wav"
            },
            "highlight": [
								{"field": "title", "fragments": ["example <em>query</em>"] },
								{"field": "otherfield", "fragments": ["another text <em>highlight here</em>"] }
            ],
            "score": 0.93453264
        }
	],
	"filters": {},
	"aggregations": {},
    "hit_counts": {
        "advertising": 1,
        "audio_classification": 1,
        "caption": 2,
        "description": 3,
        "labels": 5,
        "locations": 8,
        "logos": 13,
        "keyword": 21,
        "nsfw": 21,
        "ocr": 13,
        "people": 8,
        "sound": 5,
        "speech_to_text": 3,
        "sport": 2,
        "tag": 1,
        "text_content": 1
    }
}

query, limit and page repeat the input values that yielded the results.

  • total_hits (int) Approximate number of total hits for the given search
  • results - (array) Array of result objects (see Result fields section below)
  • highlight - (object) An object containing HTML indicating why an item was matched (see Highlights fields section below)
  • score - (int) A decimal percentage value of how relevant this item is to the search query (0 being not relevant, 1 being most relevant; low numbers are common)
  • filters - (object) Object describing the filters that were applied in the search request
  • aggregations - (object) The aggregation results
  • hit_counts (object) - List of hit counts by item data type (any type which has 0 hits will omitted)

Highlight fields

  • title - (string) If the match occurred within the name, the title will explain where; if empty, assume result.name
  • fulltext - (string) An HTML preview of why a search result is relevant to the query

Some examples for setting time ranges as a filter:

   "filters":{
      "ranges":[
         {
            "field":"last_harvested",
            "from":"now-1h",
            "to":""
         }
      ]
   "filters":{
      "ranges":[
         {
            "field":"last_harvested",
            "from":"2017-12-31T19:30:000.000Z",
            "to":"2017-12-31T19:45:00.000Z"
         }
      ]
```   

## Result fields

The result objects contain an overview of Item Object metadata.

* `_id` - (string) Unique Item ID
* `last_modified` - (timestamp) When the item was last modified
* `location_id` - (string) The ID of the Location where this Item was found
* `location_kind` - (string) The Location Kind (see the [Location Kinds API](Location_Kinds_API.md) for more information) of the Location where this Item was found
* `mime_type` - (string) MIME type for the item
* `name` - (string) Name of the item (usually filename)
* `stow_container_id` - (string) Stow Container ID of where this Item was found
* `stow_container_name` - (string) Name of the Stow Container where this Item was found
* `stow_url` - (string) The Stow URL of this Item.

## Search Analytics

To view an overview of your platform data, hit this endpoint to obtain some high level groupings.

GET /api/v3/search/analytics


### Analytics Response

```json
{
    "explicit_content": [
        {
            "key": "false",
            "count": 58
        },
        {
            "key": "true",
            "count": 4
        }
    ],
    "files_by_category": [
        {
            "key": "cat2",
            "count": 5
        },
        {
            "key": "cat3",
            "count": 3
        },
        ...
    ],
    "files_by_extension": [
        {
            "key": "jpg",
            "count": 200
        },
        {
            "key": "mp3",
            "count": 17
        },
        {
            "key": "mp4",
            "count": 16
        },
        ...
    ],
    "files_by_location": [
        {
            "key": "theberg",
            "count": 144
        },
        {
            "key": "loadnstore",
            "count": 65
        },
        ...
    ],
    "files_by_type": [
        {
            "key": "image",
            "count": 191
        },
        {
            "key": "audio",
            "count": 24
        },
        {
            "key": "video",
            "count": 17
        },
        {
            "key": "document",
            "count": 11
        },
        {
            "key": "archive",
            "count": 1
        },
        ...
    ]
}

A successful response will return a Status OK (200) and a Status Internal Server Error (500) for any unexpected errors.

This documentation is generated from the latest version of GrayMeta Platform. For documentation relevant to your own deployed version, please use the documentation inside the application.