Getting started

Integration & migration

Image & video API

DAM user guide

API overview

Account

AI Tasks with Controlled Vocabularies

Automate metadata management using multimodal AI. Apply business-specific tags and custom metadata fields automatically using natural language instructions and predefined taxonomies.


What are AI tasks?

AI tasks turn image understanding into reliable metadata. You ask a question in plain English, and the AI selects answers only from your approved vocabulary—so your tags and custom fields stay consistent, schema-safe, and ready for DAM workflows at scale.

Key benefits:

  • Natural language instructions - No complex regex or coding; use plain English instructions.
  • Taxonomy alignment - AI only uses the values you provide, ensuring 100% consistency with your internal database.
  • Scale infinitely - Process thousands of assets in seconds during upload or via bulk updates.
  • Eliminate human bias - Ensure the same logic is applied to every asset, improving search and filtering.

AI tasks consume extension units. Learn about pricing.

Use cases

AI tasks solve real-world metadata management challenges across industries. You can apply them during upload or later using the update API, giving you flexibility to automate metadata management when it fits your workflow. Here's how different organizations use controlled vocabularies:

IndustryAutomation FocusStrategic Value
E-commerceProduct categorization, attribute extraction (color, season, material), and style tagging.Faceted Search: Eliminates manual errors, ensuring customers find products via filters instantly.
AutomotiveBody style classification, visual condition grading, and PII detection (e.g., license plates).Marketplace Trust: Standardizes condition reports and automates privacy compliance for listings.
Travel & HospitalityRoom/space classification, amenity detection (pool, balcony, workspace), and view categorization.Conversion Rate: Improves SEO and guest discovery by accurately surfacing property highlights.
Media & PublishingEditorial categorization, rights/clearance flagging, and quality assessment.Workflow Velocity: Accelerates newsroom cycles by instantly routing "front-page-ready" assets.

E-commerce and retail

Challenge: Manually categorizing 50,000 product images by type, color, season, and style takes weeks and results in inconsistent tagging.

Solution:

Copy
{
  "name": "ai-tasks",
  "tasks": [
    {
      "type": "select_tags",
      "instruction": "What product categories are visible in this image?",
      "vocabulary": ["apparel", "footwear", "accessories", "bags", "jewelry"],
      "max_selections": 2
    },
    {
      "type": "select_metadata",
      "instruction": "What is the dominant color?",
      "field": "primary_color",
      "vocabulary": ["black", "white", "red", "blue", "green", "beige", "brown", "multi-color"]
    },
    {
      "type": "select_metadata",
      "instruction": "What season is this product suitable for?",
      "field": "season",
      "vocabulary": ["spring", "summer", "fall", "winter", "all-season"],
      "max_selections": 2
    }
  ]
}

Result: Instant categorization with consistent taxonomy application across your entire catalog.

Automotive e-commerce

Challenge: Automotive dealerships need to accurately classify vehicle body styles, assess visual condition, and flag privacy concerns like visible license plates across thousands of inventory photos.

Solution:

Copy
{
  "name": "automotive-standardization",
  "tasks": [
    {
      "type": "select_metadata",
      "instruction": "What is the body style of the vehicle?",
      "field": "body_style",
      "vocabulary": ["sedan", "suv", "coupe", "truck", "convertible", "van", "hatchback"],
      "max_selections": 1
    },
    {
      "type": "select_metadata",
      "instruction": "Based on the exterior, what is the apparent condition of the vehicle?",
      "field": "visual_condition",
      "vocabulary": ["excellent", "good", "fair", "damaged"],
      "max_selections": 1
    },
    {
      "type": "yes_no",
      "instruction": "Is the vehicle's license plate clearly visible and legible?",
      "on_yes": {
        "add_tags": ["plate-visible", "needs-blurring"],
        "set_metadata": [
          { "field": "privacy_review", "value": "pending" }
        ]
      },
      "on_no": {
        "set_metadata": [
          { "field": "privacy_review", "value": "cleared" }
        ]
      }
    }
  ]
}

Result: Consistent vehicle classifications, standardized condition grading for faceted search, and automatic flagging of images requiring license plate blurring.

Travel and hospitality

Challenge: Hotels, resorts, and vacation rental platforms manage thousands of property images that need consistent categorization for booking sites and guest discovery.

Solution:

Copy
{
  "name": "ai-tasks",
  "tasks": [
    {
      "type": "select_tags",
      "instruction": "What type of space or venue is shown in this image?",
      "vocabulary": ["hotel-room", "suite", "lobby", "restaurant", "bar", "pool", "spa", "gym", "beach", "conference-room", "outdoor-area"],
      "max_selections": 2
    },
    {
      "type": "select_tags",
      "instruction": "What amenities or features are visible?",
      "vocabulary": ["tv", "balcony", "ocean-view", "mountain-view", "kitchen", "workspace", "bathtub", "fireplace", "dining-area"],
      "max_selections": 3
    },
    {
      "type": "select_metadata",
      "instruction": "What type of property is this?",
      "field": "property_type",
      "vocabulary": ["hotel", "resort", "villa", "apartment", "hostel", "bed-and-breakfast", "vacation-rental"]
    },
    {
      "type": "yes_no",
      "instruction": "Does this image show outdoor or nature-focused features (beach, mountains, gardens, pools)?",
      "on_yes": {
        "add_tags": ["outdoor", "nature", "scenic"],
        "set_metadata": [
          { "field": "setting_type", "value": "outdoor" }
        ]
      },
      "on_no": {
        "add_tags": ["indoor", "interior"],
        "set_metadata": [
          { "field": "setting_type", "value": "indoor" }
        ]
      }
    }
  ]
}

Result: Organized travel imagery with searchable amenities, venue types, and property features for booking platforms and guest browsing.

Media and publishing

Challenge: News agencies process thousands of photos daily and need instant categorization for breaking news.

Solution:

Copy
{
  "name": "ai-tasks",
  "tasks": [
    {
      "type": "select_tags",
      "instruction": "What news categories or themes does this image relate to?",
      "vocabulary": ["politics", "sports", "business", "technology", "entertainment", "health", "environment", "lifestyle"],
      "min_selections": 1,
      "max_selections": 3
    },
    {
      "type": "yes_no",
      "instruction": "Does this image contain identifiable people?",
      "on_yes": {
        "add_tags": ["people", "portraits"],
        "set_metadata": [
          { "field": "requires_model_release", "value": true }
        ]
      }
    },
    {
      "type": "yes_no",
      "instruction": "Is this image high-quality enough for front-page or featured use?",
      "on_yes": {
        "add_tags": ["featured-quality", "homepage-ready"],
        "set_metadata": [
          { "field": "editorial_priority", "value": "high" }
        ]
      }
    }
  ]
}

Result: Rapid content categorization, rights management flagging, and quality assessment.

How AI tasks work

AI tasks analyze images and apply metadata based on your business rules. Each task defines what aspect of the image to evaluate (instruction), what values are valid (vocabulary), and what actions to take with the results. You can configure multiple tasks in a single configuration, with each task handling a different aspect of categorization.

Each task has three components:

Instruction - A clear, natural language question or instruction that tells the AI what to analyze.

Vocabulary - A predefined list of possible values the AI can select from, this is your controlled vocabulary or business taxonomy. AI can only choose from the values you define (1-100 items per vocabulary).

Actions - What happens with the AI's analysis, tags get added, metadata fields get set, or conditional logic executes.

Task types

AI tasks support three task types, each designed for different metadata management needs:

Task TypeWhat It DoesWhen to Use
select_tagsSelects and applies tags from your vocabularyCategorization, product attributes, building searchable taxonomies, or applying multiple labels
select_metadataSets custom metadata field values from your vocabularyStructured data like color, season, type, status, or single/multi-select dropdown fields
yes_noAsks yes/no questions and executes conditional actionsQuality checks, compliance verification, binary classifications, or conditional workflows

Select tags

Analyzes the image and adds relevant tags from your controlled vocabulary. The AI compares what it sees in the image against your instruction, selects matching tags from your vocabulary while respecting min/max selection constraints, and adds the selected tags to the file. For example, an image of a living room might receive tags: ["sofa", "chair", "table", "lamp"].

Configuration:

Copy
{
  "type": "select_tags",
  "instruction": "What types of furniture are visible in this image?",
  "vocabulary": ["sofa", "chair", "table", "desk", "bed", "shelving", "cabinet", "lamp"],
  "min_selections": 1,
  "max_selections": 4
}

Parameters:

ParameterTypeRequiredDescription
typestringYesMust be "select_tags"
instructionstringYesQuestion or instruction (1-1000 characters)
vocabularyarrayYesPossible tag values (1-100 items, max 500 chars combined, no % character)
min_selectionsnumberNoMinimum tags to select (≥ 0). Default: no minimum
max_selectionsnumberNoMaximum tags to select (≥ 1). Default: no maximum

Select metadata

Analyzes the image and sets a custom metadata field value from your vocabulary. The AI evaluates the image against your instruction, selects the best matching value(s) from vocabulary, validates against field type constraints, and sets the custom metadata field. For example, a metadata field lighting might be set to "golden-hour".

Configuration:

Copy
{
  "type": "select_metadata",
  "instruction": "What is the dominant lighting condition in this image?",
  "field": "lighting",
  "vocabulary": ["natural-daylight", "golden-hour", "overcast", "indoor-artificial", "low-light", "night"],
  "min_selections": 1,
  "max_selections": 1
}

Parameters:

ParameterTypeRequiredDescription
typestringYesMust be "select_metadata"
instructionstringYesQuestion or instruction (1-1000 characters)
fieldstringYesCustom metadata field name (must already exist in your media library)
vocabularyarrayYesPossible values matching field type (1-100 items)
min_selectionsnumberNoMinimum values to select (≥ 0). Default: no minimum
max_selectionsnumberNoMaximum values to select (≥ 1). Default: no maximum

Important:

  1. The custom metadata field must exist before using it in AI tasks. Create fields using the Custom Metadata Fields API or in your dashboard under Settings → Media Library → Custom Metadata Fields.
  2. Your vocabulary must match the field's schema definition. If you later change the field schema, AI tasks may fail to set values. Check the asset history to see why values were or weren't set.

Yes/No

Asks a yes/no question about the image and executes different actions based on the answer. The AI evaluates the image and returns one of three responses: Yes, No, or Unknown (when the AI cannot confidently determine the answer). Each response can trigger different actions—tags added/removed and metadata set/unset.

For example, a high-quality image might receive tags ["print-ready", "high-quality", "approved"] and metadata updates for quality status. If the AI cannot confidently assess quality, the on_unknown actions execute instead (e.g., tagging for manual review).

Configuration:

Copy
{
  "type": "yes_no",
  "instruction": "Does this image meet quality standards for print publication (sharp focus, good lighting, high resolution)?",
  "on_yes": {
    "add_tags": ["print-ready", "high-quality", "approved"],
    "set_metadata": [
      { "field": "quality_status", "value": "approved" },
      { "field": "print_approved", "value": true }
    ]
  },
  "on_no": {
    "add_tags": ["web-only", "needs-improvement"],
    "remove_tags": ["print-ready", "approved"],
    "set_metadata": [
      { "field": "quality_status", "value": "rejected" },
      { "field": "print_approved", "value": false }
    ]
  },
  "on_unknown": {
    "add_tags": ["needs-review"],
    "set_metadata": [
      { "field": "quality_status", "value": "pending" }
    ]
  }
}

Parameters:

ParameterTypeRequiredDescription
typestringYesMust be "yes_no"
instructionstringYesYes/no question (1-1000 characters)
on_yesobjectNo*Actions to execute if AI determines answer is "yes"
on_noobjectNo*Actions to execute if AI determines answer is "no"
on_unknownobjectNoActions to execute if AI cannot confidently determine yes or no

* At least one of on_yes or on_no is required.

Action objects:

Each action object can include:

Copy
{
  "add_tags": ["tag1", "tag2"],
  "remove_tags": ["tag3", "tag4"],
  "set_metadata": [
    { "field": "field_name", "value": "some_value" }
  ],
  "unset_metadata": [
    { "field": "field_to_remove" }
  ]
}
PropertyTypeDescription
add_tagsarrayTags to add to the file
remove_tagsarrayTags to remove from the file
set_metadataarrayArray of objects with field (string) and value (any) to set metadata fields
unset_metadataarrayArray of objects with field (string) to remove metadata fields

Complete example: Fashion e-commerce

This comprehensive example combines all three task types for a fashion retailer:

Copy
{
  "name": "ai-tasks",
  "tasks": [
    {
      "type": "select_tags",
      "instruction": "What types of clothing or accessories are visible in this product image?",
      "vocabulary": [
        "dress", "shirt", "blouse", "t-shirt", "sweater", "jacket", 
        "coat", "pants", "jeans", "skirt", "shorts", "shoes", 
        "boots", "sneakers", "bag", "belt", "hat", "scarf", "jewelry"
      ],
      "min_selections": 1,
      "max_selections": 5
    },
    {
      "type": "select_metadata",
      "instruction": "What is the primary color of the main product?",
      "field": "primary_color",
      "vocabulary": [
        "black", "white", "gray", "beige", "brown", 
        "red", "pink", "orange", "yellow", "green", 
        "blue", "navy", "purple", "multi-color", "metallic"
      ],
      "min_selections": 1,
      "max_selections": 1
    },
    {
      "type": "select_metadata",
      "instruction": "What season or weather is this product suitable for?",
      "field": "season",
      "vocabulary": ["spring", "summer", "fall", "winter", "all-season"],
      "min_selections": 1,
      "max_selections": 2
    },
    {
      "type": "yes_no",
      "instruction": "Is this a formal or dressy item (suitable for office, weddings, formal events)?",
      "on_yes": {
        "add_tags": ["formal", "dressy", "occasion-wear"],
        "set_metadata": [
          { "field": "style_category", "value": "formal" },
          { "field": "dress_code", "value": "business-formal" }
        ]
      },
      "on_no": {
        "add_tags": ["casual", "everyday"],
        "set_metadata": [
          { "field": "style_category", "value": "casual" },
          { "field": "dress_code", "value": "casual" }
        ]
      }
    },
    {
      "type": "yes_no",
      "instruction": "Does this product appear to be luxury or high-end (designer labels, premium materials, high-end styling)?",
      "on_yes": {
        "add_tags": ["luxury", "premium", "designer"],
        "remove_tags": ["budget", "value"],
        "set_metadata": [
          { "field": "price_tier", "value": "premium" },
          { "field": "target_market", "value": "luxury" }
        ]
      },
      "on_no": {
        "add_tags": ["accessible", "value"],
        "remove_tags": ["luxury", "premium"],
        "set_metadata": [
          { "field": "price_tier", "value": "standard" },
          { "field": "target_market", "value": "mass-market" }
        ]
      }
    }
  ]
}

This configuration gives you complete product categorization in one upload. Each product image automatically gets tagged with product types, assigned a primary color, categorized by season, classified by style (formal vs. casual), and marked as luxury or standard—all without manual intervention. Upload 10,000 products, and every single one gets consistent, searchable metadata that your team and customers can immediately filter and browse.

Applying AI tasks

You can apply AI tasks through the dashboard UI or programmatically via API, both at upload time or to existing files. You can also automate AI task application using path policies based on destination in the Media Library.

Using the UI

To use AI tasks through the dashboard, first create a saved extension with your AI tasks configuration.

At upload, open settings and select the saved extension from the Extensions list.

For existing files, select files in Media Library → Right-click → Apply Saved extensions → Choose your AI tasks extension → Apply

Programmatically using API

Apply AI tasks when uploading new files. Include AI tasks in the extensions parameter:

Copy
curl -X POST 'https://upload.imagekit.io/api/v2/files/upload' \
  -u private_key: \
  -F 'file=@image.jpg' \
  -F 'fileName=image.jpg' \
  -F 'extensions=[{"name":"ai-tasks","tasks":[...]}]'

Both APIs also accept the saved extension ID if you want to avoid specifying the task configuration JSON with each request.

Restrictions and limits

  1. Tasks per configuration: 1-10 tasks.
  2. Vocabulary size: 1-100 items per task.
  3. Vocabulary character length (select_tags only): Max 500 characters combined.
  4. Instruction length: 1-1000 characters per task.
  5. Custom metadata fields (select_metadata): Field must exist before use, vocabulary type must match field type.
  6. Yes/no tasks: Must have at least one of on_yes or on_no defined.
  7. Tag values: Cannot contain % character.
  8. Processing time: Typically 1-5 seconds per image.

Best practices

  1. Start with 1-2 tasks on a small batch, validate results, refine configuration, then scale to production.
  2. Keep instructions under 200 characters, be specific and direct.
    ✅ "What types of furniture are visible?"
    ❌ "Describe this image" (too broad)
  3. For yes/no tasks, phrase instructions as yes/no questions.
    ✅ "Does this image contain people?"
    ❌ "Check if people are present" (not a question)
  4. Use distinct vocabulary terms without overlap.
    ["modern", "traditional", "rustic"]
    ["modern", "very modern", "somewhat modern"] (ambiguous)
  5. Test with sample images before large-scale deployment.
  6. Review AI selections regularly and refine instructions based on results.

Next steps

Need help? Contact support or join our community.