AI_FILL

The AI_FILL function intelligently completes incomplete or missing data by analyzing patterns in provided example data and applying those patterns to fill gaps. Rather than simple interpolation or default value assignment, this function leverages large language models (LLMs) to understand contextual patterns and generate realistic, contextually appropriate values for missing cells. By default, it uses the Mistral AI platform, though any OpenAI-compatible API endpoint can be configured.

The function employs few-shot learning—a technique where the LLM is provided with complete example data to learn patterns before being asked to fill missing values. The model analyzes relationships between columns, typical value ranges, and contextual associations, then applies this understanding to complete the target data. This approach is particularly effective for tabular data where values have semantic meaning and dependencies on other columns.

For example, given employee data with several complete records, AI_FILL can infer appropriate departments or locations for new employees based on their job titles. With product data, it can deduce prices from categories and similar items. The function uses JSON mode (response_format: json_object) to ensure structured, parseable responses from the Mistral Chat Completions API.

The temperature parameter (default 0, range 0.0–2.0) controls the creativity/consistency trade-off. Setting temperature=0 produces deterministic, reproducible results ideal for business data, while higher values introduce variation. According to Mistral’s documentation, temperatures between 0.0 and 0.7 are recommended for most use cases. The max_tokens parameter (default 1500, range 5–5000) controls response size and API costs.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=AI_FILL(example_range, fill_range, api_key, temperature, model, max_tokens, api_url)
  • example_range (list[list], required): 2D list of complete example data to learn patterns from
  • fill_range (list[list], required): 2D list of target data with missing values (null) to fill
  • api_key (str, required): API key for authentication.
  • temperature (float, optional, default: 0): Controls randomness in AI response (0.0 = deterministic, 2.0 = highly random)
  • model (str, optional, default: “codestral-2508”): Model ID to use. Default is “codestral-2508”.
  • max_tokens (int, optional, default: 1500): Maximum tokens in the AI response (5 to 5000)
  • api_url (str, optional, default: “https://api.mistral.ai/v1/chat/completions”): OpenAI-compatible API endpoint URL. Default is “https://api.mistral.ai/v1/chat/completions”.

Returns (list[list]): 2D list with filled values, or error message string.

Example 1: Demo case 1

Inputs:

example_range fill_range temperature model max_tokens
Product ID Category Price Weight (kg) Product ID Category Price Weight (kg) 0 codestral-2508 1500
PRD-001 Laptop 1299 1.8 PRD-004 Laptop
PRD-002 Laptop 999 2.1 PRD-005 Tablet
PRD-003 Tablet 499 0.7 PRD-006 799 1.2

Excel formula:

=AI_FILL({"Product ID","Category","Price","Weight (kg)";"PRD-001","Laptop",1299,1.8;"PRD-002","Laptop",999,2.1;"PRD-003","Tablet",499,0.7}, {"Product ID","Category","Price","Weight (kg)";"PRD-004","Laptop","","";"PRD-005","Tablet","","";"PRD-006","",799,1.2}, 0, "codestral-2508", 1500)

Expected output:

Product ID Category Price Weight (kg)
PRD-004 Laptop 1299 1.8
PRD-005 Tablet 499 0.7
PRD-006 Tablet 799 1.2
Example 2: Demo case 2

Inputs:

example_range fill_range temperature model max_tokens
Employee ID Job Title Department Location Employee ID Job Title Department Location 0 codestral-2508 1500
EMP-001 Sales Manager Sales New York EMP-005 Sales Director
EMP-002 Marketing Specialist Marketing Chicago EMP-006 UX Designer
EMP-003 Sales Representative Sales Los Angeles EMP-007 Marketing Director
EMP-004 Software Developer Engineering San Francisco EMP-008 Senior Developer

Excel formula:

=AI_FILL({"Employee ID","Job Title","Department","Location";"EMP-001","Sales Manager","Sales","New York";"EMP-002","Marketing Specialist","Marketing","Chicago";"EMP-003","Sales Representative","Sales","Los Angeles";"EMP-004","Software Developer","Engineering","San Francisco"}, {"Employee ID","Job Title","Department","Location";"EMP-005","Sales Director","","";"EMP-006","UX Designer","","";"EMP-007","Marketing Director","","";"EMP-008","Senior Developer","",""}, 0, "codestral-2508", 1500)

Expected output:

Employee ID Job Title Department Location
EMP-005 Sales Director Sales New York
EMP-006 UX Designer Engineering San Francisco
EMP-007 Marketing Director Marketing Chicago
EMP-008 Senior Developer Engineering San Francisco
Example 3: Demo case 3

Inputs:

example_range fill_range temperature model max_tokens
Metric Q1 2024 Q2 2024 Q3 2024 Q4 2024 Metric Q1 2024 Q2 2024 Q3 2024 Q4 2024 0 codestral-2508 1500
Revenue 250000 280000 310000 350000 Profit
Expenses 180000 195000 215000 235000 Headcount 32 35

Excel formula:

=AI_FILL({"Metric","Q1 2024","Q2 2024","Q3 2024","Q4 2024";"Revenue",250000,280000,310000,350000;"Expenses",180000,195000,215000,235000}, {"Metric","Q1 2024","Q2 2024","Q3 2024","Q4 2024";"Profit","","","","";"Headcount",32,35,"",""}, 0, "codestral-2508", 1500)

Expected output:

Metric Q1 2024 Q2 2024 Q3 2024 Q4 2024
Profit 70000 85000 95000 115000
Headcount 32 35 38 41
Example 4: Demo case 4

Inputs:

example_range fill_range
Product ID Category Price Weight (kg) Product ID Category Price Weight (kg)
PRD-001 Laptop 1299 1.8 PRD-007 Laptop
PRD-002 Laptop 999 2.1
PRD-003 Tablet 499 0.7

Excel formula:

=AI_FILL({"Product ID","Category","Price","Weight (kg)";"PRD-001","Laptop",1299,1.8;"PRD-002","Laptop",999,2.1;"PRD-003","Tablet",499,0.7}, {"Product ID","Category","Price","Weight (kg)";"PRD-007","Laptop","",""})

Expected output:

Product ID Category Price Weight (kg)
PRD-007 Laptop 1299 1.8

Python Code

Show Code
import requests
import json

def ai_fill(example_range, fill_range, api_key, temperature=0, model='codestral-2508', max_tokens=1500, api_url='https://api.mistral.ai/v1/chat/completions'):
    """
    Fills missing or incomplete data in a target range by learning patterns from an example range using a large language model.

    This example function is provided as-is without any representation of accuracy.

    Args:
        example_range (list[list]): 2D list of complete example data to learn patterns from
        fill_range (list[list]): 2D list of target data with missing values (null) to fill
        api_key (str): API key for authentication.
        temperature (float, optional): Controls randomness in AI response (0.0 = deterministic, 2.0 = highly random) Default is 0.
        model (str, optional): Model ID to use. Default is "codestral-2508". Default is 'codestral-2508'.
        max_tokens (int, optional): Maximum tokens in the AI response (5 to 5000) Default is 1500.
        api_url (str, optional): OpenAI-compatible API endpoint URL. Default is "https://api.mistral.ai/v1/chat/completions". Default is 'https://api.mistral.ai/v1/chat/completions'.

    Returns:
        list[list]: 2D list with filled values, or error message string.
    """
    # Validate temperature
    if not isinstance(temperature, (float, int)) or not (0 <= float(temperature) <= 2):
        return "Error: temperature must be a float between 0 and 2 (inclusive)"
    # Validate max_tokens
    if not isinstance(max_tokens, int) or not (5 <= max_tokens <= 5000):
        return "Error: max_tokens must be an integer between 5 and 5000 (inclusive)"
    # Validate inputs
    if not isinstance(example_range, list) or not example_range:
        return "Error: Example range is empty or invalid"
    if not isinstance(fill_range, list) or not fill_range:
        return "Error: Fill range is empty or invalid"
    if not api_key:
        return "You must include an API key to use this function. Sign up for a free API key at https://aistudio.google.com/, https://console.mistral.ai/, or other providers and add your own api_key.  You may use any OpenAI compatible API, just update the api_url parameter."
    # Convert example_range and fill_range to JSON strings for the prompt
    example_json = json.dumps(example_range)
    fill_json = json.dumps(fill_range)
    # Construct a specific prompt for filling data
    fill_prompt = (
        "Fill in the missing values in the target data based on patterns in the example data.\n"
        f"Example data (complete): {example_json}\n\n"
        f"Target data (with missing values): {fill_json}\n\n"
        "Study the patterns in the example data and complete the target data by filling in missing values. Preserve all existing values in the target data."
    )
    fill_prompt += ("\n\nReturn ONLY a JSON object with a key 'items' whose value is a JSON array of arrays (2D array) with the completed target data. "
                    "Do not include any explanatory text, just the JSON object. "
                    "For example: {\"items\": [[\"row1col1\", \"row1col2\"], [\"row2col1\", \"row2col2\"]]}")
    # Prepare the API request payload
    payload = {
        "messages": [{"role": "user", "content": fill_prompt}],
        "temperature": temperature,
        "model": model,
        "max_tokens": max_tokens,
        "response_format": {"type": "json_object"}
    }
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "Accept": "application/json"
    }
    try:
        response = requests.post(api_url, headers=headers, json=payload)
        if response.status_code == 429:
            return "Error: You have hit the rate limit for the API. Please try again later"
        response.raise_for_status()
        response_data = response.json()
        content = response_data["choices"][0]["message"]["content"]
        try:
            filled_data = json.loads(content)
            if isinstance(filled_data, dict) and "items" in filled_data:
                filled_data = filled_data["items"]
            elif isinstance(filled_data, dict):
                if "data" in filled_data:
                    filled_data = filled_data["data"]
                elif "filled_data" in filled_data:
                    filled_data = filled_data["filled_data"]
                elif "result" in filled_data:
                    filled_data = filled_data["result"]
            if isinstance(filled_data, list) and all(isinstance(row, list) for row in filled_data):
                if (len(filled_data) == len(fill_range) and all(len(row) == len(fill_range[i]) for i, row in enumerate(filled_data))):
                    return filled_data
                else:
                    return "Error: AI response dimensions don't match the fill range"
            else:
                return "Error: Unable to parse response. Expected a 2D array"
        except (json.JSONDecodeError, ValueError):
            return "Error: Unable to fill data. The AI response wasn't in the expected format"
    except requests.exceptions.RequestException as e:
        return f"Error: API request failed. {str(e)}"

Online Calculator

2D list of complete example data to learn patterns from
2D list of target data with missing values (null) to fill
API key for authentication.
Controls randomness in AI response (0.0 = deterministic, 2.0 = highly random)
Model ID to use. Default is "codestral-2508".
Maximum tokens in the AI response (5 to 5000)
OpenAI-compatible API endpoint URL. Default is "https://api.mistral.ai/v1/chat/completions".