When a job is running, you may need to fetch results before it’s fully finished.

This guide explains how to handle polling for results and when to rely on webhooks. You’ll learn when to poll, how often, and how to avoid common pitfalls like infinite loops.

TL;DR

PollingWaiting for Webhooks
⚡ Get results ASAP🕰️ Get results once the job finishes
⏱️ Request every 30 seconds🚫 No polling—wait for webhook
🚫 Stop when webhook is received📡 Webhook sends a real-time update
🕰️ Used when users expect fast feedback📦 Used when users can wait for full results
🔄 Needs a stop condition✅ No need for stop conditions
❌ Can cause API overload⚡ Uses system resources efficiently

What is Polling?

Polling means sending repeated Get a Run’s Results requests to the server to check for job results until the job is completed or a webhook notification is received. 📡

GET https://api.captaindata.co/v3/jobs/:job_uid/results

Polling is useful when you want to access job results before the job is fully finished.

When Should You Poll?

Polling is useful in these scenarios:

  • ⚡ User needs immediate feedback: Users want to see results as soon as possible (before the job finishes).
  • 📡 If webhooks are failing: Use polling as a fallback method when webhooks are unavailable.
  • 🛠️ Local development: Polling avoids the need for externally accessible endpoints when testing workflows.
  • 🏗 Unreliable webhook delivery: Firewalls or connectivity issues may prevent webhook notifications from being received.

When Should You Wait for Webhooks?

Webhooks are better in these cases:

  • 🕰️ Users can wait for full results: If the workflow doesn’t require immediate results, webhooks reduce API overhead.
  • 🔔 Real-time updates: The system automatically pushes updates, making it ideal for automated workflows.
  • 🚀 Lower API rate limits risk: Unlike polling, webhooks prevent excessive API calls, avoiding throttling or rate limits.
  • 📦 When you want to reduce API calls: As Webhooks only trigger when a job is completed, it avoids unnecessary calls to Get a Run’s Results, saving API usage and system resources.

How it works: A webhook is triggered when the job status changes (e.g., created, success, failed). When this happens, you get a real-time notification.

To fetch the job’s progress using Get a Run: row_count represents the number of outputs. You can calculate the progress as % progress = (#output / #input) * 100.

How to Poll Efficiently

If you decide to poll, follow these best practices:

  1. Start polling as soon as the job is created.
  2. ⏱️ Frequency: Request every 30 seconds if the job involves large inputs.

Follow these steps:

  1. Start the Job: Launch a workflow using Captain Data’s API.

    POST https://api.captaindata.co/v3/workflows/{workflow_uid}/schedule
    
  2. Retrieve the Job ID: Extract the job_uid from the API response.

    {
      "message": "Bot successfully scheduled.",
      "job_uid": "xxxxxx-9bc6-49be-9e21-d8037d1e393b"
    }
    
  3. Monitor Job Status: Periodically check the job’s status.

  4. Fetch Results: Once the status is marked as finished, retrieve the data.

    GET https://api.captaindata.co/v3/jobs/{job_uid}/results
    

Here’s a basic polling loop in Python to check a job’s status and retrieve results:

import time

def poll_job_results(job_uid):
    results = []
    index = 0

    while True:
        status = getJobStatus(job_uid)

        if status == "finished":
            break
        elif status == "pending" or status == "running":
            job_results = getJobResults(job_uid).results
            results.append(job_results[index:])  # Append new results
            index = len(job_results)
            time.sleep(10)  # Wait before polling again
        else:
            raise Exception("Job encountered an error state.")

    return results

Ensure your polling logic has a clear stop condition (like a maximum timeout) to avoid infinite loops.

To track the progress of a Run, use the following:

  1. Use Get a Run to check the Run’s status, here get_run()
  2. Check the Run’s status and act upon:
import time

def poll_run(uid):
    while True:
        status = get_run(uid).status  # Fetch updated status

        if status == "finished":
            break  # Exit loop when the run is complete
        elif status == "shutdown":
            return "ERROR"  # Handle shutdown case
        elif status in ["running", "pending"]:
            time.sleep(30)  # Wait before polling again
        else:
            return f"Unexpected status: {status}"  # Handle any unknown status

    return get_run_results(uid).results  # Return results once finished
  1. Use Get a Run’s Results to retrieve the results once the status is “running”, here get_run_results()
  2. 🚫 Stop polling as soon as the Run’s status is finished or failed, since further polling is unnecessary.

You can also take into consideration the warning status depending on your implementation.

Polling Scenarios in Action

Scenario 1: Polling for Early Results

Use Case: You want to access job results before the job finishes.

  1. Start polling https://api.captaindata.co/v3/jobs/:job_uid/results as soon as the job starts.
  2. Frequency: Send requests every 30 seconds for large jobs.

Scenario 2: Waiting for Full Results

Use Case: The user is okay waiting until all results are ready.

  1. Don’t poll.
  2. Rely on the webhook to notify you when the job is completed.
  3. When the webhook triggers, make a single GET JOB RESULTS request to get the full job results.

If Polling Both Results & Job Status

  • 📡 Limit polling frequency to once every 5 to 10 times to avoid excess API calls.
  • 📦 Once the job is complete, stop polling immediately.
  • Do not retry polling if you receive an “All Inputs Failed” error. This error indicates that the inputs are invalid (e.g., LinkedIn profile doesn’t exist).

Why not retry? “All Inputs Failed” means the system has determined that no valid inputs exist. Retrying the job will produce the same result, so polling won’t help.

Common Pitfalls to Avoid

1️⃣ Avoid Infinite Polling Loops

  • 🔄 Always set a stop condition to exit the polling loop if no webhook is received.
  • 🕰️ Use a timeout or maximum retry count to ensure the loop ends.

2️⃣ Don’t Poll Without a Reason

  • 📡 If a webhook is available, use it instead of polling.
  • 💡 Polling should be a fallback, not your default strategy.

3️⃣ Don’t Retry on ‘All Inputs Failed’

  • ⚠️ If you receive “All Inputs Failed” (e.g., due to invalid profiles), it’s better to stop polling.
  • 🚫 Retrying won’t help because the system has already identified the issue (like a missing profile).

For more details on how to implement GET JOB RESULTS or Webhook Handling, check out our API Reference.

Polling as an Alternative to Webhooks

When integrating Captain Data into your workflows, retrieving job results is a crucial step. While webhooks are typically the preferred method for automation, certain situations make polling a more practical and reliable alternative—especially when working in local environments or dealing with specific technical constraints.

Challenges with Webhooks

Webhooks are designed to send data automatically when a job’s status changes, but they come with specific requirements and potential challenges:

  • Network Accessibility: Webhooks require an externally accessible endpoint, which can be difficult to configure in local development.
  • Error Handling: Issues like 404 Not Found or Bad Gateway errors can arise due to incorrect configurations or unstable connections.
  • Setup Complexity: Running webhooks locally often requires additional tools, such as ngrok, to expose local endpoints to the internet.

When Polling is a Good Alternative

Polling provides a flexible alternative by allowing you to actively check for job updates rather than waiting for webhooks to push data. This approach can be particularly useful in the following scenarios:

  • Local Development: Polling avoids the need to configure external endpoints, making it easier to test workflows.
  • Unreliable Webhook Delivery: If webhook delivery is inconsistent due to firewall rules or connectivity issues, polling provides a controlled way to retrieve data.
  • Tighter Process Control: You can define when and how often to check job statuses, reducing dependency on external event triggers.

Polling in Local Development

Polling can be particularly useful in local development because:

  • No External Endpoint Required: You don’t need to expose a local server to the internet.
  • Straightforward Implementation: Everything runs in a single script without additional dependencies.
  • Better Debugging Control: You can inspect and control API calls directly.

That said, webhooks remain the recommended approach for production environments, as they provide real-time updates and reduce unnecessary API calls. If you’re working locally but still want to test webhooks, you can use tools like ngrok to tunnel your local server.

Conclusion

Both polling and webhooks have their place in workflow automation. Polling is a practical alternative in local development and scenarios where webhooks pose connectivity challenges, while webhooks are ideal for handling high-volume asynchronous workflows in production.

For teams integrating Captain Data at scale, webhooks ensure efficient, event-driven updates. However, if you’re troubleshooting, working in a local environment, or need more control over job execution, polling can be a reliable method to retrieve results on demand.

If you have questions, contact us at support@captaindata.co.