Skip to main content

No-Code API Orchestration: Merging Multiple Backend Responses Into One API Call

Your clients shouldn't need to make five separate API calls to render one screen. Here's how to build a fan-out, merge, and respond pattern at the gateway layer — configured, not coded.

  • workflows
  • api-management
  • developer-experience
  • performance
Zerq team

The client makes one request. The screen needs data from five different services. You have three options: make the client call all five services itself, write a new aggregation microservice, or handle it at the gateway layer without writing code.

The third option is the one most teams skip — either because they don't know the gateway supports it, or because their current gateway is a dumb reverse proxy that cannot.

Here's what gateway-level orchestration looks like and when it is the right choice.

The problem: frontend latency from serial backend calls

A typical dashboard screen needs user profile data, account balance, recent transactions, notification count, and a feature flag value. Without orchestration, the client either:

Calls each service serially. Each request waits for the previous one to complete. The total latency is the sum of all five service response times: 50ms + 80ms + 120ms + 30ms + 20ms = 300ms minimum, before any rendering starts.

Calls services in parallel from the client. Better for latency, but the client now manages five in-flight requests, handles five different error shapes, and has to correlate five responses before it can render. The logic for handling partial failures (what if the notification service is down?) lives in every client implementation — web, mobile, third-party.

Relies on a bespoke aggregation service. Teams often build a "BFF" (Backend for Frontend) — a dedicated service that calls the backends and aggregates the responses for a specific client type. The BFF solves the latency problem but creates a new deployment unit, a new service to maintain, and a new place where logic drifts out of sync with the backends.

Gateway-level orchestration solves all three of these without a new service.

What the workflow looks like

A fan-out, merge, and respond workflow in the gateway has four logical phases:

1. Fan out: parallel calls to multiple backends. The workflow receives the inbound request and dispatches parallel calls to all required backends simultaneously. Call latency becomes the maximum of all service response times, not the sum. The five services above respond in parallel: the slowest is 120ms, so the total upstream wait is 120ms — not 300ms.

2. Handle partial failures per service. Each backend call in the workflow has its own timeout and fallback. If the notification service times out, the workflow substitutes a default value ("notifications": 0) and continues. The aggregated response still arrives — with a best-effort value for the failed service and a metadata field indicating which services were degraded.

3. Merge responses into a single shape. The workflow collects all backend responses and maps them into the single response shape the client expects. Fields are extracted from each backend response and assembled into the combined output. Field names, nesting, and data types are normalised at this step — the client receives a consistent shape regardless of how each backend names its fields.

4. Respond with the merged payload. The client makes one request, receives one response, handles one error shape. The complexity of the multi-backend call is invisible from the client's perspective.

The configuration pattern

In a gateway that supports visual workflows, this pattern looks like:

[Incoming Request: GET /dashboard]
        |
[Parallel Fan-Out]
    |          |          |          |          |
[Profile] [Balance] [Txns]  [Notifs] [Flags]
    |          |          |          |          |
[Each: timeout=2s, fallback=default value]
        |
[Merge: map fields to output shape]
        |
[Response: 200 with merged JSON]

The configuration specifies:

  • Which backends to call (upstream URLs)
  • What to extract from each response (field mappings)
  • What to return on timeout or error for each service (fallback values)
  • What the output shape looks like (field names, structure)

No code. No deployment. The workflow runs inside the gateway runtime that is already serving your traffic.

When to use orchestration vs a BFF

Orchestration at the gateway layer is the right choice when:

The aggregation logic is stable. If the shape of the merged response changes rarely (the fields are well-defined and the backends are mature), a gateway workflow is easier to maintain than a BFF. The workflow is configuration, not code — changes are a workflow publish, not a deployment.

The client types share the same shape. If your web app, mobile app, and third-party partners all need the same merged response, one gateway workflow serves all of them. A BFF typically has one per client type.

You don't own the backends. If you're aggregating responses from third-party APIs, internal services owned by other teams, or legacy systems you can't modify, the gateway is a natural aggregation point. You don't need the backend teams to coordinate.

A dedicated BFF is the right choice when:

The aggregation involves complex business logic. If deciding which fields to include requires domain knowledge, conditional branching on business rules, or stateful computation, that logic belongs in a service — not in gateway configuration.

Different client types need fundamentally different shapes. If the web app and mobile app need structurally different responses (not just different field subsets), separate BFFs with their own data models may be cleaner.

In practice, many teams use both: a gateway workflow for the common case and lightweight BFFs for client-specific edge cases that genuinely need custom logic.

Caching the aggregated response

Because the gateway sees the full merged response, it can cache it. A dashboard response that aggregates five services can be cached at the gateway for 10 seconds. For a high-traffic application, this eliminates 90% of the fan-out calls during a burst — all five backends see a fraction of the traffic.

Partial caching is also possible: cache the slow, expensive call (transaction history) for 30 seconds, and always call the fast services fresh (notifications, feature flags). The workflow handles the cache logic alongside the fan-out logic, in the same place.

Error visibility for aggregated responses

When a backend in the fan-out fails or times out, you need to know. A degraded aggregated response — where the notification count defaulted to zero because the service was down — should be observable.

The workflow can include a metadata field in the response:

{
  "data": { ... },
  "_meta": {
    "degraded": ["notifications"],
    "sources": {
      "profile": "ok",
      "balance": "ok",
      "transactions": "ok",
      "notifications": "timeout",
      "flags": "ok"
    }
  }
}

This tells the client exactly which services were degraded, so it can display appropriate UI (a "notifications unavailable" state rather than silently showing zero). And it gives your monitoring the per-service failure signal, not just "the aggregated endpoint returned 200 with degraded data."


Zerq's workflow designer supports parallel fan-out, per-service fallback, field-level response merging, and per-source caching — all configured visually without writing aggregation code. See the workflow capabilities or request a demo to walk through your specific multi-backend aggregation use case.