Resolving Nested Promise Arrays

March 24, 2019

Often times, considerable asynchronous complexity can arise when making requests to multiple resources in JavaScript in an effort to merge disparate data into a single cohesive data object.

Instances, where this may occur, could include aggregating various APIs, retrieving data from multiple databases, fetching or processing data with service workers, working with a service-oriented architecture and the list continues.

As to follow, an example situation is outlined with a suggested set of compounding solutions that may lead to additional insight into potentially solving this type of problem.

A Situation

Take for hypothetical, a developer is looking to retrieve all comments made by users who are considered active on a particular article post; active as defined by any sort of engagement metric.

First, an asynchronous request is made to retrieve the active users of the specific post. Utilizing await, one can wait until a Promise is fulfilled before continuing execution.

  const activeUsers = await post.getActiveUsers();

The response data structure contains an array of User objects with their last few comments. Instead of returning the full Comment object for each, most likely to reduce the response payload size, only the corresponding commentIds are provided by the resource.

  [
    {
      id: '8190834',
      commentIds: [ '0002434', '0002437', '0002440' ]
    },
    {
      id: '8190835',
      commentIds: [ '0002436', '0002437', '0002441' ]
    }
  ]

To retrieve the comments tied to the users, another call may be made to fetch each individual comment based on the commentId and then assign it back to the corresponding User object.

  const activeUsers = await post.getActiveUsers();
  const activeUsersComments = activeUsers.map(user => {
    user.comments = user.commentIds.map(async commentId => {
      return await comments.getCommentById(commentId)
    });

    return user;
  });

But now the resulting data structure contains a nested array of Promises for each set of comments (hereafter referred to as Array<Promise>). Even though included in the map function’s return is the await operator, the comments are not resolved.

  [
    {
      id: '8190834',
      commentIds: [ '0002434', '0002437', '0002440' ],
      comments: [ 
        Promise { <pending> },
        Promise { <pending> },
        Promise { <pending> } 
      ]
    },
    {
      id: '8190834',
      commentIds: [ '0002436', '0002437', '0002441' ],
      comments: [ 
        Promise { <pending> },
        Promise { <pending> },
        Promise { <pending> } 
      ]
    }
  ]

A Solution

One solution would be to use Promise.all to return a single Promise for the entire iterable of unresolved comments passed to it. Prior to assigning it back to the User object, it’s essentially merging the Array<Promise> into a single Promise.

Moving it to a separate function, resolveComments, may also improve clarity.

  function resolveComments(user) {
    return Promise.all(
      user.commentIds.map(commentId => {
        return comments.getCommentById(commentId)
      })
    );
  }

  const activeUsers = await post.getActiveUsers();
  const activeUsersFullComments = activeUsers.map(async user => {
    user.comments = await resolveComments(user)
    return user;
  });

The activeUsersFullComments variable would contain a single Array<Promise> with each item resolving subsequently to a User object containing the nested array of resolved Comment objects.

  [ 
    Promise { <pending> }, 
    Promise { <pending> } 
  ]

Following this approach, a next step could be to, once again, combine the Array<Promise> into a single Promise by passing it to the Promise.all method and calling it a day.

  const resolvedActiveUsersFullComments = await Promise.all(activeUsersFullComments);

All of the comments are resolved for each User and may be utilized as desired within the application.

  [
    { 
      id: '8190834',
      commentIds: [ '0002434', '0002437', '0002440' ],
      comments: [ 
        { 
          id: '0002434',
          content: 'Awesome post!' 
        },
        { 
          id: '0002437',
          content: 'I actually just taught my dog JavaScript.' 
        },
        { 
          id: '0002440',
          content: 'Please stop boring the people.' 
        } 
      ] 
    },
    { 
      id: '8190835',
      commentIds: [ '0002436', '0002438', '0002439' ],
      comments: [ 
        { 
          id: '0002436',
          content: 'Where can I find more information?' 
        },
        { 
          id: '0002438',
          content: 'What a title! Too bad I didn\'t read the article.' 
        },
        { 
          id: '0002439',
          content: 'Not quite sure where I am.' 
        },
      ] 
    }
  ]

A Better Solution

But say one wanted to make it a bit more abstract and perform the same logic on any type of User, not just the active ones.

They could have a resolveUsers method that calls the resolveComments method and wrapping each Array<Promise> in a Promise.all.

  function resolveUsers(users) {
    return Promise.all(
      users.map(user => {
        return resolveComments(user)
      })
    );
  }

  async function resolveComments(user) {
    user.comments = await Promise.all(
      user.commentIds.map(commentId => {
        return comments.getCommentById(commentId)
      })
    );

    return user;
  }

  const activeUsers = await post.getActiveUsers();
  const resolvedUsers = await resolveUsers(activeUsers);

Notice how resolveComments requires an await operator before the Promise.all in the user.comments assignment and resolveUsers does not. Simply because the Promise is being returned and not assigned.

A Better, Little More Performant Solution

What if there were comments that more than one user contributed to or possibly a comment thread containing references to multiple users (i.e. multiple users having the same commentId value in their User object)?

It wouldn’t make sense to make a request for that resource more than once, would it?

Caching

A way to handle this would be through the use of caching, by storing the comment in memory for subsequent use when resolving another User object (see also: memoization). As long as the comment is immutable or updates are unlikely to occur over the course of the execution, iterating and resolving all users, the following approach should suffice.

By creating a cache object, existing comments can be stored then checked for existence and, if existing, retrieved from memory rather than making an additional request to an external resource.

  function resolveUsers(users) {
    const resolveComments = createCommentsResolver();
  
    return Promise.all(
      users.map(user => {
        return resolveComments(user)
      })
    );
  }

  /**
   * Creates a `resolveComments` function which retrieves the full 
   *  comments for the specified user and caches the comment request
   *  if it has already been made.
   *
   * @returns {Function} `resolveComments`
   */
  function createCommentsResolver() {
    let cache = {};

    return async function resolveComments(user) {
      // Retrieves all comments based on `commentId`.
      user.comments = await Promise.all(
        user.commentIds.map(commentId => {
          // Adds comment to cache, if it doesn't already exist.
          if (!cache[commentId]) {
            cache[commentId] = comments.getCommentById(commentId)
          }
    
          return cache[commentId]
        })
      );
  
      return user;
    }
  }

  const activeUsers = await post.getActiveUsers();
  const resolvedUsers = await resolveUsers(activeUsers);

The lexical scope of the closure made by createCommentsResolver maintains the state of the cache across each iteration of users array.

Situational Caveats

An issue with the example case is that the number of requests or queries made grows proportionally with each additional comment included. As in for every new comment created by a user, the number of requests increases.

  5 users * 1 comment = 5 requests
  
  5 users * 2 comments = 10 requests
  
  5 users * 10 comments = 50 requests

n+1 Problem

This may lead to an n+1 problem, “when a request to load one item turns into n+1 requests since the item has n associated items.” It can potentially be mitigated by batching the queries for the nested property, ahead of time, before making the requests.

Rate Limiting

Another scenario where it may perform poorly would be working with a rate limited resource, such as a third-party API. On each iteration, there may be a significant delay between resolving each subset of data, culminating in one long wait period before the data is accessible.

Fault Tolerance

One thing to note is the lack of fault tolerance applied above. As discussed in another article of mine, Asynchronous Fishing: The Multi-Promise Resolution, when using Promise.all for merging, total rejection occurs if one of the requests made to a resource fails, causing the entire Array<Promise> to not resolve.

Other Solutions

Plenty of libraries exist in the wild to help subjugate some these issues, for instance, RxJS and the usage of Observables may be a better approach for merging the two example resources.

GraphQL and nested resolvers are intended for doing just that, resolving nested data. Certainly, a better option if the resource is a GraphQL server.

In addition to these, Facebook’s DataLoader is designed specifically for handling most of the aforementioned caveats by being “used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching.”

Final Note

No solution is perfect. Iteration and optimization, as needed, is as always the most pragmatic approach and the above is just one way this can be solved, albeit with noted caveats.

Let me know your thoughts, questions, considerations; always interested in hearing feedback!

Photo credit: Mr.TinDC on Visualhunt / CC BY-ND