Streaming Large Files in the Browser with Javascript

How many times have you started up a web application, only to watch it slowly grind to a halt until eventually crashing completely with the dreaded “Out of Memory” error? Browser-based applications are good at a lot of things, but efficiently handling memory isn’t typically one of them.

Here at SGSI we often need to quickly process large amounts of data from files provided by our users. This work is usually done in a back-end service which can speed up this process with the help of things like threading and distributed processing. However, occasionally we need to do similar work in the user’s browser which has much more restricted resources. This includes single threaded processing and limited access to memory.

How do we get around these limitations while maintaining reasonable speed and memory use? We do this by streaming the file content in chunks using Javascript’s ReadableStream API. Available in all major browsers (except for Internet Explorer), the ReadableStream API allows you to read a stream of file data without having to load the entire file into memory or repeatedly open and close the file.

Below is a simplified version of a class we use to read a file line-by-line using a ReadableStream.

class LineReader {

  newLine = “\r\n”;

  constructor(file) {

    this.file = file;

  }

  /**

   * forEachLine is an async generator function which allows

   * us to iterate over each line without repeatedly opening

   * and closing the file.

   */

  async *forEachLine() {

    // Retrieve the ReadableStream object from the file.

    const stream = this.file.stream();

    const reader = stream.getReader();

    const decoder = new TextDecoder(“utf-8”);

    let currentChunk = “”;

    while (true) {

      // Read the next chunk of data.

      const result = await reader.read();

      // Break out of the infinite loop when we reach the end of the file.

      if (result.done) {

        break;

      }

      // Decode the value of the chunk of data as a utf-8 string.

      currentChunk += decoder.decode(result.value);

      // Check if we have read a newline character. If we haven’t, read another chunk.

      let newLineIndex = this.findNewLineIndex(currentChunk);

      // If a newline is encountered, trim the chunk and yield it to the iterator.

      while (newLineIndex !== -1) {

        const line = currentChunk.slice(0, newLineIndex);

        currentChunk = currentChunk.slice(newLineIndex + this.newLine.length);

        yield line;

        newLineIndex = this.findNewLineIndex(currentChunk);

      }

    }

  }

  findNewLineIndex = (value) => {

    return value.indexOf(this.newLine);

  };

}

(view example in CodeSandbox)