Skip to content

FileDownload waits indefinitely on unconsumed stream #3276

@barjin

Description

@barjin

Due to the design of Crawlee request handlers, the user-supplied request handler can return before the response stream is consumed. Because of this, we are waiting until the stream is fully read before considering the request processed (link).

If the user decides not to consume the stream at all, the crawler will hang indefinitely.

import { FileDownload } from '@crawlee/http';

const crawler = new FileDownload({
    streamHandler: ({ request }) => {
        console.log(`Downloading: ${request.url}`);
    },
});

crawler.run(['https://crawlee.dev/img/crawlee-light.svg'])

Metadata

Metadata

Assignees

No one assigned

    Labels

    t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions