Skip to content

File management for batch job submission #6577

@davidpanderson

Description

@davidpanderson

Some apps may have large (~1GB) input files that are used by multiple jobs in a batch, and by multiple batches.
There are two aspects in managing such files:

  1. We don't want to repeatedly download it to a client. Once it's on a client, it should remain there until we're sure that no future jobs will use it.
  2. We need to eventually delete it from both client and server.

Currently we can make the file sticky. That does 1) but not 2).
We have tables job_file and batch_file_assoc but aren't really using them.

Proposal:

  • A file in a user sandbox can be 'pinned'. If a file is pinned it can't be changed or deleted while it's associated with a non-retired batch.
  • If a job submission uses a pinned file, it creates a job_file record and a batch association. On the client, the file is sticky.
  • Scheduler: list of sticky files is in request message. If any is not longer pinned, tell the client to delete it.
  • Job submitters get automated emails reminding them to unpin files. Maybe auto-unpin after some period.

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Backlog

    Status

    Backlog

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions