Skip to content

Significant slowdown on large files #10

@xelatihy

Description

@xelatihy

The current version of happly is quite slow for large files compared to an home-brewed solution I cooked up. The profiler suggest that the problem is allocating many small vectors in list properties. On the Lucy model from the Stanford repo, happly takes about 16 seconds of which 7 are just vector allocs. My home-brewed solution takes half that time.

I propose the following changes:

  1. change the storage of list props from vector<vector<T>> data to three vectors for start, count and data std::vector<size_t> start; std::vector<uint8_t> count; vector<T> data;, where data has the concatenated list of elements, start has the starting index for each list and count contains the lists sizes
  2. in a backward compatible manner, add getListProperty(vector<array<T, N>>& data, vector<uint8_t>& count) to read the data in preallocated lists; maintain previous versions for backward compatibility

If this sounds good, I may even take a crack at it, but only if this feels right.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions