Skip to content

pinpointcoder/external-sorting

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stargazers Downloads Issues Vulnerabilities MIT License Contributor Covenant


External Sorting

For my work I needed an external sorting algorithm to sort big arrays (for example: sort 32 array of ~300MB with 2GB of RAM at the same time), but I haven't found any resource which talks about this kind of solution for NodeJS, so I've created it, I've decided to share this part of my project with the community and I hope that the community will help me to improve my solution.

Quick examples

asc sort of strings separate with \n

import fs from 'fs';
import esort from 'external-sorting';

esort({
  input: fs.createReadStream('input_file'),
  output: fs.createWriteStream('output_file'),
  tempDir: __dirname,
  maxHeap: 100
})
  .asc()
  .then(() => {
    console.log('done');
  })
  .catch(console.error);

desc sort of numbers separate with \n

import fs from 'fs';
import esort from 'external-sorting';

await esort({
  input: fs.createReadStream('input_file'),
  output: fs.createWriteStream('output_file'),
  tempDir: __dirname,
  deserializer: parseFloat,
  serializer: (v: number) => v.toString(10),
  maxHeap: 100
})
  .desc()
  .then(() => {
    console.log('done');
  })
  .catch(console.error);

asc sort of objects by property a.b.c separate with \r\n

import fs from 'fs';
import esort from 'external-sorting';

await esort({
  input: fs.createReadStream('input_file'),
  output: fs.createWriteStream('output_file'),
  tempDir: __dirname,
  deserializer: JSON.parse,
  serializer: JSON.stringify,
  delimiter: '\r\n',
  maxHeap: 100
})
  .asc((obj) => obj.a.b.c)
  .then(() => {
    console.log('done');
  })
  .catch(console.error);

asc sort of objects by properties a, b.c and d separate with \n

import fs from 'fs';
import esort from 'external-sorting';

await esort({
  input: fs.createReadStream('input_file'),
  output: fs.createWriteStream('output_file'),
  tempDir: __dirname,
  deserializer: JSON.parse,
  serializer: JSON.stringify,
  maxHeap: 100
})
  .asc([
    (obj) => obj.a,
    (obj) => obj.b.c,
    (obj) => obj.d
  ])
  .then(() => {
    console.log('done');
  })
  .catch(console.error);

Benchmark

Mean [s] Min [s] Max [s] Relative
sort 500,000 string 2.637 ± 0.048 2.570 2.714 1.00
sort 500,000 number 3.691 ± 0.259 3.442 4.234 1.40 ± 0.10
sort 500,000 object 5.039 ± 0.262 4.741 5.407 1.91 ± 0.11
sort 1,000,000 string 5.887 ± 0.637 5.105 6.820 2.23 ± 0.24
sort 1,000,000 number 6.978 ± 0.499 6.531 7.966 2.65 ± 0.20
sort 1,000,000 object 9.665 ± 0.111 9.522 9.791 3.66 ± 0.08
Model Name: MacBook Pro
Model ID: MacBookPro14.3
Processor Name: Quad-Core Intel Core i7
Processor speed: 2.8 GHz
Number of processors: 1
Total number of cores: 4
Level 2 cache (per core): 256 KB
Level 3 cache: 6 MB
Hyper-Threading Technology: Enabled
Memory: 16 GB
SSD: Apple SM0512L

TODO

  • support .by of fast-sort

Credits

Thanks to @snovakovic for the fast-sort package, you can find it on NPM or GitHub

License

This project is licensed under the MIT License - see the LICENSE file for details

About

An external sorting algorithm implementation for NodeJS 💾

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • TypeScript 91.0%
  • JavaScript 8.3%
  • Shell 0.7%