This contains java implementation of Secondary Sort algorithm as mentioned in Data Algorithms book by M Parsian. The algorithm is implemented in Map Reduce and Spark.
Input to both programs is a text file containing data in format (year,month,day,temperature):
1948,01,01,16.4
1948,03,04,45.9
1949,02,12,46.7
1950,04,11,78.6
...
...
Expected output from Map Reduce program :
1948-01 -0.9,2.1,6.6,8.4,8.8,10.5,11.4,15.4,16.4,18.6,18.6,18.9,19.2,20.5,21.9,23.6,24.6,24.7,25.2,27.2,28.1,29.9,33.2,33.4,33.5,33.7,37.8,39.2,39.2,41.5,42.4
1948-02 -0.7,3.7,4.4,7.6,9.5,9.9,10.9,13.1,14.0,14.3,15.9,16.5,16.7,18.2,18.6,22.6,22.7,26.1,28.1,29.0,29.6,31.5,35.2,35.6,42.6,43.2,44.6,47.1,47.5
1948-03 -5.1,-2.5,5.0,8.6,10.9,15.4,17.8,18.9,19.4,20.4,21.1,22.3,24.9,28.3,29.9,31.9,32.1,32.2,35.3,36.8,40.2,41.5,44.8,45.1,45.6,45.7,45.7,47.1,50.6,52.6,55.1
1948-04 33.7,36.2,38.3,42.7,43.2,43.8,45.6,46.4,47.3,47.5,49.2,50.1,50.9,52.7,52.7,53.3,54.1,55.1,55.1,55.3,55.9,57.7,57.7,61.4,62.3,63.6,64.5,66.2,66.6,68.6
...
...
Expected output from Spark program :
(1948-01,[-0.9, 2.1, 6.6, 8.4, 8.8, 10.5, 11.4, 15.4, 16.4, 18.6, 18.6, 18.9, 19.2, 20.5, 21.9, 23.6, 24.6, 24.7, 25.2, 27.2, 28.1, 29.9, 33.2, 33.4, 33.5, 33.7, 37.8, 39.2, 39.2, 41.5, 42.4])
(1948-02,[-0.7, 3.7, 4.4, 7.6, 9.5, 9.9, 10.9, 13.1, 14.0, 14.3, 15.9, 16.5, 16.7, 18.2, 18.6, 22.6, 22.7, 26.1, 28.1, 29.0, 29.6, 31.5, 35.2, 35.6, 42.6, 43.2, 44.6, 47.1, 47.5])
...
...