Thanks Aravind.

If you refer to section 3.1.2 in the quoted reference, it is clearly mentioned that spilling does happen. Although, they have mentioned it in context of reduce phase, and not on the map phase.

Earlier Map phase was based on the Hash Shuffle writer, and therefore spilling was not required because for each of the reduce task, you have a different file.

However, with sort shuffle writer, there is only one consolidated shuffle data file sorted by reduce partitions. Therefore spilling is required in case the sort shuffle buffer fills up in between.

Hope this helps.

Big Data Architect, Apache Spark Specialist, https://www.linkedin.com/in/ajaywlan/

Big Data Architect, Apache Spark Specialist, https://www.linkedin.com/in/ajaywlan/