Take a look at this: http://aadrake.com/command-line-tools-can-be-235x-faster-tha...
IMHO I think people abuse Spark and I would be truly impressed if anybody could write a Spark program faster then just a regular Scala program for processing this.