Being sort is most frequent operation in science of computation, till date many sorting algorithms are proposed for CPUs & GPUs.Generally GPUs suffers with low memory sizes, due to this it is not possible to accommodate large data in GPU global Memory which arises external sorting techniques. These GPU based external sorting algorithms are memory intensive and their performance is also dependent on load balancing. This paper proposing an improved GPU sorting algorithm. It has two components: Splitter, which divides large data set into small data chunk, And Sorter which is responsible for sorting chunk of data available in global memory. This algorithm implemented with CUDA platform with an optimized memory contention technique. It showed a significant improvement in performance on large data sorting. © Research India Publications.