# How to sort a list of words? The bucket sorting method

In the last post I said that I just started my MSc in health informs. The first semester has five modules, one of those modules is about object oriented programming.

The object oriented programming module starts with introduction to programming, mathematics in programming, flow charts and pseudo codes.

When I was going through past papers one question caught my eye, at a glance it looks like an easy question, but the more you look at it, the difficult it gets.

## Write a pseudo code to input five names and arrange them in alphabetical order

Looks simple enough right? But what about names that start with the same letter? The differentiating letter can be the second, third or nth letter.

Even though the question asks for five names, a good algorithm should be able to tackle hundreds or thousands of names.

## Approaches

Some of my colleagues suggested a method that was about brute forcing.

• First check the first letter a word (n)
• Then check the first letter of the second word (n+1)
• If the first letter of the n+1 word is lesser than the nth word then swap them
• Now check n against n+2
• Continue until the first letter of a word is greater than that of the first letter of the nth word
• Now repeat the same process for n+1, n+2 word and so on

Even though this approach can be good for small number of words, the complexity can increase faster with the addition of each word than the total number of words.

## Tryingthe bucket sort approach

I had a hunch that there should be a better way, I knew on the back of my mind, that there has to be a better method. Plus I had to make a proof of concept working code for a different method.

After trying different approaches I came across this Wikipedia article about bucket sort, https://en.wikipedia.org/wiki/Bucket_sort

This was my approach

• Generate 26 empty buckets, one bucket representing each letter of the alphabet
• Loop though each word and assign a bucket according to the first letter of each word
• Now loop through the 26 buckets, if the bucket contains only one word then add that to the output
• If there is more than one word, then repeat the same process but now starting with the second letter of the word and so on

## The pseudo code

```BEGIN
Get names as array
Output sort(names, 0)

Begin function sort(names, indent) {
Set buckets = empty 26 arrays
Set output

For every name in names
Set char = char value of first letter-97
Set bucket[char] = name
End for

For every bucket in buckets
If bucket has only one value
Else if more than one
Indent += 1
Sort(bucket)
End if
End for
Return output
End function
END```

## Working code

I’ve made the simple example of this algorithm using JavaScript. Even though the module was taught in Java, since Java arrays are fixed in length, I found difficulties in implementing this code in Java.

``````const names = ["ruky" , "nethmi", "janith" , "rukmal", "sahan" , "rukshan"]

console.log(createBuckets(names, 0))
function createBuckets(names, indent) {
const buckets = []   const output =  []
var n = 0   while(n<=26){
buckets[n] = []
n++
}
for (var i = 0; i <= names.length-1; i++) {
var char = names[i].charCodeAt(indent)-97
buckets[char].push(names[i])
}
for (var i = 0; i <= buckets.length-1; i++) {
if(buckets[i].length > 1) {
const tempOutput = createBuckets(buckets[i],indent+1)
for (var o=0; o <= tempOutput.length-1;o++) {
if (tempOutput[o].length === 1) output.push(tempOutput[o])
}
} else if (buckets[i].length === 1) {
output.push(buckets[i])
}
}
return output
}``````

You can see the code in the following bin https://jsbin.com/jilotavocu/edit?js,console

## Limitations

This algorithm has following limitations, it will throw an error when there duplicates.

I haven’t put a failsafe method to check for duplicates. But it’s fairly simple to implement, and I think the pseudo code is more than enough to get majority of the marks, and it’s also better than brute forcing. Also better to arrange large number of words.

Do you have a better implementation? I’d love to know.