Short Encoding of Words

Given a list of words, we may encode it by writing a reference string S and a list of indexes A.

For example, if the list of words is ["time", "me", "bell"], we can write it as S = "time#bell#" and indexes = [0, 2, 5].

Then for each index, we will recover the word by reading from the reference string from that index until we reach a "#" character.

What is the length of the shortest reference string S possible that encodes the given words?

Example:

Input: words = ["time", "me", "bell"]
Output: 10
Explanation: S = "time#bell#" and indexes = [0, 2, 5].

Note:

  1. 1 <= words.length <= 2000.

  2. 1 <= words[i].length <= 7.

  3. Each word has only lowercase letters.

class Solution {
    public int minimumLengthEncoding(String[] words) {
        Set<String> set = new HashSet<>();
        set.addAll(Arrays.asList(words));
        for (String word : words) {
            for (int i = 1; i < word.length(); i++) {
                String substring = word.substring(i);
                if (set.contains(substring))
                    set.remove(substring);
            }
        }
        // In the end the set will only contain words which are
        // not a suffix substring of any other word
        int length = 0;
        for (String word : set)
            // Adding extra 1 for '#'
            length += word.length() + 1;
        return length;
    }
}

Last updated