K most frequent strings

You are given an array of non-distinct strings. Find the k most frequently occurring strings.

Solution

  • Use a hash table to count the frequencies of each of the strings
  • Now, scan through the hash table, process it through a min heap of size k
  • After processing the hash table, k most frequent elements remain in the heap
  • The complexity of this approach O(n) to count, O(n log k) to find k largest frequencies
  • This can be improved by applying the pivot and partition technique reducing the time complexity to O(n + m)

Code

def k_most_frequent(arr, k)
    return nil if arr.nil?
    return nil if k.nil || k <= 0

    frequency_hash = count_frequencies(arr)

    find_k_largest(frequency_hash, k)
end

def count_frequencies(arr)
    hash = {}
    arr.each do |str|
        if hash[str]
            hash[str] += 1
        else
            hash[str] = 1
        end
    end

    hash
end

def find_k_largest(hash, k)
    min_heap = MinHeap.new(comparator_function)
    hash.each do |str, count|
        if min_heap.size < k
            min_heap.insert({str: count})
        else
            min_heap.extract_min
            min_heap.insert(str: count)
        end
    end

    min_heap.to_array
end

def comparator_function(element1, element2)
    element1[str] < element2[str]
end

results matching ""

    No results matching ""