Most visited page

Given a log file that contains the a number of entries, each line lists the page id which indicates it was visited.

Solution

  • Process each line and keep track of the count per page. Use a bst to store the actual count so that it preserves the sorted order and is easy to find the top k pages once all pages have been processed
  • Each node of the BST contains count, use a hash table to store the link between page and its count
  • Update the count each time a line is read,
  • Once all the pages have been read, to compute the k largest on the BST, first find the max (O(log n)) then use the predecessor function k-1 times to find the k most visited pages

results matching ""

    No results matching ""