Fork me on GitHub

Group Anagrams

Description

https://leetcode.com/problems/group-anagrams/description/

Naive Solution —— implement your own hash

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class Solution:
def groupAnagrams(self, strs):
"""
:type strs: List[str]
:rtype: List[List[str]]
"""
ret = []
word_to_list = {}

for word in strs:
##generate the key
has = 1
dic = {} ##The identity of strs
for s in word:
if dic.get(s) is None:
dic[s] = 0
dic[s] += 1
bias = ord(s) - ord('a') + 1
has *= bias
fake_matched = word_to_list.get(has)

if fake_matched == None:
word_to_list[has] = [{}]
word_to_list[has][0]["word_list"] = [word]
word_to_list[has][0]["id"] = dic

else:
##Found the proper item in the hash slot
found = False
for item in fake_matched:
if item["id"] == dic:
item["word_list"].append(word)
found = True
break
##Find nothing matched, means Hash conflict
if found == False:
fake_matched.append({
"id": dic,
"word_list": [word]
})

for li in word_to_list.values():
for item in li:
ret.append(item["word_list"])

return ret

But the performance is so pool.

Sorting

Sort the word by char, then use the sorted string as key in hash table

Trade Off

Given the length of word is n

The time consuming on each item is O(n x logn) on sorting method, the one on the other method is (5*n)

If the length of the word is short, the constant item is quite large compared to the length of the word, which is the case in this question. so the your-own-hash way performance poorly.