Finding Similarities in Data
At the time of writing there were 73,445 mails to the Erlang mailing list.[42] This represents a vast store of knowledge that can be reused in many ways. We can use it to answer questions about Erlang or for inspiration.
Suppose you’re writing a program and need help. This is where Sherlock comes in. Sherlock analyzes your program and then suggests the most similar posting from all previous mails in the Erlang list that might be able to help you.
So, here’s the master plan. Step 1 is to download the entire Erlang mailing list and store it locally. Step 2 is to organize and parse all the mails. Step 3 is to compute some property of the mails that can be used for similarity searches. Step 4 is to query the mails ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access