O'Reilly logo

Complex Network Analysis in Python by Dmitry Zinoviev

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Eliminate Duplicates

Many Wikipedia pages exist under two or more names. For example, there are pages about Complex Network and Complex Networks. The latter redirects to the former, but NetworkX does not know about the redirection.

Accurately merging all duplicate nodes involves natural language processing (NLP) tools that are outside of the scope of this book. It may suffice to join only those nodes that differ by the presence/absence of the letter s at the end or a hyphen in the middle.

Start removing self-loops (pages referring to themselves). The loops don’t change the network properties but affect the correctness of duplicate node elimination.

Now, you need a list of at least some duplicate nodes. You can build it by looking at each ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required