Many Wikipedia pages exist under two or more names. For example, there are pages about Complex Network and Complex Networks. The latter redirects to the former, but NetworkX does not know about the redirection.
Accurately merging all duplicate nodes involves natural language processing (NLP) tools that are outside of the scope of this book. It may suffice to join only those nodes that differ by the presence/absence of the letter s at the end or a hyphen in the middle.
Start removing self-loops (pages referring to themselves). The loops don’t change the network properties but affect the correctness of duplicate node elimination.
Now, you need a list of at least some duplicate nodes. You can build it by looking at each ...