The resulting graph G has 2,975 nodes and 3,162 edges. It is very sparse:
It also has a lot of small connected components with two to four nodes, as shown in the figure. (You can call nx.connected_components(G) and measure the size and count of them on your own.)
To keep the case simple, we consider only the largest component (the GCC). We sort all components of G by size, select the last one (the largest!), join the respective label lists into one with chain.from_iterable, and extract the subgraph induced by these nodes. We store the resulting subgraph in the variable called ...