Is there a maximum number of nodes the software can deal with. We have some big datasets (400,000 nodes) which we are thinking about analysing.
Great question! The maximum number of nodes that Polinode will currently let you upload is 50,000. The maximum number of edges is 250,000. That said, performance at that level will be determined by a few factors, including: (1) Your hardware (in particular your CPU), (2) The number and nature of attributes.
With attributes, it’s best to limit the number of attributes used for larger networks. There is also a bit of a “trick” that you may want to use if you have attributes with a large number of text values. If you prepend a “#” in front of the attribute name when uploading it then it won’t be parsed when the network is opened. This can save a lot of time in large networks. The attribute will still be available in the network but you won’t be able to color by it, filter by it, etc. (which you typically don’t need to do with an attribute that has 100’s or 1000’s of values).
The other thing to bear in mind here is that it is usually possible and advisable to pre-process large networks. For example, you could pre-process the 400k nodes to reduce them significantly. This gets done a lot with email data. Typically you choose a communication threshold for edges (i.e. exclude edges where there have been less than X contacts over Y months). After that you can then also optionally calculate the total number of connections for each node and exclude nodes with less then say Z total connections. It’s context dependent but that kind of pre-processing is typically very helpful. Finally, you can also examine sub-networks separately if the network is truly still very large.