Analyzing the connectivity of a computer network undergoing a cyber-attack in Python

The following problem appeared as a project in the coursera course Algorithmic Thinking (by RICE university), a part of Fundamentals of Computing specialization. The following description of the problem is taken directly from the project description.

Graph exploration (that is, “visiting” the nodes and edges of a graph) is a powerful and necessary tool to elucidate properties of graphs and quantify statistics on them. For example, by exploring a graph, we can compute its degree distribution, pairwise distances among nodes, its connected components, and centrality measures of its nodes and edges. Both DFS and BFS can be used to compute the connected components of a graph.

Now, the task is to analyze the connectivity of a computer network as it undergoes a cyber-attack. In particular, we shall simulate an attack on this network in which an increasing number of servers are disabled. In computational terms, we will model the network by an undirected graph and repeatedly delete nodes from this graph. We will then measure the resilience of the graph in terms of the size of the largest remaining connected component as a function of the number of nodes deleted.

Example graphs

Let’s compute the resilience of following 3 types of undirected graphs:

  1. An example computer network – an undirected graph (with 1347 nodes and 3112 edges). The next figure shows a small part of the undirected network graph.
  2. ER random graphs.
  3. UPA graphs – the undirected version of the DPA graphs. Note that since the degree of the newly added node is no longer zero, its chances of being selected in subsequent trials increases.

Part of a computer network graph

network_graph.png

To begin analysis, let’s examine the resilience of the computer network under an attack in which servers are chosen at random. Then let’s compare the resilience of the network to the resilience of ER and UPA graphs of similar size.

To begin, let’s determine the probability such that the ER graph computed using this edge probability has approximately the same number of edges as the computer network. Likewise, let’s compute an integer such that the number of edges in the UPA graph is close to the number of edges in the computer network. Remember that all three graphs being analyzed should have the same number of nodes and approximately the same number of edges.

Then for each of the three graphs (computer network, ER, UPA), let’s compute a random attack order and use this attack order to compute the resilience of the graph.

The following figure shows the algorithms to be used to find the largest component size in a graph.

im29.png

The following figures show how BFS finds a component starting from a source node and visiting all the nodes reachable from the source.

network_graph_0_20network_graph_0_48network_graph_0_64network_graph_0_80network_graph_0_100network_graph_0_150network_graph_0_200

Once the resilience for all three graphs are computed, let’s plot the results as three curves combined in a single plot. The horizontal axis for the plot is going to be the the number of nodes removed (ranging from zero to the number of nodes in the graph) while the vertical axis should be the size of the largest connected component in the graphs resulting from the node removal.

im_25_26_1.png

Consider removing a significant fraction of the nodes in each graph in random order . We will say that a graph is resilient under this type of attack if the size of its largest connected component is roughly (within ~25%) equal to the number of nodes remaining, at all times during the attack.

As can be seen from the above figure, both the ER and UPA graphs are resilient, but the network graph is not under this attack.

20% of nodes ~ 270 nodes. If these first 270 nodes are removed from the total 1347 nodes,

For network graph the initial size of the largest component = 1239, final size after removal of these nodes = 889 which is ~28% reduction in size > 25%

For ER graph the initial size of the largest component = 1337, final size after removal of these nodes = 1047 which is ~ 21.69% reduction in size < 25%

For UPA graph the initial size of the largest component = 1347, final size after removal of these nodes = 1077 which is ~ 20% reduction in size < 25%

In the next three problems, let’s consider attack orders in which the nodes being removed are chosen based on the structure of the graph. A simple rule for these targeted attacks is to always remove a node of maximum (highest) degree from the graph. The function targeted_order takes an undirected graph ugraph and iteratively does the following:

  1. Computes a node of the maximum degree in ugraph . If multiple nodes have the maximum degree, it chooses any of them (arbitrarily).
  2. Removes that node (and its incident edges) from ugraph.

Observe that targeted_order continuously updates ugraph and always computes a node of maximum degree with respect to this updated graph. The output of targeted_order is a sequence of nodes that can be used as input to compute_resilience.

As we examine the algorithm targeted_order, it can be seen that is not as efficient as possible. In particular, much work is being repeated during the location of nodes with the maximum degree. In this question, we will consider an alternative method (which we will refer to as fast_targeted_order) for computing the same targeted attack order. Here is a pseudo-code description of the method:

im_25_26_3.png

im30.png

To continue our analysis of the computer network, we will examine its resilience under an attack in which servers are chosen based on their connectivity. We will again compare the resilience of the network to the resilience of ER and UPA graphs of similar size.

Using targeted_order (or fast_targeted_order), the task is to compute a targeted attack order for each of the three graphs (computer network, ER, UPA) from previous question. Then, for each of these three graphs, let’s compute the resilience of the graph using compute_resilience . Finally, let’s plot the computed resiliences as three curves (line plots) in a single plot.

im26.png

Now, let’s consider removing a significant fraction of the nodes in each graph using targeted_order. Let’s examine the shape of the three curves from the previous plot in Question 4. Which of the three graphs are resilient under targeted attacks as the first 20% of their nodes are removed?

ER graph is resilient, but the UPA and the network graphs are not resilient under this targeted attack.

20% of nodes ~ 270 nodes. If these first 270 nodes are removed from the total 1347 nodes,

For network graph the initial size of the largest component = 1239, final size after removal of these nodes = 3 which is ~99.76% reduction in size >> 25%

For ER graph the initial size of the largest component = 1333, final size after removal of these nodes = 990 which is ~ 25.73% reduction in size ~ 25%

For UPA graph the initial size of the largest component = 1347, final size after removal of these nodes = 19 which is ~ 98.59% reduction in size >> 25%

Advertisements

One thought on “Analyzing the connectivity of a computer network undergoing a cyber-attack in Python

  1. Pingback: Sandipan Dey: Analyzing the connectivity of a computer network undergoing a cyber-attack in Python | Adrian Tudor Web Designer and Programmer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s