The following problem appeared as a project in the * coursera course Algorithmic Thinking (by RICE university)*, a part of Fundamentals

*of Computing*specialization. The following description of the problem is taken directly from the project description.

*Graph exploration* (that is, “visiting” the nodes and edges of a graph) is a powerful and necessary tool to elucidate properties of graphs and quantify statistics on them. For example, by exploring a graph, we can compute its *degree distribution*, pairwise distances among nodes, its *connected components*, and *centrality measures* of its nodes and edges. Both ** DFS **and

**can be used to compute the**

*BFS**connected components*of a graph.

Now, the task is to analyze the *connectivity of a computer network* as it undergoes a *cyber-attack*. In particular, we shall simulate an attack on this network in which an increasing number of servers are disabled. In computational terms, we will model the network by an undirected graph and repeatedly delete nodes from this graph. We will then measure the *resilience* of the graph in terms of the size of the *largest remaining connected component* as a function of the *number of nodes deleted*.

## Example graphs

Let’s compute the *resilience* of following 3 types of undirected graphs:

- An example computer network – an undirected graph (with 1347 nodes and 3112 edges). The next figure shows a small part of the undirected network graph.
*ER*random graphs.*UPA*graphs – the undirected version of the DPA graphs. Note that since the degree of the newly added node is no longer zero, its chances of being selected in subsequent trials increases.

## Part of a computer network graph

To begin analysis, let’s examine the *resilience* of the computer network under an *attack* in which servers are chosen at random. Then let’s compare the *resilience* of the network to the resilience of ** ER **and

**graphs of similar size.**

*UPA*To begin, let’s determine the probability such that the *ER* graph computed using this edge probability has approximately the same number of edges as the computer network. Likewise, let’s compute an integer such that the number of edges in the *UPA* graph is close to the number of edges in the computer network. Remember that all three graphs being analyzed should have the same number of nodes and approximately the same number of edges.

Then for each of the three graphs (computer network, ER, UPA), let’s compute a *random attack order* and use this attack order to compute the resilience of the graph.

The following figure shows the algorithms to be used to find *the largest component size in a graph*.

The following figures show how ** BFS **finds a

**component**starting from a

**source**node and visiting all the nodes

**reachable**from the source.

Once the **resilience** for all three graphs are computed, let’s plot the results as three curves combined in a single plot. The horizontal axis for the plot is going to be the the number of nodes removed (ranging from zero to the number of nodes in the graph) while the vertical axis should be the size of the **largest connected component** in the graphs resulting from the node removal.

Consider removing a significant fraction of the nodes in each graph in random order . We will say that a graph is **resilient** under this type of **attack** if the size of its **largest connected component** is roughly (within ~25%) equal to the number of nodes remaining, at all times during the attack.

As can be seen from the above figure, both the **ER** and **UPA** graphs are **resilient**, but the network graph is not under this attack.

20% of nodes ~ 270 nodes. If these first 270 nodes are removed from the total 1347 nodes,

For network graph the initial size of the largest component = 1239, final size after removal of these nodes = 889 which is ~28% reduction in size > 25%

For ER graph the initial size of the largest component = 1337, final size after removal of these nodes = 1047 which is ~ 21.69% reduction in size < 25%

For UPA graph the initial size of the largest component = 1347, final size after removal of these nodes = 1077 which is ~ 20% reduction in size < 25%

In the next three problems, let’s consider attack orders in which the nodes being removed are chosen based on the structure of the graph. A simple rule for these targeted attacks is to always remove a node of maximum (highest) degree from the graph. The function **targeted_order** takes an undirected graph ugraph and iteratively does the following:

- Computes a node of the maximum degree in ugraph . If multiple nodes have the maximum degree, it chooses any of them (arbitrarily).
- Removes that node (and its incident edges) from ugraph.

Observe that targeted_order continuously updates ugraph and always computes a node of maximum degree with respect to this updated graph. The output of targeted_order is a sequence of nodes that can be used as input to compute_resilience.

As we examine the algorithm **targeted_order**, it can be seen that is not as efficient as possible. In particular, much work is being repeated during the location of nodes with the maximum degree. In this question, we will consider an alternative method (which we will refer to as **fast_targeted_order**) for computing the same targeted attack order. Here is a pseudo-code description of the method:

To continue our analysis of the computer network, we will examine its **resilience** under an attack in which servers are chosen based on their **connectivity**. We will again compare the resilience of the network to the resilience of **ER** and **UPA** graphs of similar size.

Using targeted_order (or fast_targeted_order), the task is to compute a targeted attack order for each of the three graphs (computer network, **ER**, **UPA**) from previous question. Then, for each of these three graphs, let’s compute the resilience of the graph using compute_resilience . Finally, let’s plot the computed **resiliences** as three curves (line plots) in a single plot.

Now, let’s consider removing a significant fraction of the nodes in each graph using **targeted_order**. Let’s examine the shape of the three curves from the previous plot in Question 4. Which of the three graphs are resilient under targeted attacks as the first 20% of their nodes are removed?

**ER** graph is **resilient**, but the **UPA** and the **network** graphs are **not resilient** under this **targeted** **attack**.

20% of nodes ~ 270 nodes. If these first 270 nodes are removed from the total 1347 nodes,

For **network** graph the initial size of the largest component = 1239, final size after removal of these nodes = 3 which is ~99.76% reduction in size >> 25%

For **ER** graph the initial size of the largest component = 1333, final size after removal of these nodes = 990 which is ~ 25.73% reduction in size ~ 25%

For **UPA** graph the initial size of the largest component = 1347, final size after removal of these nodes = 19 which is ~ 98.59% reduction in size >> 25%

Pingback: Sandipan Dey: Analyzing the connectivity of a computer network undergoing a cyber-attack in Python | Adrian Tudor Web Designer and Programmer