October 2019 - Shafaet's Blog

Suppose you have a linked list, now you have to identify whether there is any cycle in that linked list or not. This type of question is quite common for the interview. Today we will try to solve this problem using Floyd’s cycle finding algorithm.

A linked list with cycle
In the picture above, a cycle with length $7$ is present in the linked list.

Detect Cycle In A Linked List

The easiest way to detect a cycle is to use a dictionary or hashmap. For that, we need to traverse the list from the very first node and save each node in the dictionary. After reaching to any node, if that node is found to be already present in the dictionary, then the linked list is cyclic. The complexity and memory complexity of this algorithm is $O(n)$.

But we can reduce the memory complexity of this algorithm to $O(1)$ using Floyd’s algorithm. It is also known as the ‘Hare and turtle’ algorithm.

Hare And Turtle Algorithm

Let’s assume we have two pointers; one is the turtle pointer, which we will denote by $’T’$, and the other pointer is the hare pointer, which we will denote by $’H’$. In the beginning, both of the pointers will be at the root node of the linked list.

H and T pointer in a linked list cycle

In each second, the $H$ pointer will move two steps, but the $T$ pointer will move only one step. Then after one second, the position of the two pointers will be like this :

H and T pointer after 1 second

After one more additional second, $T$ will be at node $12$ and $H$ at the node $90$. After simulating these steps for some time, we will find that the two pointers will meet at the node $28$.

H and T pointers meeting point

If the two pointers meet at the same node, it indicates that there is a cycle present in the linked list. If there were no cycles in that list, then the $H$ pointer would reach the endpoint of the list by approaching ahead.

Now the question is how we can detect the first node of the cycle?

First node of the linked list

How To Detect The First Node of The Cycle

Let’s assume,

$m$= distance between the root node and the first node of the cycle,

$k$= distance between the first node of the cycle and the meeting point of the two pointers.

$l$= length of the cycle.

Now, if $T$ meets $H$ after circling the cycle for $C_{T}$ times, then the distance traveled by $T$ will be:

$D_{T}=m+(C_{T}∗l)+k$

And, if $H$ meets $T$ after circling the cycle for $C_{H}$ times, then the distance traveled by $H$ will be:

$D_{H}=m+(C_{H}∗l)+k$

As the $H$’s speed is two times of $T$’s speed, so we can say that:

$2∗(m+(C_{T}∗l)+k)=m+(C_{H}∗l)+k$

After some rearranging, it can be written as:

$m+k=(C_{H}–2∗C_{T})∗l$

Here, $l$ is the length of the cycle, so $m+k$ is a multiple of the length of the cycle. It means that if we travel $m+k$ length from the first node of the cycle, then we will return to the first node again. This is the most important part of the algorithm to understand.

Now, if you move $m$ steps ahead from the meeting point, then you will return to the first node of the cycle again. The cause is, the distance from the first node of the cycle to the meeting point is $k$. But, the value of both $m$ and $k$ is unknown. Then, how can you move $m$ steps?

The Value of $m$

An interesting and quite easy way is there to determine the value of $m$. I hope you still remember that the distance between the root node and the first node of the cycle is also $m$.

Let’s assume; no $H$ pointer is there anymore, but another turtle pointer $’T_{2}’$ is now present at the root node, and $T$ is still in that meeting point. Now, if you move both the $T$ and $T_{2}$ pointer one step at a time, then the point where these two pointers will meet is the first node of the cycle!

This is the cycle detection algorithm of Floyd. The time complexity of this algorithm is still $O(n)$ (Why?), but the memory complexity has reduced to $O(1)$.

C++ Code for Cycle Detection Algorithm

Let’s see a $C++$ code here:

/**
 * Definition for singly-linked list.
 * struct ListNode {
 *     int val;
 *     ListNode *next;
 *     ListNode(int x) : val(x), next(NULL) {}
 * };
 */
ListNode* Solution::detectCycle(ListNode* A) {
    if(!A->next)return NULL;
    
    ListNode *tortoise=A;
    ListNode *hare=A;
    
    //check if there is a cycle
    while(hare){
        if(hare->next and hare->next->next)
            hare=hare->next->next;
        else
            return NULL; //no cycle
        tortoise=tortoise->next;
        if(hare==tortoise)break; //cycle exists
    }
    
    ListNode* tortoise2 = A;
    while(tortoise2 != tortoise){
        tortoise2 = tortoise2->next;
        tortoise = tortoise->next;
    }
    return tortoise;
}

/**

* Definition for singly-linked list.

* struct ListNode {

* int val;

* ListNode *next;

* ListNode(int x) : val(x), next(NULL) {}

* };

ListNode* Solution::detectCycle(ListNode* A) {

if(!A->next)return NULL;

ListNode *tortoise=A;

ListNode *hare=A;

//check if there is a cycle

while(hare){

if(hare->next and hare->next->next)

hare=hare->next->next;

else

return NULL; //no cycle

tortoise=tortoise->next;

if(hare==tortoise)break; //cycle exists

}

ListNode* tortoise2 = A;

while(tortoise2 != tortoise){

tortoise2 = tortoise2->next;

tortoise = tortoise->next;

}

return tortoise;

}

Here is a golang implmentation if you are interested.

Other Uses of Floyd’s Cycle Finding Algorithm

Besides detecting cycles in a linked list, this algorithm can also be used in some other cases. For example, it can be used to identify cycles in any mathematical functions or pseudo-random number generator.

That’s it, now you know how cycle finding algorithm works. If you want to test yourself whether you are clear about this algorithm or not, you can try to solve this: uva 350 pseudo-random numbers.

If you want to read the same article in Bengali, click here.

Using the square-root decomposition technique, we can reduce the time complexity of several types of problems from $O(n)$ to $O(Sqrt(n))$. I will start with an example. Let’s say we have an integer array and we have to perform several operations on them. The operations can be of two types. The first type is, we have the find the sum of all the integers between index $l$ to $r$. The second type is, we have to update the value of index $i$. Some of you may already know how to solve this using a segment tree or binary search tree, today we will solve it with a new technique.

What is the naive way to solve this? Every time we are asked to find the sum, we will loop from $l$ to $r$, and for an update operation, we will update the value in index $i$. In this case, the complexity to find the sum is $O(n)$.

Now let’s try to optimize it. We will divide the array into multiple segments. Every segment will contain the sum of all integers of that segment. See the picture below:

The size of the array is $13$. When we create the segments, we want to ensure that the size of every part is almost the same. Also, the number of segments in the array should be nearly the same as the size of the segments. That’s where square root helps us. The square root of $13$ is approximately $3.61$ or $3$ if we ignore the decimal points. If we divide the array into segments of size $3$, we will get $13/3 = 4$ segments. The size of each segment will be $3$ except for the last part.

If the size of the array is not a square number, we will get one extra segment which is a bit smaller, but that’s not a problem. We will see soon how dividing in this way will help us immensely.

Now when we are asked to find the sum in the range $[l, r]$, we will check which segments fall into the range, and we will get the sum for those segments.

At first, we need to pre-process the segments. See the code below:

int segment[10000];
int preprocess(int input[], int n) { 
    int current_segment = -1; 
    int segment_size = sqrt(n); 
   
    for (int i=0; i<n; i++) { 
        if (i % segment_size == 0) { 
            current_segment++;  //new segment
        } 
        segment[current_segment] += input[i]; 
    } 
    
    return segment_size;
}

int segment[10000];

int preprocess(int input[], int n) {

int current_segment = -1;

int segment_size = sqrt(n);

for (int i=0; i<n; i++) {

if (i % segment_size == 0) {

current_segment++; //new segment

}

segment[current_segment] += input[i];

}

return segment_size;

}

We are traversing the input array and every time $i % segment_size == 0$ becomes true, we go to next segment.

For $[l, r] = [1, 10]$, we need to find sum for the segments shown in the picture below:

The problem is, some of the segments are only partially in the range. The first and last segments can have this issue. In that case, we need to loop through each index of those segments. For other segments, we can get the sum. We have $sqrt(n)$ segments, and max size of each segment is also $sqrt(n)$. So even though we may need to loop through the first and the last segment, the total complexity remains $O(sqrt(n))$. Now you know why we divided the array this way.

Now one small question. For a particular index $i$, how do we know which segment $i$ is situated it? Very easy, just check the value of $i/segment_size$.

Let’s see the code for sum operation:

int query(int input[], int segment_size, int l, int r) { 
    int sum = 0; 
    
    //loop the first segment
    //until we reach r or a starting index

    while (l < r && l % segment_size != 0) { 
	sum += input[l]; 
	l++; 
    } 

    //Loop until we reach
    //segment that contains r
    while (l + segment_size <= r) {  
        sum += segment[l / segment_size]; 
        l += segment_size; 
    } 
    
    //loop until r
    while (l<=r) { 
        sum += input[l]; 
        l++; 
    } 

    return sum; 
}

int query(int input[], int segment_size, int l, int r) {

int sum = 0;

//loop the first segment

//until we reach r or a starting index

while (l < r && l % segment_size != 0) {

sum += input[l];

l++;

}

//Loop until we reach

//segment that contains r

while (l + segment_size <= r) {

sum += segment[l / segment_size];

l += segment_size;

}

//loop until r

while (l<=r) {

sum += input[l];

l++;

}

return sum;

}

We divided the the code in 3 parts as discussed above.

Now lets come to update operation. It’s very easy, just find the segment $i$ index is located it and update the sum for that segment.

void update(int input[], int segment_size, int i, int val) { 
    int segment_no = i / segment_size; 
    
    segment[segment_no] -= input[i];
    segment[segment_no] += val;
    input[i] = val;
}

void update(int input[], int segment_size, int i, int val) {

int segment_no = i / segment_size;

segment[segment_no] -= input[i];

segment[segment_no] += val;

input[i] = val;

}

The complexity for update is $O(1)$.

Whats next?

You can even find the Lowest Comment ancestor of a tree using a similar technique. If you want the dig deeper, check out this PDF.

Month: October 2019

Floyd’s Cycle Finding Algorithm