ACM ICPC 2019 Dhaka Regional Problem H (Droplets)

Finally, the contest is over. And it’s time for the editorial of my problem. This year, I have submitted two questions for the regionals. Between these two problems, one got selected for the Dhaka regional. And that problem appeared as number H (Droplets) in the contest. Hence this problem itself was quite simple, it was supposed to be the 3rd most manageable problem of the contest. But, somehow, it turned out to be the 2nd easiest one.

[Few days before the contest, we realized that a similar problem appeared in a national-level contest in RUET. But ICPC committee decided to keep the problem because this is a much much easier version, the RUET version was one of the hardest problem in that contest.]

The Problem

The problem is about filling up a matrix with water. There is a matrix given and you have to imagine it as a 2-d box. Some parts of the matrix are enclosed. Now, you need to find out the amount of water the matrix can hold. Please read the full statement here carefully before proceeding.

These are the pictures I used in the statement to describe the problem. What the problem asked you to do is quite clear from these pictures.

The Solution

Let’s consider the smallest unit of the matrix first to solve this problem. Here, the smallest unit is the little squares. Now, for each square in the matrix, how to identify whether there is water in the square or not?
For that to happen, the water must have a way to reach the square from the top. If you understand this part, you already know this is a graph problem. Therefore, treat every square as a node of the graph. And, a node is connected to its neighbor if no lines are blocking them.

As you have the graph now, scan the top squares from left to right. Every time you encounter a square on the first row whose top is not blocked, you found a way for the water to enter into the matrix. Afterward, run a depth-first-search from that node and mark all the nodes you can reach. Keep continuing this process, and in the end, the number of nodes you reached is the answer.

The time complexity of this solution is O(n * m) for each test case.

Here is the judge solution:

That’s it, I hope you enjoyed the problem!

Last but not least

Not sure whether you have noticed it or not, I have used climate-change as the theme of this problem. Recently 11000 scientists from all over the world declared a climate change emergency, which is quite alarming for all of us. Certainly, it is high time we need to be conscious about the rapidly changing climate. I am personally anxious about it, and you should be too. Let’s do whatever we can to reduce resource wastage, plastic usage, and whatever we need to save our mother planet.

Thank you.

Floyd’s Cycle Finding Algorithm

Suppose you have a linked list, now you have to identify whether there is any cycle in that linked list or not. This type of question is quite common for the interview. Today we will try to solve this problem using Floyd’s cycle finding algorithm.

A linked list with cycle
In the picture above, a cycle with length $7$ is present in the linked list.

Detect Cycle In A Linked List

The easiest way to detect a cycle is to use a dictionary or hashmap. For that, we need to traverse the list from the very first node and save each node in the dictionary. After reaching to any node, if that node is found to be already present in the dictionary, then the linked list is cyclic. The complexity and memory complexity of this algorithm is $O(n)$.

But we can reduce the memory complexity of this algorithm to $O(1)$ using Floyd’s algorithm. It is also known as the ‘Hare and turtle’ algorithm.

Hare And Turtle Algorithm

Let’s assume we have two pointers; one is the turtle pointer, which we will denote by $’T’$, and the other pointer is the hare pointer, which we will denote by $’H’$. In the beginning, both of the pointers will be at the root node of the linked list.

H and T pointer in a linked list cycle

In each second, the $H$ pointer will move two steps, but the $T$ pointer will move only one step. Then after one second, the position of the two pointers will be like this :

H and T pointer after 1 second

After one more additional second, $T$ will be at node $12$ and $H$ at the node $90$. After simulating these steps for some time, we will find that the two pointers will meet at the node $28$.

H and T pointers meeting point

If the two pointers meet at the same node, it indicates that there is a cycle present in the linked list. If there were no cycles in that list, then the $H$ pointer would reach the endpoint of the list by approaching ahead.

Now the question is how we can detect the first node of the cycle?

First node of the linked list

How To Detect The First Node of The Cycle

Let’s assume,

$m$= distance between the root node and the first node of the cycle,

$k$= distance between the first node of the cycle and the meeting point of the two pointers.

$l$= length of the cycle.

Now, if $T$ meets $H$ after circling the cycle for $C_{T}$ times, then the distance traveled by $T$ will be:

$D_{T}=m+(C_{T}∗l)+k$

And, if $H$ meets $T$ after circling the cycle for $C_{H}$ times, then the distance traveled by $H$ will be:

$D_{H}=m+(C_{H}∗l)+k$

As the $H$’s speed is two times of $T$’s speed, so we can say that:

$2∗(m+(C_{T}∗l)+k)=m+(C_{H}∗l)+k$

After some rearranging, it can be written as:

$m+k=(C_{H}–2∗C_{T})∗l$

Here, $l$ is the length of the cycle, so $m+k$ is a multiple of the length of the cycle. It means that if we travel $m+k$ length from the first node of the cycle, then we will return to the first node again. This is the most important part of the algorithm to understand.

Now, if you move $m$ steps ahead from the meeting point, then you will return to the first node of the cycle again. The cause is, the distance from the first node of the cycle to the meeting point is $k$. But, the value of both $m$ and $k$ is unknown. Then, how can you move $m$ steps?

The Value of $m$

An interesting and quite easy way is there to determine the value of $m$. I hope you still remember that the distance between the root node and the first node of the cycle is also $m$.

Let’s assume; no $H$ pointer is there anymore, but another turtle pointer $’T_{2}’$ is now present at the root node, and $T$ is still in that meeting point. Now, if you move both the $T$ and $T_{2}$ pointer one step at a time, then the point where these two pointers will meet is the first node of the cycle!

This is the cycle detection algorithm of Floyd. The time complexity of this algorithm is still $O(n)$ (Why?), but the memory complexity has reduced to $O(1)$.

C++ Code for Cycle Detection Algorithm

Let’s see a $C++$ code here:

Here is a golang implmentation if you are interested.

Other Uses of Floyd’s Cycle Finding Algorithm

Besides detecting cycles in a linked list, this algorithm can also be used in some other cases. For example, it can be used to identify cycles in any mathematical functions or pseudo-random number generator.

That’s it, now you know how cycle finding algorithm works. If you want to test yourself whether you are clear about this algorithm or not, you can try to solve this: uva 350 pseudo-random numbers.

If you want to read the same article in Bengali, click here.

Problem Solving: Consecutive Letters (MIST Inter University Contest 2019)

This problem appeared in 2019 inter-university contest hosted by MIST university, Dhaka, Bangladesh. Initially, our assumption was it will be medium difficulty problem but seems like the contestants found it easier. Around 53 team out of 120 was able to solve it. Let’s see the problem statement:

The time limit for this problem is $3$ seconds.

If you find the 1st query hard to understand, let me try to put it another way, you have to find the size of the largest substring which includes index $i$ and all the characters in the substring are same as $S[i]$.

Let’s see the first sample input:

I give you the query “1 0”. Now you have to find the size of the largest substring which includes index $i = 0$ and contains only the character S[0] = ‘A’. In this case the substring is S[0:1] = AA” and has size $2$ .

Now the next query is “2 1”, so you need to replace $S[1]$ with ‘#’ character, so the string becomes:

After then we again query “1 0”. Now the substring is S[0:0] = A” and has size 1.

Naive Solution

The first solution comes to mind is brute force. For every query of type 1, just loop from index $i$ towards left and right and see how many $S[i]$ you can find.

The time complexity for this code is O(Q*N) where N is the size of the string. As Q and N are very big, they will fail. As a problem setter, I had to be careful to make sure this code doesn’t pass. So I made some test cases where the loops for searching left and right will be very long. Those test cases have very few queries of type 2 and almost all the characters in the string are same.

Expected solution

There are several ways to solve the problem. I have solved it using a set and binary search. There is also a nice solution using a disjoint set. I will describe the solution using binary search first.

In brute force solution, you looped towards the left to find the starting point of the segment. Now I will show you how to find the $left$ point faster. Let’s forget about type 2 query for a moment.

For any given string, we can divide it by segments containing the same characters. See the following string for example:

I have highlighted the starting point of each segment.

After you take the original string as input, you can easily find these points and save them, right? Let’s say we have saved them in a sorted array which looks like $[0, 3, 6, 7]$.

Now let’s say I give you a query “1 4”.

Can you see that the $left$ point is the largest number in the set which is smaller than $i = 4$? In this case, it’s $3$. And how to find the $right$ point? It’s just the next number if the array, in this case, it’s $6$. So the size of the segment will be $right – left  = 6 – 3 = 2$

As you have a sorted array already, you don’t need to search one by one, use the power of binary search. You can either write your own binary search or use lower_bound function available in your language and modify it.

Now how about query 2? Let’s say we put a “#” in $i = 1$. Now you can imagine this is creating a new segment from index $i = 1$.

Notice that not only it’s creating a segment from index $1$, it’s creating another segment from index $1 + 1 = 2$. So we insert $1$ and $2$ in our array $[0, 1, 2, 3, 6, 7]$.

Note that we never do type 1 query on a segment consisting “#”. There is no harm if you consider two consecutive “#” as two different segments. We can assume for type $1$ query, index $i$ will always create a new segment, index $i + 1$ may or may not create a new segment but we don’t need to handle it specially if you use a set.

If we use normal array to maintain the list of segment starts it will be very difficult to insert new segments as we need to keep them sorted. That’s why we will use a set. All the programming languages have some built-in set where you can insert a number in sorted way in O(LogN) complexity and also ensures uniqueness of the number.

Let’s see the code. To make our life easier, we will imagine that there is a segment starting at the very end of the string. That will save us pain of handling some special cases.

The complexity for both query of type 1 and type 2 is logN which makes the total time complexity O(Q*LogN) which is good enough to pass under 1 seconds if you use fast I/O.

Alternate solution using disjoint set

There is a nice offline solution using disjoint sets. It “offline” because you have to read all the queries first before answering them. Let’s see another example XXBBFFF, but now we will see the string as a graph:

All the characters converterd into nodes and there is edge between them if the characters are same. Now let’s say we have 3 queries:

Now we read all the type 2 queries first and we will erase the links with those index. In this case they are index $3$ and $5$.

Now treat the connected nodes as a connected components. You can use disjoint sets to maintain the member and the size of the components. Now we have these sets $\{0, 1\}, \{2\}, \{4\}, \{5\}, \{7\}$.

Now we answer the queries from last to first! The last query is “1 5”. What is the size of the component which contains the node 6? The answer is 1.

The next query is “2 6”, now we have to unblock the 5-th node and add back the edges.

Using union find algorithm, we will update the sets, now we have $\{0, 1\}, \{2\}, \{4\}, \{5, 6, 7\}$. Now we query “1 5” again, this time the answer is 3.

In this way we will climb from the end, keep unblocking the nodes and use union find to update the sets. So the type 1 query will be just finding the size of the set. In the end we have to print the queries in original order.

Union find operations have logN complexity, so in the end we have O(QlogN) time complexity. You can see a sample code here.

There is a third solution using segment tree. Code for that is significantly more complex compared to the solution described above. Also it takes much more extra memory and runs slower. During the contest we have let those solutions pass too. I am not going to describe that solution here.

That’s it for now, good wishes for your next contest.