Bits of Learning

Learning sometimes happens in big jumps, but mostly in little tiny steps. I share my baby steps of learning here, mostly on topics around programming, programming languages, software engineering, and computing in general. But occasionally, even on other disciplines of engineering or even science. I mostly learn through examples and doing. And this place is a logbook of my experiences in learning something. You may find several things interesting here: little cute snippets of (hopefully useful) code, a bit of backing theory, and a lot of gyan on how learning can be so much fun.

Tuesday, April 16, 2024

Uncompetitive Programming (part 2)

 This is a continuation of the part 1 of this article. It presents an example of solving a problem that appears in a competitive programming site. The method of solving is utterly useless if you look at from the CP lens. It took me an order of magnitude more time and effort to come up with this solution. I don't know if my solution would pass in the online evaluation, particularly in terms of time-performance. I will try to procure a typical CP-style solution which would do well in CP evaluation and paste it here. My solution doesn't claim to be that.

And yet, I claim that this solution is superior in many important ways than a typical CP solution.

  • It was built systematically.
  • Options were given their due attention before one was taken and proceeded with while others were let go of.
  • I spent quite a lot of time writing down my thoughts clearly in a way that is understandable to a reader.
  • The code follows closely the final design.
  • The code is readable to anyone who knows CS and programming. No special training for competitive programming is needed. There is some focus on performance, but not at the expense of readability or intuitiveness of the solution. 
I claim that this style, which is very very different from that of competitive programming, is the de facto style of practical programming. It is not something that any competitor programmer would be able to do given enough time. It involves a style of thinking, communication and programming that is very difficult if not impossible to learn in a CP setting. It requires a shift in mindset.

I hope that you will read the rest of this article and you will be able to see the differences w.r.t. to CP that I am trying to bring out here. I don't want to disparage CP. To do well in CP, one requires certain skills which are also quite valuable in real world. All I want to say is that CP not even the most important take on programming, let alone being the only one. Takeaways:

  • Programming is like an art. It's as much about experience as about output.
  • Ability to do thorough thinking is more important than thinking fast.
  • Ability to communicate through written words and pictures is more important than hacking. 
  • Readability is a far more important quality of a code than squeezing every cycle out of your processor.

Case Study

Problem

The problem can be found here. We solve a slight variant (or subproblem) of this problem. 

Our variant: We consider a sequence of slots. Some of these slots are empty while some are filled. The goal is to bring all elements to occupy contiguous slots. For this, we allowed to move any element from one slot to another (empty) slot with unit cost. Our algorithm has to compute the minimum cost to achieve this for a given initial state.

Our solution can be easily extended to solve the original problem.

Code

The code can be found in this GitHub repository.

Analysis and Design (Informal)

Our objective is to place all identical symbols adjacent to each other. The goal is to do so in minimum number of moves. The cost of each move, irrespective of the distance of the move, is 1.



The algorithm should compute the cost of moving, not necessarily do the moves as such.

The algorithm tries to avoid the following:

  • Make repeated moves
  • Cause a gap to be created.


  • Move a ball to a place/position from which an identical ball has been moved out.

Strategy 1

Identify sets of contiguous positions occupied by balls (symbols) of a particular colour (description).

For example:



Use interval as a pivot and compute the cost of filling the gap on either sides of the pivot.

For example: For the above example, using I1 as the pivot.



However, the above is not the most optimum way of achieving contiguous interval.

The above transformation results in a contiguous interval in just one move.

Each pivot interval may have up to 2 adjoining gaps. To fill any gap, we follow the policy of using the extreme balls (symbols).

Hence, after identifying a pivot, we have to identify the adjoining gaps. For, filling each gap, we have to explore using the extreme balls, which could be up to two on each direction.

Heuristic: If there exists a gap whose length is greater than or equal to the length of an extreme interval, use the same to fill the gap.

Why: Removal of an extreme interval causes a gap to disappear.



The above method has a serious shortcoming, it looks like. For each pivot, there are two gaps adjoining. For each of these gaps, the candidate balls to use are also up to two (from extreme interval). This is a fertile pattern for an exponential growth in complexity. Each step will lead to 4 options as the subsequent step. N such steps will lead to 4 raised to N options proliferating. Of course, in the worst case.

Dynamic Programming to the Rescue

I think, dynamic programming can help here.

Suppose a trace of creating a contiguous interval has the following steps:

The total cost of 

Recursively:

This is good because can be recursively defined in terms of its tail.

If there is another trace 

If we wish to find the minimum cost of getting from a state s, it's just the minimum of the sum of the cost of getting from s to its next state plus the minimum to get to a contiguous state from there.

For example, in the following figure:




Now, the implication of this observation is that once the minimum cost of getting from a state to a contiguous state is computed, it can be memoised and used/reused anytime we reach that state through any other path. This will collapse a tree of states into a dag potentially contracting the state space explosion caused by the 4 raised to N growth discussed earlier.

There are still an exponential number of states. Suppose there are N slots and M balls. Then the number of states possible is N choose M, which unfortunately can be an exponential, and this may kill our algorithm.

The way to restrict the above explosion may be to avoid exploring non-viable options and making only smart moves. For example,



is a useless transition. Our initial idea of moving balls from the extremes can be a good option.

Local Summary

  • We are gravitating towards a recursive design.
  • We memoise the minimum cost per state
  • We may want to remember the states visited in the current trace being explored to prevent looping.
  • It's better to design computation of the next steps in a way that they inherently prevent us from visiting an already visited state.
  • prevent exploring useless states, by preferably ensuring that each step leads to the improvement in some 'health' indicator (greedy approach).
  • However, a greedy approach mustn't lead us to miss out on optimal solutions.

The skeleton of the algorithm now looks like this:

ALGORITHM



Note that the above is a recursive algorithm.

M(s) = nil will happen when s is being visited for the first time.

The rate of convergence of the algorithm will depend on nextstate function. The smaller the average size of n, the faster will be the convergence.

Computing next states

Let's use a simple method to begin with.

As long as a state has gaps, we keep filling those gaps by moving balls in the end (extremes) to the gaps. To limit useless possibilities, we can fill each gap starting from its left end.



For a state with g gaps the number of next states could be 2g.

Each step by definition has unity cost.

Note that we have (at least for now) given up our earlier representations of intervals.

I would like to note down a couple of observations here:

  1. By choosing always a ball at the extreme end to fill intermediate gaps, we are sure to always shrink the spread, never extend it along any of its ends. Hence, the length of the slots (strings) is not important.
  2. We have taken the above strategy with gut feeling that it will not miss out the best moves leading to a minimum cost. This should ideally be proved.

State representation



Again, following gut feeling, let's choose binary string representation. It appears that it will lead to simpler implementation of the functions we wish to implement. We may or mayn't be right here.

Move

We begin by defining the move function.

move: state ⤫ int  int --> state

For example:

move([0, 1,1,1,0,1], 5, 4) = [0, 1, 1, 1, 1, 0]

Gaps:

FIGURE


gaps([0,1,1,0,0,1,0,0,0,1,1,0]) = {[3,4], [6,7,8]}

Note that we exclude the unoccupied slots in the two ends.

So,

gap: state -> set of (list of int)

A function that gets us the extreme positions of the state would be helpful.

Ends

ends: state --> (int, int)

For example:

  • ends([0, 1,1,1,0,1] = (1, 5)
  • ends([0,1,1,0,0,1,0,0,0,1,1,0]) = (1, 10)

Now, next move functions:

Next Moves

nextmoves: state --> set of (int, int)

nextmoves computes the cross product of the ends and gaps.

For example:



Using nextmoves, we can compute the next states.

Next states

next_states(s) = {move(s, a, b) for all (a,b) in next_moves(s)}

This function can be plugged into the min_cost function above to obtain the complete algorithm.


Conclusion

To conclude, I would (again) like to bring your notice back to the point that this post comprises of rough notes; they have been written more as an aid to my own thinking, not as a representation of the final output of my thoughts. Consequently, some of the ideas which we broached in the early part of the article disappear as we proceed along. For example, the idea of representing intervals (because it was too unnecessarily complicated), or the first iterative algorithm (because our main approach itself changed). Similarly, there was a thought (almost a passing one) about making the algorithm greedy. Turns out that the algorithm we settled with is greedy. But we never again put a conscious thought in that direction after mentioning it once. A more finished article would be shorter, including only the ideas which survived till the end and found their way into the final implementation. However, that would lose track of all the ideas we explored and rejected, and the chronological order in which they appear and sometimes faded away. Note that even the ones we drop on the way have their value: they are the steppingstones to where we finally arrived. 
So, there! An uncompetitive style of programming! In this, we look at the problem from various angles. We contemplate on various options. Sometimes, our contemplation may seem idle -- as if we are more interested in the beauty of the process of thinking, instead of the result. I will share a secret here: the above is actually not fully untrue! We do care about the result, but we don't want to miss out on the fun part.

We are concerned about efficiency. We are also concerned about the time it takes to implement the program. We are concerned about the brevity of our code. But above all these, we are concerned about the readability of our code. We write a code which is understandable by a lay programmer and follows our design very closely. Programs are not just instructions to machines; they are meant to be read by humans too! 

Uncompetitive Programming (part 1)

I recently had an interesting discussion with one of my young relatives -- a second year engineering student in computer science -- who, like many of his age, is fascinated by the idea of competitive programming. Competitive programming (CP) involves solving algorithmic problems/puzzles by writing programs. These problems typically involve some CS theory, particularly in data structures and algorithms. There are many online sites which provide practice ground for CP aspirants. The pinnacle of this, to the best of my knowledge, is ICPC. One of our alums went on to the World finals. This contributed in a big way to his getting recruited by Google, USA with a record-breaking salary. This achievement set off a CP culture in our institute which has been going strong since. Some of our brighter students gravitate towards the CP culture, with dreams of becoming the next Google hire.

CP emphasises on CS fundamentals. To rise in the CP community, one needs to be good with algorithms, some basic mathematical thinking and in coding in at least one programming language. As CP happens in time-bound environment, one needs to develop quick thinking skills. Undoubtedly, CP aspirants are closely followed by top hirers. They have their fundamentals right, they are smart and fast, and they function well under extreme pressure -- exactly the qualities top companies are looking for in their new employees, right?

Yet, I believe that CP is not for everyone. And by that, I don't mean that people who can't handle the high pressure and expectation of CP shouldn't do it. That, of course, goes without saying. But what I mean is that, with all its good things, CP ends up testing the programmers for a limited set of skills, and misses out on many others which may be equally or more important than the ones which are tested in CP.

I would like to draw a parallel with academic assessments. Take IIT JEE. About 30 years ago, when we were school students preparing for college entrances, IIT JEE used to be viewed as an ultimate test of academic brilliance. While, its focus was restricted to science and mathematics, it was perceived as a reasonably reliable metric of measuring natural intelligence. To crack JEE, you needed to be someone. Conversely, if you have cracked JEE, you are someone. 

Things have changed since then. IIT JEE, like all assessments, has devolved into a pattern now. Because of the coveted status of IITs, there are millions who dream to crack JEE. And there are people who have made billion-dollar businesses out this aspiration. They have studied the JEE pattern and have developed methods of preparing students so that they can do well in JEE, not because they are very intelligence, but because of hard practice that has drilled the JEE pattern into their minds. Acing JEE is no more a surefire indicator that you are built differently. It's definitely about hard work and a lot of practice. But, to a large extent, it may be an indicator of something negative -- probably, you are more competitive than academically brilliant; and these are two entirely different things. Such high competition may now be routinely excluding brilliant students with potential to become groundbreaking engineers from entering IITs because they are not able to handle the psychological pressure of such high competition. It's a tragedy of unprecedented scale!

CP culture has a not-so-hard-to-see similarity with IIT JEE. With growing demand, CP has been attracting students in whom the competitive part dominates their programmer part. By practice-solving hundreds of similar problems, many of these aspirants develop high degree of pattern matching skills, sharpen a bag of tricks to code something up quickly. They are often like overfitted ML models which will do really well in CP-like environments and will do well in just that kind of environment.

A terrible side-effect of this phenomenon is that programming is now increasingly being viewed as a quick-and-dirty problem-solving activity. And many of our bright students don't even consider the possibility that programming is also a creative exercise that can't always be done in high-pressure timebound manner, but through open-ended exploration. Time pressure and open-ended exploration directly conflict with each other; they are two extremes of a spectrum. People who have trained hard to get close to one of these ends of the spectrum are very unlikely to develop well on the other. And we need people at all points in this spectrum.

In real world, CP-style deadline driven algorithm design and coding is not always realistic. Usually, for a task that matters, the solution is not likely to be very cut-n-dry. In such cases, merely a performant code that gives correct results is far from enough as an output. You have to be able to explain clearly to others how you came up with the solutions and how the whole damn thing works. Happily, you would typically have much more time than in CP to come up with such a solution.

A CP-style programmer has certain skills tuned to perform well in the high pressure setting of CP. There's a tendency of pattern matching (with familiar problems) and to hack it to make it work 'somehow'. This mayn't be a good style in many settings where thinking slowly but thoroughly and clearly is more important.

While designing a solution, there may be many false starts. Writing things down in the language of plain English, pictures and mathematics helps us think ahead and correct our course early. Starting to code right away may give us a mistaken sense of early progress but may lead to difficult-to-correct mistakes down the line, particularly when the problem is challenging.

Part 2 of this article presents an example of solving a problem that appears in a competitive programming site. 

I hope that you will be able to see the differences w.r.t. to CP that I am trying to bring out here. 

To conclude this part of the article:
  • Programming is like an art. It's as much about experience and aesthetics as about output.
  • Ability to do thorough thinking is more important than thinking fast.
  • Ability to communicate through written words and pictures is more important than hacking. 
  • Readability is (more often) a far more important quality of a code than squeezing every cycle out of your processor.
This is a logical point where it's OK to stop reading. But in case you plan to continue please proceed to part 2 in which I discuss with an example how an uncompetitive programmer would approach a CP problem! 😀