I am trying to demonstrate the hardness of the following problem. This problem is from Google Hashcode, Qualification Round, 2020.

Hier is a brief description of the problem. Given a list or libraries and a list of books. Each book $ b $ has several points $ P_b $. We want to scan a subset of the books in a way that maximizes the total points of the books we scan $ n $ days. We have to choose books from libraries. Each library $ l $ has a list of books $ B_l $, the amount of books we can scan per day in this library $ m_l $ and the number of days needed to register the library (do paperwork) $ d_l $. We can scan books in parallel from many libraries at once, but registration must be done sequentially and we cannot register two libraries at the same time. In addition, the same book can be fond of many libraries and we have to decide which books should be chosen from which libraries and in what order. We can scan as many books per day as we want (respecting the restrictions provided by libraries).

Here is an example: we have two libraries and three books. Suppose we have 3 days in total. The books have 40, 40, 40 and 30 points respectively. The first library has the first 3 books. The second library has the first and fourth books. Suppose everyone needs one day to register and can provide one book per day. We can choose the first library at the beginning. On the first day we registered this library.

On the second day we can scan the second book while registering the second library. On the third day we can scan the third book in the first library and the first book in the second library. This adds up to the first three books and a total of 120 points. Note that if we scanned the first book in the first library or started with the second library, we would have scanned the 30-point book instead of a 40-point book.

This is what I have tried so far. On the one hand, with the correct order of the libraries in hand, we can solve the problem in polynomial time, since we can calculate how many books each library can achieve in total (total number of days minus sums of registration time prefix of the library for Libraries that we have chosen so far, all multiplied by the amount of books that this library can provide per day). With these values, we can turn the problem into a bipartite weighted problem of correspondence b, where each library is connected to the books it has with weights equal to the points of each book. Each library can accommodate the total number of books it can provide. This problem can be reduced to a bipartite coincidence of maximum weight and resolved in polynomial time. However, finding a correct order of libraries seems to me to be a difficult programming problem.

On the other hand, assuming $ B_l $ are separated by pairs over all libraries $ l $, which means that each library has its "own" set of books, and assuming that each library can provide exactly one book per day, and each book has exactly one point, we turn the problem into a programming problem to determine an order of libraries where each library has registration time and number of points, so that it can give one point per day after registration and registrations are made sequentially. This problem seems similar to the problem of the backpack, since we can solve it using dynamic programming during the number of days in which $ d_ {i, j, k} $ It is the greatest amount of points we can achieve from $ i $th to the $ j $th library in $ k $ days in which all libraries in the range are registered in the $ k $th days and we are free to register a new library. The problem seems similar to the knap-sack problem. However, a psudo-polynomial time algorithm for this case is already polynomial since $ k $ It is limited by a polynomial in the number of books. However, when we remove the restrictions, this condition is no longer met and a weak NP-hard problem remains NP-hard. The dynamic programming solution also seems difficult to extend to the general case.

*Do you think this problem is NP-hard? I appreciate all the advice, thoughts and ideas about hardness and / or polynomial solutions.* 🙂

**Disclaimer** The question is just out of curiosity. The competition is over and a $ O ( sqrt n m) $ The matching algorithm is already too slow for the limits suggested in a typical computer, so I only ask from a theoretical point of view.