Consider the following problem. They give us a set of patterns (chains) $ Pi = { pi_i } $, a text $ s $and a window length $ k $. We want a list of all shifts $ 0 le i le | s | -k $ such that each pattern in $ Pi $ is contained in the substring $ s (i: i + k) $.

Can this be resolved in linear or near linear time? Of course, it can be solved in quadratic time $ O (| s | | Pi | + sum | pi_i |) $ using KMP or Aho-Corasick plus postprocessing.

The motivation for this problem is to find matches for a topic (represented by the set of patterns) in a text. In that context, it makes sense to demand that the coincidences do not overlap, so I am also interested in that case, but it might be easier to start with the relaxed version.

I would also be interested in the generalizations of the problem that allow approximate matches of some kind, for example, that only require a threshold in the size of the subset of matching patterns, that allow matches within a given editing distance, or something that uses Markov models hidden or general Probabilistic graphic models. However, I would be surprised if such generalization can be resolved in a sub-quadratic time.