microservices – Should a large number of system tests be part of the build?

The tests and the build

The concept of having extensive tests as part of the build is not flawed. This is actually what the build is for. What is flawed is to have tests “that fail frequently but intermittently.” A test—no matter if it’s a unit test which executes a one-line method which does some elementary stuff and asserts that the result is equal to a given value, or if it’s a system test which relies on dozens of different components, everyone of which can fail—has its value only when green indicates success and red indicates failure. If the test randomly fails, this random characteristic makes it not just useless, but harmful: harmful because your team will mistrust the build.

— Hey, I think we shouldn’t push this hotfix to production, because our build is red.
— Oh, come on, it’s probably some flaky test, as usual. Just push it to production manually.

And then you spend the next four hours trying to undo the catastrophic consequences of what could be avoided by just looking at the build.

If you remove the tests from the build, then why having those tests in the first place? Imagine you run them by hand once per day (and you run them several times, since they a flaky). One of the tests appears to be consistently red. What now? How would you find which one of today’s fifty commits broke the test? And how do you expect a developer who actually broke something to remember exactly what he was working on yesterday?

Flakiness in tests

Flakiness can come from several sources:

  • Individual components in a system fail. For instance, it happens that when under heavy load, one system makes another system fail, given that both systems are third-party (and you can’t change them), and you configured them correctly.

    If this is the reason of a failure, it may indicate that your product doesn’t cope well with failures coming from outside. The solution would be to make it more robust. There are plenty of different cases, and plenty of different solutions, such as failover, retry policies, etc.

  • A system fails because of the interactions with the outside world. Imagine that the system tests run on an infrastructure which is also used by three other products. It may happen that when another team is running stress tests, the network becomes so slow that your tests simply fail because the parts of your product timeout on most basic things, such as waiting for a response from the database.

    In this case, the solution is to put more isolation, such as move to a dedicated infrastructure, or set up quotas to guarantee that every project will have enough computing, network and memory resources, no matter how other teams are using the infrastructure.

  • A test fails because of the complexity of the system itself, or because the test platform is unreliable. I’ve seen this on several web projects, with tests running through an emulated browser. The complexity of the product itself meant that occasionally, an element wouldn’t be shown on a page as fast as needed, and even more worrisome, sometimes a test would simply misbehave for no apparent reason.

    If this is what you have, you might move to a better testing platform, as well as try to simply as much as possible to product itself.

data structures – Algorithm for Estimating Number of Unique Monthly Visitors

Is there a way to estimate the number of unique monthly visitors to a site based on a limited sample of one week of data? I have information about when a given user visited the site. This isn’t as simple as just multiplying the number of unique visitors the first week by 4, due to the hotel problem. If 10 people visit your site the first week and the same people are the only visitors to your site the second, third, and fourth week, the total number of monthly unique visitors to your site is only 10.

I know you can use HLL to estimate the number of unique visitors to a site in O(1) space. I’m wondering if there’s a similar approach to estimate how many unique visitors there will be after a month, preferably that also works in O(1) space.

What is the request limit number in the blockchain.info API?

I’m planning to use https://www.blockchain.com/api, precisely single block https://www.blockchain.com/api/blockchain_api for testing purposes on my script.

I noticed there this line on the the API site ‘Request Limits: To bypass the request limiter, please request an API key’, the registration for an API key required me to have a website or app but I don’t have any.

I’m planning to use their API to run a script for a few days to test.

So I’m wondering: does anyone know what is the current request limit for their API? I see no mention about the request number on the website or document.

algorithms – Fast query on the number of elements in a quarter plane

A 2D Fenwick tree can solve this in $O(log^2 N)$ time and $O(N^2)$ space for both update and query, when your coordinates are integers and $N$ is the size of the maximum coordinate.

Fenwick trees are nice for competitive programming because they have a really compact implementation, for example in C++:

typedef unsigned long long uw;
uw lsb(uw x) { return x & -x; }
uw  up(uw x) { return x + lsb(x); }
uw  dn(uw x) { return x - lsb(x); }

template<class T> struct Fenwick2D {
  uw n; std::vector<std::vector<T>> b;
  Fenwick2D(uw n) : n(n), b(n+1, std::vector<T>(n+1)) { }

  void increase(uw x, uw y_, T v) {
    for (; x <= n; x = up(x))
      for (uw y = y_; y <= n; y = up(y))
        b(x)(y) += v;
  }

  T prefix_sum(uw x, uw y_) {
    T s = 0;
    for (; x >= 1; x = dn(x))
      for (uw y = y_; y >= 1; y = dn(y))
        s += b(x)(y);
    return s;
  }
};

And this shows your example from the question:

int main(int argc, char** argv) {
    Fenwick2D<int> fw(4);
    fw.increase(1, 1, 1);
    fw.increase(2, 2, 1);
    fw.increase(1, 2, 1);
    fw.increase(3, 2, 1);
    std::cout << fw.prefix_sum(1, 1) << "n";
    std::cout << fw.prefix_sum(2, 2) << "n";
    std::cout << fw.prefix_sum(3, 3) << "n";
    return 0;
}

This data structure is more general than your problem as well. In general it can increase any grid element $g_{x,y}$ by value $v$ (which decreases for negative $v$), and ask the value of the lower quadrant prefix sum $sum_{ileq x}sum_{jleq y} g_{i,j}$ for any $x, y$. Note that this means you can get area sums as well using 4 queries by the inclusion-exclusion principle.

sharepoint online – Getting version number into the spreadsheet

I have a spreadsheet on a spreadsheet located in a SharePoint library. Major and minor versions are enabled. I am trying to get names and values of some columns (metadata/properties) into the spreadsheet and have been largely successful in doing so by using ThisWorkbook.ContentTypeProperties in a VBA module.

However, I am unable to get the version number of the document. Does anyone know how this can be achieved?

Thanks

algebraic number theory – Solutions of a Linear System from a Subfield

Is there any work related to finding a linear (homogenous) system’s solutions that are in a subfield? For example, $Amathrm{bf x} =mathrm{bf 0}$ where $A$ is a real matrix but we are looking for rational (or equivalently, integer) vectors $mathrm{bf x}$ (other than $mathrm{bf 0}$). I don’t need an explicit method or algorithm – just looking for any existence result/condition (e.g., a sufficient or necessary condition for 0 to be the unique rational solution would also be useful)

Note that this different from a similar sounding case, where $A$ is a rational matrix, in which case existence of real vector $mathrm{bf x}$ implies existence of rational vector $mathrm{bf x}$. Also note that this is different from linear homogenous diophantine systems.

What are ways to obtain a toll-free number that is a secure line?

What services or ways may one obtain a toll free in USA number that is a secure line that isn’t easedroppable and meets federal patient confidentiality laws and HIPPA guidelines?

co.combinatorics – Chromatic number of regular graphs using spectra

There exist inequalities relating the maximum and minimum eigenvalues of the adjacency matrix of a graph with its chromatic numbers, i.e. the Wilf’s and Hoffmann’s inequalities. But, for regular graphs, the upper bound by Wilf’s inequality is quite trivial, that is same as the greedy coloring bound.

Is there a better bound for the chromatic number of a regular graphs using the spectra of, say the adjacency or Laplacian matrices? Thanks beforehand.

performance – Finding the prime factors of a number in Python 2

I’ve created a function that takes a number and, if it’s prime, tells you so, or if it’s composite, gives you the prime factors of the number (and if it’s 1, tells you that it’s neither).

Theoretically it should work for an infinitely large number, but at 8 digits it starts to slow down significantly, particularly so if the prime factors are large. I’m fairly new at Python, so I’d welcome any feedback, especially on how to make it faster.

I’m aware that there are things I could have done more efficiently from the start — some of which I’ve become aware from looking at other Python questions in this same vein on this site — but while I would find advice like ‘this bit’s ill-conceived, rip it out and write something else entirely’ helpful, I’d prefer best-practices things, and ways to make it faster without totally changing the premises (as it were).

I haven’t annotated it because (as far as I’m aware), it’s fairly basic; any old hack could write this, but obviously I can annotate if you’d like.

Thanks!

Here’s the code (in Python 2):

import math
def prime_factors(y):
 n = y
 def is_prime(x):
    count = 0
    if x > 1:
      for i in range(2, x):
        if x % i != 0: 
          count += 1
        else:
          return False
          break
    else:
        return True
    if count != 0:
        return True 
    if x == 2:
      return True
 def make_p_lst(x):
   z = ()
   for i in range(2, x):
     if is_prime(i) == True:
       z.append(i)
   return z
 
 c = 0
 c = int(math.sqrt(y) + 1)
 prime_lst = ()
 prime_lst = make_p_lst(c)
 p = is_prime(y)
 if p == True and y != 1:
   print '%s is prime.' % (y)
   return 'Thus, its' only factors are 1 and itself.'
 elif y != 1:
   print '%s is composite, here are its' prime factors: ' % (y)
   factors_lst = ()
   while is_prime(y) != True:
      for i in prime_lst:
        if y % i == 0:
          y = y/i
          factors_lst.append(i)
   factors_lst.append(y)
   factors_lst.sort()
   if factors_lst(0) == 1: 
     factors_lst.remove(1)
   n = factors_lst
   return n
 else:
   return '1 is neither prime nor composite.'
print prime_factors(871)

blockchain.info api what is the request limit number?

I’m planning to use https://www.blockchain.com/api, precisely single block https://www.blockchain.com/api/blockchain_api for testing purpose on my script.

I noticed there this line on the the api site Request Limits: To bypass the request limiter, please request an API key, the register side for api key required for me to have a website or app but i don’t have any

I’m planning to use their api to run a script for a few days to test

So i’m wondering does anyone know what is the current request limits number for their API? i see no mention about the request number on the website or document