data structures – Intuition behind the Reversal Algorithm [ Rotation of Array ]

I know that the below program uses Reversal algorithm to rotate an array’s elements to k positions to the right. But how does the Reversal algorithm works? It seems kinda magic to see. Can somebody give me the intuition or simple math explanation for me to understand.

void rotate(vector<int>& nums, int k) {
    k %=nums.size();
    reverse(nums.begin(), nums.end());
    reverse(nums.begin(), nums.begin()+k);
    reverse(nums.begin()+k, nums.end());
}

What is the intuition behind Strassen’s Algorithm?

I came across Strassen’s algorithm for matrix multiplication, which has time complexity $O(n^{2.81})$, significantly better than the naive $O(n^3). Of course, there have been several other improvements in matrix multiplication since Strassen, but my question is specific to this algorithm.

If you see the algorithm, you’ll notice that 7 matrices $M_1$ to $M_7$ have been defined as intermediate computation steps, and the final matrix product can be expressed in terms of these. I understand how to verify this claim, and arrive at the expression for the desired time complexity, but I’m unable to grasp the intuition behind this algorithm, i.e. why are the matrices $M_1$ through $M_7$ defined the way they are?

Thank you!

dg.differential geometry – intuition behind shape optimization using Hadamard’s method

I’m trying to understand the intuition behind shape optimization using Hadamard’s method. Please consider the following simple example:

Let $lambda$ denote the Lebesgue measure on $mathcal B(mathbb R)$, $dinmathbb N$, $Usubseteqmathbb R^d$ be open, $$mathcal F(Omega):=lambda^{otimes d}(Omega);;;text{for }Omegainmathcal B(U)$$ and $$mathcal G(Omega):=sigma_{partialOmega}(partialOmega);;;text{for }Omegainmathcal A,$$ where $sigma_{partialOmega}$ denotes the surface measure on $mathcal B(partialOmega)$ for $Omegainmathcal A$ and $$mathcal A:={Omegasubseteqmathbb R^d:Omegatext{ is bounded and open},overlineOmegasubseteq Utext{ and }partialOmegatext{ is of class }C^1}.$$

Consider the simple problems of minimizing $mathcal F$ and $mathcal G$ over their respective domains.

If I got it right, the problem is that there is no canonical notion of a “derivative” of $mathcal F$ or $mathcal G$. So, the idea is take a look at what happens to those functionals if their argument is subject to a small perturbation. The question is what we can infer from that, but let me be precise:

Let $tau>0$ and $T_t$ be a $C^1$-diffeomorphism from $U$ onto $U$ for $tin(0,tau)$ with $T_0=operatorname{id}_U$. Assume $$(0,tau)ni T_t(x)tag1$$ is continuously differentiable for all $xin U$. Moreover, assume that $$(0,tau)times Uni(t,x)mapsto{rm D}T_t(x)tag2$$ is continuous in the first argument uniformly with respect to the second so that we may assume that $$det{rm D}T_t(x)>0;;;text{for all }(t,x)in(0,tau)times Utag3$$ by taking $tau$ sufficiently small. (I guess, intuitively, this ensures that the transformations are orientation-preserving; maybe someone can comment on this.) Let $$v_t(x):=fracpartial{partial t}T_t(T_t^{-1}(x));;;text{for }(t,x)in(0,tau)times U.$$

Now let $Omega^{(1)}inmathcal B(U)$, $Omega^{(2)}inmathcal A$ and $$Omega^{(i)}_t:=T_tleft(Omega^{(i)}right);;;text{for }tin(0,tau).$$ We can show that $$frac{mathcal Fleft(Omega^{(1)}_tright)-mathcal Fleft(Omega^{(1)}_0right)}txrightarrow{tto0+}int_{Omega^{(1)}}nablacdot v_0:{rm d}lambda^{otimes d}tag4$$ and if $Omega^{(1)}inmathcal A$, then the right-hand side is equal to $$int_{Omega^{(1)}}nablacdot v_0:{rm d}lambda^{otimes d}=intlangle v_0,nu_{partialOmega^{(1)}}rangle:{rm d}sigma_{partialOmega^{(1)}}tag5,$$ where $nu_{partialOmega^{(1)}}$ denotes the unit outer normal field on $partialOmega^{(1)}$. Moreover, $$frac{mathcal Gleft(Omega^{(2)}_tright)-mathcal Gleft(Omega^{(2)}_0right)}txrightarrow{tto0+}intnabla_{partialOmega^{(2)}}cdot v_0:{rm d}sigma_{partialOmega^{(2)}}tag6,$$ where $nabla_{partialOmega^{(2)}}cdot v_0$ denotes the tangential divergence of $v_0$.

The question is: What can we infer from that? How do $(4)$, $(5)$ and $(6)$ help us to find a (local) minimum of $mathcal F$ or $mathcal G$?

Here is what my guess is: I think the implications are less helpful in deriving a theoretical (local) minimum, but they help to come up with a numerical “gradient descent”-like algorithm. It’s trivial to observe that if $mathcal H$ is any functional on a system $mathcal B$ of sets contained in $2^U$ and $Omega_t:=T_t(Omega_0)inmathcal B$ for $tin(0,tau)$, then $$frac{mathcal H(Omega_t)-mathcal H(Omega_0)}txrightarrow{tto0+}atag7$$ for some $a<0$ tells us that $$mathcal H(Omega_t)<mathcal H(Omega_0);;;text{for all }tin(0,delta)tag8$$ for some $delta>0$.

Now, we know that the family $v$ of “velocity” fields and the family $(T_t)_{tin(0,:tau)}$ of transformations are in a one-to-one relationship. As the results infer (and please correct me, if this is a bad conclusion) the derivatives in above only depend on $v_0$ and hence we could show a family of transformations corresponding to a “time-independent” velocity field.

The results above tell us that if we should choose $v_0$ such that the right-hand side of $(4)$, $(5)$ or $(6)$, respectively, is negative.

But is there a “steepest descent” direction? I don’t know whether we can show this, but intuitively, it seems like $v_0=-nu_{partialOmega}$ would yield the “steepest descent” for $mathcal F$. This not only yields a negative right-hand side of $(5)$, but it is even somehow plausible that we shrink the volume of the shape by squeezing it at every point in the opossite direction of the normals. Can this be made rigorous? It would obviously end in the emptyset which is clearly a global minimum of $mathcal F$

What is the intuition behind the outer product of two eigenvectors?

I know that the outer product of every two eigenvector forms a 2-D basis for the 2-D matrices. For example, when we write a matrix based on its eigenvectos, we have:

$ X = sum_{i,j} lambda_{i,j}u_iu_j^T $

where $lambda_{i,j}$ is equal to zero when $ineq j$ and is eigenvalue otherwise. But what is the intuition behind the basis? Why in the eigen-decomposition the coefficient of cross eigenvectors are zero?

homotopy theory – Intuition for categorical fibrations?

I think I have a pretty good intuitive understanding of most types of fibrations of quasicategories:

  • a (trivial) Kan fibration is a bundle of (contractible) spaces with equivalent fibers,
  • a left/right fibration is a bundle of spaces with covariant/contravariant functors between fibers,
  • a (co)Cartesian fibration is the same as left/right but now the fibers are $infty$-categories,
  • an inner fibration is bundle of $infty$-categories with correspondences between fibers.

One major exception is the class of categorical fibrations. I know they are the fibrations in the Joyal model structure on sSet but that description isn’t very illuminating to me. I feel this is problematic since categorical fibrations are central to the theory of $infty$-operads, which I am trying to learn at the moment.

What would be the best way to describe categorical fibrations in a similarly intuitive way?

linear algebra – Geometric Intuition of the Dot Product

First of all, sorry for my poor English and thanks for your time.

I’m having problems to understand the intuition behind the dot product.

I know how to calculate the dot product with the algebraically and geometrically definitions, and I understand why are the same thanks to the Law of the Cosines:

Algebraically: $u cdot v = u_xv_x + u_yv_y$

Geometrically: $u cdot v = |u| |v| cos theta$

But when I read some definitions like: “The dot product tells you what amount of one vector goes in the direction of another” I get confused.

I’m barely understands the physics intuition of an object pull with some force vector in some distance vector with different directions and that the result of the dot product is the amount of work.

But I don’t quite understand the geometrical intuition.

enter image description here

The result of the dot product is the length of the projected vector ($|A| cos theta$ ) multiplied with the length of the vector B($|B|$) .

When you calculates the dot product with at least one unit vector the result makes sense because is the length of the projected vector (because it has been multiplied by the length of the unit vector that is 1), something that you can see and identified in the space.

But when you calculate the dot product with two NO normalized vectors the result scalar it is something much bigger than any vector length and I don’t understand what it represents.

Can you help me to understand the dot product intuition in a geometrically way?

algorithms – Intuition behind the entire concept of Fibonacci Heap operations

The following excerpts are from the section Fibonacci Heap from the text Introduction to Algorithms by Cormen et. al


The potential function for the Fibonacci Heaps $H$ is defined as follows:

$$Phi(H)=t(H)+2m(H)$$

where $t(H)$ is the number of trees in the root list of the heap $H$ and $m(H)$ is the number of marked nodes in the heap.

Before diving into the Fibonacci Heap operations the authors try to convince us about the essence of Fibonacci Heaps as follows:

The key idea in the mergeable-heap operations on Fibonacci heaps is to delay work as long as possible. There is a performance trade-off among implementations of the various operations.($color{green}{text{I do not get why}}$) If the number of trees in a Fibonacci heap is small, then during an $text{Extract-Min}$ operation we can quickly determine which of the remaining nodes becomes the new minimum node( $color{blue}{text{why?}}$ ). However, as we saw with binomial heaps, we pay a price for ensuring that the number of trees is small: it can take up to $Omega (lg n)$ time to insert a node into a binomial heap or to unite two binomial heaps. As we shall see, we do not attempt to consolidate trees in a Fibonacci heap when we insert a new node or unite two heaps. We save the consolidation for the $text{Extract-Min}$ operation, which is when we really need to find the new minimum node.


Now the problem which I am facing with the text is that they dive into proving the amortized cost mathematically using the potential method without going into the vivid intuition of the how or when the “credits” are stored as potential in the heap data structure and when it is actually used up. Moreover in most of the places what is used is “asymptotic” analysis instead of actual mathematical calculations, so it is not quite possible to conjecture whether the constant in $O(1)$ for the amortized cost ( $widehat{c_i}$ ) is greater or less than the constant in $O(1)$ for the actual cost ($c_i$) for an operation.


$$begin{array}{|c|c|c|} hline
text{Sl no.}&text{Operation}&widehat{c_i}&c_i&text{Method of cal. of $widehat{c_i}$}&text{Cal. Steps}&text{Intuition}\ hline
1&text{Make-Fib-Heap}&O(1)&O(1)&text{Asymptotic}&DeltaPhi=0text{ ; $widehat{c_i}=c_i=O(1)$} &text{None}\
hline
2&text{Fib-Heap-Insert}&O(1)&O(1)&text{Asymptotic}&DeltaPhi=1 text{ ; $widehat{c_i}=c_i=O(1)+1=O(1)$} &text{None}\
hline
3&text{Fib-Heap-Min}&O(1)&O(1)&text{Asymptotic}&DeltaPhi=0;text{ ; $widehat{c_i}=c_i=O(1)$} &text{None}\
hline
4&text{Fib-Heap-Union}&O(1)&O(1)&text{Asymptotic}&DeltaPhi=0;text{ ; $widehat{c_i}=c_i=O(1)$} &text{None}\
hline
5&text{Fib-Extract-Min}&O(D(n))&O(D(n)+t(n))&text{Asymptotic}&DeltaPhi=D(n)-t(n)+1 &text{$dagger$}\
hline
6&text{Fib-Heap-Decrease-Key}&O(1)&O(c)&text{Asymptotic}&DeltaPhi=4-c &text{$ddagger$}\
hline
end{array}$$


$dagger$ – The cost of performing each link is paid for by the reduction in potential due to the link’s reducing the number of roots by one.

$ddagger$ – Why the potential function was defined to include a term that is twice the number of marked nodes. When a marked node $у$ is cut by a cascading cut, its mark bit is cleared, so the potential is reduced by $2$. One unit of potential pays for the cut and the clearing of the mark bit, and the other unit compensates for the unit increase in potential due to node $у$ becoming a root.


Note: It is quite a difficult question in the sense that it involves the description the problem which I am facing to understand the intuition behind the concept of Fibonacci Heap operations which is in fact related to an entire chapter in the CLRS text. If it demands too much in a single then please do tell me then I shall split it accordingly into parts. I have made my utmost attempt to make the question the clear. If at places the meaning is not clear, then please do tell me then I shall rectify it. The entire corresponding portion of the text can be found here. (Even the authors say that it is a difficult data structure, having only theoretical importance.)

Intuition for weighted average. Why $dfrac{w_1}{w_1 + w_2}x_1 + dfrac{w_2}{w_1 + w_2}x_2 = dfrac{sum w_ix_i}{sum w_i}$?

I know $dfrac{w_1}{w_1 + w_2}x_1 + dfrac{w_2}{w_1 + w_2}x_2 = dfrac{sum w_ix_i}{sum w_i}$, because $sum w_i$ is common denominator. I’m not asking about this algebra.
It’s intuitive that $dfrac{w_i}{w_1 + w_2}$ weighs $x_i$.

I don’t understand why $dfrac{sum w_ix_i}{sum w_i}$ intuitively is weighted average. You are summing $w_ix_i$ and $w_i$ separately. Thus

  1. you’ve lost information, because the weight for $x_i$ doesn’t appear.

  2. you’re muddling each $w_ix_i$. If you $sum w_ix_i$ and and $sum w_i$ separately, you end with just totals. You’ve muddled the weight of each $x_i$. It’s not intuitive how these totals inform about the weights.

Can this explain with picture?

enter image description here

performance – What’s the intuition behind calling MacOS BigSur as MacOS v11?

Maybe, a unified architecture makes sense here, as, in the recent Macbooks, Apple has been using T2 chips for the secure enclave, High speed Storage controllers for the SSDs, etc, which, if I am not mistaken, are based on ARM. Combining all these functional features under a single unified umbrella certainly helps with the power vs performance improvements as everything can be written using those low-level instruction sets on top of Metal for greater visual fidelity.

enter image description here

dg.differential geometry – How do I join the dots between the formal definition of exterior derivative and the intuition

I am self-studying differential forms, and I’d like to understand the proof of Stoke’s theorem.

I have read through the document here by Dan Piponi and really liked his explanation of the exterior derivative as finding the boundary in the picture (section 4, page 6), but I have a hard time to connect this with the formal definition of the exterior derivative.

Reading through the comments of his answer here, I am not sure if the concept of ‘finding the boundary in the picture’ applies to just xdy or any differential forms?