## pr.probability – Change variables in Gaussian integral over subspace \$S\$

I have been thinking about a problem and I have an intuition about it but I don’t seem to know how to properly address it mathematically, so I’m sharing it with you hoping to get help. Suppose I have two $$ntimes n$$ real matrices $$C$$ and $$M$$ and consider the Gaussian integral:
$$I = Nint e^{-frac{1}{2}ilangle x, C^{-1} xrangle} e^{langle x, M xrangle}dx$$
where $$N$$ is a normalizig constant and I’m writting:
$$langle x, A x rangle = sum_{i,j}x_{i}A_{ij}x_{j}$$
the inner product of $$x$$ and $$Ax$$ on $$mathbb{R}^{n}$$. $$C$$ is the covariance of the Gaussian measure; moreover, suppose $$M$$ is not invertible and has $$1 le k < n$$ linearly independent eigenvectors associated to the eigenvalue $$lambda = 0$$. All other eigenvectors of $$M$$ are also linearly independent, but associated to different nonzero eigenvalues.

This is my problem. I’d like to know how does the formula for the Gaussian integral $$I$$ changes if I was to integrate over the subspace $$S$$ spanned by the eigenvectors $$v_{1},…,v_{k}$$ associated to $$lambda = 0$$. Intuitively, this integral wouldn’t have the $$e^{langle x, Mx rangle}$$ factor because $$Mequiv 0$$ in this subspace. In addition, since $$S$$ is a $$k$$-dimensional subspace, I’d expect this integral would become some sort of Gaussian integral over a new variable $$y$$ which has now $$k$$ entries.

I would like to know if my intuition is correct and if it is possible to explicitly write this new integral over $$S$$, which I was not able to do by myself. Thanks!

## pr.probability – Contiguity of uniform random regular graphs and uniform random regular graphs which have a perfect matching

Let us consider $$cal{G}_{_{n,d}}$$ as the uniform probability space of d-regular graphs
on the n vertices $${1, ldots, n }$$ (where $$dn$$ is even). We say that an event $$H_{_{n}}$$ occurs a.a.s. (asymptotically almost surely) if $$mathbf{P}_{_{cal{G}}}(H_{_{n}}) longrightarrow 1$$ as $$n ⟶ infty$$.

Also, suppose $$(cal{G}_{_{n}})_{_{n ≥ 1}}$$ and $$(cal{hat{G}}_{_{n}})_{_{n ≥ 1}}$$ are two sequences of probability spaces such that $$cal{G}_{_{n}}$$ and $$cal{hat{G}}_{_{n}}$$ are differ only in the probabilities. We say that these sequences are contiguous if a sequence of events $$A_{_{n}}$$ is a.a.s. true in $$cal{hat{G}}_{_{n}}$$ if and only if it is true in $$cal{hat{G}}_{_{n}}$$, in which case we write
$$cal{G}_{_{n}} approx cal{hat{G}}_{_{n}}.$$

Thorem. (Bollobas) For any fixed $$d geq 3$$, $$G_{_{n}} ∈ cal{G}_{_{n,d}}$$ a.a.s has a perfect matching.

Using $$cal{G}^p_{_{n,d}}$$ to denote the uniform probability space of d-regular graphs which have a perfect matching on the n vertices $${1, ldots, n }$$, is it true to conclude from the above theorem that $$cal{G}_{_{n,d}} approx cal{G}^p_{_{n,d}}$$?

## pr.probability – Induction arising in proof of Berry Esseen theorem

I’ve been studying the following paper by Bolthausen, which proves the Berry Essen theorem using Stein’s method:

Let $$gamma$$ be the absolute third moment of a random variable $$X$$, and let $$X_{i}$$ be iid with the same law as $$X$$. Let $$S_{n}=sum_{i}^{n}X_{i}$$, and suppose $$E(X)=0$$, $$E(X^{2})=1$$.

The goal is to find some universal constant $$C$$ such that $$|P(S_{n} leq x) – P(Yleq x)| leq Cfrac{gamma}{sqrt{n}}$$.

Let $$delta(n,gamma) = sup_{x}|P(S_{n} leq x) – P(Yleq x)|$$. We would like to bound $$sup_{n}frac{sqrt{n}}{gamma}delta(n, gamma)$$.

In the proof the following bound is derived:

$$delta(n,gamma) leq cfrac{gamma}{sqrt{n}}+frac{1}{2}delta(n-1,gamma)$$ where $$c$$ is a universal constant. Noting that $$delta(1,gamma) leq 1$$, the author claims that the result is implied. However when I try to use induction to get the result the constant $$C$$ increases without bound $$n$$ grows. If anyone has studied this paper before, I would love to hear from you.

## pr.probability – Ratios and propotion

pr.probability – Ratios and propotion – MathOverflow

## pr.probability – To prove a relation involving a probability distribution

I’m reading a book and have encountered a relation which seems to me to be impossible to prove, I would like to be sure if this is the case. The author gives a probability function as
$$p_n = frac{e^{-c_1 n – c_2/n}}{Z},$$
where $$c_1$$ and $$c_2$$ are constants and Z is a normalization factor and $$n geq 3$$. Then by defining $$alpha$$ as $$alpha = sum_{n = 3}^{infty} p_n (n – 6)^2$$, the author claims one can show that

$$begin{equation} alpha + p_6 = 1, quad quad quad 0.66 < p_6 < 1, end{equation}$$
$$begin{equation} alpha p_6^2 = 1 / 2 pi, quad quad quad 0.34 < p_6 < 0.66. end{equation}$$

How is such a thing possible in the first place as these relations are not even dependent on $$c_1$$ and $$c_2$$?

## pr.probability – inverse of moment-generating function in terms of moments

Let $${h_i}$$ be decreasing sequence of $$n$$ positive reals. Define distribution $$p(X=h_i)propto h_i$$ and let $$g(s)=E_X(e^{sX})$$ be the moment generating function. For instance, for $$h={1,frac{1}{4},frac{1}{9}}$$, unnormalized distribution over $$X$$ looks like this

For a given $$epsilon$$, I need to find smallest $$s$$ such that the following is true
$$g(-2s)

Is it possible to approximate/bound $$s$$ by only relying on moments $$E(X^i)$$ for $$i=1,2,ldots,k$$ and small $$k$$?

Background

$$g(-2s)/n$$ gives a remarkably good fit to average case loss decrease after $$s$$ steps when minimizing a quadratic with eigenvalues $$h_1,h_2,ldots$$ and gradient descent with learning rate 1. Is this a known fact?

Solving equation above in terms of moments would give a practical way to estimate how many more steps are needed to achieve $$epsilon$$ reduction in loss. In practice, $$napprox 10^9$$, $$sapprox 10^9$$, $$h_1=1$$, $$epsilon approx 10^{-3}$$, $$h_i$$ probably decay faster than $$frac{1}{i}$$

## pr.probability – Summation of All subsets of Laplacian Sampled Variables

Say I have a set defined: $$L = (L_1, … L_N)$$, where $$L_i sim Laplace(0, b)$$. Now I have two questions, what will be:

1. The summation of all $$L$$: $$sum_{i=1}^{i=N} L_i$$
2. The summation: $$sum_{k=1}^{N} (-1)^{k+1}sum_{1 leq i_1 cdots i_k leq N} sum_{j in {i_1 cdots i_k}} L_i$$.

I am specifically interested in what these summations will look like as N approaches infinity, or just any really large N.

## pr.probability – Upper bound of Wasserstein distance given by subvariables of codim 1

recently I am considering the upper-bound of Wasserstain distance. Say we have random vectors $$X,Y$$ of dimension $$n$$, and let $$tilde{X}_i (tilde{Y}_i,$$ resp.) be the $$(n-1)$$-dim random vector of $$X (Y$$, resp.) discarding the $$i$$-th component. For example, $$n=3$$ and $$X=(X_1,X_2,X_3)$$, then $$tilde{X}_2=(X_1,X_3)$$.

My question is, can we formulate an inequality of the form $$W_p(X,Y) leq sum limits_{i=1}^na_i W_p(tilde{X}_i,tilde{Y}_i)$$? I know that we can formulate similar inequality by using $$1$$-dim marginals refer here, hence I believe such inequality would hold for $$(n-1)$$-subvariables.

## pr.probability – Conditions ensuring that conditional law of a process belongs to a given exponential family

Let $$(X_t,Y_t)_{tgeq 0}$$ be a pair of $$mathbb{R}^n$$-(resp. $$mathbb{R}^m$$)-valued stochastic processes on a filtered probability space $$(Omega,mathcal{F},(mathcal{F}_t)_{tgeq 0},mathbb{P})$$, defined through the SDEs:
begin{aligned} dX_t & = mu(t,X_t,Y_t)dt + sigma(t,X_t,Y_t)dW_t^1\ dY_t & = m(t,X_t,Y_t)dt + s(t,X_t,Y_t)dW_t^2, end{aligned}
where $$mu,sigma,m,s$$ are all suitable uniformly-Lipschitz functions and $$(W^i_t)_{tgeq 0}$$ are independent Brownian motions of compatible dimensions with the above dynamics.

In Chapter 11 of Lipster and Shiryaev’s book, conditions on $$mu,sigma,m,s$$ and the initial laws of $$X_0$$ and $$Y_0$$ are given ensuring that the conditional probabilities:
$$mathbb{P}left(X_t in cdotmiddle|sigma({Y_s}_{0leq s
are all $$mathbb{P}$$-a.s. Gaussian measures on $$mathbb{R}^n$$.

Fix an exponential family of probability measures $$mathcal{F}$$ on $$mathbb{R}^n$$. Are there known conditions in the literature ensuring that:
$$mathbb{P}left(X_t in cdotmiddle|sigma({Y_s}_{0leq s

## pr.probability – A monotonicity formula for the stochastic integral with respect to Brownian motion

Suppose $$f, g: mathbb R to mathbb R$$ are continuous, non negative functions with $$f leq g$$.

Fix some $$T > 0$$, and denote by $$X^f$$ the stochastic integral $$int_{(0, T)} f(s) dW_s$$, where $$W_s$$ is a standard Brownian motion. Similarly write $$X^g$$ for the corresponding integral of $$g$$.

Write $$F^f$$ for the cumulative distribution function of $$|X^f|$$, that is $$F^f (x) := P(|X^f| < x)$$. Similarly write $$F^g$$ for the corresponding cumulative distribution function of $$|X^g|$$.

Question: Is it true that $$F^f leq F^g$$?