Page 7 of my computer vision textbook, *Multiple view geometry in computer vision*, says the following:

When applying projective geometry to the process of generating images, it is usual to model the world as a $ 3 $D projective space, equal to $ mathbb {R} ^ 3 $ along with points to infinity. In the same way, the model for the image is the $ 2 $Projective plan d $ mathbb {P} ^ 2 $. The central projection is simply a map of $ mathbb {P} ^ 3 $ to $ mathbb {P} ^ 2 $. If we consider points in $ mathbb {P} ^ 3 $ Written in terms of homogeneous coordinates. $ ( mathrm {X}, mathrm {Y}, mathrm {Z}, mathrm {T}) ^ T $ and that the projection center is the origin. $ (0, 0, 0, 1) ^ T $, then we see that the set of all points. $ ( mathrm {X}, mathrm {Y}, mathrm {Z}, mathrm {T}) ^ T $ for fixed $ mathrm {X} $, $ mathrm {Y} $Y $ mathrm {Z} $, but varying $ mathrm {T} $ it forms a single ray that passes through the center of the projection point and, therefore, the entire mapping to the same point. Thus, the final coordinates of $ ( mathrm {X}, mathrm {Y}, mathrm {Z}, mathrm {T}) $ it is irrelevant to where the point is plotted. In fact, the point of the image is the point in $ mathbb {P} ^ 2 $ with homogeneous coordinates $ ( mathrm {X}, mathrm {Y}, mathrm {Z}) ^ T $. Therefore, the mapping can be represented by a mapping of $ 3 $Homogeneous coordinates d, represented by a $ 3 times 4 $ matrix $ mathrm {P} $ with the block structure $ P = [I_{3 times 3} | mathbf{0}_3]$, where $ I_ {3 times 3} $ is the $ 3 times 3 $ identity matrix and $ mathbf {0} _3 $ a zero 3-vector. By taking into account a different projection center and a different projective coordinate frame in the image, it turns out that the most general image projection is represented by an arbitrary image. $ 3 times 4 $ rank array $ 3 $, acting on the homogeneous coordinates of the point in $ mathbb {P} ^ 3 $ map it to the image point in $ mathbb {P} ^ 2 $. This matrix $ mathrm {P} $ It is known as the camera's matrix.

In summary, the action of a projective camera at a point in space can be expressed in terms of a linear mapping of homogeneous coordinates as

$$ begin {bmatrix}

X \

Y \

w end {bmatrix} = mathrm {P} _ {3 times 4}

begin {bmatrix}

mathrm {X} \

mathrm {Y} \

mathrm {Z} \

mathrm {T} \ end {bmatrix} $$Also, if all the points are in a plane (we can choose this as the plane $ mathrm {Z} = 0 $) then the linear mapping is reduced to

$$ begin {bmatrix}

X \

Y \

w end {bmatrix} = mathrm {H} _ {3 times 3}

begin {bmatrix}

mathrm {X} \

mathrm {Y} \

mathrm {T} \ end {bmatrix} $$Which is a projective transformation.

The mentioned section of the textbook is freely available here.

This are my questions:

- Where it says

Thus, the final coordinates of $ ( mathrm {X}, mathrm {Y}, mathrm {Z}, mathrm {T}) $ it is irrelevant to where the point is plotted.

it should not be the vector $ ( mathrm {X}, mathrm {Y}, mathrm {Z}, mathrm {T}) ^ T $?

- What is it $ mathrm {H} _ {3 times 3} $ it's supposed to be?

I would really appreciate it if people take the time to clarify them.