If sensor resolution numbers increase significantly from 12.1 MP to 50.6 MP, why’s actual difference in horizontal width much less pronounced?

I don’t understand the embolded phrase below from Camera Resolution Explained. Please explain like I’m 10. I’m unschooled at photography or physics.

How’s “the actual difference in horizontal width” “much less pronounced”? I don’t understand what that first collage below (showing 12.1 MP to 50.6 MP) is trying to prove?

In order to yield twice larger prints at the same PPI, you would need to multiply sensor resolution by 4. For example, if you own a D700 and you are wondering what kind of sensor resolution you would need to print 2x larger, you multiply 12.1 MP (sensor resolution) x 4, which translates to a 48.4 MP sensor. So if you were to move up to say the latest Canon 5DS DSLR that has a 50.6 MP sensor, you would get prints a bit larger than 2x in comparison. To understand these differences in resolution, it is best to take a look at the below comparison of different popular sensor resolutions of modern digital cameras from 12.1 MP to 50.6 MP:

Image Resolution Comparison

As you can see, despite the fact that sensor resolution numbers increase significantly when going from something like 12.1 MP to 50.6 MP, the actual difference in horizontal width is much less pronounced. But if you were to look at the total area differences, then the differences are indeed significant – you could take 4 prints from the D700, stack them together and still be short when compared to a 50.6 MP image, as shown below:

12.1 MP vs 50.6 MP Resolution

Keep all this in mind when comparing cameras and thinking about differences in resolution.