The average distance between pages mostly increased over the first eight or nine years but then decreased through until the present. Layout distance declined the most, 44% between 2012 and 2019.
Because we used interpretable features, though, we can figure out why. Sites shifted from a variety of colored backgrounds to mostly off-white with colors and images on top. For layout, pages used to have a variety of dense column patterns, but now they use lots of blank space.
We also looked at the source code. While code overlap between pages has gone down, software library overlap has gone up. This pattern occurs first in the websites from technology and consumer services companies, and it correlates with another pattern: mobile support
But this could just be a coincidence. So we did a regression analysis to measure which libraries similar-looking sites tend to have in common. We found that UI libraries which are often used for responsive web design tend to predict for more similar layouts, while others predict less similar layouts