1) How can half the traffic on a subset ever represent more than half of the traffic on the superset? The 12% number assumes that these top million sites have effectively all global traffic; in practice it will only ever be less than that.
2) These top million sites are exactly the sort that are likely to serve static content from a different hostname.