Title: Paper notes: Breaking Bad: Quantifying the Addiction of Web Elements to JavaScript
Date: 2023-09-26 17:15

[PDF](https://arxiv.org/pdf/2301.10597.pdf), [local mirror]({static}/files/papers/breaking_bad.pdf)

More or less all conversations involving the [tor browser](https://www.torproject.org/download/)
will at some point contain the following line: "No, javascript isn't disabled
by default because too many sites would break. You can always crank the
security slider all the way up if you want tho."

We all agree that javascript enables all sorts of despicable behaviours making
the web a nightmare-material privacy/security cesspit and completely
inscrutable to a lot of users, so having research done
to quantify how to make it a better place for everyone is always more than welcome.

The main idea of the paper is to load pages from the [Hispar
set](https://hispar.cs.duke.edu/) with and without `javascript.enabled` set,
via [Puppeteer](https://pptr.dev), and to perform
magic human-assisted smart diffing to detect user-perceived/perceivable
breakages. 

The paper is full of fancy graphs and analysis, but the [tldr](https://en.wikipedia.org/wiki/TL;DR) is:

> We discover that 43 % of web pages are not strictly dependent on JavaScript
and that more than 67 % of pages are likely to be usable as long as the visitor
only requires the content from the main section of the page, for which the user
most likely reached the page, while reducing the number of tracking requests by
85 % on average.

An interesting take is that the usage of javascript framework is the main
source of breakage, since <s>a lot</s> all of them result in completely
unusable websites when javascript is disabled. Moreover, anecdotal data seems
to suggest that the bigger a company is, the more their website is going to
break when javascript is disabled.

And like every decent paper, it comes with the [related code and data published](https://gitlab.inria.fr/Spirals/breaking-bad).
