A framework for improving the accessibility of research papers on

Voices Powered byElevenlabs logo


what is the paper's main point?

AI: The paper's main point is to highlight the importance of making scientific research more accessible to people with disabilities and to identify potential barriers for researchers with disabilities and how to make improvements to improve access. It also proposes solutions to improve accessibility, such as offering accessible HTML alongside existing PDF and TeX formats.

Extract facts

AI: Facts: - arXiv's job is to put out proper HTML alongside existing sources. - 89% of respondents heavily depend on access to research for their professional work. - Users of assistive technology report they only have access to 38% of the research they need without assistance. - PDF formatting is the biggest barrier to accessing papers, according to survey respondents. - 67% of respondents still prefer PDF even if well-formatted HTML is available. - 73% of respondents would find it useful to build a customizable arXiv feed. - 58% of respondents would find it useful to have a quick way to get to a paper's conclusions and references. - arXiv is a platform that hosts research content and has played an important role in open access for over 30 years by removing financial, institutional, and geographic barriers to research. - More than a quarter of the world’s population has a vision impairment, 20% of people in the United States have dyslexia, and 26% of people in the United States self-report living with at least one disability. - The vast majority of research papers have low levels of accessibility, creating significant barriers for a large number of people. - HTML is a next step towards more open science, offering better accessibility through well-formatted design. - Participants were diverse in terms of their career stage and included PhD students, professors, and researchers working in industry. - Participants come from multiple fields of research including physics, math, statistics, computer science, legal, and regulatory. - 7 participants disclosed blindness, 1 participant disclosed dyslexia, 1 participant disclosed ADHD, and 2 participants disclosed a movement disability. - PDFs have low native accessibility and are challenging for people with reading disabilities such as blindness, low vision, and dyslexia. - Research from the Allen AI institute shows that only 2.4% of PDFs demonstrate full compliance with accessibility criteria. - PDFs are far behind standard on mobile devices. - ACM conference has found low accessibility rates of PDF and has chosen HTML5 format as a goal to ultimately make accessibility easier and more standardized. - Well formatted HTML will support and empower the many different ways that scientists consume research data. - None of the existing tools can provide a smooth experience for authors and readers, and arXiv aims to close the gap on all the rest. - A survey showed that most respondents use arXiv daily and are primarily from Physics, Math, and Computer Science fields. Only 25% of the respondents are direct users of assistive technology. The most commonly reported assistive technology used is a screen reader.

Extract 10 top keywords

AI: 1. PDF accessibility 2. Reading disabilities 3. Blindness 4. Low vision 5. Dyslexia 6. Screen readers 7. Semantic markup 8. Tagging 9. ASL 10. Education


Shamsi Brinn, Christopher Cameron, David Fielding, Charles Frankston, Alison Fromme, Peter Huang, Mark Nazzaro, Stephanie Orphan, Steinn Sigurdsson, Ryan Tay, Miranda Yang, Qianyu Zhou


The research content hosted by arXiv is not fully accessible to everyone due to disabilities and other barriers. This matters because a significant proportion of people have reading and visual disabilities, it is important to our community that arXiv is as open as possible, and if science is to advance, we need wide and diverse participation. In addition, we have mandates to become accessible, and accessible content benefits everyone. In this paper, we will describe the accessibility problems with research, review current mitigations (and explain why they aren't sufficient), and share the results of our user research with scientists and accessibility experts. Finally, we will present arXiv's proposed next step towards more open science: offering HTML alongside existing PDF and TeX formats. An accessible HTML version of this paper is also available at

Follow Us on

1 comment


Dear Shamsi, 

Thank you for sharing your work. I am working on building similar accessibility solutions - including but not limited to this platform - and have a few questions.

  1. Your paper provides insightful perspectives on the accessibility barriers related to the PDF format, which is  surprising given its widespread use in the STEM community. Could you please elaborate on the specific user categories who face challenges with this format?

  2. With HTML potentially replacing PDF as the main format for arXiv preprints, accessibility would certainly improve. However, could the traditional browsing experience, often marked by distractions and an unfocused attention span, possibly hinder in-depth research engagement in this format?

  3. An additional concern that wasn't addressed directly in your paper is the overwhelming volume of scientific data available today. Given the rapidly increasing number of submissions to arXiv
    (which will undoubtedly become worse due to LLMs), how do you foresee your proposed accessibility solutions interacting with this issue of data overflow?

Add comment
Recommended SciCasts