OCR Limits
PDF Accessibility is a barrier for many.
PDF files are used everywhere. They store reports, forms, policies, brochures and learning materials. They look simple on the surface, but many PDFs hide serious accessibility problems that stop people from using them properly, especially people who rely on assistive technology.
Understanding these issues matters because an inaccessible PDF can block someone from reading important information, completing a task or accessing a service.
What Makes a PDF Inaccessible?
The most common accessibility problem with PDFs is that many are not real text documents. They are often images of text instead.
This usually happens when a paper document is scanned and saved as a PDF. The result looks like a normal document, but to a computer it is just one flat image.
This creates a major barrier for people who rely on assistive technology including screen readers and braille devices. A screen reader cannot understand the words inside an image unless extra steps are taken. In some cases, the document may appear completely blank to the user.
This is not a rare problem. Recent research into scholarly PDFs found that most fail accessibility standards, including missing tags, poor reading order, and lack of alternative text, making them difficult to use with assistive technology (Kumar and Wang, 2024).
Even when the content is visible on screen, assistive technology cannot access it in a meaningful way because there is no real text structure behind it.
How Assistive Technology Tries
to Read These Files
Many people assume assistive technology can simply read any document. The reality is more complex.
Assistive technology cannot “see” images or scanned pages. Instead, they rely on real digital text. When a PDF is an image, other tools are used to try and fill the gap.
OCR, which stands for Optical Character Recognition, is one of the most common tools used.
OCR tries to detect letters and words inside an image and convert them into readable text. Some modern systems also use artificial intelligence to describe what is on the page or guess missing structure.
This can make a document seem accessible at first. Text might become searchable or selectable, and a screen reader might start reading something out loud.
But this is only an interpretation of the original image, not the true document structure.
Why OCR Is Not Reliable
OCR is not perfect because it is based on pattern recognition.
If a scan is blurry, tilted or low quality, OCR may misread the content. Letters can be confused, numbers can be wrong and words can be broken or mixed up.
A simple error like “public” becoming “pub1ic” can change meaning. In important documents like contracts, medical forms or government information, this can create real risk.
OCR also struggles with:
- Handwriting
- Complex layouts with columns or tables
- Decorative fonts
- Poor quality scans
- Mathematical symbols or special characters
Even when OCR works, it can still scramble reading order, which makes content confusing for screen reader users.
Other Common PDF
Accessibility Problems

While image based PDFs are the biggest issue, other accessibility problems also appear often. These include:
- Missing headings that make navigation difficult
- Poor reading order that jumps around the page
- Links that are unclear or not labelled properly
- Images without descriptions
- Low colour contrast that makes text hard to read
These issues are common across many digital documents, not just PDFs. However, when combined with image based content, they make PDF accessibility even harder.
The Risk of Relying on OCR Alone
A common mistake is assuming that running OCR makes a PDF accessible.
It does not.
OCR only tries to convert images into text. It does not fix structure, navigation, headings or descriptions. It also cannot guarantee accuracy.
This creates risks such as:
- Incorrect information being read out
- Missing or skipped content
- Confusing reading order
- False confidence that the document is accessible
A document may look fine on screen but still fail users who depend on assistive technology.
Is There a Solution
to Accessible PDFs?

Yes, but the solution is not a quick fix.
The best approach is to build accessibility in from the start instead of repairing documents later.
This means:
- Creating documents in Word or other tools with proper structure
- Using real headings instead of visual formatting only
- Adding alt text to images
- Ensuring clear reading order
- Using accessible fonts and good contrast
- Exporting properly tagged PDFs
A strong solution also involves building digital accessibility into standard operating procedures. When accessibility is part of everyday workflow, it becomes consistent rather than optional.
This includes using shared templates for reports, forms and presentations. Templates ensure that structure, spacing and formatting are already set up correctly. This improves efficiency and productivity because staff do not need to rebuild basic layouts each time.
Clear procedures also play a key role. When organisations have a simple guide that explains how all documents should be created, including accessibility requirements, it removes guesswork. Staff know exactly how to set out content, which reduces errors and improves quality across the board.
Organisations can also strengthen this approach by investing in structured digital accessibility training. This helps staff understand both the practical steps and the reasoning behind accessible design, making it easier to apply consistently in daily work. Digital accessibility training
To properly support this, organisations should also use a PDF editor that includes an accessibility menu. Tools such as Adobe Acrobat Pro or other advanced PDF editors provide built in accessibility checkers. These tools allow users to test documents for issues like missing tags, reading order problems, and alternative text gaps. Running an accessibility check before publishing a PDF helps catch problems early and ensures documents meet standards such as WCAG and PDF UA.
When scanning is unavoidable, OCR can still help, but it must always be checked and corrected manually.
Final Thoughts
PDF accessibility is not just a technical issue. It is a usability issue that affects real people every day.
Image based PDFs are one of the biggest barriers because they remove real text from the document. OCR can help bridge the gap, but it is not reliable enough to depend on alone.
The most effective solution is simple. Build accessibility into documents from the beginning, support it with clear templates and procedures, and use proper PDF tools to test accessibility before sharing. When accessibility becomes part of everyday practice, everyone benefits from clearer, more usable information.
Talk to us about training for your team on creating accessible documents, including accessible PDFs. We can help you build the skills, templates and processes needed to make accessibility part of everyday practice.






