Web Page to PDF via Word?

Table of Contents

My PDF Problem

First, let me I admit I have a PDF problem. When I see a random Web page with interesting content, I click the "Print to PDF" button so that it can be captured for eternity (just in case the Web page disappears someday).

This isn’t really an accessibility issue, unless I started sharing it. Specifically with students enrolled in a course I’m teaching. It makes sense to do this from an archival point of view, but that untagged PDF is now a major accessibility headache, especially when its combined with other third party PDF items like journal articles or other reports which may or (much more likely) NOT be accessible.

What an instructor do to mitigate this problem? Here are my recommendations below.

Link to the Web Site (If You Can)

They’re More Accessible

Most websites, at least the text part, are more accessible than a PDF printout. Most websites have headings, a lot of PDF files don’t even have that. Major news, academic and even gossip sites have really taken the time to include image alt text, table headers

Note that the Print to PDF function destroys any accessibility features that a web team may have implemented…leaving you with an untagged PDF.

They’re More Stable

Websites have also become more stable in recent years. For major web sites like the BBC, content published in the past is discoverable. Examples include the obituary or Margaret Thatcher (2013) or even the death of the Queen Mother (2002).

But I Need a PDF Backup…

Why a PDF Backup?

There are still times when you need a backup. For instance,

  • Sites change or disappear – although the Internet Archive’s Way Back Machine can retrieve some old URLs.
  • Sites change – Wikipedia and other sites may change content depending on user input. If you want to capture content at a specific time, a PDF may help.
  • There’s a subscription/pay wall – maybe an instructor has subscribed to a news source, but should students subscribe just to read one article for a specific course? Opinions differ….

Making Accessible PDFs from Web Sites

Method 1: Print to PDF, then Tag in Adobe Acrobat

You you like the speed of "Print to PDF" and only want to worry about accessibility later. I understand that, but "later" for accessibility is coming sooner than you think. A recent DOJ announcement requires course content at Penn State to be accessibily by April 2026…even if there are no students requiring accommodations.

Why? Because a student requiring accommodations may experience significant delays waiting for content to become accessible at the last minute. This forces students with

Should you tag your files now? Maybe, but it can be a hassle. PDF files can come from multiple sources ranging from Photoshop to Python. Anything can be in there….

Method 2: Save Site Content in Word (As a Backup)

Alternatively, you can

  1. Copy and paste the page content into a Word file.

If you decide to use it as a PDF, you can

  1. Make the Word file accessible adding headings, image ALT text, table headers and so forth. This is much easier and quicker to do in Word versus PDF.
    Note: If the web site has image alt text, it will be carried over into Word with the image.
  2. Use Save as PDF function in Word (vs. Print to PDF) to export a fully tagged PDF. All the effort you put into making an accessible Word file will be transferred over into the PDF. This saves you lots of time and grief in the long run.

If you are worried about speed, you can just do the initial copy and paste from the web site into Word. This lets you capture the data you need at the time you see the article.

Sample Files

Here are some files created from a blog entry on tennis ball colors. I wrote the blog, so gave myself permission to create PDFs here. Despite the entry being over five years old, many links have remained active.

If you are familiar with Adobe Acrobat, feel free to review them, and then to tag the Print to PDF version!

Copyright?

But wouldn’t moving content into a Word file violate copyright? Perhaps (I’m not a lawyer), so you may want to be cautious. Pointing people directly to the Web site instead of using a PDF avoids some of these issues (unless the Web site itself is inaccessible).

But consider what happens now when a PDF is made accessible via tagging within a tool like Adobe Acrobat..

  • Making an "accessible" PDF alters the content, even if it only affects screen reader users. Only the Word method would let you keep the deverloper’s original image ALT text.
  • You can edit text in Word, but actually you can also edit text in Adobe Acrobat.
  • Some repairs, liking fixing improperly encoded fonts (the kind where Greek β is read as a "b" in a screen reader) can done much more quickly in Microsoft Word than Adobe Acrobat. Older PDFs used lots of bad fonts.
  • Both processes cause distortions in the original look and feel of the file. But the Word process could let you remove any unwanted ads or menus.

Some caveats

Again, I’m not a laywer, but some things to consider would be:

  • Use a link to a web page as the primary source.
  • If the PDF backup is needed, restrict the distribution to those enrolled in a course or other appropriate location.
    Note: Most especially, no PDF should be in a for-fee course packet unless it’s properly licensed.
  • Include a note which includes a link original, creation date and edits you made.
This entry was posted in Accessibility. Bookmark the permalink.

Leave a Reply