>>So, we're going to talk about making accessible PDFs. I'm with the University of Washington, I'm a technology accessibility specialist there so my job it to look at the technologies we use in education mostly Websites and documents online. And we focus on how to make those accessible. Kind of push the technology into the future and work with people to insure that they utilize those technologies in ways that are effective and that communicate with all users including those with disabilities. You can reach me at my email tft@uw.edu and I'm happy to serve as a resource. I've received federal funding to do that, to be that resource so feel free to drop me a line if you need anything. I'm also on twitter if anybody else is, I'm @terrillthompson. And prior to coming to Purdue here I did a just a real quick informal survey, looked at 20 PDFs from Purdue, found those on Google by searching for PDFs @purdue.edu and not truly randomly but sort of sifted through the results looking at different sub-domains, trying to get a broad cross section of PDFs that were out here. And of course I do this at the University of Washington too and I look at a lot of PDFs all the time from a variety of different domains. Your results actually are better then most but there's 17 of 20 don't have tags which is the essential foundation that needs to happen for accessibility and we'll talk about what that means in a moment. But the very first thing that needs to happen, only three of the 20 PDFs that I looked at had that first step. And so then of those three then they had that first step but there are other steps that had been overlooked, they weren't accessible, the first step had been taken but they still had accessibility problems. And this is a common sort of thing. We look at PDFs anywhere on the Web very, very, very few of them are accessible. And what we mean by that is mostly we're focusing on students with visual impairments who are not accessing a document visually, they're accessing either using a screen reader, listening audibly to the content of the computer, maybe using a refreshable Braille device and PDFs have historically created a huge barrier for them. They can't access the document at all or they can't access it in a way that is really useful unless it's created in a certain way. So that's what I'm going to focus on today is how we go about creating PDFs that work for those individuals and they'll work for everybody and have a variety of features that are going to allow them to be used by everybody. Also, there's an interesting finding that I put here, it's going to come into play a little bit later and that is that six of those PDFs out of the 20 were created with Adobe PDFMaker which is plug in that works in Microsoft Office and allows you to export to PDF. By default, that tool creates tagged PDF so I would have expected, you know, six out of six PDFs that were created with that tool to be tagged but only one was which suggests that the other five were had that accessibility functionality disabled before exporting. And so that actually is very easy to do and so I mention that because it is something we want to be especially aware of when we are going in and saving our Word file as a PDF there are a couple things we need to be cautious of to make sure we don't accidentally disable the accessibility after we've added it in. So, what we can conclude from that, what I conclude from that is that PDF accessibility is a major problem that very few PDFs are accessible not just to Purdue but everywhere. It's not difficult to create a tagged PDF. It actually is pretty simple and we're going to walk through that and look at a couple different ways to do that today. And because it is simple and there really isn't any reason not to do it, we just assume that people aren't aware of it. And so that's what this, you know, workshop, I'm here for four days and we're doing a lot of workshops to sort of spread the word that there are good ways to create PDFs and there are bad ways to create PDFs or ways to create PDFs that result in inaccessible PDFs that certain users aren't going to be access. So, let's focus on the not so difficult process of creating an accessible tagged PDF. And before we look at PDFs specifically let's think about documents generally, electronic documents that can be produced in a variety of different formats, maybe HTML, maybe Word, maybe PDF. HTML is really sort of the ultimate accessible format that has built into it a lot of the, well all the accessibility features that we need. And it's well supported by assistive technologies and by browsers, it's a wide open standard and so, you know, for many years people have built in support for it. And so we're going to look at an HTML document to kind of set the bar. That isn't what we were shooting for, this is what a fully accessible document looks like. And then we're going to look at how does that compare in Word and PDF as actually in Word and PDF all the same issues apply and it's possible to create documents that are almost equally accessible in those formats. So we just have to figure out, you know, what are the steps to do that? So what are we looking at? Well the most common thing that people think of when they think of Web accessibility and it's true of document accessibility too is that when you have an image or, you know, non-text content if somebody can't see that then how do they access it? So if somebody is using a screen reader or Braille device to read a document and there's an image then there needs to some alternate tags that sits behind the scenes and gets read to that user that describes the image or provides the content of the image to that user. So that's the old Web accessibility issue, they've been around for a long time and people sort of got familiar with the process if they're Web developers of adding an image to a Web page and then adding alternate text to that image so that it's accessible. Or you can actually do the same thing in almost every document authoring tool and we just need to be aware of how to do that. Also it's very important to have information, structure and relationships all communicated explicitly to the user so that if you have a document that where you have headings and, you know, certain relationships that are suggested by visual cues then that all needs to be communicated to people who can't see it too and so that all happens through the mark up behind the scenes. So, let's just look at an example of that. I have a Web page here that is an Introduction to Physics Course Syllabus, this is a fictitious syllabus that we put together for fictitious university called Accessible University. And as we look at this page what just sort of jumps out of you in terms of the content visually and the structure visually? How would you describe this page, what are it's parts and how do they relate to one another? >>The table with the content inside the cells. >>Okay, so it does have a table, rows and columns of data. Actually that's two tables, there's another one down at the bottom. It's a bit more complex in that it has nested rows and nested columns. What else? >>You obviously have several sub-sections underneath your overall intro [inaudible]. >>Okay. >>Main title and different sub-sections used separated by bold text for those who can see it. >>Right. So, you might call those headings. We have our main heading that is the title of the document and then we have sub-headings that mark the top of each of these different sections. And so and then, you know, with any section underneath each heading we have content. But like you say, we know this and because it's very bold in the case of the top level one it's centered, there's a little bit of separation between those headings and the text that follows. So not only do we take all those visual cues in and we think okay, so there's a heading with sub-headings, there's a, you know, we understand the relationships between these tables of headers and the columns that are underneath them because of there physical proximity to one another. So, somehow we need to communicate all of, to all of this to somebody that can't see those visual cues. We also have an image, the Accessible University logo up at the top. That's just an image, there's no actual text there so we need to make sure that that has some text behind the scenes again that gets passed on to somebody who can't see. So let me launch JAWS. This is the most popular screen reader in the United States and I'll just give you a sampling of how JAWS interacts with a document like this. [ pause and JAWS reading ] Okay, so we're at the top of the file the first thing it identified was the Accessible University graphic. So it tells us if that's a graphic but it also was able to read the alternate text. If we didn't have any alternate text there it would not be able to say it's Accessible University graphic. It would probably read the file name and maybe the directory path that goes up to that file name and that ends up being just a lot of noise and really is not useful information. Mainly as a screen reader user opens up a document they want to kind of get a sense for how that document is laid out. What, you know, what are the parts, what are the sections, what's the organization of the document? And they can do that using the headings, there's, JAWS has a function key just with the letter H I can jump from heading to heading to heading within the document and get a sense of how the document is organized. So I'll hit H [JAWS reading] so now it tells me what the heading text is but that it is a heading level one, so I know it's a top level heading. [JAWS reading] and so I know that is one level down, textbook, [ JAWS reading ] And so, I've wrapped it's around so it's been all the way through the document and I now understand it's outline and now I can go back in to find the content that specifically meets my needs. So let's say I have an interest in course objectives, I know where that is now so I can quickly get to it [JAWS reading] and then I'll hit my down arrow to go into that section and start reading the content that's in that section. First thing it tells me is that it's a list of three items, that's because this actually is marked up as a list as opposed to just being, you know, paragraphs of text with some kind of decorative bullet in front of them. It's important to use the mark up that we have and the features that we have available to us to communicate lists as lists and tables as tables, headings as headings, all of that needs to be explicitly identified so screen reader users have access to it. If I jump down to the end of the document there's a sentence down there that is in French. This is an English speaking screen reader but it turns out it actually is bilingual, multi-language it speak French, Spanish, Portuguese and I think some other languages. If we jump down to that [JAWS reading] it does a pretty job with the French. And the only reason it does that is because this is marked up as a French paragraph that within the HTML code we have said the language in this paragraph is French. Otherwise it would read using the English pronunciation engine and that's not going to be at all decipherable. It's just, it's going to be something very, very strange. So that's another kind of thing that we need to, we need to have that mark up behind the scenes that tells it that this is French, everything else is English. I'm going to from where I am now I'm going to go up into that table and just you a sense for what navigating a table can be like [JAWS reading] so I'm entering at the bottom cell, I go across the bottom row there. But with tables we've got several headers and we've got a row in this case of table data and multiple headers that apply to each data cell. So like this cell here, exams and one both apply to this header cell. And over here projects, two applies to this data cell. If we didn't have the specific mark up within this table that established all those relationships then a screen reader user would just hear homework, exams, projects, one, two, final, one, two, final, 15%, 15%, 15%, 15% and they're supposed to try and make sense of all that with no explicit relationships communicated. So it really is going to be an impossible task and imagine it is a very small table and imagine an equally complex but larger table, it really requires having that accessible mark up in place otherwise there's no access. So this one is marked up to be accessible and I can navigate it now backwards [ JAWS reading ] So with every table data cell that I enter it knows which headers go with that cell and it verbalizes those. So I never get lost navigating through the table like that no matter how big or how complex it is. So that's just kind of a quick overview of some Web accessibility basics. The two again which I will stress because I think they apply to most documents are images, make sure you've got alternate text, headings make sure you have an explicit headings structure identifying what the top level heading is, identify what the secondary headings are, maybe you have a third, maybe a fourth level of headings if you have a really long, deep document. And those all need to be identified as such so that a person can get the outline and understand the structure of the document. Just by doing those two things, headings and alternate text you increase the accessibility of a document expedientially. Not just HTML but as you're going to see also Word files and PDF documents. So, [JAWS reading] close JAWS [JAWS reading]. So let's turn our attention to PDF. We want to be able to do what we just saw, it's possible in HTML, we want to be able to do the same kind of thing with PDF. PDF is created by many, many tools. If we were to do a survey here I bet there will be probably at least five or six different tools that you all are using to create PDF, many more then that. And within those tools you have a lot of different techniques, lots of different approaches to creating PDF. Most tools that can create documents these days can print to PDF. The result of printing to PDF is not an accessible document. So, we need to find a better way to create PDF from our authoring tools. There are in general three types, one is an image, you've scanned a document and then you save it as a PDF without doing anything further to it. That's just an image and there's not going to be anything there at all that is readable to a screen reader user. The second type is where you have an image but you also actually have fonts that are part of the document so that might be an image that was scanned in and then converted to text or it might be just a document that was created in a Word processing program or a desktop publishing program or some kind of authoring tool that resulted in an actual document with actual text. With that type which makes up most of the PDFs that are out there on the Web today, that type has no structure. It has text so the text itself is accessible and you can read it from top to bottom, maybe not in the right order but you can read all that text but there's no indication of how the document is organized, you know, there are no headings, there are no lists, no tables, it's just text. So it's not going to be very accessible unless you just have a document that is very linear with no structure. So the third type is a type that was introduced for accessibility. It was introduced in the early 2000s in response to Section 508 of the Rehabilitation Act which required the federal government insure that its information technology was accessible and as a result of that the federal government wasn't going to be able to use PDFs. And so Adobe jumped on its side and said, well let's fix the situation and they came out with a new tagged PDF format that has an underlying tag structure that's actually very similar to HTML that allows documents to have the kind of structure that we need so that headings and lists and so forth can all be explicitly identified. So it's been around for about a decade but we still don't have a whole lot of people adopting it. Part of that is because the authoring tools have not been there that support it and that still is true today. But that's almost, well it's a requirement at some level that to create a PDF from scratch that is accessible we need to use an authoring tool that supports it. I'm going to flash ahead a couple of slides. The one that we're going to be using is not on this list because it's the one we're focusing on primarily on but that is Microsoft Office. They have supported it for a long time. These tools also are able to produce tagged PDF, InDesign, Open Office, LibreOffice, Word Perfect, Lotus Symphony. There was some discussion on the w3c wai interest group list just a few months ago where people were trying to come up with a comprehensive list of tools that support tagged PDF. And so there are a few other products that were mentioned that were maybes but people didn't have easy access to them and couldn't confirm it. But so, this is kind of the, you know, the for sure list that these tools all support tagged PDF. They don't all do it equally well and there are some issues we're going to see with InDesign that some extra things that need to be looked at in order to produce an accessible document. So, if we start with an authoring tool that supports tagged PDF and as we're authoring documents there are a couple things we need to be attentive to and just those things we've talked about. It you've images or anytime you put an image in add alternate text to it. And 99% of authoring tools out there support that. So anytime you add an image you can right click on it, go to format picture and add alternate text. It only takes an additional five seconds or so to do that with most images. We also need to add headings and sub-headings. Make sure that you're using whatever your tool provides to identify that if something is a heading it is explicitly marked up as a heading, not just big bold centered text, it is a heading one or a heading two or whatever the case may be. And after, if there's other structure, if there are lists you use the built in features of your tool to create lists. If they're tables use whatever your tool provides to create tables. And then export that to a tagged PDF and you will have a pretty accessible document. And once you've learned how to do this always do it because it really doesn't take that much time to just do it right, use the features that are available. Always make your documents accessible even if you don't know, you know, that somebody is going to need this document in accessible format, it's just a good habit to get into and you never know that maybe at some point this document is going to handed out to somebody who's using a screen reader and needs access to it. And that at point it's going to be a lot harder to go in and retrofit and make it accessible then if you'd just done it accessibly from the get go. So I've got a couple of screen shots of what it looks like in Word to add alternate text to an image. Basically you just right click on the image and go into format picture and then type in your text. And to add headings in Word you just select the heading text, click on a heading style on the ribbon, H1, H2, heading one, heading two, heading three, it goes up to six. And that's pretty much all there is to most documents until you get into some more complex stuff. So very easy, the simpler the document the simpler the process and as you get into more challenging stuff, scientific notation, then the complexity escalates. But for most documents it's pretty, a pretty simple process. So let's actually walk through the process. On your desktops you now have a folder called PDF workshop and that may or may not have another folder inside it called PDF workshop but keep going in until you find the folder called syllabus. And within that you'll find a syllabus Word document, go ahead and open that up in Word and we will look at how to make an accessible document within Word. [ pause ] How many of you are using Word as part of the process for creating PDFs? Okay, most hands go up. What version, 2010? >>[Inaudible]. >>Two thousand seven, okay, the process is a little bit different and we'll talk about the differences. Anybody's in 2003? Nope, good. >>[Inaudible]. >>Yes, how many Mac users? Okay. So we'll talk about all those differences and differences make complexity, which is good. But it's just a few more things to remember. So the two things we wanted to do, always probably want to do in almost every document if you've got images make sure you've got alternate text and then make sure you've got heading structure. So for images, right click on the image. Now this is, we've already met our first difference between versions and every version of Office since the beginning of time to 2010 at this point you'd go into to format picture. The one exception being Office 2007 where they changed things and you would then go into size and position to change to alternate text. And that doesn't make any sense in that context but that's where they put it and then in 2010 they realized their mistake and put it back where it had always been which is format picture. So what we have on these machines is 2010 so you're going to format picture and then go into alt text and then enter your description into the description field. >>What's the difference between title and description field? >>That's a good question. And this is another difference between versions. Prior to 2010 there was only one option just called alternate text and that's where you'd type in your description. Now in 2010 they've introduced a new field so you've got two choices now. They describe the difference here but don't believe that. It's not true. It's not implemented as they have described it. They have actually kind of a good idea here and a good vision for how you can like theoretically use title and description but it's not supported that way and title, if you use title at all, there's a pretty good chance it's going to get lost at some point. If you ever open this up and in an older version of Office or if you export to HTML or other formats, the title's going to get lost. So just use description. >>So title's almost ignored [inaudible]. >>Right. >>What fun. >>So, then it's a question of what makes good alternate text? Generally you want to be short and sweet. If you've got text content in the image you should provide, certainly provide access to that. So in this case we have Accessible University. We should say Accessible University. We might also say logo if we feel that that's an important part of what needs to be communicated but it's kind of a judgment call. Short and sweet is generally the rule. Yes? >>When the text reader reads that image is it would be alt text [inaudible] is it going to notify the user that it is an image that's being read, the alt text is being read? >>Yeah, well it will say graphic. >>So putting Accessible University graphic would be redundant? >>Exactly, yeah. Picture of, that kind of thing, image of, those things you generally want to avoid although we're going to see an example in a little bit where we might, you know, occasionally, you know, know the rules first but then know when to break them as well. There's also kind of the back drop there that are symbols that might have some significance. Five different symbols that represent different disabilities so one might want to, you know, describe the logo. If you think about the Purdue logo what it looks like and, you know, maybe there's some significance to that or if there were like the Purdue Seal, something like that that has some significant detail. You probably don't want to describe that on the physics syllabus because it isn't relevant to the content. So you kind of have to view the big picture and think what is this image contributing to the current document? And how much detail do I need to provide in order for it to contribute to the current document? If somebody opens this document looking for information about this course they're not probably going to studying the details of the logo. But if that same image were to appear on a, you know, history of the Accessible University logo page then it would be more relevant in that context to provide more detail in your description. But the longer your description is the more information the user has to sift through and the slower the process is going to be if they're reading the document so again, short and sweet. Okay, headings. So this first one would be a heading number one and when you select text in Word you can tell whether it's a heading or not just by looking up at the ribbon. We've got heading one, heading two, heading three buttons but none of those are selected right now. They don't have a little box around them so you know this is not marked up as a heading but it should be. So we'll go up here and we will first because I want my heading level one to look like this, the first thing I'm going to do is right click and update heading one to match the selection. That's kind of a subtle change but there are some maybe some margins that changed a little bit and so now this heading is a heading level one and my heading level one's changed so that it looks like that. Do the same thing with heading level two. We'll go up here and update heading two to match selection, that automatically converts this to a heading level two and then I can go down and I can select all my other heading level twos and just click that button and that turns them all into heading level twos. >>Can you select them all at once [inaudible]? >>I don't know how easy it is to select non adjacent things in Word but I guess you can by doing, you select one and then control... >>Hit control. >>Control, select the other ones and then select. So yeah that will, if you had a long document that would certainly be a time saver. And as a whole having these as marked up as I had this same type of heading you may already know this but it is a huge time saver for you because if you decide at some point, I want all of those heading level twos to be green, then all you have to do is change it here within this box. [ pause ] So and that effects every heading in the entire document. You don't have to go to each one one by one. So, it's great for the author but then it's also great for accessibility. Let's see, what else do we need to do? So that might be it except that we have tables. So we want to make sure that those, that that first row is a table header row. So, assistive technologies know that that row contains the headers. So select that row, then we right click, go to table properties, go to the row tab and the check boxes says repeat as header row at the top of each page, I'm going to check that and click okay. So if you think about that what it's suggesting isn't really an accessibility thing, it's that if you have a longer table that's going to expand multiple pages you're going to need to have that top row appearing at the top of every page or else it's just going to be really hard for everybody to read. So that functionality of identifying that as the table header row benefits anybody who's reading a multipage table but that same feature also is used by assistive technology to understand the table structure and to communicate that to users. You do the same thing with the second table except this time you'll be selecting two rows instead of one. [ pause ] We also have another image down here on the lower right. With any image we have a choice to make. Do we need to add alternate text or not? And that hinders on whether it is informative or decorative. Does this image have any informative value for the content of this page? >>No. >>Nope. Anybody think it does? >>Hard to say. What is the sentence there? >>Well the sentence may or may not be applicable. This sentence says, I don't know, does anybody speak French? >>[Inaudible] the image? >>No, they're separate. >>Okay. >>[Inaudible] this program is available in French [inaudible]. >>Okay. Pretty good French or is it grammatically incorrect? >>Yep, it's been several years since I've [inaudible]. >>It's [inaudible] it may or may not be good French. So, yeah this is like a molecule or it's a symbol that is often used to represent physics and, you know, may suggest that this course is, you know, officially sanctioned by some international physics governing body or something. But it may not and, you know, it's up to, kind of up to the author or, you know, whoever's working on this to sort of make that judgment call. It is positioned on the very bottom of the document that would suggest that it's of lesser importance. So, reading it without alt text is probably okay. And if you're faced with thousands of PDFs that you have to make accessible then adding to alternate text to that one extra image, you know, is just going to slow you down and you've probably got better things to do. Yes. >>On images though that we decide we don't need alt text, is it best to just leave it completely blank or is it best to put something in there that identifies it as intentionally left blank? >>That's a good question and it differs for HTML than for PDF. So in HTML alt text is actually required by the HTML specification and so on Web pages you need to add alternate text for every image and if it's a decorative image that would be a blank alternate text, it would just an alt quote quote. That's what screen readers use to know that this image is something I can ignore otherwise they don't know that so they'll try to read something, you know, the path and the file name and that just is awful. So, for and ultimately what we're working toward here is PDF but even in Word screen readers, I don't think there's any screen reader that will read and untagged image in Word. It's just going to ignore it and so and the same thing is true when we export a PDF. [ pause ] Language, if we as we noticed in the HTML version of this the French text was read with a French accent and so in order for that to be possible we need to go select that language just like that paragraph, go to review, select the language button, choose set proofing language and then identify which language it is. >>[Inaudible]. >>You've got lots of variations on French to choose from. [ pause ] The only thing is after you've done that that particular piece of information does not get passed to the PDF file when you export. So you might choose depending on what your work flow is, you might choose to not mess with that, it's just going to get lost. But if this is a Word file that later you might go back in and edit some things and then create a PDF, when you do it the next time maybe they will fix that bug and language will get passed on. So I think it's probably a good idea to just go ahead and do it and it may come in handy in the future. Or maybe somebody will read this Word file directly in which case having that mark up will make a difference. Yes? >>Back to tables, is that you had to do to those tables? >>That's all we can do in Word. So, because of the complexity of this table we're going to have to do a lot more but Word doesn't support it so we're going to have to do that in the next step. >>So as far as screen readers are concerned, this document is not Web accessible because of the complex table [inaudible]. >>Exactly, yeah. >>[Inaudible] about that at this point. >>Right. It's almost there but that complex table is, there's a challenge. >>A little too much. >>Yeah. >>Okay. >>And that's why I really, even though I'll be talking about PDF and, you know, I'll talk a lot about how do you make PDF accessible? My position is that you should try to avoid PDF if you can in favor of HTML, that whenever possible use HTML to deliver content because it is capable of doing all this stuff. PDF is also capable of having all the stuff we need for this table to be accessible but it's been there in HTML for a long time. And then when we get into math and scientific notation that's going to be even more applicable there because you've got math and [inaudible] which integrates into Web pages into HTML pages doesn't integrate into PDFs. But that's another, we've got an example actually in a little bit. We can talk some more about that. So once we have marked up our document, we've made it as accessible as we possibly can here in Word, then it's time to export to PDF. If you have 2007 then this isn't built into Word, you have to have a plug in. And Adobe PDF Maker plug in is free if you have installed Adobe Acrobat after Office then probably already you have an Adobe menu in Office, in Word and Excel and PowerPoint and that's, that is the PDF maker for them. So you would go to the Adobe menu and export PDF from there and it will do the right thing, it will export to a tag accessible PDF. If you don't have that then it's a free download from Adobe so readily available. And Microsoft actually makes a plug in that does the same thing but it's called something different. So save as PDF or some variation on that. So there are a couple of different sources of plug ins that will work in Office 2007. If you have 2010, it is built in, file, save as, select, PDF is one of your choices. And those of you who raised your hand and said you're using a Mac, this is not true for you. It actually can save as PDF but it will not create a tagged PDF. It only does that on Windows. So... >>But I have Windows and [inaudible]. >>You have Window in, on you Mac? >>Okay. >>Yes [inaudible]. >>Yeah, excellent, you're set. >>[Inaudible]. >>This is what, this is my work flow from my own personal PDFs that I create. I create from Word, do everything to make them accessible and I'm working in the Mac just because it's the environment that I've kind of grown most comfortable with now. I've got most of my stuff in there and then save my document to a folder that I can access in VMWare Fusion, so I've got Windows running on my Mac and then I just switch over to Windows, open up Word and do the final step saving as PDF from Word in Windows because Mac isn't going to be up to the challenge. So kind of silly because in so many other ways the products are, you know, well I guess they're not really the same but there's a lot of similarity between them but that, those, something that they, well they dropped the ball. So here in Word for Windows when we go to save as PDF if we click on options there is a key check box that is really important here, document structure tags for accessibility, it is checked by default. But this is the thing where the six documents that I pointed out that were created with PDFMaker where it should be checked by default and they weren't accessible, so that means that was unchecked at some point. Now probably people didn't go in and say, I don't want my document to be accessible and uncheck that. I think it was something they were totally unaware of and I suspect it was this little button right here that's pretty close to the save button so it'd be logical to click it. Optimize for standard is checked by default but if you want to minimize the size then if you do that now and go back into options you'll probably find that it has unchecked accessibility check box. It didn't do that for me because I had already unchecked and rechecked it and said no as I really mean business. But, that's something, in turns a lot of stuff out in order to minimize size, that's one of the things it cuts out is the accessibility tags and so you have to put them back in. You can still have a relatively small sized file and it will be a little bit bigger by adding accessibility tags back in but it will still be smaller then the standard. So if minimal size is really important then go ahead and click minimum size but be sure you go back in and turn accessibility back on. And if minimization is not that important to you, don't worry about that, keep it on standard, maybe check to be sure that your accessibility check box is checked just to be sure otherwise all the work you've done to make this accessible is going to be lost. [ pause ] So, go ahead and save it as syllabus. That will automatically open it and Adobe Reader after you're done. We actually don't want to work with it in Reader, we want to work with it in Acrobat Pro which is where all the accessibility editing features are. So we want to kind of explore what we can do in Acrobat Pro. So go ahead and exit out of that Adobe Reader when it pops up and if you just go down to your start button and type in Acrobat and Adobe Acrobat X Pro will pop up. There are a couple of them, couple of, it will appear a couple times, one with a version number after it, don't click one with a version number, click the one that just says Adobe Acrobat X Pro and that's it. And after you've opened Acrobat Pro then go to file, open and navigate to wherever you saved the syllabus file. [ background talking ] Okay [ background talking ] So, let's look at our handout, front page the PDF Accessibility Repair Workflow. What we've done so far is created an accessible document from Word. So if we're using Word or creating a document from scratch, that's basically what you do. But most of our documents are probably coming from other sources and so we need to know how, what do we do with this? Here we've gotten it, it's not accessible, what do we do? So, most things you just follow this workflow, ask the questions, respond to the questions and you'll end up with a document is more accessible then the one you started with. This document, the syllabus we just created and we know that it is pretty accessible but we're going to use it anyway to walk through these steps and then we'll look at some other documents after we're done with this one. So question number one, does the document have text? How do you know whether this is a scanned document or whether it actually has some text in it? >>[Inaudible] text. >>Okay, just take your mouse, highlight some text. If you can do that, it's got text. You could also do control A, that selects everything, then you can select all the text. If it's just a scanned image neither one of those techniques will work. If it's just a scanned image then what you do next is click on tools, go to recognize text and then select in this file and it will look at that image, perform optical character recognition or OCR and convert that image of text into actual text. And it does a pretty good job if it's a good scan, if you get the original, it's very good, crisp, easy to read and the scan is good then it will be okay. But, you know, sometimes not so good, just sort of depends on the quality and it will evaluate its confidence level as it does the scan and for words that it feels like I'm not 100% sure that I've got this right, it will ask you to check those and then you can go through and edit and, you know, make changes if it got some things wrong. So we don't need to do that in this case because it is text. Step two, is the document tagged? You can find that out by going into file properties, the shortcut is control D, the shortcut on a Mac is Apple D. And when we're in document properties the left most tab is description and all the way down at the bottom of that tab is a field that says tag PDF? Yes in this case. In most cases it says no. It also tells you what application was used to create this PDF and in here it says Microsoft Word 2010 and we knew that but that's, can be very helpful information. If you've got a PDF that just is really flaky, there's some weird things going on with it, sometimes it helps to know that that was created in InDesign or wherever the case may be because there might be certain things that need to happen to make that document accessible. While we're here in document properties, let's jump all the way over to the advanced tab and we are jumping on the list then down to number seven, there's the language of the document defined. The language of this document is English and at the bottom of the advanced tab there's a choice, you can select the language from there if it isn't already selected. So that's, sorry? >>If it's not selected it would be blank? >>Yeah, but it should still work. Does it work? >>Well, yeah, I'm just saying, I mean ours is set in English. You're saying if it's not chosen by default what would it look like? >>Yeah, it would just be blank. >>Okay. >>Okay, step number three, does the document need to touched up? Just because it has tags doesn't necessarily mean it's accessible. So, click on tools, among my tools I have an accessibility tool, you probably don't have that yet. Go to view, tools and accessibility and that will then add that over to your tool collection. There's a lot of stuff in the accessibility tool kit that you can use. The one we're interested in at the moment is actually well, if this had not been a tagged document you would have selected add tags to the document because that's the foundation, we have to have tags in order to make it accessible and so we would have selected that and it would then automatically go through and create some sort of tags, tag structure that we can work with. Since that's not necessarily necessary on this document what we do want to use now is touch up reading order, let's go ahead and select that. And it then shows us the order in which content blocks are going to be read in this document. See anything that is wrong with this order? >>[Inaudible] the image blocks. >>Okay, it's a banner image that appears at the top of the page that should probably be read first. So to change that, click on the show order panel button that's down near the bottom of the touch up reading order dialogue that popped up, it's a show order panel. That results in a list appearing over to the left. Each item on the list is numbered in accordance with how it's numbered in the document itself. Image is number six, just grab that, we move it up to the top, just a click and drag process. And then everything is sorted properly after that. [ background talking ] Another thing we can do here from the touch up reading order box is add alternate text if we needed to. We don't in this case but we will in a future example. >>[Inaudible] ignore the graphic at the bottom part of the page? >>Yeah, we didn't add alt text to it, right? >>Right, so it's, I see. >>So it's not part, it's not even numbered. >>Right. >>It's just background. >>[Inaudible]. >>I don't know that it necessarily will always do that without alt text. I think in this case it's small and has no alt text and that has an influence. So close touch up reading order, close the order panel and we move on to number four, are headings marked up as headings at appropriate levels? For this we actually are going to get into the actual tagged structure, we want to study the tags to see, you know, whether the tags that were delivered from Word were delivered appropriately. And for this I will luggage tag over in my left margin. Again, you guys probably don't have that yet. If you go to view, show/hide navigation panes, tags, then that will give you that luggage tag. [ background talking ] Once you have the tags panel open you can just click on the plus signs and drill down into it and see the tag structure. Those of you that are familiar with HTML will recognize this as being very HTML like. Okay, once you paragraph tags, H1, H2s, list is an L so that's a little different then HTML, the table has table rows and table headers inside it, table data cells, so it really is patterned after HTML. So what we're going to do is specifically check the headings. Make sure that headings have good heading mark up. So if we just grab like this first heading, Introduction to Physics, then the little icon underneath the word tags over on the tag panel is a menu and there's an option to find tag from selection. So when you have something selected in your document you can select that, the option, and then it will open up that specific tag. And then you can see what the tag is that contains that content. And it is an H1, sometimes it will actually spell it out, heading one, but that's exactly what we're looking for. If we do the same thing with Course Objectives, content the selection, it is Course Objectives, yes it is an H2 which is what we were hoping. So we said that the language did not get passed. So we need to go to that paragraph, probably you don't need to find tag from selection in order to find that French paragraph because it's the last paragraph on the page, you can probably just look for it. But you can find it with find tag from the selection two. There it is. So to make that French, you right click on the tag, go to properties for that tag and one of the options is language. So you drop that down and you select French. And there's only one flavor of French according to Adobe unlike Microsoft who feels that there are 12. [ background talking ] Okay, sometimes when you pull up the dialogue box for changing properties of a tag everything's grayed out and it doesn't let you change anything, that sometimes happens when something is selected. And so after you have found what you're looking for then unselect, just click off of your selected text and then right click the properties and then everything should be available to you at that point. [ background talking ] The fifth item on the list, does other mark up need to fixed? You just kind of poke through the mark up, check a few things, look to make sure that the content is tagged in a way that makes sense. So if you have a bizarre tag like sometimes things show up as parts or as no space or other things that just don't seem to make any sense given the content then you can change that to whatever would seem to be appropriate. Mostly what you're concerned about though are headings and then, you know, lists should be identified as lists and the items in those lists should be LI, List Item. If you want to see all the possible tags you can go to the properties for any tag and just click down or just type, click and take a look at all the different things that you can identify a chunk of content being. These are all the possible tags. [ pause ] We didn't talk about data tables. How many of you are really hungry for knowledge about making that complex table accessible? How many of you, just raise your hands are sure? Sure you're sure? >>[Inaudible]. >>Okay. That actually is, we'll take a step back, that is something you can accomplish in touch up reading order. Touch up reading order is bit of misnomer because it does a lot more then just reading order stuff. One of the things it does is table accessibility. Let's start with the simpler first table. You right click on the table and select table editor. And your table then may not look like mine. Does it have THs and TDs? If it doesn't then right click on the table anywhere and select table editor options and there's a option down at the bottom that says show cell type, TH or TD. You should check that if you want to see which ones are THs and which ones are TDs which is part of the accessibility package. So those table headers need to be THs and so that's why we want to be able to see that. [ pause ] How many of you do Web design or are familiar with HTML? Okay, most the hands are going up. So the THs, that's all, that's not new to you, TH versus TD from [inaudible] format. What about scope, hear about the scope attribute? One accessibility principle with tables even in HTML is that a table header if you get to cell properties, it should have a scope attribute. In HTML the scope equals row or scope equals COL for column. And that without any doubt tells the screen reader that this is a column header. So like right here, where in the far left heading, it could be a heading that applies to the row or it could be a heading that applies to the column so we want to remove any chance of that being misinterpreted. So you say scope equals column for each one of these. Go to table cell properties, scope equals column and on the last one, scope equals column and you're set. This table is now fully accessible. The one on the bottom is our next challenge, it's more complex so the solution is more complex. Go to the table editor again by right clicking. What do you see as you look at the THs and TDs on that second table? Do you see any sort of problem visually? >>Top row's messed up. >>Yep, top row's messed up. So that export from Word didn't do things quite right. Not a huge problem. If we can sort of study that it's just that this TH is spanning five columns and this one is spanning one. So what we need to do is expand the number of columns that the second one spans. It should be three column wide header. So we're go in, just right click on that TH and everything you see here, all the options in this dialogue box are straight out of HTML. It's got row span, column span, in addition to having scope so we want that column to span three columns for that table cell to span three columns and that then fixes the problem. It pushes the other table header over here to be appropriate, so. We also have an issue with the homework cell. It's got the two table headers there that should just be one. We can delete one of those. [ pause ] I think. [ pause ] Actually I'm not sure how to do that. Maybe if we say that this spans two rows, yeah okay, so click, right click on the first table header in the upper left and say that that spans two rows and that makes the second one go away. So now we have our table mark up is correct but it's not fully accessible yet. The thing we have to do, this is true then for HTML and it's true for PDF, that when you have nested columns and nested rows these relationships get too complex for screen readers to understand automatically. So even just to say scope equals column, you know, this is scope equals column, this is scope equals column but you've got multiple headers that are scope equals column and when it gets to this cell it really doesn't know which headers belong to that cell or which ones are associated with that cell. So there's a two step process that needs to take place. First, every header has to have a unique ID. So you have to go through your table and assign IDs to all of those headers. And so why don't we just do that real quick like. The first one we'll just call it homework [ pause ] the second one, the one that says exams, we'll just say exams, sorry. And when you get into the cells that just say one, two, final and one, two, final, since you have to have a unique ID for every header you're not going to be able to call those one, two and final because that would be, yeah, it's going to be repeated twice. So you might call those exam one, exam two and exam final [ background talking ] So after you've added a unique ID to each of your table headers, step two is to go into each table data cell and select the headers that accompany that data cell. So for the first one, table cell properties, you just click the plus sign, drop down the list of IDs that correspond with all the headers. The first one is homework, the second one has two, you select exams and you select exam one. And so you would do that throughout for each of the table cells just go through and select the IDs that correspond. In HTML what that would be doing is that using the headers attribute in each of the IDs and separated by space as the value of the headers attribute. So that can be a lot of work. This is where HTML comes in handy. If you've got a database with all your data on the backend and all you have to do is create a template that is going to be able to create the IDs automatically, create the headers attributes automatically so you don't have this manual process. To do in PDF, I don't of anyway to automate that or descript that, so, it is kind of and use this tool to go in and manually add those tags. >>[Inaudible]. >>I would say in HTML, yeah, but in PDF, I'm not aware of anything. >>[Inaudible] this is a software problem. Instead of recoding tens and thousands of pages of stuff that somebody should build a product that does all that. >>Yeah. >>Makes it readable, change the reader if need be. >>Well then there's also the question, why can't assistive technology figure this out? The relationship between this, this and this seems pretty intuitive to me just based on col span and row span attributes. So I think if maybe you had a smarter assistive technologies, you know, that would be a solution too because it is a lot of work to get and that's a small table. Imagine, you know, how much bigger [inaudible]. >>[Inaudible]. >>Yeah. So those of you that didn't enjoy the data table exercise then, you know, which people to blame. But it is something you might encounter in a real world PDF, so you to kind of, you know, experience that. But again most of the stuff you're going to be working with is just, you know, relatively simple. Let's, we've got about a half an hour left, let's try and squeeze in a couple of other examples just to get some variety. Close this file and open in your PDF workshop folder there's a subfolder called PDF docs. Let's open example number one, Conifer Dieback. [ pause ] And then refer to your workflow and just sort of step through the process, see what you can find related to this document. If you get to a step where things just sort of fall apart, then shout, let us know what is really wrong with this document, if anything. [ background talking ] Anybody find the point at which this really breaks down? >>At the beginning. >>The beginning, it has text. >>It does but all of the readers are pointed, I mean there's not alternate text for all of the images and it starts with number one which is on the bottom left hand corner at each... >>Okay, so the reader order is messed up and there's no alternate text for images. So that's not huge, there's a lot of problems here but they're fixable. So if there are images without alt text, you can add alt text right? Right click on any figure that says figure, no alternate text exists, right click on it, edit alternate text, type in appropriate alt text for that. That picture in the upper left is kind of hard to see with the extra stuff over top of it but it is Purdue University. [ pause ] What about figuring number four? Is it informative or decorative? >>Distracting. >>Distracting. >>[Inaudible]. >>If you feel that it's informative you need to describe it in which case you would right click and go to edit alternate text. If you feel that it is nearly decoration then you click the background button and that image is then just background now and screen reader users and you don't have to mess with it anymore. >>Do you do a spell check for alternate text? >>No, no they don't, not here. >>Because I just [inaudible]. >>Click the background button. Click it for a selective, so select the object you want to make part of the background and then click the background button. Yeah, if you misspell alternate text then when you listen to a misspelling that's when you certainly realize you made a, there's a misspelling. >>What [inaudible] Purdue so it wouldn't say [inaudible]. >>How do you spell it? >>Peru. >>Oh, I think you're not the only one. I'm sure, you should ask a screen reader user sometime. They probably run into Peru all the time. What about figure 13? The photograph, decorative or informative? >>Informative. >>[Inaudible] it's informative if someone can see the picture. >>Yeah? >>If you can't see the picture then I don't know how describing a picture of dying tree is going to be helpful to a person who doesn't see what a dying tree looks like. >>Is there something informative about that dying tree or is it just sort of supplementing the content and maybe everything we know about Conifer Dieback is in the text and the photo doesn't contribute to that knowledge, or does it? >>So then you just take that out then for? >>We could if we choose to get rid of it, we just click background and the photo is now in the background and it's no longer anything. But, and I was actually prepared to do that earlier until I noticed that it does beneath it it has a caption. So then we have to do a caption too. And the caption actually gives us a quote to the fact that this isn't just a backyard photo of a dying tree. It is telling us that Spruce Dieback, it's showing us Spruce Dieback caused by site stress. So we know from this photo that site stress causes Spruce Dieback which might be significant information and I'm inclined then to edit to alternate text to that photo to match the caption. Spruce Dieback caused by site stress. And I said earlier that you don't want to say, you know, picture of or image of generally because the screen reader will do that for us. It's going to identify this as a graphic. But maybe we do want to say that it's a photo because just saying it's a graphic doesn't really tell us what it is. It might be a photo, it might be a sketch. [ pause ] So a lot of times no right and wrong for alternate text but maybe sort of evaluate what's relevant here. And there is no undo. >>Yeah, I tried to [inaudible]. >>It says it on your handout. >>Yeah. >>The notes, there's no undo. So then it's a good idea to save often. Now by adding that caption as your alt text though what problem have we introduced? It's redundancy. You don't want to say the same thing twice so what we can get rid of is the caption now. >>Even if I've already saved [inaudible]? >>Click on the caption text, make it part of the background. >>Okay. >>And now it's not going to be read. [ background talking ] Also, if you look down at the bottom of page one and the top of page two you have a footer. The footer has a little logo with a 800 number and something else go knowledge or I forget what it says. The header has the title again Conifer Dieback and Purdue Extension again. Do we have those informative in that context? Or does that just create an unnecessary interruption. If somebody's reading the page, they get to the bottom of page one they want to continue at the top of page two to read. They don't want to have to mess with the advertisement, the 1-800 number and a reminder of what they're reading because they probably already know that and they're reminded it came from Purdue Extension because they probably know that too. So I would take all that, put it in the background. Just minimize the distraction. And headers and footers, that generally is going to be true. Not footnotes because they're going to have extra information but just a footer, just a header that repeats the same thing on every page, it is really unnecessary information that can just be made part of the background for every single page. >>You just right click on that, tag as background, is that doing the same thing or something different? >>Then yes, it's the same thing. If you left click on it and then click the background button or if you right click it, tag as background, yeah it's the same thing. >>Got some changes [inaudible]. >>Right, yeah it shouldn't change its appearance. Everything that we're working with here is manipulating the tag structure, that's only going to affect how it appears to screen reader users, it's not going to affect the page visually. And I say in theory because every once in a while for some unexplained reason you delete a tag or you'll move a tag to a new position in the document and something on the page disappears. That's actually happened on this document in an earlier lesson, earlier workshop, somebody made Purdue University disappear in the upper left. The only thing I can figure out as to why that happens via control D and again, look under description, look and see what tool is used to create it. I've only seen that happen with InDesign documents. And I think, you know, it's, got something to do with how layers are all, you know, stacked and how that, I don't know. It's kind of a mystery and any document that I've seen that addresses this problem both from Adobe and from authors who've written about PDF accessibility, they say, you know, this is a strange thing that happens sometimes and save often and be prepared to go back to an earlier version when it does happen because you can't get that back once it disappears, at least not easily. Okay, so let's open up another document. Okay, close this one. Let's try example three, the math example. [ background talking ] Does this have text? >>Does this have text? >>Does it have text? Okay, so we don't need to do OCR on it. Is it tagged? [ pause ] No, okay so this is the first one we've looked at that it says tagged PDF no. So we can't make it accessible until we tag it. So we got to tools, accessibility, add tags to document and it may take a little while because it's a 43 page document [ pause ] but it does its best to interpret the structure of this and add tags to it. [ background talking ] Okay, when it's finished you use your report so you can look through that if you're really interested in what it's telling you there. It will clue you into some things that it might have had some trouble with. What I usually like to do, immediately after I have automatically tagged something is just check out the tags and see how good it did. And, bless you, what I'm seeing there is pretty much what I see visibly with this document, it's a lot of paragraphs. That is pretty straightforward, seems to have done a pretty good job with the tag structure. >>Well wouldn't we want the first one to be a heading? >>We would. So we'd want to select that particular paragraph, get a find tag from selection and then click off of the paragraph so that this will actually work, right click on that P tag, go to properties, select heading one. [ background talking ] It says select the text you want to change, just to find it and we don't really need to do that, it's the first paragraph, open up the first paragraph to see that that is in fact the Apology for the Proof, and then this used to be a paragraph tag, you right click on, one of the steps was click off of this because sometimes having it highlighted messes up the ability to change it, go into properties and select heading number one from the list. >>And what would you call that secondary one, would you call that a paragraph? >>The author? >>No [inaudible]. >>Yeah, I would scan the options to see if there's another tag or even if there's an abstract tag, I would use abstract tag but it is a paragraph. >>Okay. >>Now we're going to run into another problem with this document though. It seems easy so far, let's go to page 12 and we got some content that might make things a little more challenging then it seems. If you select that and go to find tag from selection you'll find that it too is a paragraph or actually a series of paragraphs but it doesn't know what to do with those infinity symbols, to put them in a separate paragraph as plus zero or plus O, one above and one below the formula and then everything else is really just a jumbled mess. This is where you really can't mark this up in a way that it's going to be meaningful. The screen reader's not going to be able to read that, it's just going to be gobbedy gook. What we can do is we can take the paragraph if it's all contained in a single paragraph, we can change that, unselect things, change that paragraph to a figure or to a formula, they actually do have a formula tag but the formula tag doesn't actually provide any sort of mark up that's going to help with rendering those or making it readable, it's just going to say, you know, this is a formula. We can add alternate text to that formula though and maybe for a simple formula maybe we can verbalize it in a way that an instructor might verbalize it. You have instructors writing this formula up on the board, they might have a certain mathematically correct way of referring to this formula and reading it so that it is intelligible. For a formula like the one we're looking at and certainly we're going to get much more complex formulas then that, reading it an alternate text really isn't a good accessibility solution. Alternate text isn't something that a person can study. It passes, they read it, they move beyond the image and it's gone, they can go back and read the image again but you can't really dissect it and understand its parts. So this is where MathML comes in. You've got a mark up language specifically designed by mathematicians to render math and to add mark up that explicitly identifies the structure of a mathematical equation so that it can be displayed properly on Web pages and so that it can be read properly by assistive technologies and you can embed MathML in Web pages but you can't embed MathML in PDFs. And so if there is some compelling reason to use a PDF for presentation of this document as a whole then the solution for those formulas I would say would be to refer elsewhere for the actual formula. So the alternate text could say, figure 12a or something like that, figure 12, and then there could be a supplemental document that accompanies this where somebody can access formula 12 in MathML or maybe a physical handout with Nemeth Braille which has, you know, Braille mathematical notation, you know, that gets distributed to people that need it, something like that. So the formulas get extracted, distributed in an accessible alternative document and the alternate text then would refer to those formulas or would some how, you know, keep it, allow user to keep track of where those formulas go within the text. That make sense? So that's as good a solution as we can get at this point within PDF since PDF doesn't support MathML. Design Science is a company that is working hard on math related solutions and specifically with an interest in math accessibility. They've been working with Adobe for years trying to come up with some math solutions within PDF and so far we're not there yet. So wish I had a better information on that front. [ pause ] Okay, it is about quitting time. What questions do you have? I'm happy to stick around afterwards too if you have something you're not able to, yes? >>I just have one quick question. I noticed [inaudible] are those pretty accurate? >>They're pretty good in terms of accuracy. Although on my Word file, I put that last, check for lingering errors because I don't particularly like their report that it gives and it's just my personal preference that I think it's kind of confusing and it tends to overwhelm rather then educate. And so I prefer doing all the things that you already know about first and then when you run a full check it will just say, you have no more problems or you have one problem as opposed to, you know, what you're going to get otherwise. So, I think that's, that's my recommended way to use that tool. Do use it but wait until the end to use it and that way you can maybe uncover some things that you haven't thought of otherwise. Yes? >>In your next session will you be covering the Life Cycle Designer for interactive [inaudible]? >>In the forms sessions which are tomorrow, I'm doing a couple of forms workshops that do cover both Acrobat Pro forms and Life Cycle Designer forms. The Life Cycle piece is mostly just to kind of show what's some of the accessibility issues are with Life Cycle just to kind of contrast the two. So not in a whole lot of detail if you're looking for like, you know, really specific Life Cycle stuff. That won't be there but there is some basic accessibility stuff related to Life Cycle. >>So when you use the Acrobat Pro can you make those forms interactive too? >>Interactive in a sense that somebody can complete them within the PDF itself. >>Okay, without going to Life Cycle? >>Yeah. >>Okay. See I was [inaudible] Life Cycle. >>So yeah, come tomorrow and we'll clear up some of those myths. >>[Inaudible]. >>And anybody, if you're interested in forms specifically we'll plunge into those tomorrow and I don't know what the process is for registering or can you register at this point or just show up? >>I think there was still space if you got your emails. I'm not sure if there was notification in the emails that would take you to links to sign up for the others or not. But if you're interested, can you give me your name and address [inaudible]. >>I do. One of my designers [inaudible] signed up and he said it was [inaudible]. [ background talking ] There's also an online evaluation that I've been asked to promote so if you could while you're [inaudible] yet, go to that address [inaudible] pdf eval [inaudible] 2011 and go ahead. >>I was going to say if you have [inaudible] this morning just hit refresh a time or two and it should take you to the [inaudible] eventually. But if not I have paper copy...