That’s it.

xp-end-is-hereWell, time to sign off. I’m not sure what the future holds. I am on sabbatical 2017-2018, so if SC200 runs next Fall, it won’t be me running it. Whether that hiatus becomes permanent depends what’s happening when I get back from sabbatical and what the College wants to do with SC200. There are opportunities for new types of Gen Ed course now. I could do something different. I can imagine interesting things to be done now trans-domain courses are possible; I quite fancy joint teaching a course with someone in history or philosophy or literature or economics. I could also focus SC200 a bit more (perhaps entirely on to medicine and health care?) or keep the same general themes and just find different material to try to stay fresh. All things to ponder on sabbatical. I can’t help think that the themes, objectives and subtext of SC200 will become even more important in the coming months and years. Science is going to remain the best system for knowledge generation and problem solving that humanity has, but it is also a hugely civilizing process. Looking forward from this tumultuous year, it looks like we will need that side of things more and more.

But for now, a big thanks to people who made the course happen this year. Thirteen others contributed. Thanks to:

  • The class TAs, Brian, Eric and Sarah. Huge efforts, well above and beyond. The students and I really appreciate your efforts to support the students’ learning and blogging.tas-2016-cropped-pul0aw
  • The guest speakers who volunteered their time and their very different perspectives, Eberly College of Science Dean Doug Cavener, Mike Mann (Meterology) and Jason Wright (Astronomy). Thanks for making us all think, me and the students alike.
  • Monica, who kept me sane while dealing with endless emails, handouts ready in the nick of time, and pieces of paper for attendance grades and extra credit.
  • The graders, five hard working grad students who got things done under immense time pressure and, most impressively and despite the best efforts of sites@psu and Campus Press, found most of the students’ work. Thanks too for taking the time to give the students such detailed feedback.
  • Chris Stubbs, TLT tech guru who got the site up and running and then put up with my frustrations at what he no longer had control over.

Jason Wright in class, Nov 8, election day.

And to the Class of 2016: Thanks for the challenges. Keep thinking. Have a nice life.

Self reflection

OK Andrew, what worked and what didn’t work this year?


  • Classroom discipline MUCH better. There was still some complaints about distracting whispering, but nothing like the last few years and none of the shameful s*** that’s happened before. None of it got me worked up, unlike previous years.  I think the difference this year was that I talked more at the start of semester about it, pitched it to the students as a deal (the deal works like this, you do x and not y and I will do a and not b). I also took a very light touch approach in class when it happened, politely asking people not to. That seemed to go better than trying to shame them into silence (and way better than getting pissed). I think walking up and down the aisles really helps too.
  • Plagiarism. Just one case this year. Broadly speaking, the countermeasures worked. Unlike the College Academic Integrity Committee, which continues to under perform in this important and delicate arena.
  • I did a lot more work with the students on soft skills. That felt good. I hope it set at least some of them up for a better College experience. No way to know of course, so like a good physician, I’ll roll with confirmation bias and assume I did good.
  • Again, no problems with the laptop ban, and progress on the phone issue.
  • Explaining to students more about why we are doing things was good.
  • Attendance algorithm worked much the same as last year. I think stick with that.

Room for improvement:

  • More active learning. Things like this (interestingly, that video is pitched as being about clickers, but in fact they are pretty irrelevant to that sort of teaching practice).
  • More debates. For example, is it worth testing homeopathic vaccines?
  • More challenge questions to stoke curiosity. Mini research projects?
  • Build a capstone class around John Oliver’s great show?
  • The students continue to find their blog grades disappointing, by and larger, but at the same time by and large don’t produce excellent work. Need to think more about why that is. Show more examples of best practice? Or am I fighting the Facebook/Twitter drivel. Perhaps students don’t read much good writing these days?
  • Talk more about cognitive failings of humans. Use the analogy with optical illusions more. Our eyes/brains can deceive us. Similarly we have cognitive defects too. Monty Hall good for this. Bring this out even more e.g. with the Linda problem. That scientists can use science to both study and (somewhat) over come these problems, even though everyone has them.
  • Talk more about note-taking earlier on. Do some examples — it worked really well this year when I reviewed (all across the blackboard) what Mike Mann and Doug Cavener had said, partly because I (hopefully) reinforced and clarified their messages and partly because the students could see what I mean about reviewing after class what had gone on.
  • The blog software. Groan.
  • Is there some way to motivate students to solve the procrastination problem? Do a class on procrastination science? That might be interesting to think about, especially since what I read does not make too much sense. It’s like a huge human blinder.

The big unknowns remain big unknowns. (1) Am I pushing the students hard enough or too hard? (2) What impact are any of my efforts having? I can imagine ways to investigate #2, though not easily. I still can’t even imagine how to investigate #1. That’s just not my problem. It’s a College universal. If I didn’t find my science so interesting, I might turn my scholarship activities to ruminating on that. I’ve made no progress on it since I first started pondering it in 2011. I haven’t seen or heard on anyone else even wondering about it, even though it goes to the heart of this huge, expensive industry we call Higher Education. Most industries reflect vigorously on what they are doing (or they are forced to by outsiders); why don’t we?

Extra credit

extra_credit_2My use of extra credit has grown over the years, despite my concerns about grade inflation. I use it to

This year I really went crazy with it, in the end offering nine different routes to extra credit. I capped it at 10% so that extra credit does not dominate the grade, but within that constraint, a student can go for it as they wish. Much to my amazement, almost no students make anything close to full use of it. It is much easier to get 10% through extra credit than it is to get an extra 10% by doing better on the tests or blog.

I offered extra credit for:-

  1. Particularly lucid, stimulating, artistic or lateral blog posts (max 5%/post). This is to encourage/reward outstanding work. I thought few students did enough to deserve this, but it’s very good to have the option to reward those who go above and beyond.
  2. Suggesting exam questions. You have to really know your stuff to write exam questions. Just 5 students offered any up, even though I’d give them 2.5% for every question I used.
  3. Finding a mistake in an class test or an exam that causes me to regrade (max 5%/mistake). A couple of students suggested mistakes, but they were typos and so did not need regrading. Nonetheless, I think this extra credit is good because it emphasizes the possibility that Professors can be wrong, it gets hypervigilance going, and students who argue with me learn, even if they are wrong (especially if they are wrong?).
  4. Partaking properly in the first blog period (1%). This is an anti-procrastination (get-off-your-butt) carrot which I was trying for the first time. It did not work at all: only about 100 students participated properly, fewer than last year when no extra credit was available.
  5. Blogging ahead of deadline (2%/deadline). This was a time-management carrot. It too did not work.
  6. Surrendering phones in class (1%/time). This did work.
  7. Writing an extra blog (2%). I asked the students to write about something they learned in class and how it had or might change their life. Just  nine students took advantage of this. Maybe that mean the course had no impact on the 310 other students. But I thought all of the nine were really interesting, especially this and this and this and this.
  8. Opt in to names in the hat (1%). A little under a third of the class did this, which says something about students, but nonetheless, I liked this solution to the problem of cold-calling students in large classrooms.
  9. A bribe to get the SRTE return rate up (1%). I wasn’t going to use this bribe this year, but with just a few days to go, only 30% of the students had given feedback through the Student Rating of Teaching Effectiveness system. Since that 30% was for sure not going to be a random sample [as is clear from what appears on Rate My Professor], I offered the 1% extra credit to everyone in the class if the class return rate got about 80%. It hit 82.5%… I’ve agonized before about this shameless bribe, but I think we have to do it if we are going to take anything meaningful from the SRTEs.

On average, the class got 4.7% extra credit. That’s pretty amazing, given that 4% would happen more or less automatically (3 x 1% for the phone-ins + 1% for the SRTE bribe). Just 13 students got the maximum extra credit and only 34 got 8% or more. I am sure I had more students just ask for more grade.

Bottom line? There is an administrative cost to all this extra credit, and I was able to keep on top of it only because I have Monica supporting the course. Without that, I am not sure I would keep anything except #1-3 and #9. But otherwise, I think worth persisting for the bullet-pointed reasons I give above. For professorial peace of mind, buffers against students complaints and begging are not to be underestimated. More positively, carrots are at least in principle a good way to nudge student behavior, even if there is not much sign they actually worked on my students. Perhaps the time management/anti-procrastination carrots need to be bigger (#4, #5). Just how much do I need to bribe students to do what’s good for them?

Software de-grades

My biggest headaches this semester came from the software platform we used for the class blog. ‘Up’-grades happened this year which, incredibly, degraded functionality. As currently configured, sites@psu is not fit for large-class teachingThe software now creates unnecessary work for instructors and frustrations for students — all while simultaneously creating novel ways for students to cheat – and no way to catch them.not-fit-for-purpose-stamp

  1. Most irritating was the degraded ability to find students’ work. We used to have an alphabetically-arranged Contributions page visible to the world. It enabled students (and us) to easily see with hot links what work they had done within a fixed time window and to find their class mates’ work. That made it easy for them and, most important, it made it very easy for us to grade. The 2016 ‘improvements’ hid all that. Now, no one on the outside can find anything and the themselves students have to log in to the system and run a report on themselves. And the graders? We ran endless reports. Click click click. Tick. Tick. Tick.
  2. During the grading of the third blog period, someone changed the method for running reports on students. You are in the middle of grading hundreds of blogs and someone replaces one lousy search algorithm for another lousy search algorithm — all for no obvious gain?
  3. The search widget does not search by author. Wtf?
  4. We had to rely on students (!) to go into their profiles and make their names correct. As administrators, we couldn’t do that. Students appeared by default with their user I.D. (afr3). They could then call themselves Drew, Andy, Andrew or leave the afr3. We had to ask them to call themselves what our class lists call them. Otherwise we have to be like detectives to figure it out. My favorite: Alexander called herself Xander. When you are searching a drop-down list of 300 students arranged alphabetically by first name…
  5. Yup, that’s right. For much of semester, you could not run reports on a students’ surname or user id.  There was just a drop down menu in alphabetical order of first (!) name. At one point we had a list of students arranged by first name followed by the remaining students arranged by ID number. I did so much scrolling up and down that list.
  6. The default time zone for the blog? Central Russia (no kidding). We figured that out after the first deadline cut off a lot of students’s last minute work.
  7. Some moron set up a clone site. This might have been in response to my complaints about losing the contributions page. I like that they tried. I did not like that they failed. But worse, they made it so the students could post to the clone site. You can see it here (check out the URL!). Once we figured out that there was a live mirror site, I disabled student access to it. But too late. You can still see on the clone site the students who posted to it. That’s the work we did not grade until student complaints unearthed it.
  8. The ability of the grading team to get into the site and find students completely stopped for many hours during a grading period (10/22/16). We have a team of five graders trying to get it all done in less than a week and we lost the better part of a day — without explanation or apology.
  9. My instructor blog vanished completely for six hours (10/17/16). Again no explanation or apology.
  10.  Despite my endless exhortations, many students post at the last minute. Some of this last-minute work took more than 12 hours to become findable because the blog under pressure does not post straight away. We know this because some work appeared after the graders had graded a student… Oh, the complaints (from students and graders).
  11. There is no way to tell if the site is about to exceed its storage limit. Right now, my dashboard tells me that with something in the order of 2,000 posts this year, similar numbers for the classes of 2012, 2013, 2014 and 2015, as well as this Reflections blog, I have 0.00% of 2.93GB used.
  12. There is no log. That’s the thing that would tell you who had done what on the site when. That’s what you need to check whether students are cheating or misleading you. And that matters because:
  13. Unbelievably, the site lets the student determine the publication date of a post. They can do work after a deadline and make it look like they did it before the deadline. I discovered that early in semester and I could not believe it. If you make it possible for students to cheat, some will. Maybe it is good there is no log. I can not tell how often we were taken for a ride.

Juggling 300+ students is hard work, especially on top of a busy research and administrative life. Time is everything. Brain space is everything. I struggle to put into words my feelings about the hours and energy I wasted dealing with software-induced student complaints and concerns. I dare say the College is also unimpressed with the cost of the extra hours the graders had to spend tracking down students’ work. Writing this post has taken even more time I will never get back. I hope it leads to constructive action on someone’s part. Whose, I have no idea. These days, you never get a person to deal with.

In 2010, tech guru Chris and others persuaded me that we could make blogging work for a large class. And indeed, Chris made it work, year after year. For the first five years, the blog software never got in the way of teaching and was never a pointless time-suck. Those were the good old days. In those good old days, Chris had control and was able to build the site himself to aid my pedagogy and grading efficiency. No more: sites@psu got outsourced to folk who don’t believe in local control. Last year, the change of platform was all mildly irritating. This year, I’d have given anything for the old functionality.

bad-softwareIndeed, if this year’s performance had happened in year 1, I would have given up blogging and returned to conventional term papers. And I will unless we get back the functionality we once had.  I continue to think blogging is an exceptionally good teaching tool. But this year, the hassle didn’t justify the pedagogical gains. Not even close.

The only good thing I can say about this year was the speed with which the folks at Texas-based Campus Press (to whom things have been outsourced) got back to me. I learned that if you put URGENT or EMERGENCY in the email, you got a rapid response. As to the responses themselves? Well, here’s one: “The reports did change and unfortunately at this point I don’t have any way to change them back to the previous version.”

Of course a real software ‘up’grade would involve a gain of function. Two new things I would like: (1) A text editor for comments. If you are an administrator editing an existing comment, a text editor appears. But not if you are a student. They have to use html, if you can believe it. That has caused so many utterly pointless headaches for students and TAs over the years. Fixing that would not be an innovation. It would just be making existing tools accessible to actual users. (2) Automatic plagiarism software. This would be an innovation. It would be great to have something that we can turn on after deadlines, and which compares the material on the blog with the rest of the internet (and not least SC200 blogs from previous years). The process doesn’t need to be instant (it could chug away for a week). If that’s too computationally intense, something simpler could still be very useful. For example, just taking well-formed sentences from every blog post (or even a random sample) and doing a google search for that text string would be good. If that’s too much to ask, then how about something that checks the current semesters posts against SC200 blogs from previous years? Plagiarism is a big issue for teaching via a blog. Be great to have a blog that worked with the instructor to make things better.

Mind you, I’d just settle for one that didn’t just make things harder.

Phones con’t.

There is evidence that phones are toxic for learning (e.g. 1, 2, 3). My students agree (2015, 2016). So what to do? I tried several things this semester, all for extra credit (1% each time).

(1) Collecting phones. That worked, but it’s a scene and a half. It could be improved on by collecting the phones when the students are in their seats. That would cut down on the time it takes to collect them. Returning 300+ phones would still be, well, a scene.

cell-phones(2) Honesty. This was Julia’s idea: get the students to swop phones and sign a paper form to certify that they had their neighbor’s phone for the entire class. This worked very well the first time I tried it. The second time, when the students knew how the system worked, we had at least two cases that were most easily interpreted as outright cheating. But for that, I would have tried it a third time. Several students were mighty pissed that a couple of cheaters meant the whole class missed an opportunity for extra credit. Me too.

(3) Flipd. This is an app that students download. The download is free, but there is a one-off charge of $3 to actually use it for classroom credit. The instructor sets up the class times, the app notifies the students when to flip their phone off, and system lets the instructor see who has not used their phone during class. I like it because the phone still works, so people who need to be contactable (those with offspring in childcare for instance) don’t get excluded. But I trialed this app with the TAs; the four of us did not find it reliable enough to roll out to 350 students. It’s simple enough to use, but if you push the wrong button at the wrong time, the wheels fall off. I could just imagine endless emails of the  ‘but I was there’ sort. Many professors are making Flipd work and I am sure as the software comes on, it will be great. The $3 is a bit of a downer. Maybe extra credit for the price of a cup of coffee is something students would go for. Instructors can negotiate a class rate – I got it down to $2/student – but then the instructor has to pay ($2*350 students=$$$).

I discussed the various issues with Cristian Villamarin, the guy based in Canada who wrote the app and runs the company. He sent me a flier and a presentation on the system (I enjoyed that one of his slides came direct from the PSU discussions I blogged about). He’s been pretty interactive since we talked, putting me in touch with another PSU professor who has been using it successfully. Flipd is probably going to be the solution when the reliability kinks are sorted (Cristian says they are). I also like Cristian’s slogan summing up the aim of all this: Life is Like a Camera: focus on what’s important and you’ll capture it perfectly.

(4) Pocket Points.  This app is 100% free and students gain points they can use for discounts on food around town, so they are motivated to use it. But the problem is that it gives a list of ALL the PSU students using it on campus at any time (which can be many hundreds): you can’t get a list of just your own students. Moreover, it shows the list in real time, not who was there for the full class period. So while it is a great way for students to impose discipline on themselves, it is not going to be a way to use extra credit to incentivize self-discipline without a major overhaul.

I did not try a solution Bill Goffe pointed me to. Yondr is a hardware solution which even got a mention in the NY Times. Yondr told Bill they have a subscription model — $1.50/pouch/month and 4 undocking stations. Doug Paris at Yondr is the contact. This could be an interesting way to go, but the hardware aspect means one more thing for students to forget/moan about.

yes-no-maybeIn all of this, there is a dilemma for me: phones can be good for active learning. I use PollEverywhere to poll students and to run a comment wall so they can text questions if they are too nervous to put their hand up. I think that is good option in large classes (not everyone likes to speak in front of 300+) and I hate clickers (and so do students). So how to balance those upsides of the ubiquitous phones with their toxic downsides?

Here’s a possible answer. After each of our three phone hand-ins/swops, I noticed fewer phones out during subsequent classes. Could it be that encouraging students to disengage from their phones just a few times in semester is enough to show them how much better off they are when they focus on the classroom……? Could just a few sessions be enough to show them that it is possible to leave texting and social media for a whole hour without the world ending? If so, Julia’s honesty system on a few well-chosen occasions might be enough. Is that too much to hope for?

I feel like there ought to be education specialists or teaching learning and technology specialists trying to sort all this this out. Surely none of this is rocket science.

Study Smarter Not Harder

worksmart2I’ve learned that most students have learned little about how to learn. This leads to the annual tragedy of students asking for a higher grade because they worked hard. That argument doesn’t cut it in the real world. It doesn’t even work in my world (please give me a grant or publish my paper — I worked ever so hard).

No one cares how hard you work. They care about what you did. Outputs count — inputs don’t.

As far as learning goes, the best bet is to learn how to to learn efficiently.  Early in semester, I talked about this in class a fair bit, and posted various things to the Angel site to encourage students to learn better. Angel is about to vanish, so for prosperity, here they are.

  1. Study Smarter Not Harder handout from John and Jackie’s class. Their 2 hour class is voluntary. You’d think their small classroom would be packed. It’s not. I guess students are too busy studying inefficiently.*
  2. John Water’s MOST excellent guide on how to study for exams. Acting on the information in this handout could transform the lives (or at least the transcripts) of so many students.
  3. Study Tips
  4. Top 5 Ways to Accelerate Learning
  5. Make It Stick. An awesome book. Should be compulsory reading and the focus of all Freshmen Seminars.
  6. Good generic wisdom from SC200 2015.

The Class of 2016 offered up remarkably similar advice:

  • Take more and better notes
  • Ask the professor and TAs more questions
  • Pay more attention in class
    • Stay off the phone
    • Sit closer to the front
    • Sit away from friends
  • Don’t skip class
  • Go to exam review sessions
  • Review notes regularly after class

work_smart*My son did an earlier incarnation of John and Jackie’s class. He said it was the most useful two hours he’d spent at PSU.

What to make of this?


The class tests and final exam are all identical in format. In an ideal world, we should see grades improve steadily through the semester. We did last year.  That would look like a steady increase left to right for the A’s and a steady decrease left to right for the lower grades. We don’t really see that. Well, certainly not for the A’s. Maybe the B’s and C’s are doing sort of the right thing. A simpler interpretation is that not much happened at over the four class tests and then there was a huge jump in performance for the final exam.

When I first noticed how much better the class did in the final exam than in the class tests, I was pleased (they had learned something! there’ll be fewer complaints! etc etc). Then it started to gnaw at me. The final exam is open, on-line for five whole days (120 hours) and, like the class tests, the students get a second go at the exam, having learned what their score was first time around (but not which questions they got wrong). Could there be widespread cheating? Now this is not something nice to think about, far less discover (just think of the time-suck it would be to run large chunks of the class though academic integrity proceedings). But I decided I should anyway take a look.

The set-up is vulnerable to a class that gets really organized and uses their first exam attempts to try to work out the correct answers. That would take some serious amount of class-wide coordination to pull off. But let’s imagine what that would look like if it happened. Most obviously, test performance should improve over the five days the test is open. There is no sign of that. In fact, if anything overall performance gets worse through time (I believe that’s because procrastinators do worse on average).


Each point is a test score; students get two goes so there are about 600 scores here. I ask 28 questions and grade out of 25; that’s the plotted score. More than 100% is therefore possible (note the two times that happened, it was me, once to test the test and once to test what Angel does when you get something wrong).

This picture doesn’t rule out some types of cheating (e.g. the highly illegal business of getting someone else to do the test for you), but I think it does rule out most plausible scenarios of large-scale class-wide fraud.  So I guess the simplest explanation for the performance jump in the final exam is some combination of (a) me setting an easier exam, and (b) students having more time and motivation to do well.

The way to be 100% sure about this would be to do the exam proctored in the exam center. What a performance for a class on this scale — plus, some students would miss it and need a re-take test. That would be all so tedious.

2016: the bottom line

I calculated the final grades almost a week ago and then let them sit on Angel until now. This is to give the students a chance to complain. That generates a bit of e-traffic but very effectively crowd sources the search for errors in my grade book. With 300+ eagle eyes on it, I am now confident there weren’t any. So the grades are officially posted today. They look like this:


The class average is 87.6% (B+), or 89.6% (B+) for those who passed. We started with 358 students; we ended with 317. Among the finishers, 50% got some type of A, 66% got a B+ or better and 80% got a B or better. With extra credit, 11 students got >100%. Altogether rather similar to last year.

I say it every year, so I guess I’ll say it again: what to make of this grade distribution? Is it about right or too high or too low? We had a Biology faculty meeting a while back, that I sadly missed (not often I say that), where the proportion of A’s was being discussed. In biology classes for 2013 with more than 20 students, the numbers looked like this:

% A and A- 100-200 level Bio Courses 400-level Biol Courses
Mean 24% 42%
Median 27% 36%
Range 13-39% 13-99%

Everybody except the person awarding 100% A’s thought 100% was too generous. The minutes from the meeting helpfully say: “Faculty Senate policy allows faculty to grade according to their best judgement. Although programs can provide guidelines, ultimately grades are at the discretion of the individual faculty member. Several faculty shared their experience of figuring out their grading criteria with little to no guidance. It was widely agreed that some departmental guidelines for grading would be helpful.” No such guidance has been forthcoming because I don’t think any such guidance is possible. It’s a fundamentally challenging problem. The problem is even more difficult for Gen Ed courses where there are no professional discipline-specific views on relevant standards (and how can there be?).

Is 24% about right? My grade distribution with its 50% of A’s is clearly out of line with the 100-200 level Bio courses. Does that matter? People get excessively steamed up about grade inflation, but if we worry about that from data on the proportion of A’s, it implies that the only thing that matters is relative success. And if that’s important, our job is to not what I think it is, but instead it is to identify and anoint the top x% of students.  Which is CRAZY.

Actually, thinking about this too hard might drive me crazy. Previous ruminations are here and here. I am making no mental progress on this problem at all. Worse, I don’t see anyone else even engaged with it. In the shower this morning, I had a thought: isn’t the search for an ideal grade distribution fundamentally silly? What I should care about is the impact I am making to the way students think about the world. The grades might say something about that. But probably not much. So, Andrew, think about what’s important, not what is easily measured. Ruminate on that.

The line in the sand

It’s that time of year where I get inundated by email with students asking for a better grade. These requests fall into two categories.

  1. They’d just like some more. My 2015 response to that is here.
  2.  They’d like to be rounded up. My 2015 explanation of my rounding algorithm is here.

To both those 2015 posts, I note that this year there was up to 10% extra credit available. Students who want a higher grade might think about why they did not make full use of that.

Moreover, all students got at least 1% extra credit (the bribe to get the class SRTE rate return rate above 80%, which it duly was), and many students got more as carrots for time management, phone hand-ins, names in the hat….  That means all students just below a grade boundary got as close as they did because of extra credit — not academic performance. If I took away the non-academic extra credit, they would not be close. Grades are earned not requested.


Final Exam

I am always pleasantly surprised how well students do on the final exam. This time, the average was 88% (B+) (89% (B+) when the fails are excluded). That’s a full 15% better than Class Test 4. There were 7 fails and 5 no-shows. No students got everything right, but on my ask-28-questions-grade-out-of-25 algorithm, 64 students got 100%, and six students got 26/28. There were 119 A‘s, 75 A-, 38 B+, 25 B, 16 B-, 18 C+, 4 C, and 9 D‘s.

Once I have dealt with the final grades, and the e-correspondence they generate (“please sir, can I have some more grade”), I’ll might come back and muse some on why the final exam performance is so much better than in the class test which was just a few days earlier (a 15% jump in average performance — from a C to a B+ — in just five days?)

The circumcision decision

130913_medex_circumcisionbaby-jpg-crop-promo-mediumlargeIn my final class, I drew the students’ attention to this really excellent example of how to rationally think about evidence and reach a conclusion. I hope all SC200 graduates are now capable of assembling relevant data and thinking about it the way this author does.

I like his discussion because it leads him to a conclusion opposite to that reached by the Federal Government (CDC) and by the American Academy of Pediatrics, a professional body that should know a thing or two. I have no strong opinion on the author’s conclusion, but I sure do like his reasoning process.

I raised this particular topic because the majority of students will face this uniquely delicate decision in their not-too-distant future. As with so many decisions, they can take the easy route and accept the first bit of advice they hear from an authority figure like their mom, minister, medic or some moronic website — or they can think about it critically themselves. I really hope I empowered them to do that.

Overall blog grades

This terrific cartoon appeared as a response to a questionnaire I gave the students on the lessons they learned from SC200 on how best they could improve their learning and their grade

This terrific cartoon appeared last week on a class questionnaire….

There are three blog periods during the semester; at the end of each, students get a grade and personalized feedback on their work. I take the best grade from the three periods. This algorithm encourages improvement, mostly lifts games and sometimes delivers brutal lessons in time management.

The final blog grades were: A, 9; A-, 29; B+, 27; B, 42; B-, 41; C+, 66; C, 45; and D, 35. Incredibly, 20 students failed to do enough work to pass. A further five students did nothing at all. Ever.

So about 10% of the class achieved some kind of A. That seems about right to me. I wonder why that feels about right. I said the same thing when 33% of the class got some type of A on the class test final grade. We professors are left to set the bar where we want (unless its a subject with a long history like math, where there seems to be agreement [how?] or where some professional body stipulates authoritative standards [derived from….?]). This means the height of the bar becomes a great source of tension, and one which is completely ignored by university authorities because it’s a really tough problem. I set the bar where I feel good about it. I think we want to stretch the students without discouraging them. I have untenured colleagues who low-ball it so students are attracted to their classes so they can keep their job. I wish I could wrap my head around an incentive structure that results in faculty job security as the primary determinant of student performance. Sadly for my disgruntled students, I have tenure and so am free to determine my expectations of students. Mine come from a very different source and a firm belief that because this stuff matters, it’s better if things are challenging. At least one student agrees:


Blog Period 3 results

Well, we made it. The blog is done.

gradesBlog Period 3 results: We had 167 no-shows, over half the class. A further 34 didn’t do enough to pass. Among those who did, the average was 77.3% (C+). Under my best-of-three-blog-periods grade algorithm, it is always a little hard to know what to make of the grades for the third and final period. The students who participate include those trying to improve their grades from previous blog periods. They are typically going for gold. And then there are a ton of students who are participating for the first time, so they have not had any feedback on previous work and worse, being procrastinators, many of them leave it till the end of the period before they start doing anything. We had some shamefully bad first attempts just hours before the deadline. Sigh.

Still, there were some posts I really enjoyed. Students unhappy with their grade (or indeed other things), might try forcing a smile. Much to my surprise, birth control provides fertile (groan) grounds for a discussion of confirmation bias, and reading real paper might be better than reading screens if you want to learn stuff. And if you are germ phobic, it’s better to touch the toilet than anything else in a public bathroom. A post on zombies contained material I had not known of — and used in class. There is also a great post about a hugely important topic: that your health might be massively affected by your social status. I think that will become one of the biggest issues in employment law one day. Workplace health and safety was once ignored; now it’s literally top of the agenda. But almost no one gets hurt by the stuff that involves. Instead, people get hurt if they are not fairly promoted. On a more cheerful note, I enjoyed learning about grocery store hunger and there was a nice discussion of what seems like obvious nonsense to me: a device you pour your wine through which allegedly stops hangovers. Sign me up for a randomized control trial of that. Though not if I have to pay $80 for the damn thing. As Valerie pointed out, you can buy several bottles of wine for that.

You can buy ten times more wine for $800. That’s the scale of the bill I was sent for my last annual medical check up. There was nothing wrong with me before I went and nothing after. But there was a lot wrong with the doctor who ordered a whole lot of tests my insurance company did not think I needed and even more wrong with the ever exasperating Mt Nittany Health who simply can not do billing properly. So I much enjoyed a post summarizing the evidence that annual check-ups are a waste of time when you’re well (the idea that healthy people need them is a myth put about by physicians). Michael even explains how to get a check-up for free. I wonder how many tests I would set if I got money each time a student took a test. And how high blog scores would be if I got paid each time I awarded an A. But I don’t. So:

The overall grade distribution was: A, 2; A-, 10; B+, 8; B, 11; B-, 21; C+, 24; C, 15; D, 27; Fail, 34.

Why should we  teach science to nonscience students?

“What do we hope to accomplish?” asks Stuart Firestein in his interesting 2016 book Failure. I like his answer:

“Science has for the past four or so centuries provided more and better explanations about nature than anything in previous recorded history. Mostly it developed a strategy for finding stuff out and knowing whether to believe it or not.”      

As he says later in the book: “Science is the best method I know for being wary without being paranoid.”

Where we currently stand

With the final blog period, the final exam and some extra credit to be added, we have six students on more than 100% for their final grade, and about 85 with some kind of A. Conversely, we have about 40 on a fail. The final blog grade should save many of those.

Class Test Score Overall

Bummed out by the Class Test 4 scores, I decided to have a quick look at the overall score for the class tests (I take the best 2 of 4). This usually cheers me up. And indeed it did. The distribution is: A, 21; A-, 81; B+, 36; B, 68; B-, 35; C+, 24; C, 29, D, 12; Fails, 9.

So about a third of the class are on some type of A. That seems about right to me.enhanced-5890-1412873873-5










I wonder why that feels about right. There are no guidelines on this whatsoever. We professors set the bar as high as we want. How high to set the bar is the hardest problem in Higher Education and everyone avoids it like the plague. I suppose the reason it cheers me up to have a third of my students on an A is that no individual test turned up that many A’s. My take-the-top-two-test-grades-of-four algorithm allows improvement and fluctuating performance. So I get to challenge the students and many get well rewarded. No trade-off.

Another observation: four students got a overall class test score of 100%. None of those got 100% in all four tests. I think that is good. Even those attaining the very highest scores still have something to reach for. I feel better about that too.

Class test 4: sigh (again).

downloadThe results from the 3rd Class Test were the most disappointing I’ve ever seen.  Yesterday was Class Test 4 and the results are only a shade better. Among those who took the test, exactly the same class average as Test 3, 74% (C). But that average comes from way more A’s and A-‘s (44, up from 26 last time), fewer B’s, C’s and D’s (good) but more dreadful fails. Oddly, we also had way more no-shows (53, up from 19 last time). So the best we can say is that the decline in test performance across the semester has been arrested. The average grade among those who passed was 80% (B-).

Two students got 100% on my ask-28-questions-grade-out-of-25 algorithm, but the highest score was 25/28. The rest: A, 14; A-, 30; B+, 39; B, 33; B- 24; C+, 20; C, 16; D 43; Fail, 53.

When I look at the performance on the individual questions, there were what I would term under “performance problems” with the questions involving stuff I’d gone over in class in the most recent weeks. Perhaps that’s  a consequence of the low attendance:

  • Just 40% of students in the class thought the strongest reason to think a Zombie apocalypse couldn’t happen is that it hasn’t so far; maybe that’s because only 49% of students were in class when I made that case.
  • Just 52% of students were in class when I discussed the famous ‘experiment’ where Barry Marshall drank H. pylori laden broth, only to get sick not ulcers, thereby providing (anecdotal) evidence against his hypothesis that H pylori causes ulcers. Only 45% of the class knew that.
  • Just 60% of the class was there when I showed the video that makes clear that to a reasonable approximation, the Drake equation shows that the main determinant of whether we will detect life in the Milky Way is how long civilizations transmit detectable signals; 53% of the class had taken that on board.

Most heart breaking:

Which of the following is currently an open question in science?
(a) the nature of dark energy
(b) the safety of childhood vaccines
(c) the cause of climate change
(d) the chemical composition of celestial bodies
(e) all of the above.

A staggering 55% of the class chose (e), meaning that I failed to get across the strong scientific consensus on (b) and (c), and many were not paying attention to Jason Wright’s strong guest performance when he showed we learned how to do (d) in 1859. I weep.

Just one question seems important for me to go over in class today. That’s one which makes clear I have not made clear enough that statistical significance per se tells you little about how big an effect is. Statistical significance does not necessarily mean biological significance.

What’s interesting?

I polled the class on which of topics they would like me to cover in the remaining classes. This gave me a popularity ranking on the options I offered, from most to least popular (n=141 respondents):

  1. sleep.jpgAre there aliens? (91%)
  2. Is a Zombie apocalypse coming? (87%)
  3. Are animals gay? (84%)
  4. Why haven’t we cured cancer? (72%)
  5. Why do we sleep? (66%)

This order is the complete reverse of what is interesting to me and (I believe) the vast majority of scientists. Go figure.

Sex bias?

This is the actual hat I use....

This is the actual hat…

According to a comment in this very interesting video about how to get large classes engaged in active learning, males are more likely to put their hand up and to be called on in class. Best practice is therefore to do random cold calls. Students hate that, and so I compromise: for extra credit, they can have their names in a hat from which I randomly select people.

The class is 44% male. 40% of those who opted to have their names in a hat were male. So no bias there. The real bias is probably in the personality types selected. That sort of bias might be much more insidious.

Late drop

Last Friday was the last chance for students to drop the course and not have it show in their GPAs.  We lost just 12 students. Incredibly, some of those had full attendance and grades that could easily have been lifted to A’s with a bit of effort.

Go big or go home is a common saying on campus. Kudos to the 90% who did not go home. Each year it staggers me that there are students who waste their tuition dollars, time, energy and (more importantly to me) a place in my class, all to protect their precious GPA. The GPA is the single most damaging number in higher education metrics. Truly brave universities would get rid of it. I’m told we can’t because some of our Colleges actually select students based on their GPA. I wonder why. It just encourages students to play it safe and avoid or drop challenging courses. No way to encourage them to go big.

For those wondering what just happened to their grades

Angel, the course management system, calculates an overall grade in real time. This is good, but it means that as new components of the final grade come in, some grades adjust downwards. This generates a lot of email traffic. This is my generic explanation of what just happened.

I just released the attendance grade for the first time. Students have to be at nine pop quizzes to get this (worth 10% of final grade). We have now done nine, so regular attenders just got their 10%; those who have yet to be at nine just got a zero. They will get their 10% when they hit nine quizzes (and there will at least three more before end of semester). But for now, that means those who have missed any pop quizzes had their score go down by 10%. The scores of the other altered slightly as Angel re-weighted the various components of the grade to include the attendance score.

Culture wars

To Dr Harlene Hayne, Vice-Chancellor***, Otago University, New Zealand

Dear Harlene,

As an Otago graduate (Zoology, Class of 1984), I’ve always enjoyed your articles in our Alumni Magazine. Congratulations, btw, on five years in the job. I hope the next five are as good for you and the university as the last five.

Here at Penn State, I am a research professor most of the time. But for 15 challenging weeks a year, I teach 365 non-science majors about science. I’ve been doing it since 2010, and each year I am amazed by just how hard it is. I have a high bar (and struggle with how high set it) and I do everything I can to get the students over that bar — except lower it. I also expect (demand) that the students seize control of their own learning. But many of my students just hate it (they want A’s on a plate) and most of my colleagues don’t much care for my efforts or standards.

And so it’s a struggle. I’ve often wondered why I bother. No one would complain if I aimed low. But now, thanks to your recent article, I know where my teaching aspirations come from. You went on US tour to get feedback from the US students who do Study Abroad at Otago and, in your words, everyone

… reported that the academic standard at Otago was much higher than that of their home institution. I was constantly told that the American students – many of whom came to us from highly selective, and extremely expensive private universities – had to work twice as hard at Otago as they did at home.

They also told me that Otago required students to think for themselves and to take responsibility for their own learning; that Otago fostered a sense of independence that was initially a bit daunting to many of them.

So that’s it! My aspirations are Otago’s fault. Ironic that you, an American in a NZ university have the perspective to explain to me, a NZer in an American university, what’s going on.

Well, here’s to the ‘smart, ambitious and warm-hearted, edgy‘ Otago people who shaped me. To name just three still on your books: AlisonEwan and Alan. You’ve made me realize their reach is long and their contribution to my professional discomfort great. I am sure my own students will one day thank them. I do.






Dr Andrew Read FRS
Evan Pugh Professor of Biology and Entomology
Eberly Professor of Biotechnology
Director, Center for Infectious Disease Dynamics.

This from Otago's recruitment literature...

A major reason I had a fantastic time at Otago University (1981-1984).

***In US speak, the Vice-Chancellor is the University President.

Extra credit calculations con’t.

We just released a slug of grades, meaning that students who did not do well start thinking about extra credit options (good) and how they are calculated (ok, but better to spend time reviewing class material, blogging or actually earning extra credit). Any how, the calculations, and further to my last post about this: we cap extra credit at 10% (ie students can have added to their actual class grade a maximum of 10% extra). That means no matter how much extra credit they actually earn, we never add more than 10%.

That 10% is added using the extra credit option in Angel, which shows everything out of 100%. So 6% extra credit shows in Angel as 60/100 extra credit, and the maximum allowable is 100/100 = 10% added.

Class Test 3: sigh

unhappy-face-clipart-best-md1kpo-clipartI don’t recall ever being quite so disappointed by a set of test results. I didn’t think the test was especially hard, and certainly no harder than Class Test 2. But the grades are down. The average score among those who took the test was 74% (C), down from 77% (C+) in the last class test and 79% (C+) in the first test. That is a trend so going in the wrong direction.

The specifics: A, 11; A-, 15; B+, 33; B, 41; B-, 47; C+, 38; C, 24; D, 59; Fail, 46; No shows 19. One student got 26/28 and 7 students got 100% on my ask-28-questions-grade-out-of-25 algorithm (down from 9 last time and 20 the time before). The number of A’s was down (from 15 and 42), as was the number of B+’s (down from 46 and 40). The only growth area is the C+’s (this time up from 27). So my strongly bimodal distribution of earlier tests now has less of a valley in the middle. Hardly an achievement.

So what’s going on? Possible explanations:.

  1. The test was impossible. But a dozen students did outstandingly well, so the test was do-able.
  2. There were some badly worded questions. I won’t know until I talk to the students, but looking at them, that’s not obvious to me. There were a few questions where I had slung in some class slogans (e.g. correlation does not equal causation) that were perfectly correct answers to questions I wasn’t asking. Those always seem to cause problems (ie test understanding). I think people recognize the slogan and opt for it without thinking about the question. That’s sobering. There was also a question about whether the study in the the media report was a randomized control trial (it was), but to realize that, you had to read the report and think about it. Most of the students having incorrectly decided it was an observational study then got the next two questions wrong (since RCTs but not observational studies allow you to rule out reverse causation and third variables). But other than that, I can’t see any real question-related issues.
  3. Nobody got the guest speakers. Five of the questions revolved around the recent presentations of our two guest speakers, Mike Mann and Doug Cavener. I reviewed both those presentations with the class following their visits. Attendance at all three of those sessions was poor (just 75% of the class present when I went back over what they’d said), and that can’t have helped. But looking at four of those five questions, the majority of the students got them right: 88%, 64%, 63% and 59% (which, if non-attendance was fatal, translates to 100%, 85%, 84% and 78%). Only 30% of the class got the fifth question right. This concerned Dean Cavener’s work to put a giraffe gene into mice. He is not doing that to test the power of new gene editing technology or to test evolution (why would he do that? — both of those have already been thoroughly demonstrated). No, he’s doing it to see if a genetic difference correlated with the height difference between giraffes and their nearest relatives is actually a cause of height difference. I really thought I went over that well in class. Clearly we need to do some work on correlation/causation, and maybe revisit gene editing and evolution, but that’s just one question — not enough to sink the class average.
  4. Study groups are working together and group-think is misleading them. This is a possibility. The students are supposed to do the tests alone. And they pledge to that effect. It’s an integrity violation to work with others. It can also be dangerous if the blind are leading the blind….
  5. Too few people have been to review sessions. That’s certainly true. Is this the problem? How to fix that? I can’t do review sessions in class. No one complained that they couldn’t make reviews last time or asked for other sessions to be put on. Maybe after these grades, some more students will make the effort to review things.
  6. People are really not understanding things. This is a serious possibility. The questions all variants of what I used in previous years. People are not getting all the same things wrong (exceptions above); they are getting diverse things wrong and more than in previous years. I am not sure why. This year I have been going much more slowly through material, and I thought if anything explaining things better. I wonder if my focus this year on soft skills has been taking up so too much class time. It might be better to spend more time probing the concepts from different directions (that’s what builds real understanding) rather than talking of phones, study skills and time management.

So what to do? I won’t really have a good understanding of the misunderstandings until I do the first review session and see what students are missing. Hopefully we can do that first review session tomorrow so I can get a real sense of what’s happening asap.

I want to do everything I can to help students over this bar. Except lower it.

Mid-course SELF-evaluation

I do a mid-course evaluation to find out what I could do better on the remainder of the course. For the first time, I asked the students what they could do to help themselves do better. The most common answers (in bold the number of students who said that):

Take better notes 62
Be less lazy; not procrastinate 60
Focus – pay more attention 60
Review material/class notes more after class 54
See/resort to TA’s/meet with them 53
Ask more questions 44
Turn off electronics/phone 39
Study more 38
Blog more frequently/regularly/earlier 36
Time management 21
Review pop quizzes & tests 12
Attend study sessions/review 10
Attend more office hours 9
Do more extra credit work 8
Sit up front 7

All remarkably similar to the wisdom from last year (learning, grade).

As for what I could do better to help the students learn better? Most requests actually revolved around how they could get a better grade. And the most common requests? More on what makes a good blog, more on test preparation, and could I please require less blogging and make the tests easier. I take the first two points, but I really think those things are a work in progress. As for less blogging? 65% of the students think the work load of the course is about right (which makes me think its too light). As for making the tests easier — how hard to stretch people is one of the toughest (and most ignored) questions in Higher Education. They might already be too easy: nobody thinks the course is insufficiently challenging, with 57% thinking it’s about right and 43% saying its too challenging. You gotta stretch students out of their comfort zones. With almost 60% feeling pretty comfortable, I may not have gone far enough.

Things they think I am doing well? 

Picking interesting topics that keep students engaged 73
Answers students’ questions thoroughly 24
Makes students think critically 11
Detailed explanations 10
Extra credits/many chances to improve grade/succeed 10

Most disappointing is the quarter of the class for whom the course is failing to meet their expectations or who are dissatisfied. The most disappointing thing about that is that no one has come to talk to me about anything they are dissatisfied with. I always wish we could link the feedback to the grade book. Are there engaged students who aren’t happy? That would worry me. Or are the unhappy 25% those who are failing because they haven’t done much?  And by even raising that possibility, have I become the doctor who blames his patient’s death on the severity of the disease rather than the quality of the doctoring……..???

Thanks so much to Monica for her fantastic efforts to compile all the data.

Blog Period 2 results: upward

Somehow we lost 17 students from the course since Blog Period 1… sigh. None of them ever came talk to me about their options. Maybe they dropped out for reasons other than the course.

Of the 336 students still registered, 91 did nothing on the blog at all. A further 59 did so little, they failed. On top of that, there were 38 D‘s. The good news is that, believe it or not, that’s fewer fails and D’s than for the first blog period.

performance-improvement-web-picture-pcaaun-clipartBetter, the number of students scoring very well has gone up ten-fold. There were 6 A‘s, 13 A-, 19 B+, 20 B, 22 B-, 40 C+, and 28 C. The average score among the 186 who passed was 78% (C+).

It’s always a little hard to know what to make of that distribution. Some of those who did well last time will not have participated this time (I take the best score from the three periods) and some who did not do anything last time have just tried for the first time (and so have yet to benefit from personalized feedback). Nonetheless, I do take heart that some students had really stretched and things were in general much improved (way more A’s and B’s). The writing that earned those scores cheered up the graders. They found it all quite disheartening in the first blog period (a common reaction for grading newbies who never realize quite how little effort the majority of undergraduates put in).

blogSo, some examples of good practice: I enjoyed learning whether music is helpful (lots of original thinking in that one) and that chicken soup can be (mechanism unclear). Whether brothers make you gay is an important topic with lots of ramifications, some of which I hope to discuss in class in the not too distant future. Also important and well written, the possibility that video games make you sexist, and that weight loss apps don’t work. I also thought this take on sleep was great. Sleep is another topic I hope to get to in class: we do it for a third of our lives and no one knows why. Do dumber people swear more? As an occasional potty mouth myself, I was reassured to find the answer is probably no. There was also important, well written stuff on running to aid mental health, the effect of skipping class on school performance (not good), SADthe dangers of second hand smoke (where the beliefs are far stronger than the data), and whether going to the doctor is actually good for you (it’s almost always good for the doctor’s bank account, so you can trust their answer to that question).  As always, for students interested in improving their blog grade, I strongly recommend taking a look at those examples (not random entries on the blog which are by definition average), scrutinizing the grading rubric, paying serious attention to the feedback from the graders, and looking at the tons and tons of resources to be found here.

Ok….so we now have the list of the 23 students who have yet to do any blogging whatsoever. We’ll write to those folks over the next few weeks begging them to engage. You can’t pass this course without blogging (it’s 40% of the course and the pass grade is 60%). In previous years, some students have failed and told me after they thought blogging was optional… Gotta love it.

A failed time-management carrot

To encourage time management and to discourage procrastination and last minute panic, I give 2% extra credit for anyone who can get their blog posts done five days ahead of time. That’s extra credit for no extra work whatsoever. You’d think everyone would want a piece of that, especially since 2% extra credit is equivalent to raising test scores from a B to an A or blogging scores from a B- to a B+.

Nope. For blog period 2, just 20 students got sufficiently organized to get something for nothing.

Climate change change

Today I reviewed what Mike Mann had said in class about the evidence that climate change is real and mostly due to humans. Then I re-did the poll I took before Mike came to class. Last week, 39% of the students (n= 152 respondents) either didn’t know or disagreed with the statement The earth is warming mostly due to human activity. Today, it was 16% (n=132 respondents). So I guess that’s progress.

But that prompted some scary stuff on the comment wall:

There are bigger issues going on in the world that need our immediate attention. War, violence, the economy.
Climate change isn’t happening because humans are not superior to god’s will.
Climate scientists are going to hell because they are saying humans are superior to god’s will.
Everything that happens with global warming is due to nature and god, not humans. God is superior to us and determines what happens to earth.
I know global warming is fake because my uncle is a republican congressman and says so.

It’s always a little tricky to tell the comedy from the reality.

More constructively, I had a really good idea for a teachable moment. Following Mike Mann’s forceful presentation last class, I polled the class today:

132 respondents

132 respondents

And that gave me the opening to say that among the 100-500 climate or climate-related scientists on campus (a guess, based on estimating the post-docs and grad students as well as faculty), there was not one….NOT ONE… who would disagree with Mike’s summary of the evidence that climate change is happening and largely man made. That’s the extent of consensus on this. If I wanted to find a climate denier, I’d have to hunt among the non-scientists on campus, and even then probably only among the politically motivated. I pointed out that when TV news does balance, it’s not balance in any real sense. Just a scientist and some talking head. It looks 50:50 but that’s an incorrect impression.

I also used it as a moment to to say that if any of those 100-500 Penn State scientists could convincingly show that climate change was not happening or not man made (or a Chinese invention), that scientist would be famous. That’s why we can be pretty confident when they all agree. Everyone is trying to take down everyone else’s science.  When it stays standing, we have to start to think it might be right.

Well I thought it was a good teachable moment….