This is about two courses at Simon
Fraser University that appear very similar: Stat 201 and Stat 203.
These courses are
very similar in that they are 200-level service courses (meaning they
are for non-majors). They are introductory courses that cover the
fundamentals of descriptive statistics, sampling, probability,
hypothesis testings, and t-tests. Both courses are equivalent as
per-requisites for the 300-level service courses or for fulfilling
graduation requirements.
Both classes were
offered as a combination of 2 hours/week of lecture one day, and 1
hour/week of lecture another day, with drop-in workshop support for
assignments and studying.
One could be
forgiven for treating them as different sections of the same course,
which is exactly what I did.
However, one class
is titled “Stat 203: Statistics for Social Sciences”, uses SPSS,
and is a service course for the sociology and anthropology
departments. The other is titled “Stat 201: Statistics for Life
Sciences”, uses R, and is a service course for the biology and
environmental sciences. This schism not in content but in audiences
is what makes these courses different in ways I didn't expect.
The 201 class had
much higher classroom engagement, higher attendance, and even a
better reaction to my awful jokes. More measurably, the 201 class
also had an average of .75 grade points higher than their 203
counterparts; the 201s received a B+ on average, and the 203s
received an average of between a C+ and a B-. Unsurprisingly in this
context, the 201 students rated me much higher (4.5/5 vs 3.5/5) in
their teacher evaluations.
The themes in the
written answers were essentially the same, although my weaknesses
were mentioned more by the 203 students. The first word count here is
for Stat 201 and the second for Stat 203. I've removed a few common
but uninformative words like “Jack”, “Davis”, “course”,
and the usual grammar stop words.
Word cloud for evaluations from Stat 201: Stats for Life Sciences
Word cloud for evaluations from Stat 203: Stats for Social Sciences
There's a few lessons to be learned
about the mistake of treating two different courses like these as if
they were the same, but it's hard to articulate, so forgive me if I
stumble.
First, teach (or
present, or write) for the audience you have, as opposed to
generically. There's a quote that floats around in B.Ed. programs, “I
taught, but the students didn't learn” (See Alfie Kohn's article,
http://www.alfiekohn.org/article/teach-learn/
), and how this is a poor attitude for an educator, or that the focus
should be on the result, not the process. In other terms, material
MUST be suited to the audience to be effective. For me, it would be
best to draw from some new sources or sacrifice some depth for more
fundamental examples before I deliver Stat 203 again.
Another possibility
is to hold practice session for exams, or offer more hints for
assignment questions. Since I delivered this course, my exam practice
material has gotten much more extensive. For example, the Midterm
1 practice material is now more than 15 pages long, and includes
a partial
key.
Another key moral:
Teaching is a service first, and a means of research and personal
growth after that. In Mastery:
The Keys to Success and Long-Term Fulfillment, by George
Leonard, there's a story about the author's time as a trainer of
fighter pilots. In the story, the author spends extra time further
developing two already-talented pilots at the expense of the other,
less apt, pilots under his charge. From a value-added perspective,
the trainer had only done half his job, because the novice pilots
could have benefit far more per hour from the trainer's attention
than the ace pilots. It's possible that I committed the same fault
without being aware of it, and ended up giving more attention to the
Stat 201 students than they needed and ended up leaving the Stat 203
students behind as a result.
Also, as an artifact
of the timing of the exams in each class, the 203 students got the
harder exams than the 201 students, which is the opposite of what
should have happened. I wrote my exams in the order that they would
be administered, and it happened to be that the Stat 201 midterms and
final all came before the Stat 203 equivalents. I wanted to make the
exams different but equivalent, and the easiest way to do this was to
create a question for one exam, and then change the numbers and/or
scenario for the question for the second exam, and add a twist. Most
of the time, adding a twist meant increasing the complexity of the
exam. I justified this with the assumption that the later 203s would
have additional information about the exam from the 201s that had
taken a very similar exam a few days prior; this assumption was
wrong.
Another resource I
should be using is live student feedback. I've been using a learning
management system called TopHat, and it's taken me a while to make
good use of it. TopHat allows students to answer questions live in
lecture (or after the lecture, if the prof doesn't want to deal with
excuses for absences) through their mobile devices. I've rarely used
it for student opinion polls, but doing so would be a good way to
effectively adapt material, or at a minimum give students a chance to
anonymous voice concerns.
I don't want to
dismiss the 203s as simply weaker in statistics; that shuts the door
to finer optimizations. Instead, it would be better think of there
being some barrier I haven't broken through yet, and to try to
identify that.
On the flip side,
what I'm doing with the 201 students seems to work well on the
surface, but it's not optimal either. I'm wasting an opportunity to
challenge them or push them to work towards greater learning. We'll
see though, it's possible that the 201 course being in the
mid-afternoon played a role, as well as the its location on a
secondary campus. Being on a secondary campus, my coordinator
hypothesizes that more dedicated students selected that course
because others would have been deterred by the extra commute.
For all the gloom
that this reflection may present, I would call this semester and the
teaching of these two classes a success. It was a substantial
improvement both in outcomes and in workload of the semesters before,
and over the Stat 203 class that delivered in Summer 2012.
One particularly
bright spot was Crowdmark, a grading platform we started using for
assignments and exams. The assignments had some technical growing
pains, but for exams, Crowdmark was fantastic. Each exam is given a
QR code at the top of each page, which allows that page to be
separated from the rest of the exam digitally. The questions could be
distributed out to markers just by having them log in and grade, and
the platform is equipped with hotkeys to allow them to put the…
same… comment… on… hundreds… of… incorrect… answers… by
writing the comment once and using a couple of keystrokes on each
question. The students then receive a pdf of their exam with the
marker's annotations.
Also, it keeps a
record of how each student did on each question, rather than looking
at the exam scores in aggregate. This means I can look back and see
which questions are doing the best for appropriate difficulty and
discriminatory power. I can apply item response theory to the
results. I can even use the data for my future research on improving
the exam experience.
No comments:
Post a Comment