Learning About Language Assessment: Dilemmas, Decisions, and Directions – K.M. Bailey

(en español)

Learning About Language Assessment - Bailey 1998As I read my way through Kathleen M. Bailey’s book about assessment (1998), I was reminded of a nightmare I had during my student teaching experience in a multilingual third grade public school classroom:

I was back in Berkeley.  An anti-abortion group calling itself NWEA had called for a protest at the house a teenage girl who had been raped.  This was an organization known for terrorizing not only abortion providers, but people who get abortions as well.  They used strategies that had been developed and perfected by the KKK in the height of the Jim Crow era.  There was therefore a call by others for people to come out and support the family.  I arrived early but didn’t see the house as I approached.  It looked to be a vacant lot between two office buildings, but when I came to the corner of the office building and looked into the lot, I could see the small dilapidated house tucked back and hidden away, blocked from view by the high rises on either side.

Right around the same time as I approached, so too did a parent of a former student who I’m “friends” with on Facebook [someone who I actually never friended on Facebook though I got to know them well from having two of their children in preschool and/or afterschool continuously over the course of about three years] .  The parent was asking me about my Facebook post asking for a pen.  I explained to her that I had posted that several hours earlier in the day and that I had already found a pen that I could use.  It was a friendly exchange, but I noticed that she went to the area that the NWEA members were gathering, so I left her and walked up to the house.

I took my shoes off on the front porch and went inside.  The father was very protective.  He was an African American preacher who spoke with a southern accent.  He eyed me warily as I entered.  I told the family that I was there to support them and asked them what I should do.  As I spoke I saw the expression on the teenager’s face.  It was one I had seen before.  Years ago a co-worker of mine had come in to work with a black eye and bruised all over from having been attacked by a serial rapist, and she wore that very same look for several weeks.  It was a look of fear and dread, but also a look of determination.  A look that said she didn’t know who she could trust and that if it turned out it wasn’t you, then you would live to regret it.  A look that amplified when the chanting began outside before any member of the family could answer me.

I went back out and put my shoes on as I saw two groups of people marching in rectangles across from each other.  One group was chanting, “NWEA, don’t say ‘Nay.’”  As I was contemplating the meaning of that chant, my 6am alarm rang and I woke up.

I woke from the dream with my heart pounding.  I immediately wrote down the dream and considered what my subconscious was mulling over.  Somehow I had transformed a standardized test (the NWEA) into a hate group.  The expression on the raped girl’s face reflected how I expected the students to be feeling about taking all of these tests so early in the year.  They had already done the BAS test, assessing their guided reading level; the BOY test, where they had to figure out a character trait about the narrator in a story about a girl named Jess; and now they would be doing the NWEA literacy test, and all this within the first month of the school year.  Three different tests examining literacy alone.  Just as the girl in the dream was being inundated with so much at once, starting with the horrible trauma and it being compounded by becoming the focus of so much attention, so too are our students being inundated.

Bailey describes a challenging dynamic in which “we are often faced with new classes in which we must use assessment devices to gather information, while at the same time we wish to establish a positive environment” (1998, p. 8).  This is a challenge that must be worked out by teachers in their own way, however, too often teachers are not given the opportunity to work them out in their own way.  Of the three major literacy tests that had been given in this classroom in that first month, only the BAS test had a direct impact on my instructional strategies.  Standardized tests in public schools tend to be used more for funding than for instructional purposes.

This is likely what made the public attention of the raped girl in my dream so salient.  I’m sure the chant that I was hearing at the end of the dream was some bastardization of a chant that I had heard coming from the strike rallies that filled the downtown streets with teachers just a few weeks prior.  The issue of standardized tests that had been such a point of contention in the negotiations and which had raised questions about the legality of the strike is something that affects the children in ways they are likely not even aware of.  Why do school funding and teachers’ performance evaluations have to be based around an arbitrary measurement standardized among a group that may not be well-represented in any given school?

As Bailey points out, “Teachers often see a mismatch… between the skills they address in their classrooms and the material that is covered in exams, especially standardized multiple-choice tests” (1998, p. 148).  She eschews multiple-choice tests in general, though she nonetheless devotes an entire chapter to explaining how to develop one, explaining that, “my goal in discussing the multiple-choice format is not to promote its use, but rather to caution you about the dilemmas presented by this approach” (1998, p. 130).

This chapter was particularly eye-opening for me, as she exposed, and to some degree even seemed to be condoning, some rather insidious test-development procedures.  Bailey reveals that, ”in terms of item analysis, the assumption is made that the student’s total test score on a specific exam is the best estimate of how good a student he is in terms of whatever knowledge and/or skill that particular exam is measuring” (1998, p. 135).  She then goes on to cite research describing how an item that is answered correctly by low-scoring students which is missed by the high-scoring students must be reevaluated because there would have to be something inherently wrong with the question to lead to such a result.

The problem with this perspective is that it assumes that all learners are alike.  By determining that such items must be inherently faulty, Bailey is denying that the so-called ‘low scorers’ may process the information differently and actually hold some capabilities that exceed those of the so-called ‘high scorers.’  The fact that she cites multiple other sources to corroborate this test-refinement technique reveals that there is actually a systematic push to make low-performing students perform even lower.  ‘Low scorers’ end up scoring low on standardized multiple-choice tests because these tests are particularly tailored to make them do so.

This revelation drives home, more than anything else in my mind, the point that, as Bailey puts it, “Important decisions should not rest on simple test scores” (1998, p. 204).  Testing is not the be-all and end-all of education.  In fact, Bailey tells us that “testing often (though not always) lags behind teaching, and is in some regards inherently conservative” (1998, p. 141) and that “While our pedagogic emphases have swung from a strongly product-oriented to a largely process-oriented approach… our evaluation procedures have lagged behind our pedagogy” (1998, p. 186).

In preschool education, most teachers I encounter generally rely on observational assessments as our primary mode of determining our students’ capabilities.  Perhaps this is something that needs to be done to a greater degree in the primary grades as well.  As Bailey says, “there is much to be gained by watching learners learn when they are not being taught” (1998, p. 55).  Perhaps it is time for teachers to take a step back and find out what our students know by just letting them be themselves.

Next month I will probably be examining Echevarría, Vogt, & Short’s, Making Content Comprehensible for English Learners: The SIOP Model (2013).  If there’s a book you would like to see me discuss in this blog, please comment on the Recommended Reading page.


Bailey, K.M. (1998). Learning about language assessment: Dilemmas, decisions, and directions. Cambridge, MA: Heinle & Heinle Publishers.

Echevarría, J., ME. Vogt, & D.J. Short. (2013). Making content comprehensible for English learners: The SIOP® model. Upper Saddle River, NJ: Pearson.