Wash Park Prophet: Testing With A Purpose

I am not opposed to high stakes, externally administered and designed tests for kids in the public schools. The problem with Colorado's CSAP exams is not that they are high stakes tests, but that they are no stakes tests that are used for purposes that student tests aren't well suited to serve. Colorado should have externally administered and designed tests in its public schools, but those tests need to be used for valid educational purposes.

What Isn't Wrong With Externally Administered Tests.

Many people are opposed to externally administered and designed tests, whether they are the SATs, the ACTs, the modern successors to the British eleven plus, O-Level and A-Level exams, New Zealand's bursary exams, medical school board exams, bar exams, Japanese and South Korean college entrance exams, the International Baccalaureate exams, the CSAPs, the Iowa Tests, or the Advanced Placement exams, often out of concern for racial or ethnic bias. This concern is unsound. An individual teacher's grades over the course of a semester or two are inherently suffused with the teacher's own personal biases arising out of months of regular contact with the student, and are almost never consciously designed to reduce cultural bias. While no test can probably ever be completely bias free, an externally administered and designed test is almost always consciously designed to minimize cultural bias and is free of a particularly teacher's prejudices. Such exams also consistent between institutions, while teacher based evaluations are highly influenced by a school's cultures of grade inflation and even individual teachers within a particular school often have materially different standards. An external test is a more fair way to judge a student's mastery of various subject than a student's transcript containing teacher awarded grades standing alone.

I also am not terribly concerned about the worry that teachers will "teach to the test". If the test is designed to test the right outcomes, then teaching to the test is not only tolerable, it is desirable. While we don't want to discourage creative teaching methods, we only want to encourage teaching methods that provide students with mastery of the materials. When creative teaching methods produce good results on tests that measure what we actually want the students to learn, this is well and good, but when creative teaching methods don't accomplish help students master what they need to learn, the teaching methods should be discouraged. This requires external test designers to put serious thought into designing their tests. But, it can be done. The IB system, the British external testing system, the pre-multiple choice era SATs (which were similar to the IB and British exams in the United States for most of the early 20th century) and the advanced placement exams are all examples of external testing programs respected for their ability to test what we want students to be learning in particular subject areas.

And, if we are going to devote considerable time, effort and money to establishing an externally administered and designed testing regime that tests what we want students to be learning in particular subject areas, it is all a horrible waste if they aren't used to guide educational decision making. There can be overkill of course. No test administered in a single setting is a perfectly accurate gauge of a student's ability and mastery of subject matter. The flu, a family tragedy, a missed night's sleep, the happenstance of a student chancing upon a disproportionate share of the subparts of a subject in which a student is weakest, or a hundred other factors can produce an inaccurate gauge of a student's abilities. Almost no test is so accurate that it should be the sole basis for an important decision in a student's life, and any testing regime needs to have provisions that allow a student having an off day to redeem him or herself, while at the same time, discouraging endless futile attempts to do better that are very unlikely to produce a different result.

What Do Tests Do Poorly?

The only purpose for the CSAPs used in Colorado right now is to evaluate particular schools, based on the aggregate test score results of the students who attend the school at a point of time, with a small nod to trend lines that, as often as not, are the product of changing demographics in the school's attendance area as they are a product of changing teaching quality.

This is a miserable way to evaluate schools. Overwhelming evidence shows that the primary determinant of a schools aggregate test score results is the characteristics of the students who are admitted to the school. There is probably no social scientific fact which is better established.

A school with poor kids who have personal issues outside of school, or have weak academic records before entering the school, almost inevitably, in the aggregate, will do very poorly on any point in time measure of their academic ability. A school with affluent kids from stable families with high socio-economic standing and strong prior academic performance will, almost inevitably, in the aggregate, do very well on any point in time measure of their academic ability.

This isn't to say that there aren't occasional exceptions. But, decades of intensive study of those outliers have failed to produce a readily reproducible formula for producing better than expected performance, and a close look at those outliers over a long time period, often reveals that this exceptional performance is often fleeting. Identifying outlier schools so that state officials can help other schools replicate their performance does not justify that time, effort and money that is devoted to the CSAPs.

Another equally useless purpose of externally administered and designed tests is to, as many states do, make graduation contingent upon passing them. Higher education in the United States is a national market, which has already identified other means of determining who is ready for college. High stakes requirements to earn a high school diploma, thus do not help colleges identify whom they should admit. Employers who are interested in a student's academic performance in high school can already easily request a transcript (or if they really want, test scores as well) to ascertain that, yet few employers do for jobs that require only a high school education. The only practical effect of making a high school diploma contingent upon passing high stakes externally administered and designed exams is to impose the stigma of being a high school dropout on a large number of mediocre high school students who do their work, but aren't up to grade level. Making life harder for these non-college bound high school completing students, who are already ill served by our educational system, serves no useful purpose.

What Do Tests Do Well?

So, why do I support high stakes, externally designed and administered tests? I support them because there are things that tests do well. Tests do not make students learn subject matter. Teachers do that. What tests do well is to sort students based upon what the test measures. Good tests, as a part of a number of other factors, are a good measure, in particular, of future academic performance. For example, while the LSAT does a rather poor job of predicting how well a lawyer will do once he is working in the profession, the LSAT does a rather good job of predicting how well a prospective law student will do in law school and how likely that student is to eventually be able to pass the bar exam.

College Admissions

We can know, to a considerable degree of accuracy, the likelihood that a particular college applicant will graduate from college, based upon test results and a number of other factors that can be determined from the face of a college transcript (not just the GPA, but also the courses taken). This in turn can be, and should be, used to allocate scarce higher education resources. I am of the strong opinion that we do students no favors by admitting them to college when we know, based on the information in their application, that they are almost doomed to fail in the program they have set out to complete. This doesn't mean that these students should have opportunities to benefit from continuing education. But, allowing a student who has a 5% chance of earning a four year degree in a liberal arts major to enroll in a four year liberal arts program is a waste of the student's time and the institution's money. Less charitably, this kind of practice exploits ill qualified students in an effort to secure formula based funding to benefit more able students. A student applying to a program he or she isn't prepared to succeed in should receive counseling on alternatives that are more realistic, rather than being set up for failure in an inappropriate program.

For instance, while most of the affirmative action debate in law school admissions has focused on admission to elite institutions, the data that is out there shows that a very large share of students admitted to law schools with low admissions standards through an affirmative action program either drop out, or complete law school only to never be able to pass the bar exam. No one should have to pay for three years of law school without being able to pay for it with a career as an attorney. It is hard enough to pay for it when you do become one.

Students who have a very low chance of success in a program don't necessarily have to be absolutely barred from admission. Everyone knows that one person who was an outlier and succeeded despite the odds. But, that kind of gamble is a poor use of scarce higher education funds. We would be better off making more progress in reducing financial barriers for students more likely to succeed in a program, or in allocating funds to programs that a rejected student is more likely to succeed in, than devoting large sums of money to go through the motions only to see the expected academic failures actually happen. Externally administered and designed tests can make this high stakes determination more fair and more accurate for a student than either teacher based evaluations or subject matter free tests like the SAT that I took in high school.

Intervening To Prevent Dropping Out

Another thing that testing and some related measures can predict quite well is which students are on track to drop out of high school, or at least, perform dismally in high school. While there is a long standing suspicion of tracking in the American educational culture, one of the things that tests do well is sort kids in a way that accurately predicts future academic performance. We can determine, with a great deal of confidence, which eleven years olds are going to, at age sixteen, drop out or perform miserably in high school, to the point where it is clear that they are not learning anything.

A child who drops out is at very high risk of imposing immense burdens on society at large. An eighteen year male high school dropout is stunningly more likely to end up as a convicted felony than, for example, an eighteen year old female enrolled in a community college. High school drop outs, in addition to being much more likely to have a life of crime, are far more likely to need help from government social programs, are far more likely to end up homeless, are far more likely to be unemployed, are far more likely to have out of wedlock children whom they are unable to provide for, and are far more likely to suffer health problems while uninsured. All of these outcomes impose a burden on society. And, high school drop outs are far less likely to generate any significant tax revenues to pay for those burdens.

In fact, we already do enough testing right now to predict who will drop out by late elementary school age, although we could do better if we tailored a test specifically for that purpose. But, we do very little with this information. Typically, drop out prevention and intervention programs in the existing system start in the months before or after a student drops out, despite the fact that this is an outcome that our educational system should have seen coming for many years in advance. Prior to that, intervention is typically half hearted and usually aimed at the very narrow goal of preventing truancy, which is itself, typically the symptom of far deeper problems outside the school environment, and of a school environment that is not serving that child's needs.

Letting predictable failures just happen is grossly irresponsible. It is one thing to have a sink or swim policy. It is another to have a sink or swim policy when you know in advance who has had swimming lessons and who is wearing life jackets.

We should be doing high stakes testing by late elementary school age to identify those students at high risk of ending up as high school dropouts or very poorly performing high school students for intensive, immediate intervention aimed at putting those children on a stable, self-sufficient track, which probably will not involve applying to enter a four year college program, over the next five years or so that the educational system has a mandatory ability to intervene. Accepting a palpable risk that a student will drop out and fail miserably in life, in exchange for preserving a mere glimmer of a possibility that the child might go to college and graduate may be idealistic, but exercising bad judgment by overstating the likelihood that the child will ever go to college in a way that sets that child up for failure doesn't do that child a useful service. Those high stakes tests need to be as accurate as possible to reduce the risk that we will inappropriately discourage someone from taking a college track, but the stakes involved make it all the more important that the tests by externally administered and designed, so that this decision is made fairly and accurately. School counselors are routinely accused, when they act more subjectively, of basing their recommendations on improper biases, which is one of the reasons that many are reluctant to express an honest opinion on a child's prospects at all.

Now, if all we are going to do is track students into the different streams of the same curriculum we have now, this kind of high stakes late elementary school testing is a fool's errand. But, if we put real money into the intensive intervention that the tests identify a need for, perhaps double state funding for students who need this kind of extra help, and specifically tailored curriculum to meet these student's needs, we as a society can save ourselves an immense amount of social costs down the road.

Helping Those In the Middle Find Career Paths

There is also a need to sort out what course of academic preparation is best for students who are neither college bound, nor on the path to dismal academic failure.

We ignore this group of students now in educational policy making, and I'm not personally a great expert in what approach would be best with these students, in part because those in the middle get less attention in academic research than students at the extremes. Probably the best precedent would be the testing regime used by the U.S. military to identify optimal military occupational specialties for incoming recruits, who are overwhelming drawn from the middle 50% of high school graduates as measured by class rank. More able students tend to go to college, less able students aren't wanted by the military. Adapting these kinds of tests to students in the middle of the road academically could allow them to be on paths more productive than the status quo.

The current system gives students in the middle a watered down college preparatory curriculum, supplemented by only a handful of half hearted vocational electives, even if they aren't actually college bound and would likely fail to complete a four year college program if they tried. Typically, these students, when they graduate, end up looking for a totally unskilled McJob, unless family connections or personal connections totally unrelated to the educational system, or a tour of duty in the military, set them on a more successful course in life, or they make a half hearted attempt to try a local community college or open admissions four year college. Obviously, military aptitude tests would have to be adapted to the civilian world. There isn't a lot of demand for artillery specialists in the civilian world. But, students who aren't college bound deserve the same broad range of career preparatory choices that college bound students who pick a major, military recruits and students at the secondary level in most industrialized countries receive.

These tests might be given around sophomore or early junior year in college, allowing students who aren't college bound (and by late freshman or late sophomore year, this should be quite clear for most students based on a long history of academic performance and college admission type testing) two or three years of free vocational preparation providing them with meaningful skills suitable for a viable career on the public school system's dime (perhaps followed by a year or two of further career preparation in community college), rather than taking a couple of more years of exclusively watered down college preparatory liberal arts courses, and doing the same thing at their own expense in proprietary trade schools or community colleges after graduating from high school. There could be some civics and personal enrichment oriented course work, but it shouldn't be the main focus of the curriculum for these students, as this is not their focus. The perennial complaint of high school students not headed for college is that the curriculum isn't relevant to their lives and goals. These tests wouldn't necessarily have to be as "high stakes" as the other tests discussed above, but, instead, could serve primarily to provide some impartial guidance on what kind of career choices really make sense for that student, and as a basis to limit admissions to vocational options where demand outstrips supply at the moment or failure rates are high.

Conclusion

Testing only makes sense if it has a purpose that it is good at fulfilling. Tests in their current form are a poor ways to grade schools. Tests are good, however, at making sensible recommendations about a future course of academic student for individual students. Tests are fundamentally sorting devices, rather than teaching devices, and used for this limited purpose, they can be valuable educational tools.

The benefits of high stakes testing in college admissions and in identifying potential high school dropouts for intensive intervention at an early age are clear. Testing could also be useful in helping students who are not college bound with a couple more years in the school system left, to identify and prepare for suitable careers.

A test toward the end of 5th grade designed to identify students who need immediate intensive intervention supported by extra resources to prevent them from dropping out of high school five years later, a test at the end of the 9th or 10th grade to identify which children have a realistic chance of completing a traditional four year college degree who will benefit most from the traditional college preparatory curriculum which would follow in the couple of years that follow, and a test at the beginning of the following year for students who are not college bound, to identify the most promising career paths for each student to help tailor that student's next two or three years of public education to that student's needs, all in a high stakes, externally administered and designed format, would serve Colorado's children much better than the more burdensome and mostly useless CSAP system that we have today which is used to provide a basis for school report cards.

This system of tests would devote significant public high school education resources from traditional college preparatory subjects to more career oriented subjects. This system of tests would also probably result in fewer students participating in traditional four year college and university programs in the state, but a similar number of college graduates, while diverting significant resources now used to teach ill prepared college students who ultimately drop out of their programs, to new less traditional programs for those students, designed to bridge the gap for them between high school and either a career or further education.

In each case, the focus should be on matching students to further instruction which is appropriate for them, before that effort is undertaken, rather than penalizing or rewarding students after the fact. And, testing is only justified for these purposes if it is matched by ample funding to back up what the tests indicate is appropriate for individual students.

Wash Park Prophet

Pages

11 April 2006

Testing With A Purpose

1 comment: