In January, 2011, the National Council on Teacher Quality (NCTQ) and U.S. News & World Report announced their intention to evaluate and rank teacher education programs in the United States. The goal of this study is to review the quality of teacher preparation programs across the U.S., based on NCTQ’s standards for teacher education. This followed a series of reports assessing state-level Teacher Preparation policies across the nation.
Because there are at least 1,400 university-based teacher preparation programs in the US, NCTQ relied on two methods to assess the practices of teacher preparation programs and the content in the courses offered by the teacher preparation programs. The first method was to ask preparation program representatives to respond to queries from NCTQ about programs. The second method was to collect and analyze the syllabi from the courses taken by students in teacher preparation programs.
The purpose of this blog is to examine the effort by NCTQ to evaluate, judge, and rank university-based teacher preparation programs. My comments are separated into five sections: (1) Inputs versus outputs; (2) Lack of research foundation; (2) Methodology employed; (3) Alternative programs ignored; (4) Superintendent critiques; and, (5) Ultimate impact.
Approach Focuses on Inputs and Largely Ignores Outputs
One of the two most important criticisms of the NCTQ effort is the almost unilateral focus on inputs and the lack of any consideration of outputs. While inputs are certainly important to a preparation program, the outputs are what matter. By outputs, I mean such outcomes as teacher placement, teacher longevity in the profession, actual behaviors of teachers in the classroom, and the effect of teachers on various student outcomes.
NCTQ claims the barriers to assessing outcomes are simply too large to overcome without the investment of hundreds of millions of dollars. In fact, I would agree with NCTQ on this point—assessing the outcomes of teacher preparation programs would be quite costly and difficult. For example, analyzing outputs such as placement, retention, and impact on student test scores would require states to collect and make available detailed data in a number of areas such as: teacher characteristics and prior experiences; teacher production, placement, and retention; the link between test scores and students; the link between students and their teachers and teachers and their preparation programs; a wide variety of school characteristics; and, the characteristics of the principal. Most states do not collect such data and, even if they wanted to collect such data, lack the financial and human resources to accomplish such a huge undertaking. Further, even if states were able to collect such data and NCTQ was granted access to the data, appropriately analyzing such data is extremely difficult and may simply be impossible. Indeed, as Sass and his colleagues found in Florida, fairly comparing teacher preparation programs based on graduates’ impact on student test scores is not possible because of the need to control for unobserved characteristics of schools.
Given the extreme difficulty in assessing outcomes, I can understand why NCTQ did not try to assess outcomes. What is terribly troubling, however, is that NCTQ makes the quantum leap from inputs to quality preparation. Essentially, NCTQ claims it can assess the quality of a preparation program’s teachers based almost solely on a review of syllabi in some, but not all, courses taken by students in a program. There is simply not enough evidence or research to make such a leap (see below for further discussion of this) and NCTQ should have simply stated that they were evaluating and ranking preparation programs on inputs only and left it at that. This is simply another example of a think-tank not understanding issues surrounding research.
Lack of Research Foundation
The second major critique of the NCTQ effort is that there is little or no empirical research that substantiates the use of such standards in an effort to evaluate and judge the quality of a teacher preparation program. NCTQ admits that their standards are not grounded on a body of research. Note that the “study” lists the number of research reports underlying each standard. Yet, they do not list the papers. Are they peer-reviewed? What was the quality of the research? Was the paper a case study of one program or a large analysis of multiple programs? If NCTQ was really confident and transparent about the research foundation of their standards, they would have listed the papers and even provided links to them. Give me any education topic and I can list 20 papers on that tropic. Many won’t be pertinent or of high-quality, but it will sure look impressive to the non-researcher that I could say “there are 20 research reports that substantiate this standard.” You need both research QUANTITY and QUALITY in order to legitimately adopt a standard.
For example, NCTQ states: “[Our] standards were developed over five years of study and are the result of contributions made by leading thinkers and practitioners from not just all over the nation, but also all over the world. To the extent that we can, we rely on research to guide our standards. However, the field of teacher education is not well-studied.” Note that the word researcher is omitted from this description. NCTQ relied on thinkers and practitioners, but not researchers. While thinkers and practitioners can provide useful insight, researchers are critical to such standard setting. In fact, many beliefs based on common sense turn out to be incorrect after research examines and issue.
While NCTQ is correct in that there is not a large body of research examining the inputs of teacher preparation programs and any outcome measures, there is some research and NCTQ apparently did not read it. Indeed, Eduventures (http://www.units.muohio.edu/eap/deansmessge/documents/EduventuresNCTQMethodologyCritique.pdf) correctly points out that the NCTQ standards include many inputs for which there is no research base that links the inputs to any type of output while some inputs that are important indicators of teacher preparation program quality are completely ignored by NCTQ and do not appear in the standards.
Eduventures (2010) notes that important inputs that should be assessed in an effort to link programs to outputs would include: the quality of instruction provided in teacher preparation and content courses, student support services, mentoring and induction provided by the program, and the length of the clinical experience required of students. Note that many of these inputs are simply absent from the NCTQ standards. NCTQ will respond that they don’t have enough money to properly conduct the study. So why do the study at all then? Is it okay to do a bad study because the money is not available to conduct a proper study? What would NCTQ say if a teacher preparation program claimed that they could do better if they had more money? I seriously doubt NCTQ would be sympathetic. Why should the public and the media be sympathetic with NCTQ?
Examples of Incorrect Reading of the Research
There are numerous examples of NCTQ standards that are simply not supported by existing research that was clearly available to NCTQ.
For example, NCTQ states that middle and high school preparation programs should require students to obtain a content-area major or at least 30 hours of content courses. Yet, Monk (1994) found diminishing returns to teacher effectiveness past five courses at the high school level. Moreover, at the middle school level, Alexander and Fuller (2003) and Darling-Hammond (personal; communication) found that middle school teachers trained in as elementary teachers were more effective than middle school teachers trained as secondary subject area specialists.
With respect to student teaching, NCTQ requires that preparation programs ensure that cooperating teachers for student teachers be proven effective instructors as measured by student achievement. While this makes common sense, there is no research base to support this contention. Further, NCTQ does not say how districts should use student achievement to assess teacher effectiveness. There is certainly no consensus in this area and there will be great variation in the quality of efforts of districts to do this. Ultimately, the measure is meaningless because there is no method to ensure districts assess teachers in an appropriate and accurate manner.
Also with respect to student teaching, NCTQ states:
When evaluated in the context of teacher preparation programs that are in relative geographic proximity, the proportion of a program’s student teaching placements that are made in schools that can be classified as “high functioning and high needs” can signal a commitment to ensuring that all teacher candidates experience teaching in such learning environments. For purposes of classification, schools are designated as “high functioning and high needs” if:
- Average student performance in reading and mathematics both exceed the district average or the school has been designated by its state as having recently made significant improvements in average student performance in reading and mathematics.
- Forty percent or more of students are eligible to receive free or reduced-price meals.
Again, this standard has no research foundation. Moreover, it is highly problematic. If districts use percent proficient to determine student performance, then the determination could very well be incorrect. Even more problematic would be the use of the change in percent proficient to assess growth. This almost always provides an inaccurate judgment about progress (Koretz, 2008). This makes me strongly suspect NCTQ does not even understand basic assessment issues.
Another aspect of student teaching that is excluded from the standards is the existence of a capstone project. NCTQ uses the standard of five observations that was found to be statistically significantly associated with teacher practice in a study by Boyd and his colleagues (2008). Yet, the very same study found the existence of a capstone project has the same impact. Why did NCTQ pick one finding and ignore the other?
Perhaps most importantly, the standards for reading/English and mathematics instruction are not grounded in the research that is clearly evident in the standards developed by the National Council of Teachers of English or the National Council of Teachers of Mathematics. Why NCTQ believes they know more about instruction in these fields than actual experts is beyond me.
Incorrect Use of Research
As is typical with think-tank reports, the authors at NCTQ do not understand how research should be used. There is actually some high-quality research on the link between teacher preparation practices and outcomes, but we certainly need much more to make definitive conclusions about best practice. But using research to identify potential best practices and using research to rank institutions are two totally different uses of research. When researchers like Donald Boyd and his colleagues conduct research, they are looking for patterns in the data to be able to say a certain characteristic of teacher preparation programs is associated with improved teacher practice or greater student achievement. Such research is extremely useful. However, such research does NOT conclude that EVERY teacher preparation that employed a particular strategy was high-performing or that EVERY teacher preparation program that did not use a particular strategy was low-performing. The researchers conclude that teacher preparation programs tend to have better outcomes if they use a particular strategy, but that some programs that use the strategy are low-performing and some that don’t use the strategy are high-performing.
NCTQ, however, completely mis-uses the research by contending EVERY program MUST use a certain strategy. That is simply NOT what research says or what researchers would advocate in terms of how the data should be used. There is widespread consensus that research should not be used this way which is why researchers are loath to rank programs. They know that rankings will be inaccurate and cause harm to good people who run effective programs and give undue recognition to ineffective programs.
Critique of Methodology
Before critiquing the methodology, I provide a review of the methodology most likely employed in the study. Currently, I cannot find any documentation for the current study.
While there is no current description of the methodology NCTQ employed for the 2013 study, we can certainly guess the methodology employed based on their past efforts in this arena. There are three phases to the study.
The first phase includes data collection and consists of two methods. The first method includes reviewing institutional websites to gather information on entrance requirements and other similar information. If syllabi are posted on institutional websites, then the syllabi would be collected from the websites. The second data collection method includes asking schools of education to provide syllabi from teacher preparation courses assuming they are not available on any websites. In addition, a request is made for all course materials and a listing of all readings if these items are not included in the syllabi.
The second phase of the study includes a content analysis of the syllabi as well as ancillary materials and the required readings. NCTQ describes previous data collection efforts in this way:
NCTQ examines institutional admissions standards and a program’s own admission policy; general education course requirements and course descriptions; course requirements for secondary teachers in their subject area; professional course requirements and descriptions; syllabi and textbooks for selected coursework; student teaching policies and practices; graduation requirements; course schedules and teaching assignments; faculty listings; and a program’s record of interaction with area school districts.
According to NCTQ, the reviewers are experts in the field. In particular, NCTQ makes reference to content experts actually reading all of the required books in a course. NCTQ states:
NCTQ never looks at just a syllabus when rating a course. We also have experts read and analyze every text that is required for a course, as well as any “reading packets” put together by the instructor. In college reading courses alone, NCTQ has reviewed over 700 texts.
The third phase includes providing a preliminary report to representatives from the Colleges of Education to check the accuracy of the findings. Representatives are offered an opportunity to identify errors and provide additional information.
Critique of the Methodology
There are a number of critiques of using such data in the manner in which NCTQ proposes to use the data.
First, one must question the accuracy of syllabi as an indicator of the enacted curriculum in a class. NCTQ claims that syllabi are likely to “provide a more ambitious picture of the content of a course than the professor is usually able to achieve. Professors generally overestimate what they will be able to accomplish, not underestimate. For that reason, a methodology relying partially on syllabi could end up rating a course higher than it might actually deserve.” NCTQ provides absolutely no research to support this claim. While there may be research that supports such a belief, NCTQ does not provide any mention of the research and I could not find any research that addresses this issue.
In my personal experience, I am not terribly specific in writing my syllabi because I prefer to assess the knowledge and interests of my students before determining where the class should go in terms of content and discussion. Could this be the case with other professors? I think a fair assumption to this question is that other professors may, in fact, do the same.
Second, NCTQ provides no evidence that all syllabi are even collected. The methodology provided online repeatedly mentions teacher preparation courses and collecting syllabi from colleges of education. But, in many states, much of the content is provided by other colleges within a university, particularly for secondary education. If NCTQ is not collecting all syllabi, then there is a high likelihood that there conclusions are erroneous. IF NCTQ believes Colleges of Education can demand syllabi from professors in another college, then no one at NCTQ has ever worked at a university.
Third, NCTQ mentions that syllabi, course descriptions, readings, and books are reviewed by experts. Yet, they never mention who these experts might be or provide information about their training. What types of backgrounds do these individuals possess? What type of training was provided to all these individuals? Did the training include individuals reviewing the same information to ensure inter-rater reliability? Were the results cross-checked by a second- or third-reviewer as is common in content analysis studies? We don’t know the answer to any of these questions because NCTQ does not provide such information. I guess we would have to rate their effort an “F” if we were inclined to rate and rank studies of this sort.
***Update1: In an email from Kate Walsh, she claims the new report actually addresses these issues. In fact, she claims the experts are listed and their backgrounds are described, they provided training, reviews were reviewed by other people, and there was training to increase inter-rate reliability. If so, great! They never did this before as far as I can tell. I have asked her to supply the methodology and descriptions of the review process–I don’t care to see the rankings because they won’t tell me much about anything.***
***Update2: Kate Walsh was NOT correct in her contention these issues were addressed. She is correct that the report (1) documents appropriate training for inter-rate reliability; (2) documents that multiple reviewers were used to to review each syllabi and (3) “experts” who directed and worked on the project were listed. However, the qualifications of the reviewers is not documented anywhere in the report. I suspect these are young grad student types or ex-TFAers who have inadequate training in qualitative methodology and have little expertise in teacher preparation. More stunning is the lack of qualifications of the directors of the projects. None of them have any background that would qualify them to do such work. They appear to have no training in research methodology and no degrees or coursework that would communicate they have in-depth knowledge of the field of teacher preparation.
Moreover, think about how many experts would be needed to review all the courses taken by all students at all 1,400 or so programs that prepare teachers. I don’t know about you, but simply reading all the texts in my five courses in one semester took quite a long time—far more than the “40 hours” that NCTQ claims it takes to review an individual program.
Alternative routes are excluded
NCTQ also ignores the increasing relevance of alternative providers of teachers. In Texas, more teachers are produced by alternative certification programs than traditional undergraduate programs. In California, a substantial percentage of newly minted teachers are from alternative certification programs. The same is true for other states and metro areas around the country. Given the importance of these programs—particularly in large states such as Texas, California, New York, and Florida—one has to question why NCTQ left such programs out of the effort.
In an email conversation I had with Kate Walsh, President of NCTQ, she stated:” [Your study of the placement of teachers from alternative certification programs into schools by the percentage of poor and minority students enrolled I the school] very much jives with the data that Jennifer Pressley collected in Illinois, if those alt route paths are as awful as you and I both think or know they are. The poorest schools are getting these teachers, no question.”
So, the President of NCTQ believes that many of the alternative certification programs are “awful,” but these programs are excluded from the analysis. Why? Why exclude some of the worst programs in the country—ones that provide zero hours of pre-service training and allow individuals to enter with less than a 2.0 GPA? Why exclude programs whose graduates are far, far more likely to fail Texas certification exams (see figure 1 below)? Doesn;t this exclusion simply give a pass to some of the very worst programs in the country, some of which produce over 1,000 teachers per year?
This exclusion calls into question the intent of the NCTQ effort. Indeed, I think a legitimate question to ask is whether NCTQ simply wants to reduce reliance on university-based programs and simply privatize teacher preparation.
Figure 1: Greater Likelihood of Failing a Texas Content Certification Test by Individuals from Community College Alt Cert Programs, Private Alt Cert Programs, and Charter Schools Compared to Individuals from Traditional University-Based Programs
Further, as the percentage of teachers from alternative certification programs has increased–especially teachers from privately managed alternative certification programs–the rate of progress on NAEP scores had declined to the point where Texas is no longer considered a leader in this area. Texas had some of the greatest gains in NAEP on the mathematics test of any state in the country. While Texas continues to make gains, those gains are much, much smaller than in the 1990s and the gains have completely disappeared at the 4th grade level. Has NCTQ considered the distinct possibility that increasing the share of teachers from programs other than university-based programs will negatively impact achievement?
Superintendent Critiques of Teacher Preparation
Not surprisingly, NCTQ has reached out to superintendents to invite them to write letters to the media supporting NCTQ and criticizing teacher prep programs. This is a desperate effort to find some evidence about outcomes since the NCTQ report has no outcome data. Without a doubt, some superintendents will bemoan the low quality of teachers they employ from teacher prep programs. But this is misleading in a number of ways. First of all, why would a district hire poorly prepared teachers? There is no teacher shortage in most disciplines. This says more about the poor hiring processes of school districts that teacher preparation since there will be some relatively ineffective teachers from every preparation program—including the very best programs in the country. Second, effective teaching is influenced as much by a teacher’s principal, conditions, class sizes, peers, mentoring, induction support, facilities, and materials supplies.
A superintendent blaming teacher preparation programs for poorly prepared teachers in her/his district is simply casting blame on others and abdicating responsibility for creating an effective hiring process and providing teachers the proper conditions and support to be effective.
What will be the ultimate impact of the NCTQ rankings? I think a few predictions are on solid footing.
First, if programs pay attention to the rankings—and I think uninformed politicians will force them to do so—programs will start to game the system. By game the system, I mean faculty will begin to write syllabi that conforms to NCTQ standards. That doesn’t mean the actual instruction in courses will align with the standards, but simply that syllabi provided by programs will align with the NCTQ standards. This is line with Campbell’s Law, which states:
The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.
Thus, we will see the appearance of greater conformity, but that does not mean greater conformity in the enacted curriculum—what actually happens in classrooms.
Second, there will be increased pressure to close down or restrict university-based teacher preparation programs and provide more opportunity for alternative preparation programs, including privately managed programs. Let’s be clear—this has already happened in Texas and, as a result, teacher quality has declined mightily (Vigdor & Fuller, 2012). Indeed, even Kate Walsh (President of NCTQ) and C. Emily Feistritzer (President of the National Center for Alternative Education), have both communicated to me personally that they believe the privately managed alternative programs in Texas are abysmal. This is important because the single largest producer of STEM teachers in Texas is privately managed alternative certification programs. Yet, these programs produce teachers who are far more likely to fail a subject-area certification exam (Vigdor & Fuller, 2012) and far more likely to leave the profession with three years (Fuller, 2009). Is that really what we want for teacher preparation in the US?
Ultimately, I think NCTQ and some other organizations simply want to de-regulate and privatize teacher preparation in the belief such actions will improve teacher quality (For a critique, see https://fullerlook.wordpress.com/2013/05/27/de-regulating-teacher-prep-great-act/). Evidence from Texas says this approach will have disasterous results on teacher quality and ultimately on student achievement.
Boyd, D. Grossman, P., Lankford, H., Loeb, S., & Wyckoff, J. (2008). Teacher Preparation and Student Achievement. NBER Working Paper No. 14314. Washington, DC: National Bureau of Economic Research.
Eduventurs (2010). Review & Critique of NCTQ Study on Teacher Preparation Programs. Retrived at: http://www.units.muohio.edu/eap/deansmessge/documents/EduventuresNCTQMethodologyCritique.pd
Fuller, E.J. (2009). Review of Teacher Preparation Programs in Texas. Presented to the State Board for Educator Certification governance meeting.
Fuller, E.J., & Alexander, C. (2004, April). Does teacher certification matter? Teacher certification and middle school mathematics achievement in Texas. Paper presented at the national meeting of the American Education Research Association, San Diego.
Koretz, D. (2008). Measuring Up: What educational testing really tells us. Cambridge, MA: Hravard University Press.
Monk, D. H. (1994). Subject Area Preparation of Secondary Mathematics and Science Teachers and Student Achievement. Economics of Education Review, 13, 125-145
Vigdor, J. & Fuller, E.J. (2012). Examining Teacher Quality in Texas. Expert witness report in Texas school finance trial.