Comparing U.S. K-12 Students' Math and Science Performance Internationally: What are the facts, what do they mean for educational reform, and how do I talk effectively about the issues?

In the popular press and in public debate, one often hears that U.S. students are performing poorly in math and science in comparison to other countries. What is the basis for these claims? What are students' actual scores and rankings? How should we interpret and use these scores? A better understanding of the evidence is important for making effective policy decisions that affect computer science and other STEM fields.

What is the basis for the international comparisons?

The source for these comparisons is the Trends in International Math and Science Study (TIMSS), an international test administered every four years to 4th and 8th graders.* The test was first given in 1995 in approximately 20 countries. The most recent test was administered in 2007 to 4th graders in 36 countries and to 8th graders in 48 countries. The average score for each country is determined and used to rank all participating countries.

*The Program for International Student Assessment (PISA) is a similar international test given to 15-year-olds. While PISA is less frequently cited, it also has limitations similar to TIMSS.

 

What are U.S. students' scores and rankings?

The frequently heard claim that U.S. students are doing poorly is misleading. In 2007, U.S. 4th grade students scored fifth in science and ninth in math; 8th grade students scored tenth in science and sixth in math (full results available at http://nces.ed.gov/timss/results07.asp). While there is always room for improvement, U.S. students currently score in the top 12-25% of countries in most grade levels and subjects. In the test's history, U.S. scores have been stable or improving rather than declining.

What factors are important for interpreting and using TIMSS results?

It is always important to be careful when making claims based on one test. In this case, several important limitations need to be considered before drawing conclusions or making policy recommendations based on TIMSS result.

The test compares apples and oranges. The U.S. population is larger and more diverse than the populations of most countries that outscore us (e.g., Singapore, Latvia, Hong Kong, Taiwan, Estonia, Japan, Hungary). We also educate more second-language learners and students with a wider range of socioeconomic differences. All of these factors tend to lower "average" scores. When comparing U.S. students to students with similar family income levels, U.S. scores are similar to those of other top-scoring countries. Curricular variation among countries also makes it difficult to design a fair or accurate test. Some test-takers will encounter topics not yet covered in their regular classes.

Test scores do not measure innovation. Sometimes TIMSS scores and international comparisons are used to argue that U.S. innovation is declining. While improving innovation efforts is certainly important, using these test scores in a discussion about innovation is misleading. TIMSS scores do not measure or predict innovation. Many countries leading in innovation typically do not score in the top five slots. Some of the highest-scoring countries explicitly "teach to the test"; educational research shows that although this practice can improve test scores, it can hinder creativity and innovation.

What factors are important for interpreting and using TIMSS results?

Average scores mask a high number of top-scoring U.S. students. Almost all reports of TIMSS results focus on average scores, but this masks important differences among scores. For example, U.S. rankings often improve when considering the percentage of "advanced" scorers. In 2007, 15% of U.S. 4th graders and 10% of U.S. 8th graders scored at or above the "advanced" benchmark in science. Only two countries (Singapore, Taiwan) had a higher percentage of advanced scorers in 4th grade, and only six countries (Singapore, Taiwan, Japan, Korea, England, Hungary) had a higher percentage of advanced scorers in 8th grade. Percentage of U.S. 8th grade students who reached each TIMSS international science benchmark compared with the international median percentage.



NOTE: The numbers do not total 100 because each benchmark percentage is cumulative. In other words, each set of bars shows the total percentage of students who reached that benchmark or a higher benchmark.

nces.ed.gov/timss/figure07_3.asp

 

How can I talk about TIMSS scores andeducational reform more effectively? Provide accurate information and reframe the conversation

Complete the picture: It's about economics as well as education. Crisis rhetoric that positions the bulk of U.S. public education as failing is neither accurate nor helpful. It incorrectly blames education for what are also significant economic problems. For example, inaccurate reports of TIMSS test scores are often used to fault U.S. education for not producing enough "qualified" American workers. Even if U.S. education produced an unlimited supply of "qualified" workers, economic issues (e.g., globalization; national salary differences) would still complicate these students' chances for acquiring jobs. Failure to acknowledge these kinds of economic factors presents an incomplete picture, leading to misguided policies and ineffective solutions (e.g., policies that punish or withhold funding from low-scoring schools).

Be precise: It's about specific problems and solutions. Being specific about the nature of educational problems is crucial. For example, educators in schools where funding and access to technology is limited do face significant difficulties in educating students. Also, some subjects, such as computer science, are increasingly important for students to learn but are typically absent from conversations about math and science reform. Advocating for specific policies and programs that address problems such as these is important for truly improving students' performance and ability to innovate.

Author | Catherine Ashcraft