|
|
Observation Checklists versus Observation Data
By Dr. John L. Tenny, developer of the Data-Based Observation Method and the eCOVE Classroom Observation Software, January, 2010
I have looked at a couple dozen books on how to conduct classroom/teacher observations and have downloaded another 30 or so observation forms used by various districts across the country. I am struck by the general nature of most of them, and by the lack of specifics in the descriptors. The rating scales used run from observed/not observed to met/not met to a 5 to 7 likert scale. Some of the forms used by districts are designed to record anecdotal notes without a rating scale and/or with the evaluation of the performance on another summary page.
An example: This 'standard' came from a school district set of guidelines for conducting observations: "Establishes and maintains an orderly and supportive environment for students", and is, in one form or another, common in standards.
I just cannot see how a checklist or scale is helpful, let alone accurate, as a record of what happened in the class. If the observer checks 'observed', does that mean the class was at one point orderly and supportive? Alternatively, does it mean that when things started to get disorderly, the teacher responded and brought the class back in focus? On the other hand, since the phrase 'supportive environment' is also included, does that 'observed' indicate that the teacher made positive, encouraging statements? To all the students? Could the observer be satisfied with student work being posted on the walls, and a 'student of the week' bulletin board being present -- that is certainly supportive?
In addition, if the class was orderly some of the time and not others, and the observer checks ‘not observed’, would not the teacher respond with “Are you saying my class was never orderly Or that I was never supportive? Or both?"
Observed/ not observed does not work. It does not convey any helpful information and will lead to conflict between the observer and observee.
So what about a scale? Scales are typically designed to be either a 1 to 5 (poor to great) or a rubric of 'unsatisfactory, basic, emerging, competent, distinguished' type. Some of them will have descriptors for each of the levels, the worst being the range from 'did not observe/ observed some of the time/ observed most of the time/ observed all of the time'. These descriptors are worthless as they are nearly impossible to mark in a way that conveys what happened. For example, if the class were orderly for the first 3 minutes and then in chaos the rest of the time, the orderly standard would actually be met 'some of the time'. Actually, if the class were orderly up to 49% of the time, the same checkbox would apply; and if things were orderly 51% to 99% of the time, it would be 'most of the time'.
There are other descriptors or indicators for each of the levels that seem more specific. For example, Charlotte Danielson in her Framework for Teaching has a standard for Management of Instructional Groups with a proficient level of competence described as "Tasks for group work are organized, and groups are managed so most students are engaged at all times." In the many districts that use some variation of Danielson's work as their standards, the observer would be asked to judge if the teacher was at this level during the current observation.
Skip the fact that there are two behaviors in this indicator - organizing tasks and managing groups - which also confuses the issue, and just look at the act of making that determination of worth based on the second behavior. When you watch a typical classroom, the complexity in deciding if 'most' (is that 51% or is it really a higher target than that?) 'are engaged' (physically or mentally? engaged in low level or high level work?) 'at all times' (what would be the determination if the full class went off task for 3 minutes?), has such wide variation across classrooms and observers that the validity is suspect.
In discussions with administrators, what I find very often happens is that the observer adds additional criteria to the specific situation. 'Most students' sometimes turns into almost everyone in the class (a higher standard that stated); 'engaged' equates to looking and acting busy without regard for the level or quality of engagement; and 'at all times' is ignored in lieu of an unspoken criteria of 'most of the time'. If the class is known to have kids with behavior problems or the number of students is high, the criteria is functionally lowered.
What is really happening is that the observer has internally defined a level that is satisfactory to him or her based on personal experiences, the makeup of the class, and the relationship with the teacher, and that definition is applied unevenly across classrooms. That inconsistency is confusing to everyone, and the results of classroom observations cannot be compiled across the building, let alone the district, as a basis for broad decisions. The system has a built in subjectivity and personal interpretation of the standards that makes it difficult for any observer to be consistent and fair. Rating scales do not work, especially when extremely little effort is put into rater reliability and clear statement of the objectives and indicators.
Data based observations can make a significant difference. Some of the texts on classroom observations provide steps on how to turn a judgment on a rating scale into numbers and then process those numbers as if they were data -- but given all the issues with rating scales I believe this to be a false path.
Instead, I recommend using the Data-Based Observation Method, a 5-step process that includes the actual collection of observable behavior data. The steps are:
1. Identify the standards. Be sure that they are worded so that observable behaviors demonstrating those standards can be clearly identified. Good standard: Students will be engaged in learning activities. Bad standard: Teachers will act in an ethical and professional manner at all times (what does 'ethical' look like?).
2. Create indicators. Be sure that they describe the observable behavior identified in the standard. Good indicator: Students will listen attentively to the teacher, be productively engaged in individual work, or contribute to the work of a small group. Bad indicator: Students will follow teacher instructions as given (too vague and general).
3. Set criteria. As a profession, we have not engaged in setting criteria for ourselves in concrete terms, so this part is a new conversation.
What are the criteria for engaging students in learning? Should they be engaged 25% of the time? No, that is clearly too low. How about 50% of the time -- still sounds low. What about 95% of the time? Too high for real classes? The answer here is not to set the criteria arbitrarily, but to turn to (or conduct) research to establish criteria in which we can have confidence.
If the standard is an important one, and the indicators are valid, there will be a correlation between the behavior observed (such as student engagement) and the final desired outcome (student learning). We need to find those connections and use them as guides for improving teaching. We actually have research that identifies a significant number of them, but we are not applying that research at the classroom level.
4. Design data collection tools. I developed the eCOVE Classroom Observation Software as an easy and efficient way to collect the objective data, but you can use pencil and paper, a stopwatch, the wall clock in the classroom, etc. to collect the data once you have carefully identified what data is important to collect. Good tool: A counter tracking on/off task behavior and using the time sample data collection method to record the percent of time engaged and the percent of time not engaged for the entire class. By using the time sample data collection approach and repeated sweeps of the class to record the on/off task behavior of each student, a quite accurate data-based, objective picture of the class behavior is produced. This becomes a factual basis for making decisions. Useful tools include Class Learning Time, Level of Questions, Teacher Talk/Student Talk, and other tools reflecting research on best teaching practices.
5. Analyze and interpret the data. Did it meet the criteria? Is there a need or desire for a change? Given the context (number and diversity of the students, physical space, materials at hand, etc) what is most likely to bring a positive change? When and where will the new approach be initiated? When will the next set of data be collected, analyzed, and interpreted?
For the greatest success, it is critical to operate with the belief that every vested interest be involved in this process. Administrators, teachers, parents, aides, students, counselors, etc. all have an important contribution to make where the purpose of the observation is the improvement of teaching and learning.
-----------
I am coming to realize what a big shift this is in the education field. We have tried to cite 'professional judgment' when the inconsistency in the process and the unreliability of the results support neither the process nor the conclusions. Serious collaborative discussions are needed to move to a more concrete basis for judging what we say we value, and how to use the specifics to guide the improvement of teaching.
|
Teacher Pay and Test Scores
By Dr. John L. Tenny, developer of the Data-Based Observation Method and the eCOVE Classroom Observation Software, January 2010
Awhile ago I read an article in Ed Week about the Houston and Denver districts' efforts in teacher pay for performance. Both programs are broad implementations of the pay-for-performance system and are struggling with enrollment and acceptance. The most interesting quotes were by Gayle Fallon, President of the Houston Federation of Teachers. Both quotes, "It's better than last year. Still, they are handing out money and getting nothing in return." and "What we hear from teachers consistently is that they have no clue what they did to get the money", point to the black-box nature of using student test scores as a primary determiner in awarding pay or other rewards. While student learning is the primary goal, the connection between the teacher's direct influence on student scores (as they indicate learning) is very difficult to determine. I can understand the teachers not 'having a clue' when the results of their efforts (the test scores) are calculated and revealed sometime in the future and those results also include the influence of a large number of other variables.
A medium sized school district in Oregon has recently received a large grant from the Chalkboard Foundation to improve student learning. Part of their efforts include a bonus pay system based on a teacher portfolio of evidence, which can include student scores as well as other strong evidence of exemplary teaching and professional conduct. They contacted me to discuss the use of data-based observation data on best practices as a part of that process.
There is credible research about teaching practices that result in increased student learning. eCOVE Software will that will track the implementation of those practices in an individual classroom. Now we have the opportunity to reward teachers who are implementing those researched best practices. The process is not difficult to manage - identify the behaviors that everyone is confident in as directly influencing student learning (Class Learning Time, Time on Task, Wait Time, Level of Questions {as answered by students, not just asked by the teacher}, etc, etc), train observers (teachers, aides, paid data gatherers, administrators) to competently use the data collection tools, and determine the appropriate data collection procedures (number of data points, length of individual data collection events, etc).
The result of this, I predict, will be interesting and engaging. Not only will teachers know immediately that they are using the researched best practices in their classroom, but they will have a running record of that. It is that running record that is the greatest benefit - it can provide feedback in a timely and useful manner to the teacher who has a goal of becoming an exemplary teacher. They can immediately see if they are moving toward a higher level of proficiency instead of waiting for months to find out if they 'won'. As I have said before, teachers are deeply dedicated to effectively teaching their students in the best manner possible. Bringing the objective feedback to the classroom level in real time will build more effective teachers; then we'll know why those scores went up as well have the 'clues' we need.
As a side note, I'm a bit concerned that the student performance/more money is a strong extrinsic motivator and will shift the focus on why one becomes/continues to be a teacher. I think the immediate, objective, and over time feedback that eCOVE provides will not only reinforce the skills of teaching but will also reinforce the teacher's perception of their skills. Since we love doing what we do well, the data and teacher reflection become the intrinsic motivator. As teachers become/continue to be successful in their craft, and are clearly aware of their successes, they will keeping doing what they love - helping kids.
|
Developing Self-Directed Professional Growth
By Dr. John L. Tenny, developer of the Data-Based Observation Method and the eCOVE Classroom Observation Software November, 2009
Quality staff development efforts are directed at improving teaching and learning in ways that will result in long-term change. The topics included in staff development come from national, state, district, and building standards; from benchmark test scores; from school board and superintendent directives; from research and scholarly journals; and from the teachers themselves. Programs include training on goal setting, action research, specific curriculum or behavior techniques, communication approaches, brain research, child psychology, and a nearly unlimited list of other topics; all of which have value and work to some degree.
However, there is a persistent level of frustration among staff developers around the resistance to new ideas, the difficulties in getting teacher buy-in and implementation, and the limited impact of staff development efforts. It is the premise of this paper that a more productive perspective is to focus on a more fundamental skill needed by professional educators — the skill of reflection.
Every educator has been involved in ‘reflection’ exercises, from college assignments to responding to the annual evaluation. While one would think that teachers are therefore skilled at reflecting on their teaching and their students’ behavior and learning, there has been a significant missing link. The focus of reflections has, to date, been nearly always on something either abstract, outside the control or influence of the teacher, or in response to a judgment or opinion of someone else.
For example, reflecting on the drop in 5th grade reading scores can only be done as an abstract exercise as the exact causes are not known. The number of variables affecting a change in scores is extensive, and the teacher does not have the data/information needed to ‘reflect’, let alone explain or develop an effective plan of action. Similarly, asking a teacher to reflect on observation reports with a list of met/not met or observed/not observed items, or worse yet, a low ranking on a likert scale, will nearly always result in a defensive or deflective response, accompanied frequently by anger, resentment, and hostility. A judgment of one’s worth is always subject to suspicion of bias, and challenging that judgment is nearly always an adversarial exercise.
If reflecting, under current practices, is so difficult to accomplish in a meaningful way, how can staff development efforts increase that foundational skill? The answer lies in conducting objective, data-based observations on the behaviors of teacher and students. By providing the data to the teacher without judgment, praise, or criticism and asking a simple question, "Is this what you thought was happening in your classroom?" This will result in the beginning of reflection on the activities within the classroom, the teacher's plans and goals, and other variables that would have an impact on the data. When a teacher is presented with the actual duration and/or frequency data rather than the observer's subjective evaluation there is a shift in the dynamic from defense and deflection to an empowering professional engagement with the results of the observation.
Most often when a person is engaged in teaching and classroom management, it's not possible for them to see clearly the interaction between the lesson delivery, classroom materials, and student behaviors. When an observer gathers data on focused behaviors such as level of questions, teacher talk time, teacher response to misbehavior, etc, the teacher can become engaged in a non-defensive manner and move from pleasing the observer to objectively devising and testing research based approaches to classroom activities.
Until recently, the process of gathering this type of data has been daunting, and involved pencil, paper, stopwatches followed by time doing the calculations. An innovative program, eCOVE Software, has eliminated nearly all of the time consuming mechanics and has enhanced both the data collection and reporting process. eCOVE includes 40 specific data collection tools, and runs on both Macintosh and Windows computers. An observer can easily gather data by operating the timer and counter floating tools, and produce reports on both individual observations and/or behavior over time. Tools for tracking additional behaviors can be collaboratively developed using the tool creation templates.
When the data is shared in a non-judgmental manner, the teacher has a sound basis for reflection. Working through determining the meaning of the data and the possible need for change is an invigorating professional discussion. Following this process with tracking the implementation of a plan of action and the outcome in student behaviors through additional data collection further enriches the reflection process. The result is self-directed professional growth by the teacher and a collaborative relationship between the observer and teacher.
| |