Designing Mathematics Assessments

Using released items from state and national mathematics assessments can be helpful in designing a grade level assessment instrument for students to be used as one source of data in the evaluation of teacher professional development or student mathematics interventions.

The list below identifies sources of released assessment items that might be aligned with the content of your program. This list will be updated as additional sources are identified.

New York State Department of Education Released 2017 Grades 3-8 ELA and Mathematics State Test Questions

New York State Department of Education Released 2016 Grades 3-8 ELA and Mathematics State Test Questions

PARCC Released Items

NAEP Released Items / NAEP Questions Tool Take a look at this Facebook video for help in using the tool.

Once a pool of items are established, you’ll need to consider next steps to ensure the validity and reliability of the assessment.

 

Defining a Program

What makes a program a program?

A recent webinar led by Boris Volkov hosted by AEA’s Organizational Learning & Evaluation Capacity Building Topic Interest Group gave me pause to think about important qualities of a program.  His talk titled From Purist to Pragmatist: Expanding Our Approaches to Building Evaluation Capacity opened with a shared definition of program, evaluation, and program evaluation. He used the CDC (2011) definition that states:

A program is any set of organized activities supported by a set of resources to achieve a specific and intended result (p. 3)

Inherent in this rather broad definition are three important moving parts:

  1. Set of resources (inputs)
  2. Set of organized activities (activities described by outputs)
  3. Specific and intended result (outcomes and impacts)

Each of these provide parameters that define and focus a program and its evaluation. Yet, embedded in the definition are the qualifiers- organized, specific, and intended. Activities must have some sort of underlying organization. Perhaps the activities are organized developmentally, sequenced by difficulty, or arranged in some manner as a result of previous research. Results must be specific and intended. That is, there is a desired end that can be clearly articulated so as to justify the means.

A program exists when the qualifiers (organized, specific, and intended) are understood and there is a logical relationship between resources, activities, and outcomes. This logical arrangement is often described in a program’s logic model. Without understood qualifiers and an articulated logic, a program is not a program, but rather a loose set of activities.

Evaluating Change

Programs are designed and implemented to create change, often throughout multiple levels of a program; but at the very basic level, change occurs in individuals. Quoting Albert Wenger, “change creates information.” Information about changes that occurred during a program is essential to evaluate a program.

According to Radhakrishna and Relado (2009), a program might influence change in a program participant’s knowledge, attitude, skills, aspirations (KASA), and behavior. Another common set of individual outcomes called KAP include knowledge, attitudes, and practices (for example, see Chaplowe (2008)). At their intersection, changes in:

  • what participants understand (knowledge),
  • what participants use to base their decisions (attitudes/beliefs), and
  • what participants eventually enact (practices/behaviors)

lead to useful information about the potential impact of a program.

 

Scheduling Evaluation Interviews

Interviewing multiple individuals for a project is time consuming and scheduling those interviews also can be time consuming. There are several online applications that can match available interview times between interviewers and interviewees; however, I created an easy, quick, and no-cost system to schedule interview times using a Google Doc as described below.

I created a Google Doc that listed available day/times I was available for interviews. This Google Doc was shared with all interviewees at the same time using a shared link giving them permission to view but not edit the online document.

I easily and quickly edited the online document with days/times that became available or unavailable as my time commitments changed. At some level, using a Google Doc in this manner was a quick way to develop an instant website describing my availability. I used an email similar to the one below to share this process with those I was planning to interview.

Hi Dana,

[Opening Intro]

I’d like to schedule an hour phone conversation to talk with you about [name of program].

This online document suggests several day/time possibilities. Please reply to this email with your preferred choices. If none will work, please suggest a couple others. I’ll confirm in a return email.

I’m really looking forward to talking with you about [name of program].

Thanks,

Chris

 

 

Program Evaluation Tiers

The Harvard Family Research Project offers several useful evaluation resources. The guide Afterschool Evaluation 101: How to Evaluate an Expanded Learning Program is one of their valuable resources useful to provide non-evaluators an overview of program evaluation. The guide is organized around nine steps:

  1. Determine the evaluation’s purpose
  2. Developing a logic model
  3. Assessing your program’s capacity for evaluation
  4. Choosing the focus of your evaluation
  5. Selecting the evaluation design
  6. Collecting data
  7. Analyzing data
  8. Presenting evaluation results
  9. Using evaluation data

Evaluations vary as much as programs do (i.e., different activities, duration, outcomes, et cetera) underscoring the importance of wisely choosing the focus of an evaluation. Step 4 in the guide “Choosing the focus of your evaluation” describes a five tier approach that is summarized below (pages 13-16). Determining an appropriate evaluation focus is largely dependent on a program’s maturity and developmental stage.

  • Tier 1- Conduct a needs assessment to address how the program can best meet needs
  • Tier 2- Document program services to understand how program services are being implemented
  • Tier 3- Clarify the program to see if the program is being implemented as intended
  • Tier 4- Make program modifications to improve the program
  • Tier 5- Assess program impact to demonstrate program effectiveness

As you can see, evaluation can and should coincide with a program throughout its lifespan. These tiers are useful to help design an evaluation plan and to determine appropriate methods of data collection and analysis.

Evaluating Positive Youth Development (PYD)

Out of school programming often seeks to develop children’s’ outcomes beyond those purely academic. Working with program leaders to align program activities/goals with specific outcomes can be a challenge. In many cases, program activities are determined prior determining program outcomes. In these instances, understanding components of positive youth development can be useful to reflect on the relationship and alignment of program activities and program outcomes.

First, the Oregon State University 4-H Youth Development Program developed a Positive Youth Development Inventory (PYDI) intended to assess PYD changes in students ages 12-18. The collection of 55 Likert scale items measures the latent constructs of:

  1. Confidence
  2. Competence
  3. Character
  4. Caring
  5. Connection
  6. Contribution

Second, The Colorado Trust developed the Toolkit for Evaluating Positive Youth Development containing survey administration guidance and pre-post and post-only instruments that examine 8 sets of outcomes or domains that include:

  1. Academic success
  2. Arts and recreation
  3. Community involvement
  4. Cultural competency
  5. Life skills
  6. Positive life choices
  7. Positive core values
  8. Sense of self

Both resources above were initially designed for students in Grade 4 or higher. Please indicate in the comments below if you know of resources suitable for students in Kindergarten-Grade 3 (ages 5-8).

Areas of Interest

As indicated in the tagline of this blog, I’m interested in the design, implementation, and evaluation of educational programs. This post, to be refined over time, serves as a framework describing relevant topics that will be explored in the future.

About program design:

  1. Teacher preparation, induction, and development
  2. Mathematics professional development
  3. College readiness and persistence
  4. School-community intersection
  5. Community health

About program implementation:

  1. Fidelity of implementation

About program evaluation:

  1. Evaluation capacity building
  2. Empowerment evaluation
  3. Developmental evaluation
  4. Monitoring and evaluation
  5. Evaluation questions and rubrics
  6. Theory of change and logic models

What other topics intersect or are aligned with these topics that would be helpful to add to this list?

Evaluation Memo

Several clients have more than one evaluation project happening at the same time.  Evaluation project activities including data collection, analysis, and reporting differ for each of the projects. As a means to consolidate all of the activities in a monthly snapshot I developed the evaluation memo.

An evaluation memo is a monthly correspondence (1-3 pages) between myself and a client that initiates a dialogue that:

  • Recaps evaluation activities that occurred that month
  • Poses questions to clients that need responses for the evaluation to move forward
  • Requests additional documents, data, or information, and
  • Shares upcoming activities and deliverables related to our program evaluation work

The monthly evaluation memo serves as a running record of recent past evaluation work, upcoming evaluation work, and a current client to-do list to keep the evaluation moving forward and on-track.