• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Rehabil Psychol. Author manuscript; available in PMC Nov 1, 2010.
Published in final edited form as:
PMCID: PMC2799120

A Behavioral Observation System for Quantifying Arm Activity in Daily Life after Stroke



Evaluate psychometric properties of the Functional Arm Activity Behavioral Observation System for measuring hemiparetic arm use.

Participants and Measures

All participants acquired their brain injury > 1-year prior to study entry; most had mild-to-moderate upper-extremity hemiparesis. In Study 1, nine stroke survivors wore accelerometers and were videotaped for 15 minutes in the hospital or at home after they were asked to behave as usual. In Study 2, one traumatic brain injury and eight stroke survivors wore accelerometers and were videotaped at home for 3 days with a motion-triggered camera. Observers independently rated 15-minute segments of the Study 1 and 2 videotapes in 2-s blocks with a 4-step arm-activity coding scheme.


Inter-rater reliability was excellent; the mean Cohen’s κ in each study was ≥ .84. For data from both studies combined, validity was supported by a strong correlation between amount of hemiparetic arm functional activity, as determined by the observers, and the ratio of hemiparetic to other arm movement, as determined by accelerometry.


FAABOS reliably and validly quantifies amount of spontaneous hemiparetic arm activity outside the laboratory.

Keywords: arm, observation, rehabilitation, treatment outcome, stroke

Several bodies of evidence suggest that under certain conditions actual use of an upper-extremity departs markedly from the severity of impairment after neurological injury (e.g., see review by Uswatte & Taub, 2005) Animal studies directed by E. Taub have shown that monkeys that have had surgery to abolish all sensation from a forelimb typically never use this limb again even though they recover considerable ability to control movements of the deafferented limb 8–24 weeks after surgery. The monkeys do not use their deafferented forelimb even though they can 8–24 weeks after surgery because when the animals attempt to use that forelimb soon after surgery they still do not have sufficient motor control to manipulate objects or support their weight successfully. These failed attempts punish use of the deafferented forelimb and suppress its use permanently unless the “learned nonuse” is counterconditioned using some simple behavioral methods. Cross-sectional studies of hemiparetic arm function in stroke and traumatic brain injury (TBI) survivors with mild to moderate motor impairment show differences between severity of paresis and actual use of the paretic arm in daily life that are consistent with the type of excess motor deficit present in the deafferented monkey model. Clinical trials of Constraint-Induced Movement therapy (CI therapy) show that after CI therapy stroke and TBI survivors exhibit much larger gains in real-world use of the hemiparetic arm than in motor ability.

This evidence, along with biopsychosocial models of disability such as the diagnostic system from the World Health Organization (2001), have led to interest in measuring impairment and real-world activity separately. The new measures of real-world activity include structured interviews that assess real-world arm use, such as the Motor Activity Log (E. Taub et al., 1993; Uswatte, Taub, Morris, Light, & Thompson, 2006; Uswatte, Taub, Morris, Vignolo, & McCulloch, 2005), and physical sensors that continuously monitor arm movement during daily life, such as accelerometers (Schasfoort, Bussman, Zandbergen, & Stam, 2003; Uswatte et al., 2005; Uswatte et al., 2000). Their validity has been demonstrated by examining how they are correlated with each other (e.g., Uswatte, Taub, Morris, Vignolo, & McCulloch, 2005) and with measures of impairment (e.g., van der Lee, Beckerman, Knol, de Vet, & Bouter, 2004). (The latter approach is flawed given the evidence cited above for dissociation between real-world use and severity of impairment under certain conditions; Uswatte & Taub, 2005.) A "gold standard" against which these real-world measures can be compared has not been available.

This paper presents inter-rater reliability and convergent validity for the Functional Arm Activity Behavioral Observation System (FAABOS), which quantifies amount of more-impaired arm function in daily life from random samples of video recordings from stroke and TBI survivors’ homes. Results are presented from two separate groups of participants (Study 1 and 2), who were videotaped in different contexts and rated by different sets of observers. This instrument, which directly and objectively assesses the construct of interest, might serve as such a gold standard, and as a measure of treatment outcome and functional status on its own.



Individuals with acquired brain injuries were recruited from the occupational therapy clinic or clinical research projects at a rehabilitation hospital in the Southeastern United States. In Study 1, eight participants had mild-to-moderate paresis of the arm more-affected by acquired brain injury; one had very severe paresis of the more-affected arm. In Study 2, all nine participants had mild-to-moderate paresis of the more-affected arm. Severity of more-impaired arm motor deficit, i.e., paresis, was graded according to a scheme used in Constraint-Induced Movement therapy research (described in Bowman et al., 2006; E Taub et al., 2006). Table 1 summarizes additional characteristics of the Study 1 and 2 participants.

Table 1
Demographic and Injury-related Characteristics of Participants in Study 1 (N = 9) and 2 (N = 9)



Functional activity was assessed by pairs of observers who coded how participants used their more-impaired arm in 2 s blocks from video of participants behaving spontaneously. Observers in both Study 1 and 2 were second-year physical therapy graduate students; they received approximately 12 hr of instruction and 24 hr of practice prior to coding study video. They worked in pairs because one person was needed to stop, start, and replay the video, while the other recorded ratings. Additionally, the consensus rating of two observers was thought to have less chance of being idiosyncratic than of one observer. To ensure that their responses were independent, each pair of observers scored videotapes at separate times and recorded their scores in separate workbooks. Each pair of observers was also instructed not to share their responses with the other pair until coding of video from all participants was completed. Thereafter, both pairs of observers met, reviewed 2 s blocks on which they disagreed, and came to consensus on the appropriate codes for these blocks.

FAABOS codes were: 0 (no activity or movement), 1 (nonfunctional activity), 2 (nontask-related functional activity), and 3 (task-related functional activity). Observer pairs followed the coding rules in the Appendix. Amount of functional arm activity in a video was quantified by averaging FAABOS codes assigned to each 2 s block. A coding interval of 2 s was selected based on previous pilot work showing that, out of a range of intervals, this one most frequently accommodated complete upper-extremity functional acts (Uswatte et al., 2000).

Each FAABOS category was delineated by giving a “technical definition” followed by examples to help convey the “family resemblance” among activities in that category (Smith & Medin, 1981). For instance, the technical definition for category 3, “an arm movement or action that helps to accomplish a task,” was supplemented by several examples (see Appendix, Table 1). This approach was followed rather than relying only on a technical definition because “tasks”, like some common categories such as games or tools, cannot be defined by a set of necessary and sufficient features. Such categories are governed by the principle of family resemblance or successive overlap: member A shares several features in common with member B, member B shares a different but overlapping set of features with member C, and so on. Despite the lack of precise definitions, we are able to use these everyday categories based on family resemblance to successfully communicate with each other, i.e., we generally know a tool when we see one.


Duration of extremity movement was monitored with an accelerometry system (Uswatte et al., 2005; Uswatte et al., 2000) that has an established reliability and validity (Uswatte et al., 2005; Uswatte et al., 2006; Uswatte et al., 2000). Accelerometers1 were worn on each arm, the chest, and a leg. The duration of movement of each of these body parts over the data collection period was calculated by (a) programming a short recording epoch (i.e., 2 s), (b) setting raw acceleration values recorded in each epoch to 1 if they were above a low threshold and to 0 otherwise, (c) multiplying the “threshold-filtered” values by 2, and (d) taking the sum of the resulting products, i.e., counting the number of 2 s bins with above-threshold acceleration values. Thresholds for filtering the arm, chest, and leg recordings were 2, 2, and 10, respectively.


Study 1 participants were asked to behave as they would usually for approximately 15 minutes while they were videotaped and wore accelerometers in their homes or the rehabilitation hospital. Two pairs of observers coded more-impaired arm activity using the FAABOS scheme.

Study 2 participants were videotaped and wore accelerometers continuously during waking hours for 3 days in their homes only. A miniature, wide-angle video camera was installed in a room that participants frequented regularly and connected to a motion-triggered recorder.2 Based on pilot work, the post-trigger recording period was set to 5 minutes. In addition, trigger sensitivity was adjusted to the room’s lighting and “tripwires” were placed according to participant’s activity patterns so that virtually continuous video of a participant in the room, including periods when they were still, could be obtained. Two 15-minute segments were randomly selected from those periods when participants were on-screen, and were coded by a second set of observer pairs. The university institutional review board approved the procedures; all participants gave informed consent.

Data Analysis

The agreement between the independent FAABOS ratings from each observer pair was indexed by Cohen’s κ, which corrects for chance correspondence (Bakeman & Gottman, 1986). Agreement was calculated between the ratings assigned by each observer pair to each 2 s block of the 15-minute video segments collected from every participant. Kappa values above .75 are thought to reflect excellent inter-rater reliability (Cicchetti, 1994).

Convergent validity was evaluated by the Pearson correlation across participants between the average FAABOS rating for each 15-minute video segment and the corresponding ratio of more-impaired arm to less-impaired arm threshold-filtered accelerometer recordings. This ratio has been shown to be correlated with amount of more-impaired arm use, as measured by the Motor Activity Log, in stroke survivors with mild to moderate upper-extremity hemiparesis, r’s > .5, p’s < .05 (Uswatte et al., 2005; Uswatte et al., 2006). In the behavioral sciences, correlation coefficients > .5 are considered strong (Cohen, 1988). For the purpose of examining convergent validity, consensus FAABOS ratings were used, i.e., those determined after both pairs of observers met and came to agreement on 2 s blocks of video for which they gave different codes. In addition, data from both Study 1 and 2 were combined because of the small sample size in each study. For Study 2 data, FAABOS and accelerometry summary values were aggregated over both 15-minute video segments sampled from each participant. Accelerometry data were not available from two participants in Study 2 because an error was made in downloading their data.

Results and Discussion

Inter-rater reliability between the pairs of observers was excellent. In Study 1, Cohen’s κ, on average, was .84 (SD = .09); in Study 2, it was .85 (SD = .11). Corresponding percent agreement values for Study 1 and 2 were 89 (SD = 7) and 92 (SD = 4), respectively. Table 2 lists Cohen’s κ and percent agreement values for each participant. The convergent validity of FAABOS was supported by a strong correlation between the FAABOS ratings and the ratio of more- to less-impaired arm accelerometer recordings for data from both studies combined, r (14) = .55, p < .05. Table 3 summarizes FAABOS and accelerometry values obtained and the locations and types of activity sampled.

Table 2
Agreement Between the Independent FAABOS Ratings of the Two Pairs of Observers in Study 1 (N = 9) and 2 (N = 9)
Table 3
Selected Features of the Coded Segments of Video from Study 1 (N = 9) and 2 (N = 9)

Additional support for this coding scheme was provided by a post-hoc analysis. FAABOS categories 0 and 1 were collapsed to form a super-ordinate category of nonfunctional activity, while categories 2 and 3 were collapsed to form a super-ordinate category of functional activity. Inter-rater reliability for the super-ordinate category coding scheme (Study 1, mean Cohen’s κ = .85, SD = .09; Study 2, mean Cohen’s κ = .92, SD = .05) was only modestly better than that for the original scheme (see above), suggesting that the distinction drawn, for example, between task-related functional activity and nontask-related functional activity was meaningful.

A strength of Study 2 was that data were collected under conditions (i.e., continuous in-home recording over approximately 72 hr) that were more likely to induce behavior representative of everyday activity than Study 1, in which participants were asked to behave as usual immediately before videotaping. Furthermore, the replication of inter-rater reliability values from Study 1 in a separate sample coded by different observers (i.e., Study 2) increases confidence that FAABOS can be successfully adopted by other rehabilitation researchers interested in validating self-report and physical measures of arm activity against direct observations.

A major limitation of both Study 1 and 2 was their small sample size. Before FAABOS can be considered as a measure of functional status or treatment outcome, additional data are needed on a) stability of FAABOS summary values from randomly sampled 15-minute segments recorded in more varied settings and b) correlation of these summary values with indices of more-impaired arm use over extended periods, such as the Motor Activity Log. Studies of alternate coding schemes that quantify particular activities (e.g., manipulating fork, folding laundry) or patterns of arm movement (e.g., types of grasp) would be also valuable. However, training observers to apply such complex schemes reliably might be time consuming. Furthermore, such schemes would not yield a single number that captures the amount of functional activity, which would be important for assessing rehabilitation outcome.

Two trends in rehabilitation research support further development of instruments such as FAABOS. One, as noted in the Introduction, interest in measuring rehabilitation consumer activity in the community is growing because of theory and evidence that, under certain conditions, actual use of a body part in daily life can depart markedly from its motor capacity as measured by performance on laboratory tests. Two, controlled studies supporting the efficacy of rehabilitation for the hemiparetic (i.e., more-impaired) arm of individuals with stroke (e.g., reviewed in E Taub & Uswatte, 2006), along with evidence of remarkable brain plasticity after neurological injury (e.g., reviewed in Uswatte & Taub, In press), have led to an increasing emphasis on ameliorating more-impaired arm activity as opposed to teaching compensatory strategies. This priority naturally creates a demand for instruments that directly measure more-impaired arm function, as opposed to functional independence. (Functional independence can be strengthened by generating improvements in more-impaired arm function or more effective compensatory strategies, including changes in the environment.) FAABOS has the potential to address these needs by providing a rich measure of more-impaired arm activity in the home setting itself.

More generally, behavioral observation systems are uncommon in rehabilitation psychology research today, with the consequence that much research examines the relationship of self-report measures to each other. Self-report measures, of course, are subject to several types of bias, including availability bias, demand characteristics, and experimenter bias (e.g., see review by Uswatte & Schultheis, In press). Moreover, studying the relationship between two constructs both measured by self-report artificially inflates the association between them because of variance in common simply due to the common mode of measurement (Campbell & Fiske, 1959). Self-report measures also typically provide a static view of a construct. Behavioral observation systems, such as FAABOS, provide a fine-grained picture of changes in behavior across time and place, and thus permit researchers to examine what interpersonal and environmental stimuli control behaviors of interest (Merbitz, 1996). Knowledge of such temporal processes is critical to generating new models of recovery and adaptation after disabling injury, as well as new treatment approaches, and therefore important for advancing our field (Elliott, 2002).


This research was supported by Grant 0365163B from the American Heart Association Southeast Affiliate, Grants HD34273 and HD053750 from the National Institutes of Health, Grant 97-41 from the James S. McDonnell Foundation, and Grant H133G050222 from the National Institute on Disability and Rehabilitation Research.


Table 1

FAABOS Coding Scheme

Categories of Arm Activity in FAABOS Coding Scheme

Category 3: Task-related
A movement or action that helps to accomplish a task.Reaching to grab object
Grabbing a can (failed attempts are included)
Releasing object
Wiping dishes, table top,...
Transporting object from one place to another
Transferring object from one hand to other
Holding object while object is being used by person being rated or someone else
Using hand/arm to push-up from chair
Use hand to obtain support from rail
Category 2: Nontask-related, functional
A movement or action which has some function, but does not accomplish a task.Adjust glasses
Touch face
Scratch leg
Hand/arm gesture
Hand to mouth to cover yawn
Holding object, but object is not being used or transported
Incomplete movement (adjusting placement of arm in preparation for reaching for object)
Moving arm from one position to another
Support/hold body part
Category 1: Nonfunctional
A movement or action which does not have any function or has minimal function.Tic, tremor
Arm movement secondary to other body part movements (i.e., arm swing when walking).
Passive movement (i.e., rated arm is moved by unrated arm or therapist’s arm; any amount of assistance classifies the movement as passive; drops mostly due to the effect of gravity)
Category 0: No activity or movement
No movement or action of the arm.Arm at rest on chair when watching TV

Table 2

Other Rules Governing the FAABOS Coding Scheme

1Behavior is coded in 2 s blocks.
2If there is some movement or action in the 2s block, even if it is for only a brief period, the block is coded as a 1, 2, or 3.
3If there is a mix of movements or actions, the block is coded at the highest level of activity observed.
4There is no set threshold with regard to amplitude to determine if there is movement/action or not. However, if it is not clear whether there is movement/action or not, err on the side of being conservative (i.e., no movement/action).
5If the rated arm is not visible during a particular 2 s block, the block is treated as a missing data point.a

Note. No. = Number.

aThe rated arm is considered not visible if a) the rated arm is still and the palm or dorsum is not visible, b) no part of the rated arm is visible and there is no object being moved or held in place, or c) an object is being moved or held in place, but the unrated arm can not be ruled out as being responsible for the action. A synopsis of the considerations underlying the criteria for visibility is available from the corresponding author.


Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at www.apa.org/journals/rep.

Parts of the results were presented at the Annual Convention of the American Psychology Association in Chicago, IL, August 2002.

1Model 71256 Activity Monitors; Manufacturing Technologies Incorporated, 709 Anchors Street, Fort Walton Beach, FL 32548

2SSC-950HDVMD VCR & SSC-113WC36 Bullet Camera; ADV Security Products, 26 Carlyle Plaza, Ste. 127, Belleville, IL 62221

Contributor Information

Gitendra Uswatte, Departments of Psychology and Physical Therapy, University of Alabama at Birmingham (UAB)

Laura Hobbs Qadri, Department of Psychology, UAB.


  • Bakeman R, Gottman JM. Observing interaction: an introduction to sequential analysis. New York: Cambridge University Press; 1986.
  • Bowman MH, Taub E, Uswatte G, Delgado A, Bryson C, Morris D, et al. A treatment for a chronic stroke patient with a plegic hand combining CI therapy with conventional rehabilitation procedures: case report. NeuroRehabilitation. 2006;21:167–176. [PubMed]
  • Campbell DT, Fiske DW. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin. 1959;56:81–105. [PubMed]
  • Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment. 1994;6:284–290.
  • Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988.
  • Elliott T. Presidential address: defining our common ground to reach new horizons. Rehabilitation Psychology. 2002;47:131–143.
  • Merbitz CT. Frequency measures of behavior for assistive technology and rehabilitation. Assistive Technology. 1996;8:121–130. [PubMed]
  • Schasfoort FC, Bussman JBJ, Zandbergen AMAJ, Stam HJ. Impact of upper limb complex regional pain syndrome type 1 on everyday life measured with a novel upper limb-activity monitor. Pain. 2003;101:79–88. [PubMed]
  • Smith EE, Medin DL. Categories and concepts. Cambridge, MA: Harvard University Press; 1981.
  • Taub E, Miller NE, Novack TA, Cook EW, III, Fleming WC, Nepomuceno CS, et al. Technique to improve chronic motor deficit after stroke. Archives of Physical Medicine and Rehabilitation. 1993;74:347–354. [PubMed]
  • Taub E, Uswatte G. Constraint-Induced Movement therapy: answers and questions after two decades of research. NeuroRehabilitation. 2006;21:93–95. [PubMed]
  • Taub E, Uswatte G, King DK, Morris D, Crago J, Chatterjee A. A placebo controlled trial of Constraint-Induced Movement therapy for upper extremity after stroke. Stroke. 2006;37:1045–1049. [PubMed]
  • Uswatte G, Foo WL, Olmstead H, Lopez K, Holand A, Simms ML. Ambulatory monitoring of arm movement using accelerometry: an objective measure of upper-extremity rehabilitation in persons with chronic stroke. Archives of Physical Medicine and Rehabilitation. 2005;86:1498–1501. [PubMed]
  • Uswatte G, Giuliani C, Winstein C, Zeringue A, Hobbs L, Wolf SL. Validity of accelerometry for monitoring real-world arm activity in patients with subacute stroke: evidence from the Extremity Constraint-Induced Therapy Evaluation trial. Archives of Physical Medicine and Rehabilitation. 2006;87:1340–1345. [PubMed]
  • Uswatte G, Miltner WHR, Foo B, Varma M, Moran S, Taub E. Objective measurement of functional upper extremity movement using accelerometer recordings transformed with a threshold filter. Stroke. 2000;31:662–667. [PubMed]
  • Uswatte G, Schultheis MT. Real- and virtual-world tools for measuring function during everyday activities. In: Mpofu E, Oakland T, editors. Assessment in rehabilitation and health. Boston: Allyn & Bacon; In press.
  • Uswatte G, Taub E. Implications of the learned nonuse formulation for measuring rehabilitation outcomes: lessons from Constraint-Induced Movement therapy. Rehabilitation Psychology. 2005;50:34–42.
  • Uswatte G, Taub E. You can teach an old dog new tricks: harnessing neuroplasticity after brain injury in older adults. In: Fry PS, Keyes CLM, editors. New Frontiers in Resilient Aging. Cambridge, England: Cambridge University Press; In press.
  • Uswatte G, Taub E, Morris D, Light K, Thompson P. The Motor Activity Log-28: assessing daily use of the hemiparetic arm after stroke. Neurology. 2006;67:1189–1194. [PubMed]
  • Uswatte G, Taub E, Morris D, Vignolo M, McCulloch K. Reliability and validity of the upper-extremity Motor Activity Log-14 for measuring real-world arm use. Stroke. 2005;36:2493–2496. [PubMed]
  • van der Lee J, Beckerman H, Knol DL, de Vet HCW, Bouter LM. Clinimetric properties of the Motor Activity Log for the assessment of arm use in hemiparetic patients. Stroke. 2004;35:1–5. [PubMed]
  • World Health Organization. International classification of functioning, disability, and health. Geneva: World Health Organization; 2001.


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...