![]() | ![]() |
Formats:
|
||||||||||||||||||||||
MatOFF: A Tool For Analyzing Behaviorally-Complex Neurophysiological Experiments Laboratory of Systems Neuroscience, National Institute of Mental Health Correspondence should be directed to: Dr. Andrew R. Mitz, National Institutes of Health, Bldg 49/Rm B1C72 MSC 4401, Bethesda, Maryland 20892-4401, USA, phone: 301 402-5573, fax: 301 402-5441, email: arm/at/nih.gov The publisher's final edited version of this article is available at J Neurosci Methods.Abstract The simple operant conditioning originally used in behavioral neurophysiology 30 years ago has given way to complex and sophisticated behavioral paradigms; so much so, that early general purpose programs for analyzing neurophysiological data are ill-suited for complex experiments. The trend has been to develop custom software for each class of experiment, but custom software can have serious drawbacks. We describe here a general purpose software tool for behavioral and electrophysiological studies, called MatOFF, that is especially suited for processing neurophysiological data gathered during the execution of complex behaviors. Written in the MATLAB programming language, MatOFF solves the problem of handling complex analysis requirements in a unique and powerful way. While other neurophysiological programs are either a loose collection of tools or append MATLAB as a post-processing step, MatOFF is an integrated environment that supports MATLAB scripting within the event search engine safely isolated in programming sandbox. The results from scripting are stored separately, but in parallel with the raw data, and thus available to all subsequent MatOFF analysis and display processing. An example from a recently published experiment shows how all the features of MatOFF work together to analyze complex experiments and mine neurophysiological data in efficient ways. Keywords: neurophysiology, behavior, single-unit, data mining, primate, software Introduction Behavioral neurophysiology, the technique of recording signals from the central nervous system in alert animals, owes it success to technological advances that have been driven by the conceptual ones. These experiments require collecting a mix of time-stamped behavioral events, analog signals reflecting behavior, and time stamps of single units detected through electrodes, while a trained animal performs one or more strategically-designed tasks. Computer programs to collect and analyze data from these experiments have undergone a co-evolution with the conceptual underpinnings that have motivated the introduction of more complex and dynamic tasks (Sclabassi and Harper, 1973). Early analysis programs simply aligned trial-to-trial data to a specific behavioral event, often a sensory stimulus. Subsequent programs were well suited to experimental paradigms that focused on stationary processes recorded from animals executing highly stereotyped behaviors in experiments designed to compare a modest number of conditions, e.g., movement to one of several targets, in a blocked trial or mixed-trial design. For a given condition in these experiments, behavioral performance and single-unit activity in any one trial mimicked all other trials for that same condition. Analyzing these experiments required separating out all trials for a given condition (e.g., movement goal) from the trials of the other conditions. Newer behavioral paradigms – developed to test more subtle complexities of mammalian brain function – have placed new demands on the analysis tools. Modest refinements to earlier software were sufficient to accommodate simpler non-stationary paradigms, e.g., associative learning, embedded sequence learning, reversal learning, motor skill acquisition, and pharmacological wash-in and wash-out experiments. These non-stationary paradigms tracked the time course of processes or events that evolved over many trials. More radical software changes are required to study behavioral paradigms where the state of the subject, reward schedule, sequence of movements or other behavioral factors depend unpredictably on the animal’s behavioral choices over many task trials. Some laboratories that employ highly non-stationary behavioral paradigms have abandoned general purpose analysis programs in favor of custom programs written for each new experiment. They typically use high-level general purpose programming languages that can allow relatively efficient re-use of previous code, like C, Basic, MATLAB® (The MathWorks, Inc.), or LabVIEW® (National Instruments). However, many components of the code must be re-written or extensively modified for each new analysis of data. Rather than abandon general purpose neurophysiological analysis tools, the present paper describes a general purpose neurophysiological analysis tool, MatOFF that supports a very wide range of analysis requirements. On the low end, non-programmers can rapidly generate simple peri-event rasters and histograms for a single neuron using its graphical user interface (GUI) and built-in functions. As a subset of the built-in functions, MatOFF is a valuable tool for analyzing a purely behavioral experiment or data from acute electrophysiology. On the high end, architectural features permit the more sophisticated user to sift through a population of units recorded under complex, non-stationary behavioral conditions by mixing batch, macro, and high-level language scripting from within the framework of a single tool. Results can either be plotted in traditional ways (rasters, histograms, X-Y displays, etc.) or passed to other (e.g., statistical, graphics) tools for further processing. The examples of MatOFF presented here are from a recent demonstration of strategy selection in prefrontal cortex, a paradigm not readily amenable to analysis by the current generation of general purpose neurophysiology analysis programs. MatOFF was developed and is maintained by the Laboratory of Systems Neuroscience in the National Institute of Mental Health, Bethesda, Maryland (http://dally.nimh.nih.gov/matoff/matoff.html). Except for the (extensive) documentation, all components are written in MATLAB and the program is distributed freely with its source code. The major advantages of general purpose neurophysiological analysis tools over custom programming are twofold. First, code reuse is maximized. Software components that manage data, file names, cell identifiers, and components that warehouse interim values for later processing, as well as components for delivering results (plots, statistics, output data files) do not have to be re-written or re-integrated with new software each time a new analysis program is written. Second, the likelihood of software bugs in the analysis software is reduced each time a new experiment is analyzed, rather than increased. More general purpose analysis tools are tested with each new experiment and carry the benefit of prior testing from past experiments. The reliability of code is inherently less assured when it is rewritten or changed with each new experiment. Levels of data analysis and management For the most complex experiments, selecting trials based on intra-trial event codes is not sufficient (see below). MatOFF has a hierarchy of tools that aid not only trial selection and data realignment, but also aid in the selection of units for batch processing and population analyses. A data analysis for neurophysiological recording must provide tools to move smoothly from the single unit domain to the cell population domain, without obfuscating or abstracting the data beyond recognition at any point along the analysis axis. Not all levels must be visible simultaneously, but problems arising at one level must be traceable to the others. MatOFF encourages data visibility through its layered design. Methods A concise event search engine As behavioral tasks and their analyses become more complex, the ability of a data analysis tool to efficiently locate, accumulate, and display the experimental trials that define an experimental condition becomes increasingly important. MatOFF was designed to provide efficient within-trial event searching through an event sequence matching engine that operates from a simple, concise search command. Each possible event in a trial is assigned an arbitrary event number (event code) by the acquisition system or data conversion program. The search criteria are defined by a sequential list of bracked event code groups, called the event code sequence. Event codes are logically ORed within brackets, and a logically ANDed between brackets. For example, the event code sequence: [5][14][8,10][51] is satisfied by the following two possible “embedded” event code sequences in the data: 5, 14, 8, 51 and 5, 14, 10, 51. Intervening events disqualify the sequence, unless those events are specifically excluded by an “ignore” list. State-based execution engine (description-execution) MatOFF operates in two steps: state description and execution. Most user interaction with MatOFF is to establish a state set for processing. Once this set is established, an execute command is given to process the data as described by the state set. MatOFF compares the new requirements to the previous state and makes changes as necessary. This approach has a number of major advantages over other, more immediate interactive approaches. First, most interactions with MatOFF are nearly instantaneous; long delays are relegated to the execution step. We find this division of time more consistent with the way users like to work. Second, the user is not burdened with mastering the sequence of command processing. MatOFF decides what processing steps to execute during the execution step. Third, having a separate execution step simplified the implementation of a command interpreter that seamlessly mixes GUI and command line commands. Fourth, the state system, or part of it, can be saved, restored, and edited in human-readable form as a group of commands. The command interface is a fully mixed GUI/command line environment (Figs 1
In its simplest form, the description-execution model can be viewed as an interactive process:
This interactive process is typically used to explore the data and fine tune output formats using commands described below and in Supplementary Tables 1 and 2 online Scripting and Batch file Layers Scripting and batch files are layered over the basic description-execution model. Once there is a meaningful description for each experimental condition, the descriptions can be saved as one long list of commands called a “protocol”. Scripting is accomplished in MatOFF by simply executing a protocol:
Batch operation adds one additional level of execution. It runs protocol files and assigns string variables that are used as parameters in the protocol files: Run batch file
Commands in the protocol use place holders that undergo simple macro expansion when they are encountered. For example, if string 1 is “102”, the command sequence [10][12][%1] will expand to sequence [10][12][102]. One more level of macro expansion can occur when a protocol file calls another protocol file. The layering of execution works well for analysis of neurophysiological experiments when batch files are used to gather population data for one type of analysis. Each batch file is used for a different question about the population. Each line of the batch file lists a single unit by file and name, as well as any characteristics specific to that unit (e.g., which trials are valid, which stimulus was used, the cytoarchitectonic location of the unit). The same protocol file is used each line of the batch file. That protocol file has a series of description-execution command sequences for all standard conditions (e.g., target directions), and generates either a series of plots or a text spread sheet of results. Other approaches to parsing analysis problems with MatOFF have been used, as well. Command language The present version of the software has 116 commands, most with multiple parameters or other variants. For descriptive purposes, the commands can be divided into the following categories:
Supplementary Table 1 online provides a list of the most often used MatOFF commands and their purposes. Supplementary Table 2 online lists the graphics display control commands. MatOFF file optimizations, the MatOFF database MatOFF requires that data be converted from its original format to an optimized internal MatOFF data format. (See Supplementary Note 2 online for detailed descriptions of the data files.) MatOFF has a built-in conversion tool for files created by the NIMH Cortex experimental control and data acquisition program. Cortex is free, widely used in behavioral neurophysiology, and has a simple file format. The Cortex file includes three data types: single unit (spike) data, analog data, and event data, stored in trial-by-trial records. The MatOFF conversion program not only converts these files, but also provides a range of remapping and filtering options. To simplify the task of converting other data file types (Fig 3
The design of MatOFF data files provides both flexibility and efficiency. Cortex and MatOFF are trial oriented. Like many data acquisition systems, Cortex can record a single unit across multiple data files, and in some cases a single unit may be stored discontinuously across a group of files. Also, like many acquisition systems cortex data records may be of varying lengths. For efficiency, MatOFF creates a separate file for each data type. It then indexes all the behavioral trials in a file that holds pointers to each of the other files. The index file has a fixed record length for rapid random access to any trial in the experiment. To assure data integrity, each trial of data in a MatOFF data file has a header that must match the index entry each time it is accessed. A “unit definition file” lists which of the behavioral trials belongs to each single unit. This definition file manages discontinuous recording and recordings across multiple input files. When a file and single unit spike channel are requested by the user, MatOFF reads the entire index file entry for that spike channel into memory. (At present, MatOFF works on only one spike channel at a time.) Accessing any specific data type for any selected trial is rapid, because only the essential data type (i.e., analog, single unit, event) is accessed and records are accessed through in-memory absolute file pointers to structures within the data files. Profiling studies of MatOFF show that the most demanding task is searching through complex event sequences for matching event patterns. The file organization of MatOFF optimizes the search process by isolating event data from other data types, and feeding the event search engine through rapid access of each trial, even when trials are not contiguous. One additional performance issue was addressed early in the development of MatOFF. Opening and reading very large data files created serious operating system performance problems under some versions of Windows. We did not want to have limits on the data file sizes, so MatOFF has its own internal file caching for the data files. Facility for advanced trial analysis: the MATLAB sandbox What sets MatOFF apart from other neurophysiological analysis tools is its integration with MATLAB. Other neurophysiological tools, like NeuroExplorer® (Nex Technologies, Littleton, MA), use MATLAB for a post-processing step after the data search is complete. MatOFF integrates user-written MATLAB scripts into the search engine using a sandbox to protect the original data and program execution. A sandbox is a program execution environment for running untrusted programs without risk to the main software or data. A script is a file of MATLAB commands that does not begin with the FUNCTION keyword. This sandbox provides flexibility without compromising software stability or data integrity, and has the unique power to evaluate complex experimental conditions from within the data analysis program. In understanding the role of the MATLAB sandbox, it helps to understand the original motivation for integrating MATLAB scripts within the MatOFF search engine. A given trial of recorded data will have single unit spike times, analog samples, and behavioral events. In simpler experiments all practical information about a trial can be encoded as event codes; but, in complex experimental designs, the significance or categorization of a trial can depend upon its relationship to a complex series of events and experimental conditions occurring in many other trials. The MATLAB sandbox was originally conceived as a tool for examining the past and future histories surrounding a trial, and to use that information to establish new classifications for the trial. For that reason the MATLAB scripts are called history scripts. Although they are used for many functions beyond evaluating historical records of events before or after the current trial, history scripts have been most essential in teasing out complex behavioral histories. Each time an event sequence matches the sequence criteria (determined by the events codes within a trial) the execution branches to a script file of MATLAB commands (the “history script”). The script has access to all data for all trials of a single unit, as well as other key information about the current trial (trial number of the matching trial, sequence that matched, etc.) through a group of functions and predefined variables. The history script is written to evaluate this information and make decisions about the current trial. The history script obtains information about the experimental data from a set of functions and variables available to the script. A second set of functions allow the script to permanently store results. The script can define any number of “classes” (numbered 1…n) and store a separate value “class value” in that class for each trial (Fig 5
All class values for all trials (called the “history data”) can be saved with the original data in the MatOFF data file structure. MatOFF stores data in a group of files with a common root name (e.g., data.analog, data.pulse, data.event), called a MatOFF database. The class values are stored using the same root name, but in files with “.history” and “.hindex” suffixes (Figs 3
Once created by a history script, history data can be applied to further analysis in multiple ways (Fig 6 Results MatOFF as a productive tool MatOFF comes from a heritage of productive tools. Since its introduction in May, 2000 MatOFF has been used as the primary analysis tool for seven full-length peer-reviewed research papers from this laboratory and is used on a regular basis in other neurophysiological laboratories. Before that, PCOFF, a 16-bit DOS-based C language program with a text-based GUI and monochrome graphics contributed to 17 full-length peer-reviewed research papers. PCOFF used the description-execution model and supported scripting (protocol files). PCOFF was introduced around 1990. Prior to that, KOFF, a PDP-11 program written in Pascal, supplied much of the same basic functionality. The last version of KOFF was completed in 1985 and had 60 commands, including protocol files without macro expansion. A complete count of papers published with data analyzed by KOFF has not been tabulated, but KOFF contributed to at least as many papers as its successors. MatOFF’s evolution has been layered. The basic functions and features survived the evolutionary changes as new facilities were added. Preserving its primitive built-in functions makes MatOFF well suited to “simpler” experiments. For example, MatOFF is an excellent tool for evaluating the behavioral data of psychophysics experiments. On the flip side, MatOFF can be used for acute or in-vitro electrophysiological studies where the only “behaviors” are the experimental manipulations. By providing a tool that can be used efficiently by programmers and non-programmers, alike, MatOFF has been an attractive solution for laboratories that see a regular turn-over of PHD students and post-doctoral scientists who arrive with a wide range of prior computational experience. Those without programming expertise can begin working independently with MatOFF with about an hour of training. Daily print-outs of behavioral and other data can be automated easily, so preliminary experimental results are available at the end of each recording day. As the user gets more sophisticated with MatOFF features, he or she can develop more complex analyses and displays. Typically, the analytical needs drive the learning process. History scripting supports very sophisticated analyses from within MatOFF. The MatOFF database does not limit the complexity or size of an analysis; we have used a database with over 1,500 units without problem. Still, it is common to export results from MatOFF to other software tools, including statistical packages, spreadsheets, and databases, by judiciously parsing the analysis among the various tools. As with all programmable tools, the key to productivity is to determine how to best parse the problem. The MATLAB sandbox as an enabling technology With the introduction of history scripts, many tasks that were once difficult with MatOFF became easy to program. For example, the ability to measure the time between any two events under MatOFF was restricted to events in the search string. A 20-line MATLAB script generalized this ability to any events. Likewise, any class values can now be used to order rasters, vastly expanding MatOFF’s capabilities. A 17-line script is now used to export analog data to almost any data format. A ~250-line script now analyzes saccade start time, end time, start and end positions, and peak velocity in a series of history classes, making new classes that can be used for marking, sorting, or exporting data. Being able to add functionality with so little regard for data file structure, unit selection, trial search strategy, trial selection, or graphical display, makes history scripting a rapid and reliable method for adding functionality as needed. Scripts can be experiment specific, or they can be general purpose. The next section describes the analysis of a demanding behavioral task, accomplished without the need to venture outside of the data analysis tool. Use of history scripting for trial classification in a strategy task As discussed earlier, it is sometimes impossible and often undesirable to numerically encode every possible behavioral event during data acquisition that may be required for later analysis. Furthermore, unforeseen analyses may require information beyond that encoded in each trial during the experiment. Some analyses require information about the previous trial, sometimes many previous trials. We use history scripts for both unforeseen analysis requirements and for dealing with complex intertrial events that might be unwieldy to encode. In a recent experiment (Genovesio et al., 2005) we used history scripts to efficiently classify each trial into one or more critical categories. Despite using a relatively rich coding scheme during the experiment, only a minority of events codes was essential; all other event codes could have been eliminated from the original data file. The essential codes included time markers (times of stimulus presentation, eye movements, etc.) and a few other events (type of stimulus, response executed, response requested, delivery of reward). In the end, we used history scripting to identify, categorize and simplify the analysis. We have since started using history scripting to look at old experiments and test hypotheses not originally anticipated. Here we provide an example of a complex experiment analyzed using history scripting. Two rhesus monkeys were trained to perform a saccade task (strategy task) as previously reported (Genovesio et al., 2005). Briefly, in this task the monkeys first fixated a central spot; then, three potential eye-movement targets appeared: up, right and left from the center (Fig 7A
In part of this task (Fig 7B History based selection of the strategies Identifying a trial associated with one of the above strategies cannot be done simply on the basis of the events within a trial. The information from the previous trial is sufficient while the monkey completes trials without any errors. For example, a repeat-stay trial is a type of trial in which a stimulus presented in the previous trial is presented again. Repeat-stay trials can be distinguished from change-shift trials because the stimulus changes in change-shift trials. Distinguishing repeat-stay trials from second-chance trials is more subtle. With repeat-stay trials the expected response matches the previous response; in second-chance trials the requested response never matches the previous response. Thus, unambiguously identifying a repeat-stay trial requires knowing: if the stimulus has changed, if the requested response has changed, and if the prior trial was executed without error. When the monkey stopped working for a few trials and then resumed, recorded event codes were insufficient for identifying the trial type. In these and many other cases history scripting was essential for classifying trials. With the flexibility of history scripting we were able to look backwards in time to examine any number of intervening trials. Adding a history class is so easy that we implemented multiple classification schemes for the trials. Having multiple classifications allowed us to employ a “late binding” for analysis. In more than one occasion we created our standard, as well as “more conservative”, classifications in anticipation of reviews’ objections to an analysis decision. Fig 7C The experiment History scripting proved essential to uncovering a property of prefrontal cortex activity that was unanticipated (Genovesio et al., 2006). At the start of each repeat-stay or change-shift trial the monkey must remember which was the target (goal) selected in the previous trial. Both that previous goal and the upcoming instruction stimulus are essential for choosing the correct next goal. Figure 7D Creating history classes To study the neuronal representation of the previous goals we needed to classify all trials for all single units according to the goal selected in the previous trial (Fig 8
History scripting supports both bottom-up and top-down data analysis History scripting lends itself to the natural flow of data analysis. Although an experiment may have a top-down design, analyzing data often requires a bottom-up approach. Even with explicit experimental questions in mind, it was easiest to construct trial types and subtypes before attacking the specific questions (Fig 8 Once the basic classes were established, further analysis was remarkably easy (Fig 9
History scripting was also essential for studying the evolution (time course) of single-unit activity associated with the representation of future goals in second-chance trials. This activity starts towards the end of the previous trial, which was a not-rewarded, but strategically correct, change-shift trial. To evaluate this time course we created a new class and assigned the value of the current goal to this class in the previous trial. This classification scheme provided an easy dichotomy between not-rewarded change-shift trials that lead to successful second chance trials and those that did not. For convenience, the same script created a class that stored the future goal, although that information could be gleaned from other trial data. In the next cycle of analysis we used these new classes to sort the not-rewarded change-shift trials by the future goal. Discussion One unexpected benefit of MatOFF and its history scripting is the promotion of teamwork. A single MatOFF database can be shared (copied), and any copy of a database can be updated by sharing history scripts. Once a database is updated, any investigator can query and make use of all available classes. In practice, an initial series of history scripts are used to define classes that are essential to all subsequent analysis. This generates a common database. That database is distributed to the investigators, who then work independently. By agreement, each investigator is assigned a different range of class numbers for new classes. New classes developed by each investigator are distributed to other investigators as history scripts. Each investigator can pick and choose which of the distributed scripts might be useful. Any time along the way a “universal” database using all the history scripts can be created and re-distributed. There are many approaches to building data analysis tools and some remarkably powerful tools are available both commercially and as freeware. The best tools are always developed in the context of specific problems. MatOFF is the current incarnation of a tool that traces its origins to the father of primate behavioral neurophysiology, Edward V. Evarts (Cowan et al., 2000), and has co-evolved these past three decades with the explosive expansion of the field. From its inception, the progenitor to MatOFF aimed at concise, flexible and efficient classification of behavioral trials. MatOFF inherits these qualities and extends the power of trial-by-trial analysis by integrating the MATLAB language internally to its search engine, rather than as a post-processor. The examples used are among the most complex neurophysiological behaviors published in the field, and they highlight the benefits of MatOFF as a tool. Conceptually, the sandbox integration of a high level language into the search component of existing tools is an approach that all tool designers should consider, especially those who write new programs for each new experiment. Standardization, especially for data exchange and reuse, is an important goal of experimental software development and management. Various standards and organizational schemes have been developed to make data accessible through a common interface (Kötter, 2001; Bradley et al., 2005, and many others). Some schemes employ a relational database (Bradley et al., 2005), others use XML (http://neurodatabase.org), and others rely on a Microsoft Windows dynamic link library (DLL). The MatOFF internal file structure has some of the features of a relational database, but deviates from this architecture to make event sequence searches both fast and flexible. A description of the file structure is provided with the MatOFF distribution and as Supplementary Note 2 online. For data sharing with MatOFF, we have adopted the Neuroshare standard for importing data (http://neuroshare.sourceforge.net), which uses a standardized DLL interface. As of this writing, a Cortex Neuroshare DLL is being tested and the ability to read files via Neuroshare is in development. MatOFF has several facilities to generate text files. Many of our results are output as text files in a columnar format, then imported to a spread sheet (Microsoft Excel) or relational database (Microsoft Access) for further analysis. Despite all its power, MatOFF continues to be the tool of choice in our laboratory for the beginner or non-programmer. MatOFF is easiest to learn by demonstration, but two well-documented examples are available. One example is in the ezstart directory of the distribution and described in Supplementary Note 1 online. With a single command it will create a plot from a sample Cortex data file, or from the user’s own data file. The other is an example of history scripting. It is available on the MatOFF Web site as history-scripting-example.zip and in the MatOFF distribution under/example_files. Both examples show how to go from a Cortex file to a plot in relatively few commands. When learning by demonstration, the GUI is preferred. Although potentially daunting at first, the GUI is ideal for a quick look at data and for testing “what if” scenarios. We find new users can be trained to use the GUI quite rapidly. Intermediate users graduate to batch files and text output once they are familiar with GUI operation. With the advent of history scripting, MatOFF power users can now apply MATLAB programming skills without having to abandon their most productive tool. 01 Click here to view.(24K, pdf) 02 Click here to view.(55K, pdf) 03 Click here to view.(25K, pdf) 04 Click here to view.(14K, pdf) 05 Click here to view.(14K, pdf) Acknowledgments The authors wish to thank Dr. Paul Cisek for his comments on an early version of the manuscript. This research was supported by the Intramural Program of the NIH, National Institute of Mental Health. The authors also wish to acknowledge the work of Dr. Karl Arrington for his pioneering work on the KOFF data analysis program. Footnotes Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||
Neuron. 2005 Jul 21; 47(2):307-20.
[Neuron. 2005]Neuron. 2005 Jul 21; 47(2):307-20.
[Neuron. 2005]Hippocampus. 1999; 9(2):101-17.
[Hippocampus. 1999]J Neurosci. 2006 Jul 5; 26(27):7305-16.
[J Neurosci. 2006]Annu Rev Neurosci. 2000; 23():343-91.
[Annu Rev Neurosci. 2000]J Neurosci Methods. 2005 Jan 30; 141(1):75-82.
[J Neurosci Methods. 2005]