Working with MATLAB and BIAC analysis tools

By Scott Huettel, Ph.D.

MATLAB (Mathworks, inc.) is a powerful data manipulation environment that allows flexible analyses of large and complex data sets. Many analysis tools created at the Duke-UNC Brain Imaging and Analysis Center (BIAC) use the MATLAB environment. This instruction set is intended to provide a brief overview of how to use MATLAB, including basic syntax, working with variables, and using scripts, functions, and programs. Additional help can be found by selecting MATLAB help under the Help menu. You can almost always access command-specific help for any Matlab command discussed by typing help command in the command window (e.g., type help overlay2).

Note that this description is intended to introduce basic MATLAB concepts and is appropriate for the versions of MATLAB used at BIAC. It does not necessarily use the same notational structure as the MATLAB help files themselves. This description will be primarily based upon MATLAB version 6.1, but most topics will be applicable to earlier versions.

Commands to be typed into MATLAB will be indicated in bold in the paragraphs, with examples set aside in red, often followed by a short descriptive comment in green. For example:

cd P:\User_Scripts\ %changes current directory to user scripts

Output from a command will be shown in blue, usually to the right. a = 123 * 4 a = 492

Some of the commands will use example data loaded into the Class.01 directory on Bristol. These commands will assume that the reader has mapped the biac drive on Bristol to the “T” drive letter in Windows. If you have a different mapping, then you will need to substitute your drive letter in all examples.

One thing that will help you if you make a mistake running a command or program is the key combination <ctrl>+c. Simultaneously pressing these two keys will interrupt whatever MATLAB is doing. This is especially useful if you forget a semicolon when loading in data (see Section 1.2.4) and MATLAB starts printing out millions of data points. You can also access previously typed MATLAB commands by pressing the up arrow; partially typing the first few letters of the command before pressing the up arrow will restrict the search.

Basic MATLAB

Strengths of MATLAB

The power of MATLAB for fMRI analyses results from two of its strengths. First, it is optimized for working with large matrices, such as the data sets collected in fMRI studies. In a single fMRI study, a researcher may collect about 1000 images of the brain, which are each composed of something like 64 rows (X) by 64 columns (Y) by 20 slices (Z) of voxels of the brain. This large data set can be represented as a single four-dimensional matrix with the size: 64 x 64 x 20 x 1000. Since many of our fMRI analyses apply similar actions to all voxels in the brain, MATLAB can do those analyses very efficiently. Some examples of such actions would be moving the entire brain up one voxel, as when doing motion correction, or creating an average image of the brain across all time points.

The second aspect of MATLAB that makes it extremely useful for fMRI analyses is that researchers can work directly with their data using the command window. The command window allows direct execution of single commands. If, after you loaded a data set from a brain into the command window, you found that it was upside down, you could flip the brain using a single MATLAB command. You can use simple commands to extract single trials from a complex data set, to calculate statistics on your data, or event to find out the value of a statistical test at different points in the brain. The command window is also very useful for testing out parts of programs as you write them, as well as for debugging problems that occur.

Key MATLAB concepts

Command Window

The command window is the basic working environment of MATLAB. This is where you will load data, run programs, execute scripts, and work with variables. It also can control changes in the working directory and other environment properties. In version 6 and later, the command window will appear in the right part of the main MATLAB window and will be accompanied by other windows. In earlier versions, the command window is all that you will see upon starting MATLAB.

Working Directory

“Working directory” refers to a location on the hard disk or network that MATLAB will load and save files. To find out your current working directory use the “print working directory” command pwd in the command window:

pwd %prints current directory.

Alternatively, on newer versions of MATLAB, there is a current directory toolbar at the top of the window that lists your working directory. This toolbar has a drop-down list of recent directories and a selection button that allows choice of any directory.

Note that MATLAB cannot write to some write-protected directories, including anything on the C:\ drives of BIAC computers. For that reason you may need to change your working directory to some other location before using some programs, such as SPM. To change the directory, use the cd command and list whatever directory you want to change to:

cd d:\ %change current directory

Path

MATLAB maintains a list of possible disk locations for programs, scripts, etc., in the variable “path”. To see this list, just use the path command:

path %prints out current path.

You will see a large list of file locations. On the BIAC computers, most of these will be in the “C:\Program Files\MATLAB” directory, but there may be others. If you type a program or function's name into the command window, MATLAB will search through all of these directories to find it. So, if you want to use programs from a given directory over and over, you should add it to your path.

addpath \\BROCA\Programs\User_Scripts\dias %adds directory to path

If you want to see in which directory a program is stored, use the which command. This command is useful when there may be multiple versions of a program on the hard disk.

which showsrs %prints out directory containing program “showsrs”

Variables

Matlab stores information in variables that can be accessed through the command window (or through programs, scripts, etc.). Variables can be created in many ways. A simple way to create a variable is to assign some value.

a = 123 % creates a variable “a” which stores the numeric value 123

Note that both variable names and function names are case sensitive. If you now type a in the command window, it will print the value of a, ans = 123. If you type A, it will give an error: “??? Undefined function or variable 'A'.”

There are many different types of variables. Most numeric variables are stored as class “double” which means that a total of eight bytes are used to represent the number. You can also store text, other types of numbers, or groupings of variables called cell arrays or structures.

Variables can be of different sizes. You can create a variable consisting of a 10 by 10 matrix of random numbers using a simple command.

test = rand(10,10) %creates & prints 10*10 array of # from 0 to 1

To see what variables are currently loaded, you can look in the workspace window (in 6.0 or later) or you can type the command whos. Either way, you can see what variables you have loaded, what size they are, how large they are, and in what format they are stored. If you have followed this section so far, you should have the variables “a” and “test” loaded into the workspace.

Note that whenever you execute a MATLAB command, you can choose whether to print the output in the command window. If you don't want to print the output, put a semicolon (;) at the end of the line. The below command will create a new set of random numbers, but will not print them out.

test2 = rand(10,10); %creates/(no print)10*10 array of # from 0 to 1

To remove a variable from the workspace, use the clear command. If you want to remove one or more variables, type clear variable_name variable_name … . If you want to remove all variables, just type clear.

Scripts, functions, and programs

In general terms, there are three primary ways of running a series of Matlab commands. The simplest form is a script, which refers to a number of commands that execute one after another. For example, if you always used a sequence of five commands to display a particular dataset, you might create a script called display_data that executes those commands. Under the definition that will be used in this document, scripts do not take any form of arguments from the user; they just execute the commands as written.

A more powerful way of working with MATLAB is to use functions. Functions are a set of commands that take arguments, such as the random number generator rand() described above. In that example, two arguments were provided that told the function how many rows and columns of random numbers to create. Many BIAC tools are functions that act upon data based upon arguments.

The final way to work with MATLAB is by using programs. Programs are self-contained applications that allow the user to work with data interactively. For example, you might write a program that lets the user click a button to select a data set to load and then click other buttons to move up and down in the brain. Many of the BIAC programs can also be used as functions, following instructions in their help files.

Loading data into Matlab

Although you can type data directly into MATLAB's command window, that approach would not be feasible with most fMRI data, which may consist of millions of data points. Therefore, BIAC programmers have created tools that allow data to be read in from a variety of common scanner formats.

Readmr

The most basic data loading command, readmr, is used to bring data consisting of a single image/volume into MATLAB. You will use readmr for loading things like a structural image, a statistical analysis, or a single image from a functional series. A full description of the arguments taken by readmr can be obtained by typing help readmr in the MATLAB command window. You can use readmr as either a function (you type in the image parameters) or as a program (you select the image parameters).

The image that we will use as an example is a full-brain 3D SPGR data set, with 256*256*86 voxels, saved in 'volume' format (2-Byte integers).

To load a single anatomical brain using readmr as a function: anat_image=readmr('\\Huxley\Data\Class.01\Examples\Anatomical_Images\3D_SPGR_Full_Brain\3D_SPGR_a.img', 256,256, 86,'volume');

Alternatively, you can use readmr as a program: anat_image = readmr; then select the correct file name using the “browse” button, type the correct values into the X, Y, and Z fields, and select file type “volume”.

Note that in either case, you should get in the habit of adding a semicolon to the end of your commands; if you do not have that ending semicolon, the computer will print out the intensity value for each of the more than 5 million voxels in this image!

Common Image Types used at BIAC

Volume

The volume data type is used for most unanalyzed functional images. The reconstruction programs currently active for most data types will create images in volume format that are labeled V*.img (e.g., V0001.img, V0002.img, etc.). Each voxel is saved as a two-byte signed integer.

Float

The float data type is chiefly used for analyzed data, such as the output of a program that creates statistical comparisons, averages, or similar. The BIAC analysis programs MRTest and TstatProfile give their output in float format. Because many programs create float images, the file extension will vary. Each voxel is saved as a 4-byte floating point value, so decimal values are available.

DICOM_Slice

The DICOM_slice data type should be used when you want to load a single image (slice) from a set of slices saved in the DICOM format. DICOM is a standard for image formatting that was created by joint efforts of the American College of Radiology and the National Electronics Manufacturers Association. It allows interchange of medical images across a wide variety of platforms. Images in the DICOM format will generally be saved with the extension *.dcm. Note that images in the DICOM format contain headers with information about the image properties, so that you do not need to specify anything about the image if you load DICOM images using readmr as a program.

As of summer 2002, the transfer daemon that moves anatomical images from the scanners to the BIAC servers saves the images in DICOM format. The file name has the following parts:

          Scanner#_YearMonthDay_Exam#_Series_Run_Image.dcm

Dicom_Volume

The DICOM_Volume data type should be used when you want to load an entire volume of DICOM data (e.g., all slices from a single image). It is otherwise identical in use to the DICOM_Slice data type.

Signa5

The Signa5 format was previously used on the BIAC scanners for anatomical images. It uses 2-byte integers for each voxel, with byte-order swapped (UNIX format) and with a image header of 7904 bytes. This data type is no longer used as part of regular BIAC procedures, but would be used if data were manually transferred from the SIGNA console through the ADW to the BIAC servers.

ScreenSave

The ScreenSave data type is a variant of the Signa5 format designed for screen capture images. It has image dimensions set to 512 * 512 *1, which is the default resolution for a screen capture. Screen captures are typically used for identifying slice locations on a sagittal image. Like Signa5, the ScreenSave data type is no longer in common use at BIAC.

Analyze7.5_SPM

This data type is a variant of the Volume data type, with image dimensions set to the default values used by SPM: 79 * 95 * 68. It is typically used for data that have been analyzed in SPM. Note that data that have been normalized in SPM before additional analyses may have these dimensions but be saved using the Float data type.

Readtsv

The readtsv command is used for loading a time series of volumes (tsv) into MATLAB. This is chiefly used for loading in functional data, where the same brain image is repeated for many time points. The output of the readtsv command is a 4-dimensional matrix. Although there are differences in syntax, the operation of readtsv is generally similar to readmr. Both can be used in functional or program forms. Full details can be found by typing help readtsv.

We will use a data set that consists of 326 time points, with each image consisting of 64*64*14 voxels. To read in these data using readtsv as a function:

tsv=readtsv(‘\\Huxley\Data\Class.01\Examples\Data_Sets\FUNC3\V*.img', {64,64,14,'volume'});

To read in these data to variable “tsv” using readtsv as a program, type: tsv = readtsv; then select the correct directory and file using the browse button, select the file type of volume, and make sure that the correct image size appears in the X,Y, and Z windows.

As before, it is critical to have the final semicolon or, else MATLAB will print out the more than 18 million datapoints in this time series.

Commands for working with matrices

Understanding subscripts and indices

To extract a single value from a large dataset, you need to tell MATLAB the coordinates of that value. For this example, load in the 3D-SPGR dataset from above:

clear; anat_image=readmr('\\Huxley\Data\Class.01\Examples\Anatomical_Images\3D_SPGR_Full_Brain\3D_SPGR_a.img', 256,256, 86,'volume');

Now, you have a dataset consisting of the single volume anat_image with 256*256*86 voxels. To check the value of a single voxels, you need to indicate which voxel you want to look at. One way to do that is to give the subscripts (i.e., Cartesian coordinates) of the voxel. To look at the voxel at X = 120, Y = 124, and Z = 40, just type:

anat_image(120,124,40) ans = 59

Note that here you do not want a trailing semicolon, because you want the answer to be printed. Try a few other coordinate values to see what happens.

You can use more than one number in a subscript. For example, if you wanted to list a row of 30 voxels, you could do the following: anat_image(30:60,124,40). This will print out a column of numbers that starts off very small, rises to a peak for about 4 voxels, and then gets smaller again. Can you guess where in the image these coordinates are located? You may get a better answer by looking at other voxels around these coordinates.

You can also look at a point in a dataset according to its index. While three subscripts are used to identify a particular voxel in the brain, that same voxel can be identified uniquely using a single index. You can think of the index for a given voxel as the number that would be assigned if you started in the top-left corner of the matrix and counted your way through the entire volume, first counting from left to right, then from top to bottom, then from front to back. [Note that although directional terms are used here, they refer to the matrix, and not necessarily to the brain itself. If the brain was flipped within a matrix, so that the top-left of the matrix referred to the bottom-right of the brain, you still would begin counting with the top-left of the matrix].

For an example, if you began counting from the top left of the anat_image matrix, you would reach the voxel X=120, Y=124, Z=40 as the 2,587,512th voxel. This can be calculated by the following formula, where x_size = 256, y_size=256, and z_size = 86.

  index = X + (Y-1)*y_size + (Z-1)*y_size*x_size

If you type anat_image(2587512) you will get ans = 59, as before.

Changing between subscripts and indices

There are two commands that can take you back and forth between indices and subscripts: ind2sub and sub2ind. As their names may suggest, each command converts values from one format to another. Both take the size of a variable as an argument. The size of a variable is just the dimensions of the underlying matrix.

size(anat_image) ans = 256 256 86

If you type the command incorrectly, you get the following. Why? size anat_image ans = 1 10

To convert the coordinates 120, 124, 40 to an index, you can use the following commands (they are equivalent):

index = sub2ind([256,256,86],120,124,40) index = 2587512 index = sub2ind(size(anat_image),120,124,40) index = 2587512

To convert the index 1,000,000 to a set of subscripts: [X,Y,Z] = ind2sub([256,256,86],1000000) X=64, Y=67, Z=16 [X,Y,Z] = ind2sub(size(anat_image),1000000) X=64, Y=67, Z=16

Flipdim

You can use MATLAB to easily transform large datasets. You can quickly flip or rotate an entire dataset, often with a single command. The flipdim command is used to flip a brain along a single axis. The function is of the form: output = flipdim(input,dimension_to_be_flipped)

To flip the brain from top to bottom, which is dimension #3 in the anat_image dataset, use the following: anat_flipped = flipdim(anat_image,3);

You can quickly check that it has been flipped by comparing the midpoints of the two datasets. Note that the columns are reversed from top to bottom.

anat_flipped(128,128,39:48) anat_image(128,128,39:48)

It is critical to emphasize that flipdim does not rotate the brain. For example, if an image is in radiologic convention, such that the left side of the image represents the right side of the brain, it will remain in radiologic convention after using flipdim to flip it from top to bottom.

Permute

The command permute rotates the brain by rearranging the dimensions of the matrix. This is useful for changing image orientation (e.g., from axial to coronal), and is often used in concert with flipdim. This command requires both the data, and the new order of dimensions. To switch the last two dimensions of anat_image:

anat_permute=permute(anat_image,[1 3 2]); The new variable anat_permute is the same brain as before, but now it is in coronal orientation (albeit upside down).

Creating new matrices

A very frequently used command in MATLAB programming is zeros, which is used to create blank matrices of a specified size. This is often used to create a variable before writing in some data. To create a 64*64*14 matrix of zeros (e.g., a blank brain): blank = zeros(64,64,14);

A related command is ones, which creates a matrix filled with unit values. ones_matrix = ones(64,64,14);

Removing singleton dimensions from a matrix

Some matrix operations will leave you with a singleton dimension. For example, consider a data set consisting of 10 subjects * 3 conditions * 19 datapoints, for which we can create a dummy matrix: data = rand(10,3,19);

If you want to take the mean of this data across subjects, you would use: mean_data = mean(data,1);

This gives you an output of size 1*3*19. size(mean_data) ans = 1 3 19

Thus, the result of your operation is a 3D dataset with only one value for the first dimension. The first dimension is called a “singleton dimension” and becomes an annoyance for future operations. To remove that dimension use the squeeze command: mean_data = squeeze(mean_data);

Now you have a 2D dataset that can be more easily worked with.

Replicating and tiling a matrix

Another MATLAB command that is very useful when programming or working with large datasets is repmat. Repmat allows you to take a smaller matrix and tile it to make a bigger matrix. The format of the command is: output = repmat(input, [multipliers]);

where multipliers is some array such as [2 2 3]. Using [2 2 3] would take the original data and tile it twice in each of the X and Y dimensions, and 3 times in the Z dimensions. If our original data were a brain, using [2 2 3] would result in a new dataset with four brains in each slice (2*2) and three sets of the same slices.

Here, we will use this command to convert a time series into a difference series. Load in the tsv from section 2.3 above. This example uses a lot of memory, so you may wish to only load in part of the data or use a smaller data set.

Clear; tsv=readtsv(‘\\Huxley\Data\Class.01\Examples\Data_Sets\FUNC3\V*.img', {64,64,14,'volume'});

Now, create a mean image across the entire run: mean_image=mean(tsv,4);

We can next use repmat to replicate this mean image across 326 time points: mean_tsv = repmat(mean_image,[1 1 1 326]);

Next, we can get the difference between the original data and the mean: difference_tsv = tsv – mean_tsv;

With just these simple commands, you have created a time series of brains that shows the change from the mean value over time. In the data set used here, which shows a time series of brains corrupted by noise, this new difference image allows easy identification of the types of artifacts present at each point in time.

Using the find command to search for a particular value (or set of values)

MATLAB has the ability to quickly search through a large set of data to find values of a given range. This has many uses, including creating statistical masks (e.g., return only the voxels with t-values > 3.6) and setting display thresholds (e.g., only show voxels with intensity > 30). The demonstration of this command will use the anatomical SPGR image from above.

clear; anat_image=readmr('\\Huxley\Data\Class.01\Examples\Anatomical_Images\3D_SPGR_Full_Brain\3D_SPGR_a.img', 256,256, 86,'volume');

Let's say that we want to find all voxels representing the skull (generally high intensity values on T1-weighted images). We can set a threshold that we want to look for, in this case, values greater than 100. To identify such voxels, use the find command:

new_voxels=find(anat_image > 100); This command identifies 202,840 voxels. We will use the find command in the following section to create intensity-based masks of the brain.

Working with data in showsrs

We will use the BIAC tool showsrs, which can be used for simple analyses or displaying of data, as an introduction to working with MRI data. Showsrs is a MATLAB program that allows visualization of three- or four-dimensional data. When we refer to “four-dimensional data”, we mean that a data set has three spatial dimensions and one time dimension. So, we can use showsrs to display a data set consisting of a single anatomical image (3D) or a entire run of functional images (4D). Note that there are more advanced programs (e.g., orthoshowsrs) that subsume much of the functionality of showsrs and are more typically used by BIAC personnel. We here use showsrs as an introduction to basic concepts, but we will present these other programs in future sections.

To find out the usage syntax for showsrs, simply type help showsrs. A more detailed guide to the buttons and options in showsrs can be found in the BIAC reference library here (insert hyperlink)

Loading a sample data set

We will begin by loading a sample data set consisting of 326 time points, each with 64*64*14 voxels. It will take a few seconds to load the data clear tsv=readtsv(‘\\Huxley\Data\Class.01\Examples\Data_Sets\FUNC3\V*.img', {64,64,14,'volume'});

Running showsrs

You can either run showsrs as a function or as a program that prompts you for the data. To run it as a function, just provide the variable name for the data set you want to use as an argument. Note that the variable must already be loaded into memory. showsrs(tsv)

To run showsrs as a program that calls the readtsv command (Section 2.3): showsrs(readtsv)

To continue with the examples below, you should just go ahead and run it as a function that loads variable “tsv”.

Basic use of showsrs

The basic showsrs window displays the raw data from the loaded variable. The display begins at the first slice of the first volume. The slider at the left of the screen changes slices, and the slider at the bottom of the screen changes volumes. Try changing each of these to see what happens.

The buttons at the bottom left of the screen work like controls on a CD player: |< go to first volume, < play backward, [] pause, > play forward, >| go to last volume. The loop checkbox indicates whether to loop around to the front or back when you reach an end of a series. The adjacent box (begins with 0.0) is the time between successive images (play rate); increase this number to slow down the play speed. The play controls will only be functional for 4D data.

The W+L button brings up a new figure that controls the contrast of the image. You can control contrast by setting the “window” (i.e., how large of an intensity difference is set between black and white) and “level” (i.e., what value corresponds to the lowest level, black). To change these values, either type in the values in the boxes or drag the yellow sliders. By adjusting the contrast values in the loaded image (if you have loaded the example from Section 4.1), you will be able to see image artifacts surrounding the brain. Once you are finished adjusting the contrast, you should close the small figure that popped up.

Note that in showsrs, as in most BIAC tools, the menu at the top of the window is generally not used in most circumstances.

Finding intensity values in showsrs

To look at the intensity values of a single point in the brain, just click with your mouse on that point. Click on one point now. If you have a 3D dataset loaded, you will see a single red dot in the middle of the figure that gives you the intensity value of the brain where you clicked. if you have a 4D dataset loaded, you will see the entire time series. Looking at the time series is extremely useful for evaluating whether a voxel is active or for looking for problems in the data.

If at this point do you not have the main showsrs window active (title bar is gray), just click on the titlebar to bring the window to the front. Now, you can move your previously selected point around with the keys W, A, S, and D. This is useful for quickly exploring a small region of the brain.

We can look at a 3D data set at the same time in a new showsrs window. Go back to the MATLAB command window and type the following:

anat_image=readmr('\\Huxley\Data\Class.01\Examples\Anatomical_Images\3D_SPGR_Full_Brain\3D_SPGR_a.img', 256,256, 86,'volume');

This will create a new variable with a 3D anatomical image. Now, you can quickly display this brain in a new showsrs window:

showsrs(readtsv) All of the above showsrs commands are still available, except that you cannot move through the series in time. Looking at different datasets in showsrs this way provides a good way of beginning to understand concepts like spatial resolution and contrast.

Brain Imaging & Analysis Center

Table of Contents