How to get started with the MATLAB distributed computing toolbox?

The MATLAB parallel computing toolbox, formerly known as the distributed computing toolbox (DCT) is a commercial toolbox provided by Mathworks. It allows you to execute distributed computations on multiple cores in a single computer, or if you have access to distributed computing engines on a compute cluster. To figure out whether you have it, you can try

help distcomp

or using FieldTrip

ft_hastoolbox('distcomp')

This page provides a short introduction into the most relevant functions that allow you to distribute your FieldTrip analysis over multiple computers or cores and run them in parallel.

The distributed computing toolbox requires that you begin by starting up the “workers”, either on your local computer or on the distributed computing engines on your cluster. This is done with e.g.

matlabpool local 4

which starts 4 workers with the “local” configuration. Subsequently you can use parfor instead of the normal for to iterate over a number of computations, as in

dataset = {
'Subject01.ds'
'Subject02.ds'
'Subject03.ds'
};

parfor i=1:3
  cfg = [];
  cfg.dataset = dataset{i}
  data{i} = ft_preprocessing(cfg);
end

Alternatively you can use the dfeval function like this

for i=1:3
  cfg{i} = [];
  cfg{i}.dataset = dataset{i};
end

data = dfeval(@ft_preprocessing, cfg, 'Configuration', 'local');

The dfeval function works similar to the standard MATLAB cellfun function, and thereby to the FieldTrip qsubcellfun and peercellfun functions.

A third approach that is available in the distributed computing toolbox is to use the spmd construct. Given the same definition of the dataset as a cell-array with three strings as above, this would look like

matlabpool local 3
spmd 3
  cfg = [];
  cfg.dataset = dataset{labindex};
  data{labindex} = ft_preprocessing(cfg);
end

The labindex variable is automatically replaced by the number of the worker. Note that this only works if your matlabpool is greater than or equal to the number of jobs.

Some closing remarks

Many of the FieldTrip functions allow to specify the cfg.inputfile and cfg.outputfile option, which allow you to run large analyses in parallel without all the analysis results being returned to your primary MATLAB session. This is especially relevant if your primary computer is not able to hold the results of all computations in memory at the same time.

Elsewhere on this FieldTrip wiki you can find more documentation, such as the distributed computing tutorial. Some of the FAQs on distributed computing with the FieldTrip qsub toolbox and the peer toolbox will also be informative in general.

faq/how_to_get_started_with_the_matlab_distributed_computing_toolbox.txt · Last modified: 2015/10/26 11:55 by 131.174.44.150

You are here: startfaqhow_to_get_started_with_the_matlab_distributed_computing_toolbox
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0
This DokuWiki features an Anymorphic Webdesign theme, customised by Eelke Spaak and Stephen Whitmarsh.
Mobile Analytics Website Security Test