Convert the EEG language dataset for sharing in BIDS

The following MATLAB script shows how the EEG data supporting Identifying Object Categories from Event-Related EEG: Toward Decoding of Conceptual Representations by Simanova et al. (2010) was converted to the Brain Imaging Data Structure (BIDS). The original data is available from the archive of the MPI for Psycholinguistics. More details on the experiment and data can be found here.

The BIDS background is explained on http://bids.neuroimaging.io, details on the specification can be found on https://bids-specification.readthedocs.io/.

This conversion makes use of the data2bids function to convert the data and to write the associated metadata to the accompanying TSV and JSON files. The converted data in the BIDS organization is available from our download server.

The conversion here only includes the “pilot” subjects numbered 0, 1, 2, 3, 4, which were used for the (non-blind) optimization of the analysis pipeline in the original paper. The optimized pipeline was subsequently used (blind) on all other subjects. This ensures that the optimization of the processing does not bias the results of the analysis.

% this script converts the original data to a BIDS representation

originaldir = '/Volumes/SamsungT3/data/eeg-language/pilot';
rawdir = 'bids';

originalid = {
  'subj0'
  'subj1'
  'subj2'
  'subj3'
  'subj4'
  };

for i=1:length(originalid)
  % keep the original number as the identifier, but zero pad it to two digits (e.g., 01, 02, ...)
  number = sscanf(originalid{i}, 'subj%d');
  newid = sprintf('%02d', number);
  
  cfg = [];
  cfg.sub = newid;
  cfg.bidsroot = rawdir;
  cfg.datatype = 'eeg';
  cfg.dataset = fullfile(originaldir, [originalid{i} '.vhdr']);
  cfg.task = 'language';
  cfg.method = 'copy'; % it is already in the right format
  
  % this goes into dataset_description.json
  % see https://bids-specification.readthedocs.io/en/stable/03-modality-agnostic-files.html#dataset_descriptionjson
  cfg.dataset_description.Name                = 'Identifying Object Categories from Event-Related EEG: Toward Decoding of Conceptual Representations';
  cfg.dataset_description.BIDSVersion         = '1.6.0';
  cfg.dataset_description.Authors             = {'Irina Simanova', 'Marcel van Gerven', 'Robert Oostenveld', 'Peter Hagoort'};
  cfg.dataset_description.ReferencesAndLinks  = {'https://doi.org/10.1371/journal.pone.0014465', 'https://hdl.handle.net/1839/00-0000-0000-001B-860D-8'};
  
  hdr = ft_read_header(cfg.dataset);
  nchan = hdr.nChans;
  
  % this goes into eeg.json
  cfg.InstitutionName                 = 'Radboud University';
  cfg.InstitutionalDepartmentName     = 'Donders Institute for Brain, Cognition and Behaviour';
  cfg.InstitutionAddress              = 'Kapittelweg 29, 6525 EN, Nijmegen, The Netherlands';
  
  cfg.TaskDescription = 'The study investigated semantic processing of stimuli presented as pictures (black line drawings on white background), visually displayed text or as auditory presented words. Stimuli consisted of concepts from three semantic categories: two relevant categories (animals, tools) and a task category that varied across subjects, either clothing or vegetables.';
  
  cfg.eeg.Manufacturer = 'BrainProducts';
  cfg.eeg.ManufacturersModelName= 'BrainAmp with ActiCap';
  cfg.eeg.CapManufacturer = 'Easycap';
  cfg.eeg.CapManufacturersModelName= 'M10';
  cfg.eeg.PowerLineFrequency = 50;
  cfg.eeg.HardwareFilters.lowpass = 1000;
  cfg.eeg.HardwareFilters.highpass = 1/10;
  cfg.eeg.SoftwareFilters.lowpass = 200;
  cfg.eeg.SoftwareFilters.highpass = 1/10;
  cfg.eeg.EEGChannelCount = 62;
  cfg.eeg.EOGChannelCount = 2;
  
  % this goes in channels.tsv
  cfg.channels.name               = hdr.label;
  cfg.channels.type               = repmat({'EEG'}, nchan, 1);  % Type of channel
  cfg.channels.units              = repmat({'uV'}, nchan, 1);% Physical unit of the data values recorded by this channel in SI
  cfg.channels.sampling_frequency = repmat(hdr.Fs, nchan, 1); % Sampling rate of the channel in Hz.
  
  % add a human-interpretable event table
  event = ft_read_event(cfg.dataset);
  nevent = length(event);
  
  onset     = ([event.sample]' - 1)/hdr.Fs; % starting at t=0
  duration  = zeros(size(onset));
  sample    = [event.sample]'; % starting at sample 1
  type      = {event.type}';
  value     = {event.value}';
  
  % these are the required columns, i.e. the technical description of the events
  % see https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/05-task-events.html
  required = table(onset, sample, duration, type, value);
  
  % The first digit codes task/no task: 1 for the non-target semantic categories:
  % animals, tools and 2 for the target semantic category: clothing. The subjects’ task
  % was to press the button in response to clothing items, these targets were not
  % analyzed in the main study.
  %
  % The second digit codes the items, 1 to 4 for animals (cow, bear, lion, ape) and 5
  % to 8 for tools (ax, scissors, comb, pen). There were also 4 target items
  % (clothing).
  %
  % The third digit codes the stimulus modality: 1 for written words, 2 for pictures, 3
  % for spoken words.
  
  task      = cell(nevent,1);  % nontarget, target
  category  = cell(nevent,1);  % animals, tools
  item      = cell(nevent,1);  % cow, bear, lion, ape, ax, scissors, comb, pen
  modality  = cell(nevent,1);  % written, picture, spoken
  
  for i=1:nevent
    if strcmp(event(i).type, 'Stimulus')
      digit1 = str2double(event(i).value(2));
      digit2 = str2double(event(i).value(3));
      digit3 = str2double(event(i).value(4));
      
      if isnan(digit1) || isnan(digit1) || isnan(digit1)
        task{i}     = 'unknown';
        category{i} = 'unknown';
        item{i}     = 'unknown';
        modality{i} = 'unknown';
        continue
      end
      
      switch digit1
        case 1
          task{i} = 'notarget';
        case 2
          task{i} = 'target';
      end
      
      if strcmp(task{i}, 'target')
        % the interpretation of digit2 is not given for targets
        category{i} = 'target'; % clothes or vegetables
        item{i}     = 'target'; % we don't know the actual items

      else
        % the following only applies to nontargets
        switch digit2
          case {1, 2, 3, 4}
            category{i} = 'animals';
          case {5, 6, 7, 8}
            category{i} = 'tools';
        end

        switch digit2
          case 1
            item{i} = 'cow';
          case 2
            item{i} = 'bear';
          case 3
            item{i} = 'lion';
          case 4
            item{i} = 'ape';
          case 5
            item{i} = 'ax';
          case 6
            item{i} = 'scissors';
          case 7
            item{i} = 'comb';
          case 8
            item{i} = 'pen';
        end
        
      end % target or non-target

      switch digit3
        case 1
          modality{i} = 'written';
        case 2
          modality{i} = 'picture';
        case 3
          modality{i} = 'spoken';
      end
      
    elseif strcmp(event(i).type, 'Response')
      task{i}     = 'response';
      category{i} = 'response';
      item{i}     = 'response';
      modality{i} = 'response';
      
    else
      task{i}     = 'unknown';
      category{i} = 'unknown';
      item{i}     = 'unknown';
      modality{i} = 'unknown';
      
    end % stimulus or response
  end % for
  
  % these are the interpretation of the events
  interpretation = table(task, category, item, modality);
  
  % this is for events.tsv, note that it is with an "s"
  cfg.events = cat(2, required, interpretation);
  
  % this is for participants.tsv, note that it is with an "s"
  cfg.participants.age = nan;
  cfg.participants.gender = nan;
  cfg.participants.handedness = nan;
  
  % convert the dataset to BIDS
  data2bids(cfg);
  
end % for each subject

Tags: cuttingeeg2021 bids eeg-language

Convert the EEG sedation dataset for sharing in BIDS