HMMTree: A computer program for hierarchical multinomial processing tree models

Help Contents

  • 1. User Guide
  • 2. Contact the authors

    1. How to use this software: a brief user guide

  • The HMMTree program computes parameter estimates, confidence intervals and goodness-of-fit statistics for hierarchical multinomial processing tree models.
  • The current release of the program can be found at http://www.psychologie.uni-freiburg.de/Members/stahl/HMMTree.
  • This guide gives a short introduction to computing HMM models with the program and describes the file formats that are used. The program is described in more detail in:
    Stahl, C. & Klauer, K.C. (2007). HMMTree: A computer program for hierarchical multinomial processing tree models. Behavior Research Methods, 39, 267- 273.
  • For an introduction to latent class hierarchical multinomial models, see:
    Klauer, K.C. (2006). Hierarchical multinomial processing tree models: A latent-class approach. Psychometrika, 71, 1 - 31.

    Computing HMM models

    1.1 Load a model and a data set

      Model
    1. Select File | Load equations or press CTRL + E. A file open dialog will show.
    2. Pick an EQN file to load and select 'Open' or press ENTER.
    3. The model will be loaded and displayed. Click on the 'Model' tab to verify that you selected the correct model.
      Data
    1. Select File | Load data or press CTRL + D. A file open dialog will appear.
    2. At the bottom, select the file type (MDT or DAT) you wish to open.
    3. Pick a data file to open and select 'Open' or press ENTER.
    4. The data will be opened and displayed in a grid on the 'Data' tab, where you can verify your selection.

    1.2 Compute model

    Once you have successfully loaded a model and data file, the 'Run' button will be enabled.

    1. Select the desired output options. By default, parameter estimates and confidence intervals are computed.
      • On the first run, parameter estimates need to be computed.
      • Confidence intervals are available when the Fisher information matrix is also computed. You can optionally select to print the inverse of the Fisher information.
      • Goodness-of-fit statistics for mean and variance/covariance structure are also available when the Fisher information is computed.
      • Also, the observed and expected means (classwise) and variances and covariances can be computed.
      • In case of more than one class, the posterior probability of a participant's class membership can be computed.
    2. Click the 'Run' button or press ALT + R or F9 to start the computation. When computation is finished, the results are diplayed in the 'Output' tab.
    3. Save the output to a text file with the File | Save output (or CTRL + S) or the File | Save output as... (or F12) commands.

    File formats

    Model equations have to be provided in the EQN format that is also used by other multinomial model estimation software like AppleTree, MBT, or GPT. Data can be provided in two different formats: the MDT format (used by AppleTree and MBT) and as a tab-separated subjects x categories matrix.

    The EQN Format

    The EQN format used for this program is equivalent to the format used by AppleTree and MBT. An EQN file specifies the equations of a multinomial model.
    The first line gives the number of equations in the model. Line 2 to line number of equations + 1 contain the model's equations.
    An equation line contains three fields, separated by a space:

  • The first field contains the tree number of that equation.
  • The second field contains the category number.
  • The third field contains the equation itself.

    An example EQN file:
    16
    1 1 c1*r1
    1 4 c1*(1-r1)
    1 2 (1-c1)*u1*u1
    1 3 (1-c1)*u1*(1-u1)
    1 3 (1-c1)*(1-u1)*u1
    1 4 (1-c1)*(1-u1)*(1-u1)
    2 5 a1
    2 6 (1-a1)
    3 7 c2*r2
    3 10 c2*(1-r2)
    3 8 (1-c2)*u2*u2
    3 9 (1-c2)*u2*(1-u2)
    3 9 (1-c2)*(1-u2)*u2
    3 10 (1-c2)*(1-u2)*(1-u2)
    4 11 a2
    4 12 (1-a2)
    
    The equation format is defined as follows:
    	equation ::= term | equation*term
    	term ::= parameter | (1-parameter)
    	

    The MDT Format

    The MDT data set consists of a data set description on the first line, followed by a line for each category, containing its number and its frequency, separated by a space. The end of a data set is marked by three equal signs. A file can contain multiple data sets.

    An example of an MDT file with aggregated data:

    Participants 1-3
    1 8
    2 1
    3 7
    4 13
    5 7
    ===
    	
    It is used by AppleTree, GPT and MBT. These programs do not compute hierarchical multinomial models and thus do not require individuals' data sets. You cannot use MDT files from these programs to compute hierarchical multinomial models! To be able to use the MDT format to compute hierarchical multinomial models, individuals' category frequencies have to be provided as separate data sets.

    An example of an MDT file containing individual subjects' category frequencies:
    Participant 1
    1 4
    2 0
    3 1
    4 5
    5 2
    ===
    Participant 2
    1 3
    2 0
    3 4
    4 2
    5 3
    ===
    Participant 3
    1 1
    2 1
    3 2
    4 6
    5 2
    ===
    	
    Category labels have no relevance for computation and may be omitted. If present, those category labels given with the first individual's data set are displayed.

    The DAT Format (tab-separated matrix)

    The DAT format allows you to easily use data from other statistics software for hierarchical multinomial model estimation. It basically consists of a matrix of individuals' category frequencies.

    The first line of a DAT file is interpreted as data set description. It can contain labels for each category or, alternatively, a description of the data set.

    Each additional line is interpreted as an individual's set of category frequencies. Category frequencies are read from left to right. The first value is assigned to category 1, the second value is assigned to category 2, etc. Values on a line have to be separated by (one or multiple) TAB(s) or space(s).

    A DAT file may contain individuals' numbers or codes as the first value of each line. if the second line of a DAT file contains one more value (delimited by tabs or spaces) than the model contains categories, the first value in each line is interpreted as a participant code or number.

    An example with category labels:

    A	B	C	D	E
    4	0	1	5	2
    3	0	4	2	3
    1	1	2	6	2 
    	
    An example with data set description:
    Subjects 1-3:
    4	0	1	5	2
    3	0	4	2	3
    1	1	2	6	2 
    	
    An example with category and participant labels:
    A	B	C	D	E
    S1	4	0	1	5	2
    S2	3	0	4	2	3
    S3	1	1	2	6	2 
    	
    The following SPSS command provides you with a DAT file of your data:
    SAVE TRANSLATE OUTFILE='filename.dat' /TYPE=TAB /MAP /REPLACE /FIELDNAMES.

    2. Contact the authors

    Christoph Stahl & Karl Christoph Klauer
    Institute for Psychology
    Albert-Ludwigs-Universitaet Freiburg
    D-79085 Freiburg i. Br.
    Germany

    E-Mail: stahl@psychologie.uni-freiburg.de