Chapter 2. I just want to analyze my data!

Table of Contents
Data analysis considerations
Putting it all together

If you have already obtained QMS2 and are reasonably familiar with Unix or Unix-like systems and SAS Software, then you might only need the information in the section called Putting it all together. This section give a brief description of exactly what to type to go through a complete analysis using Genehunter 2 and QMS2. Each step will be described in more detail in the subsequent chapters. If you are not comfortable writing your own SAS Software scripts and using a Unix or Unix-like command line, then you should read this entire document.

Warning

This document does not:

  • Give a complete tutorial on all of the options available with QMS2.

  • Explain how to format raw data for use with Genehunter 2.

  • Describe how to interpret the results of QMS2.

Because QMS2 was primarily written to perform the DF analysis, the default settings are selected to correctly handle data destined for a DF analysis. This includes expecting to have probands specified on the input file and double entering the data. This is probably not appropriate for the DF Augmented, HE, and New HE models.

The qms2.sas macro is configured to produce no output by default. Though a limited and probably ineffectual safety measure it was designed by the authors to require active intervention by the user before any files or datasets are overwritten.

Warning

Using proc print; procedures on the output statistics and influence statistics datasets can produce a huge amount of data. Influence statistics are produced for each sib pair at each position, so even moderately sized datasets can produce listings of 10s or 100s of megabytes. It is strongly recommended that some post-processing be performened on the influence statistics, such as only printing the top 10 most influentual cases.

Data analysis considerations

As it is written, QMS2 uses all four models:

every time it is run. Because these models have different properties it is unlikely that it is appropriate to use all of them on the same dataset.

Another important factor to take into consideration is the direction of the t. Under the HE model the ts of interest will have negative values, while under the DF Augmented and the New HE the ts of interest will be positive. The direction of the t of the DF Basic model depends on the scaling of the data. If the mean value of the selected group is lower than the mean value of their co-sibs (i.e. selection on a low score) the ts of interest will have a negative value because the regression line has a negative slope. If the mean value of the selected group is higher than the mean value of their co-sibs (i.e. selection on a high score) the ts of interest will have a positive value because the regression line has a positive slope.

These things must be taken into account when considering whether to get excited about the peaks or the troughs of the graphs. Genehunter 2 and other programs may truncate positive ts for the HE model, but QMS2 does no filtering of the results.