Sparse File Format
Files such as PhyloD's predictorFile and targetFile are in the
sprase file format.
This format is three-column and tab-delimited text file with headers
var (variable),
cid (case identifier), and
val (value). For example, suppose that
AnHla is the name of an HLA.
This HLA file starts:
var | cid | val |
AnHla | 1 | 1 |
AnHla | 2 | 0 |
AnHla | 3 | 0 |
AnHla | 4 | 1 |
AnHla | 5 | 1 |
... | ... | ... |
This says that the patient with cid 1 has HLA "AnHla". Patients 2 and 3 don't. Patients 4 and 5 do.
Missing data is supported. To say, for example, that you don't know if patient 4 has HLA "AnHLA", just leave out the "AnHla 4" line from the file.
The values in the
var column may be any string. The values in the
cid column may be any string. The values in the
val column should be either 0 or 1.