matthews_coef - Misha Isupov's Jiffy to calculate Matthews coefficient.
matthews_coef
[Keyworded input]
The Matthews Coefficient and solvent content are calculated from the unit cell and the molecular weight of the molecules in the unit cell. A description of the Matthews coefficient Vm and how it relates to solvent content is given below.
The program requires the information below which is input via keywords. No input files are required.
No output files are generated; below is a sample of the log output.
THE MATTHEWS COEF. IS : 1.74 SOL % IS : 28.96
or, if used with the AUTO keyword:
For estimated protein molecular weight 6000. Nmol/asym Matthews Coeff %solvent 1 6.6 81.2 2 3.3 62.5 3 2.2 43.7 4 1.7 24.9 5 1.3 6.1
Available keywords are:
AUTO, CELL, MOLWEIGHT, NMOL, NRES, SYMMETRY, XMLOUTPUT
You must give the unit cell parameters. The angles default to 90.0 if omitted.
Either the spacegroup number or name can be given. Alternatively, the symmetry operators can be input explicitly, each separated with a '*'. However, the program only requires the total number of operators.
This is used to estimate the molecular weight of one molecule in Daltons. It is assumed that on average each residue contains 5 carbons, 1.35 nitrogens, 1.5 oxygen, 8 hydrogen and 0.05 sulpher atoms and has a molecular weight of about 110.
The molecular weight of a molecule in Daltons. What is important is the total molecular weight of the molecules in the asymmetric unit. This keyword is used in conjunction with NMOL. If this is not given, the program calculates a tentative molecular weight of the molecule, assuming the unit cell is 50% protein.
This keyword is not compulsory but is used in conjunction with MOLWEIGHT. The <number> of molecules per asymmetric unit. Default 1.
This keyword is not compulsory and can be used in conjunction with NMOL and MOLWEIGHT. It produces a list of incrementing number of molecules, from NMOL (default 1), in the asymmetric unit whilst the %solvent is >0.0.
This keyword is of little use for the 'user'. When specified matthews_coef will output a small XML file of the results. The name and location of the XML file can be specified on the command line with XMLFILE, otherwise the file will be called MATTHEWS_COEF.xml.
Example of input
CELL 73.58 38.73 23.19 SYMM 19 MOLW 6600.0 AUTO XMLO
Example of output file
<?xml version="1.0"?> <matthews_run>> <MATTHEWS_COEF ccp4_version="4.1" date=" 1/25/02" /> <keyword > </keyword> <cell volume=" 66085.78" /> <result nmol_in_asu=" 1" matth_coef=" 2.503249" percent_solvent=" 50.47841" /> <result nmol_in_asu=" 2" matth_coef=" 1.251625" percent_solvent=" 0.9568155" /> </matthews_run>
Vm = cell volume ( cubic As) V ----------------------- = --- M*nasymu*nmols_asu M*Z M = molecular weight of protein in daltons V = volume of unit cell. Z = no. of molecules in unit cell. = nasymu*nmols_asu nasymu = number of asymm. units nmols_asu = number of molecules in asym unit. Molecular weight = number of protein residues in molecule * 110 - very roughly!!! = number of non hydrogen protein atoms in molecule *14 - roughly!!!!
Use RWCONTENTS to read your PDB file if you have one; it will count number of atoms of each type.
Matthews found Vm somewhere between 1.66+ and 4.0+ corresponding to protein contents of 75% to 30% but proteins with higher solvent contents will give higher values of Vm. E.g. for a solvent content of 90%, the Vm would be 12+.
Using this you can calculate Vm assuming nmols_asu = 1/4,1/2,1,2,3 etc etc.. You MAY be able to narrow down the number of possibilities for nmols_asu. If Vm falls outside the range above then the number of molecules per asymmetric unit assumed, is likely to be incorrect.
Turning this into fraction of protein in asymmetric unit:
Total mass of Protein in unit cell Vp = --------------------------------------- Protein density * Unit cell volume Vp = M*Z*u/(V*Dp) = 1/(N*Dp*Vm) where Vp = fraction of protein volume in asymmetric unit. Vm = Matthews Number (A**3/Daltons) Dp = density of protein = 1.35 (g/cc) (ref 1) N = Avagadro constant = 6.023*10**23 gmole**(-1) u = Mass of Hydrogen = 1.66*10**-24 g ( It is sufficient to approximate the mass of a Hydrogen atom as (1/N) because the mass of 1 mole of Hydrogen approximates to 1g.) ==>From this it is easy to obtain the formula derived in Matthews i.e. Vp = 1.66*v / Vm = 1.23 / Vm 1/Dp is Matthew's v = 0.74 cc/g )
Alternatively:
Vp = Np* AV/V where Np = number of protein atoms in unit cell (including hydrogens) AV = average atomic volume in A**3 - = 10 approximately. (There are about the same number of hydrogens as C N O etc.) If Vp equals fraction of protein volume in asymmetric unit Density = Dp *Vp + Ds* (1-Vp) = 1.35*Vp + 1.0 * (1-Vp) = 0.35*Vp + 1.0 Ds = density of solvent. = 1.0 for H2O therefore Vp = (density -1.0)/0.35
If you know the density you can work backwards and find the number of molecules in the asymmetric unit exactly.
matthews_coef << eof CELL 73.58 38.73 23.19 symm 19 molweight 6600.0 nmol 1 eof
matthews_coef << eof CELL 73.58 38.73 23.19 SYMM 19 MOLW 6600.0 AUTO eof
Originator: Misha Isupov
Additions by: Alun Ashton a.w.ashton@ccp4.ac.uk, Eleanor Dodson ccp4@ysbl.york.ac.uk