I tried to model a protein sequence (4 chains) using RosettaCM. The structure of the protein is already available in PDB (2ERJ). The issue I am facing - the final modelled structure has all the 4 chains very far apart, unlike the pdb. What exactly is going wrong?
I am obtaining the alignment file from Clustal omega. The target fasta sequence is given as chain1/chain2/chain3/chain4 and the 2ERJ fasta is also given in the same format in clustal omega so that the final alignment has only 2 unique identifiers in the ali file, like this:
2ERJ1|Chains GMLSLELCDDDPPEIPHATFKAMAYKEGTMLNCECKRGFRRIKSGSLYMLCTGSSSHSSW 60
IL2Ra -----ELCDDDPPEIPHATFKAMAYKEGTMLNCECKRGFRRIKSGSLYMLCTGNSSHSSW 55
************************************************.******
2ERJ1|Chains DNQCQCTSSATRSTTKQVTPQPEEQKERKTTEMQSPMQPVDQASLPGHCREPPPWENEAT 120
IL2Ra DNQCQCTSSATRNTTKQVTPQPEEQKERKTTEMQSPMQPVDQASLPGHCREPPPWENEAT 115
************.***********************************************
2ERJ1|Chains ERIYHFVVGQMVYYQCVQGYRALHRGPAESVCKMTHGKTRWTQPQLICTGEMETSQFPGE 180
IL2Ra ERIYHFVVGQMVYYQCVQGYRALHRGPAESVCKMTHGKTRWTQPQLICTGEMETSQFPGE 175
************************************************************
2ERJ1|Chains EKPQASPEGRPESETSCLVTTTDFQIQTEMAATMETSTGHHHHH---------------- 224
IL2Ra EKPQASPEGRPESETSCLVTTTDFQIQTEMAATMETSIFTTEYQVAVAGCVFLLISVLLL 235
************************************* .::
and so on.
I used the following commands:
~wholepath/tools/protein_tools/scripts/setup_RosettaCM.py --fasta IL2Ra.fasta --alignment alignmentfile.aln --alignment_format clustalw --templates 2ERJ.pdb
~wholepath/main/source/bin/rosetta_scripts.default.linuxgccrelease @flags -database /wholepath/main/database -nstruct 10
What is it that I am doing wrong which is leading to all the modelled chains being very far apart in the final structures?
Kindly help.