I am using antibody.linuxgccrelease aplication in order to model camiled heavy chain only antibodies.
when using the command:
antibody.linuxgccrelease -exclude_homologs true -vhh_only -fasta my_fasta.fa | tee grafting.log
for example with my_fasta.fa:
>heavy QVQLQESGPSLVKPSQTLSLTCTVSGLSLSDHNVGWIRQAPGKALEWLGVIYKEGDKDY NPALKSRLSITKDNSKSQVSLSLSSVTTEDTATYYCATLGCYFVEGVGYDCTYGLQHTTF HDAWGQGLLVTVSS
I often get models that don't have the same sequence as the input fasta, mainly missing the last
Serine (like when using my_fasta.fa example) , but somtimes with additional amino acids in the begining or some missing amino acids in the end.
the weirdest result I got is when running with the following fasta file:
>heavy QVQLVQSGAEVVKPGASVKVSCKASGYAFSSSWMNWVRQAPGQGLEWIGRIYPGDGDTN YAQKFQGKATLTADKSTSTAYMELSSLRSEDTAVYFCAREYDEAYWGQGTLVTVSS
I got models with the following sequence:
(the first two amino acids and the last amino acids are missing and an extra Q was added).
this problem creates errors when trying to compute the rmsd, for example in the CDR3 modeling stage with in:file:native (with antibody_H3.linuxgccrelease I get an error because the sequences are different).
Is there any way to solve this issue?
Really appreciate the help!