You are here

NR Database Download in install_dependencies.pl (for make_fragments.pl) Broken

7 posts / 0 new
Last post
NR Database Download in install_dependencies.pl (for make_fragments.pl) Broken
#1

Hello,

 

 

I have recently been trying to install the dependencies for make_fragments.pl (/tools/fragment_tools/make_fragments.pl) with the provided install_dependencies.pl in the same directory. The installation goes fine until attempting to download the NR database. The directory hard-coded into the script (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.9.0+-src.tar.gz) no longer exists. I have tried pointing the script to both the new version of the database (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.10.0+-src.tar.gz) and to the new location of the old version in the script (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.9.0/ncbi-blast-2.9.0+-src.tar.gz). In either case, the package is downloaded, update_blastdb.pl is moved and ran, and the 39 tarballs of the NR database are successfully downloaded and extracted.

 

What I am having trouble with is in the subsequent fastacmd step. When this runs in the script, I get the following error:

‘’’

Generating nr fasta. Be very patient ......
[The directory to]/rosetta_bin_linux_2019.35.60890_bundle/tools/fragment_tools/blast/bin/fastacmd -D 1 > /[The directory to]/rosetta_bin_linux_2019.35.60890_bundle/tools/fragment_tools/databases/nr[fastacmd] ERROR: ERROR: Cannot initialize readdb for nr database

ERROR!
[The directory to]/rosetta_bin_linux_2019.35.60890_bundle/tools/fragment_tools/blast/bin/fastacmd -D 1 > [The directory to]/rosetta_bin_linux_2019.35.60890_bundle/tools/fragment_tools/databases/nr failed.

‘’’

 

I have tried this in the latest numbered release (rosetta_bin_linux_2019.35.60890_bundle) and the latest release (rosetta_bin_linux_2020.03.61102_bundle) running on Ubuntu version 18.04. I have also tried downloading the “nr” fasta directly to avoid fastacmd but that led to other complications. If it would be of help, I have attached the entire log of “./install_dependencies.pl overwrite”. (with modified path to version 2.9.0 as described above).

 

Any help on how to successfully patch this script would be greatly appreciated. Thank you.
 

 

AttachmentSize
With_Modified_DB_Path.txt249.26 KB
Post Situation: 
Tue, 2020-03-10 08:58
Jacob_Verburgt

Hello,

1. Sorry about the download link, I will fix that for the next release

 

2. I don't know if I've seen this error before, I believe i will need more information...

could you post a list of all the files in the databases folder (i believe its: ...rosetta_bin_linux_2019.35.60890_bundle/tools/fragment_tools/databases)

 

also could you post the results of running one of your last lines from your log (meaning run this in the databases folder)

.../rosetta_bin_linux_2019.35.60890_bundle/tools/fragment_tools/blast/bin/fastacmd -D 1 > .../rosetta_bin_linux_2019.35.60890_bundle/tools/fragment_tools/databases/nr

that may give us more information

 

Also in that directory try running 

perl update_blastdb.pl nr

and see what logs that gives.

 

sorry I don't have a more direct answer for you!

Tue, 2020-03-10 10:25
danpf

Hello,

Thank you for your quick response! I have attached the list of files after running the "install_dependencies.pl" (with the modified path mentioned in the original post). It should be mentioned that the "nr" file is empty as fastacmd failed.

I have ran the update_blastdb.pl seperate from the script, and it is able to download the  39 nr.XX.tar.gz files. After manually extracting these, I get the same files that are present after running the "install_dependencies.pl"

When running the subsequent fastacmd command on its own, I get the same error:

"

../blast/bin/fastacmd -D 1 > ./nr
[fastacmd] ERROR: ERROR: Cannot initialize readdb for nr database
"

From what I can tell, the fastacmd is looking for nr.00, nr.01, nr.02... , which are not present after extraction. Is it possible that new versions of the NR database no longer contain these files?

Thank you again for your help, and please let me know if you have any questions!

File attachments: 
Tue, 2020-03-10 13:22
Jacob_Verburgt

Thank you for the information!

Unfortunately i'm still not sure of what the problem is:

could you please post the logs from when you run:

env 

 

or

env > for_forums.log

 

I'm suspecting it's an environment problem at this point.

Tue, 2020-03-10 15:01
danpf

Hello,

I have attached the log of my environment variables. Is there anything in particular that you are looking for? Thank you again for looking into this!

File attachments: 
Tue, 2020-03-10 15:23
Jacob_Verburgt

I was looking mainly for suspect variables that I've seen screw things up, i believe there are some BLAST specific env variables that might mess things up too, but those usually have "BLAST" in them (which you don't have)

However i have occasionally seen problems with this particualr variable, could you try unsetting it and then running the fastacmd command again?

unset LD_LIBRARY_PATH

 

unfortunately i'm sort of out of ideas after this... I'm also on ubuntu 18.04 and I don't see any issues like this.

Tue, 2020-03-10 15:32
danpf

Unfortunatley that didn't work either. I was able to get the uniref 50 database to work, so I will move forward using that database for now. Thank you again for your help!

Wed, 2020-03-11 05:18
Jacob_Verburgt