Does anyone know how best to optimize compiling rosetta to take advantage of AMD Epyc (zen3/zen4) and Ryzen archatectures?
AMD offers an optimized fortran compiler called AOCC (https://developer.amd.com/wp-content/resources/57222_AOCC_UG_Rev_3.2.pdf). What might be the best way to make use of this? Can it be invoked directly using the "cxx=" flag in "scons.py" or must one configure the OS directly - and if so, how?
Some newer Epyc processors offer very large L3 caches (768MB and soon over 1GB). Can Rosetta make use of this extra cache? Is there a way to have roseta make specific use of this extra cache - either at build or runtime?
AMD offers a number of optimized numerical libraries collectively called AOCL (https://developer.amd.com/amd-aocl/). Does roseta make use of any of these libraries? Might it in the future?
Any input, suggestions or pointers to more info are welcome.