Protein identification made easy with DeepTracerID
by Awakhiwe Makalima
The success of the human race has largely been owed to our continued efforts in creating ways of getting things done quicker and with more ease. From the healthcare industry, to transportation, to energy… numerous innovations have come about all ultimately making it possible to break barriers as we progress towards a better tomorrow. The field of protein studies has been no exception to this.
Over the past decade advances in electron detection systems and image-analysis software have catalyzed a “resolution revolution” in cryo-electron microscopy (cryo-EM), with the number of structures determined to atomic resolution exponentially increasing each year. Since the cryo-EM approach has less restrictions in terms of sample purity, concentration and volume, these atomic structures have even been determined directly from cell extracts. However, this “flexibility” is where the problem often begins.
Imagine you’re working with Mycobacterium tuberculosis, and you’ve put in a lot of work into identifying an important protein complex involved in bacterial survival within macrophages and its structure has yet to be determined. You’ve gone through a cryo-EM workflow, and you’ve used a software like Relion to obtain a near-atomic resolution cryo-EM map. The issue is you now require additional sequence information of your protein otherwise it will be impossible to build an atomic model. At this point the common solutions to your problem would be techniques, such as tandem mass spectrometry and/or bioinformatics which in the case of the former, would not always be easily accessible nor affordable and with latter the results would not always be easy to interpret.
So how would you identify your protein without the hassle of conducting more expensive and time-consuming experiments? Well, Luca Chang and his team have come up with just the right innovative tool and they’ve called it, DeepTracerID.
This server-based approach first requires the user to input a cryo-EM map which is used to generate a 3D model trace by DeepTracer. The user then needs to input an easily attainable AlphaFold2 protein library of the organism of interest after which three different alignment algorithms can be used to align the AlphaFold2 predicted structures to the generated 3D model trace. The aligned predictions are then statistically scored and listed from lowest to highest score. The correct protein of interest being predicted to be among those with the lowest scores.

The simplicity of this approach has the potential to open doors for incredible breakthroughs in the world of molecular imaging. As we continue to try and better understand the mechanisms via which different biochemical processes occur on a molecular level for various applications such as disease control, it is key for there to be good visual information to study. Continuing with the Mycobacterium tuberculosis example from earlier, it is interesting to see that there are multiple studies which have reported proteomic evidence of thousands of exported proteins, hundreds of which are associated with the strategies the bacterial cells use to survive within macrophages and cause disease. A search on protein data banks however gives nearly zero hits on the structures of these exported proteins nor the membrane proteins which could be implicated in their export out of the bacterial cells. Tools such as DeepTracerID could certainly be one of the keys to making it easier to increase the structural information on protein complexes available to us which is crucial to answering a lot of the questions we have in the research world.
Reference
Chang L., Wang F., Connolly K., Meng H., Su Z., Cvirkaite-Krupovic V., Krupovic M., Egelman E.H., Si D. 2022. DeepTracer-ID: De novo protein identification from cryo-EM maps. Biophysical Journal. Volume 121, Issue 15, Pages 2840-2848.