How do Large Language Models understand Genes and Cells
How do Large Language Models understand Genes and Cells
Fang, C.; Wang, Y.; Song, Y.; Long, Q.; Lu, W.; Chen, L.; Wang, P.; Feng, G.; Zhou, Y.; Li, X.
AbstractResearching genes and their interactions is crucial for deciphering the fundamental laws of biological activity, advancing disease treatment, drug discovery and so on. Large language Models (LLMs), with their profound text comprehension and generation capabilities, have made significant strides across various natural science fields. However, their application in cell biology remains notably scarce. To alleviate this issue, in this paper, we selects seven mainstream LLMs and evaluates their performance across a range of problem scenarios. Our findings indicate that LLMs possess a certain level of understanding of genes and cells, and hold potential for solving real-world problems. Moreover, we have improved the current method of textual representation of cells, enhancing the LLMs\' ability to tackle cell annotation tasks. We encourage cell biology researchers to leverage LLMs for problem-solving while also being mindful of some challenges associated with their use. We release our code and data at https://github.com/epang-ucas/Evaluate_LLMs_to_Genes.