Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks

by oqtey April 19, 2025

written by oqtey April 19, 2025

[Submitted on 6 Apr 2024 (v1), last revised 16 Jun 2024 (this version, v3)]

View a PDF of the paper titled PhyloLM : Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks, by Nicolas Yax and 2 other authors

View PDF
HTML (experimental)

Abstract:This paper introduces PhyloLM, a method adapting phylogenetic algorithms to Large Language Models (LLMs) to explore whether and how they relate to each other and to predict their performance characteristics. Our method calculates a phylogenetic distance metrics based on the similarity of LLMs’ output. The resulting metric is then used to construct dendrograms, which satisfactorily capture known relationships across a set of 111 open-source and 45 closed models. Furthermore, our phylogenetic distance predicts performance in standard benchmarks, thus demonstrating its functional validity and paving the way for a time and cost-effective estimation of LLM capabilities. To sum up, by translating population genetic concepts to machine learning, we propose and validate a tool to evaluate LLM development, relationships and capabilities, even in the absence of transparent training information.

Submission history

From: Nicolas Yax [view email]
[v1]
Sat, 6 Apr 2024 16:16:30 UTC (4,036 KB)
[v2]
Thu, 23 May 2024 16:03:29 UTC (11,764 KB)
[v3]
Sun, 16 Jun 2024 14:39:20 UTC (11,764 KB)

3D printing 3D scanning 5G 6G Adaptive learning AI AI ethics AI governance AI-driven automation AI-driven chatbots AI-driven healthcare AR/VR (Augmented and Virtual Reality)Artificial intelligence Augmented reality Automation Autonomous drones Autonomous vehicles Big data Bioinformatics Biometric security Blockchain Blockchain security Blockchain-as-a-Service Chatbots Cloud computing Cloud infrastructure Cloud security Cloud-native applications Cognitive computing Cryptocurrency Cyber defense Cyber-physical systems Cybersecurity Cybersecurity frameworks Data analytics Data governance Data lakes Data mining Data privacy Deep learning DevOps Digital currency Digital ecosystems Digital payments Digital transformation Digital twins Digital wallets Drones Edge AI Edge computing eSIM technology Fintech Fintech innovation Geospatial analytics Gig economy platforms Green technology Human augmentation Hybrid cloud Hyperautomation Image recognition Intelligent apps Internet of Behaviors (IoB)IoT (Internet of Things)IT operations IT security Machine learning Metaverse Microservices Mobile app development Multi-cloud environments Multi-factor authentication Natural language processing Neural networks Open-source software Predictive analytics Privacy-enhancing technologies Quantum computing Quantum encryption Quantum sensors Renewable energy storage Renewable energy tech Robotics Robotics process automation (RPA)SaaS (Software as a Service)Self-driving cars Serverless computing Smart cities Smart contracts Smart devices Smart grids Smart homes Supply chain tech Tech sustainability Video streaming Virtual assistants Virtual reality Voice recognition Wearable health tech Wearable technology Zero-trust security

Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks

Submission history

Seething Neil Critchley lets rip at John Beaton as Hearts boss vents fury at referee over ‘questionable decisions’

‘I love my country but nobody is safe’: the plight of Cameroon’s exiles, trapped in Nigeria | Cameroon

Related Posts

Leave a Comment Cancel Reply