{"id":243215,"date":"2024-07-14T09:03:02","date_gmt":"2024-07-14T09:03:02","guid":{"rendered":"https:\/\/michigandigitalnews.com\/index.php\/2024\/07\/14\/llama-3-fine-tuning-achieves-90-of-gpt-4s-performance-at-lower-cost\/"},"modified":"2025-06-25T17:14:46","modified_gmt":"2025-06-25T17:14:46","slug":"llama-3-fine-tuning-achieves-90-of-gpt-4s-performance-at-lower-cost","status":"publish","type":"post","link":"https:\/\/michigandigitalnews.com\/index.php\/2024\/07\/14\/llama-3-fine-tuning-achieves-90-of-gpt-4s-performance-at-lower-cost\/","title":{"rendered":"Llama-3 Fine-Tuning Achieves 90% of GPT-4&#8217;s Performance at Lower Cost"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<figure class=\"figure mt-2\">&#13;<br \/>\n                                &#13;<\/p>\n<p>&#13;<br \/>\n                                    <a href=\"https:\/\/blockchain.news\/Profile\/Luisa-Crawford\">Luisa Crawford<\/a>&#13;<br \/>\n                                    <span class=\"publication-date ml-2\"> Jul 14, 2024 02:46<\/span>&#13;\n                                <\/p>\n<p>&#13;<\/p>\n<p class=\"lead\">Llama-3 fine-tuning demonstrates significant performance gains, achieving 90% of GPT-4&#8217;s accuracy at a fraction of the cost, according to together.ai.<\/p>\n<p>&#13;<br \/>\n                                <a href=\"https:\/\/image.blockchain.news:443\/features\/FCAF30107F93017A469BDB76DCCE7D957DFC034943E2204CF5967AAF05B60663.jpg\">&#13;<br \/>\n                                    <img decoding=\"async\" class=\"rounded\" src=\"https:\/\/image.blockchain.news:443\/features\/FCAF30107F93017A469BDB76DCCE7D957DFC034943E2204CF5967AAF05B60663.jpg\" alt=\"Llama-3 Fine-Tuning Achieves 90% of GPT-4's Performance at Lower Cost\"\/>&#13;<br \/>\n                                <\/a>&#13;<br \/>\n                            <\/figure>\n<p>The success of Llama-3 has been remarkable, showcasing that open-source models are closing the gap with their closed-source counterparts, according to <a rel=\"nofollow\" href=\"https:\/\/www.together.ai\/blog\/finetuning\">together.ai<\/a>. By leveraging proprietary data, customers have been able to fine-tune smaller open-source software (OSS) models like Llama-3 to achieve higher accuracy than top-tier closed-source models.<\/p>\n<h2>Fine-Tuning Process<\/h2>\n<p>Together AI&#8217;s platform allows users to fine-tune Llama-3-8B on proprietary data, creating custom models that outperform larger OSS alternatives like Llama-3-70B and are comparable to leading closed-source models like GPT-4, all at a fraction of the cost. A detailed guide demonstrates how a fine-tuned Llama-3 8B model improved from 47% accuracy to 65%, surpassing Llama-3-70B&#8217;s 64% and nearing GPT-4&#8217;s 71% accuracy.<\/p>\n<p>The fine-tuning process involves several steps, including dataset transformation, uploading and verifying datasets, starting a fine-tuning job, and running evaluations to compare the results. The initial step requires downloading the Math Instruct dataset from HuggingFace, cleaning it up, and transforming it into a JSONL file format suitable for Together&#8217;s platform.<\/p>\n<h2>Dataset Transformation<\/h2>\n<p>The transformation process involves loading the original JSON data, defining the Llama-3 prompt format, and converting the data into the correct format. This formatted dataset is then validated using Together&#8217;s SDK before being uploaded for fine-tuning.<\/p>\n<h2>Uploading and Fine-Tuning<\/h2>\n<p>Once the dataset is prepared, it is uploaded to Together AI via the Python SDK. The fine-tuning job is then created using the Llama-3-8B base model, specifying the dataset, number of epochs, and other parameters. Users can monitor the fine-tuning job through Together AI&#8217;s dashboard.<\/p>\n<h2>Evaluation and Results<\/h2>\n<p>After fine-tuning, the model&#8217;s performance is evaluated using 1000 math problems. The fine-tuned Llama-3-8B model&#8217;s accuracy is compared to the base Llama-3-8B, Llama-3-70B, and GPT-4. The fine-tuned model achieved a 65.2% accuracy, outperforming the base model&#8217;s 47.2% and Llama-3-70B&#8217;s 64.2%, and coming close to GPT-4&#8217;s 71.4% accuracy.<\/p>\n<p>The results indicate that the fine-tuned Llama-3-8B model outperformed the base model by nearly 20%, surpassed the top OSS model Llama-3-70B, and achieved over 90% of GPT-4&#8217;s accuracy. Additionally, the fine-tuned model is faster, 50 times cheaper than GPT-4, and offers full ownership of the model and weights.<\/p>\n<h2>Conclusion<\/h2>\n<p>This fine-tuning approach demonstrates that small open-source models like Llama-3-8B can be customized to perform specific tasks with high accuracy, speed, and cost-efficiency. Users can leverage their proprietary data to fine-tune a model and either host it on Together AI or run it independently, maintaining full control and ownership.<\/p>\n<p>The Llama-3-8B model trained on math problems outperformed leading OSS models and approached GPT-4&#8217;s performance, with a total fine-tuning cost of less than $100 on Together AI.<\/p>\n<p><span><i>Image source: Shutterstock<\/i><\/span><\/p>\n<p>                            <!-- Divider --><\/p>\n<p>                            <!-- Author info END --><br \/>\n                            <!-- Divider --><\/p><\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"https:\/\/blockchain.news\/news\/llama-3-fine-tuning-achieves-90-percent-gpt-4-performance-lower-cost\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] &#13; &#13; &#13; Luisa Crawford&#13; Jul 14, 2024 02:46&#13; &#13; Llama-3 fine-tuning demonstrates significant performance gains, achieving 90% of GPT-4&#8217;s accuracy at a fraction<\/p>\n","protected":false},"author":1,"featured_media":243216,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[171],"tags":[],"_links":{"self":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/243215"}],"collection":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/comments?post=243215"}],"version-history":[{"count":0,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/243215\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media\/243216"}],"wp:attachment":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media?parent=243215"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/categories?post=243215"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/tags?post=243215"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}