Learners will gain the skills to serve powerful language models as practical and scalable web APIs. They will learn how to use the llama.cpp example server to expose a large language model through a set of REST API endpoints for tasks like text generation, tokenization, and embedding extraction.

Beginning Llamafile for Local Large Language Models (LLMs)
Seize the savings! Get 40% off 3 months of Coursera Plus and full access to thousands of courses.

Recommended experience
What you'll learn
Learn how to serve large language models as production-ready web APIs using the llama.cpp framework
Understand the architecture and capabilities of the llama.cpp example server for text generation, tokenization, and embedding extraction
Gain hands-on experience in configuring and customizing the server using command line options and API parameters
Skills you'll gain
Details to know

Add to your LinkedIn profile
4 assignments
See how employees at top companies are mastering in-demand skills

There is 1 module in this course
Offered by
Explore more from Software Development
Status: Free Trial
Status: Free TrialDuke University
Status: PreviewDuke University
Status: Free Trial
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.

Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy


