The Rise of Domain-Specific Large Language Models and Why it Matters to Organizations

It is clear that not many companies that can afford build their foundation models and large language models (LLMs); it is just too costly, and even large AI platform vendors are unclear when the investments are going to be paid off. If you wonder what the difference is between a foundation model and large language model; they are both types of systems that can perform a wide range of tasks based on data. However, large language models are specifically focused on language-related data, such as text, while foundation models can handle multimodal data, such as text, images, audio, and video. Organizations such as Open AI, Microsoft, Google, Meta, and many others spend tens of billions in training these models.
It has been interesting to see that many startups are investing heavily in model training (a big part of their budgets), and investors keep pouring money into these. If I were on the board of directors of these startups, I would want to know that what the company is investing in will not be blown away by any of the big vendors and their investments.
I have talked to some ISVs the past few months, and when listening to their story and what they are doing, I have been wondering whether some of these companies will survive in the long run. This phenomenon is not new; in my previous life as head of development/CTO for software companies, we would always question whether to build or buy any of the components we wanted to include in our solution. A good example was a graphics engine for our business intelligence/analytics solution. At the time, we could not find an engine with the APIs that allowed us to build the graphics we needed, so we built our own. Today, that would be crazy, pure lunacy.
So, the question that any software vendor or end-user organization should ask themselves is this: If I use generic large language models, how can I get them to understand my business, my industry, or my specific function (like legal or accounting)? That is where domain-specific concepts come into play.
Domain-specific thinking is not new; in fact, my second TELLUS International customer Metacase (leader in domain-specific modeling tools) has been working with domain-specific concepts for years and now the same line of thinking is finally coming to the world of AI and specifically LLMs. I loved traveling with Juha-Pekka Tolvanen (CEO of Metacase) to different software development conferences, giving talks about domain-specific language models, etc. You should check out the book "Domain-specific Modeling - Enabling Full Code Generation" by Steven Kelly and Juha-Pekka Tolvanen
The topic of domain-specific or specialized large language models is coming to play and today Venturebeat had an article explaining how San Francisco-based AI company Writer has launched two specialized large language models (LLMs) tailored specifically for the healthcare and financial services industries potentially reshaping how these highly regulated sectors adopt artificial intelligence (I wish European Union can get their AI legislation before it is too late from a competitive perspective).
Writer has released two new specialized models, Palmyra-Med70b and Palmyra-Fin-70b (one in medicine and one in finance), and both of these are now available as open-source offerings on AI platforms such as Nvidia, Baseten, and Hugging Face. The claim is that these models significantly outperform larger, generalized AI models such as GPT-4 in domain-specific tasks. The CTO Waseem Alshikh from Writer states that "those models require not just engineering, but a special type of data and a special type of expertise. You actually need experts to help you build those models."
The VentureBeat article also discusses how industries grapple with how to leverage AI's potential while navigating complex regulatory environments. Unfortunately, I feel that Europe might be left behind, as regulators and politicians have made it difficult for businesses to know whether it is worthwhile to invest in Europe regarding AI. Meta, Apple, and others have already concluded that they won't release some AI-related technologies to Europe.
What is interesting to see is that the discussion of open sourcing LLM models is on the rise. I wrote last week in another article about this, and now organizations such as Meta and Mistral have decided to open-source their models. Also, the models from Writer are open-sourced. Writer CTO claims that open-sourcing models could also address the longstanding concerns about AI safety and regulatory compliance in highly regulated industries. Writer CTO says that they will continue to add domain-specific models in the future and I believe this is the route that many ISVs should take as well.
You might also run into Small Language Models (SLMSs), lightweight generative AI models. Compared to large language models, SLMs have fewer parameters and don't need as much data and training time. SLMs are more efficient to train and deploy and offer greater customization potential. They are also less likely to exhibit biases or generate factually inaccurate information and offer advantages regarding safety and security compared to LLMs.
Independent software vendors and any organization adding AI to their IP (intellectual property) should carefully consider how and what they should be doing. There is a big risk in investing money and time in something that will be replaced within months by the larger players. I believe, and have been educating TELLUS clients for almost 19 years, to be well segmented, to understand what the value proposition is what you intend to deliver to your customer, and to understand their willingness to pay.
Do not try to be another "Microsoft", "Google", or "Meta" but pick your battles and be super focused on to deliver results. If you try to capture the "entire world" with your solution and especially try to be horizontal (anybody or any industry), it will be truly hard to position the solution and sell the solution. I have opened markets for ISVs in North America, Europe, Australia and New Zealand, and having a "Swiss knife" is not easy. By that, I mean to quickly position the solution for a prospective client if the claim is that the solution can do everything under the sun.
I would love to hear your thoughts on this topic.
Yours,
Dr. Petri I. Salonen
PS. If you would like to get my business model in the AI Era newsletters to your inbox on a weekly or bi-weekly basis, you can subscribe to them here on LinkedIn https://www.linkedin.com/newsletters/business-models-in-the-ai-era-7165724425013673985/
