Selective State Space Models, GPT but better?

Selective State Space Models, GPT but better?

Large language models (LLMs) like GPT have been nothing short of astonishing in handling complex tasks such as language translation, document summarisation, chatbots and this and that, the list goes on and on... However, they have some notable drawbacks shown when dealing with long sequences of data and the heavy computational demands which are starting to approach the energy use of a small country.

Selective State Spaces (SSMs) are a new approach that could be more efficient and effective in certain scenarios compared to LLMs. Here’s why SSMs might be a better fit:

Efficiency with Long Sequences

SSMs can process long sequences of data more efficiently than Transformers. Where Transformers struggle with the computational load as data length increases, SSMs handle this with much less effort, making them ideal for tasks involving lengthy documents or records.

Selective Processing

Unlike LLMs, which process all parts of a sequence equally, SSMs can focus on the most relevant parts of the data. This selective attention means they’re faster and use fewer resources, which is crucial when working with extensive datasets.

Adaptability 

SSMs are highly adaptable to different contexts. They can dynamically adjust how they handle data based on its characteristics, which means they can be fine-tuned to perform better in various specific applications.


In the legal field, where we deal with large volumes of text and need to extract relevant information quickly, SSMs could offer several promising applications:

Document Review and Summarisation

SSMs could streamline the review process by focusing on the most relevant parts of legal documents, making it easier to summarise contracts, case files, and legislation.

E-Discovery

During the discovery phase of a project, SSMs could help by efficiently sifting through massive amounts of digital evidence, highlighting key pieces of information that are most pertinent to the case.

Legal Research

SSMs could enhance legal research tools by more effectively scanning through legal texts, case law, and other resources to pinpoint relevant information, saving time and improving the accuracy of research findings.

Now these are all tasks that LLMs have been looking to solve, however by filtering out irrelevant information, SSMs can reduce noise in the data, making the review process faster and more accurate and requiring less computational power and memory, makes them just more practical for us all when we look to our responsibilities in reducing energy use.

Mamba


Now this is great Ryan, but give me a model, well there’s Mamba. Think of Mamba as a super-efficient version of the traditional (I know just a few years…but still) models we’ve been using. It skips some of the usual complex steps and instead uses a clever way of processing data that speeds things up significantly—up to five times faster than Transformers do 

Mamba can handle extremely long sequences of data without getting to bogged down, would could make it awesome in areas like language and audio processing. In fact, Mamba’s performance is apparently so good that it can match or even beat traditional models that are twice its size based on current research. This shows how SSMs, with models like Mamba, can bring state-of-the-art performance to the world of legal tech.


SSMs could present a more efficient way to handle extensive and complex data, offering potential improvements over current LLMs, particularly in fields like law where dealing with large volumes of text is a big challenge we’re all looking to solve right now.