Machines That Can Do Many Things Well
We have developed a Machine Learning model called MPL (Multi-Purpose Learners) that has the ability to perform classification on images and text, without making any explicit modifications between tasks. Along with the ability to do many tasks, MPL performs well on common benchmarks with as little as 5 samples being shown to it. In this blog, we preview MPL and it's implications for AI going forward.
Artificial Intelligence has given developers and society the ability to solve problems that could not be solved before with traditional computer science algorithms. A couple of times a year, a new method or model breaks a new record that influences the way the community builds new algorithms. In 2020, there has been a new trend amongst many research groups and companies that will have large implications on technology as a whole: General-Purpose AI Algorithms.
With recent projects like GPT-3 from OpenAI and MuZero from DeepMind, the AI community has seen glimpses into the next generation of Artificial Intelligence. General-Purpose AI Algorithms can be defined as an AI model's ability to do many tasks within a specified context. For example, GPT-3 from OpenAI has shown the ability to do many language tasks (translations, conversation, summarization etc.) without being explicitly trained to do so.
At clevrML, we are focused on researching new AI methods that are general purpose in nature for two reasons:
1. To enable developers with cutting-edge technologies for their applications
2. Make strides towards Artificial General Intelligence.
With this in mind, we have created a model called "MPL" (Multi-Purpose Learners), which we believe has the potential to provide developers with massive benefits.
Motivation For MPL
We are currently testing new ways Active Memory Learning can be used and improved. For this project, we have implemented a slightly different AML architecture that has shown promising results.
Active Memory Learning has the potential to transform how developers build and use AI. At clevrML, we want to see how far AML can be pushed in hopes to make more breakthroughs.
We are attempting to discover Machine Learning algorithms that don't need to be changed for every task. This not only progresses AI forward but helps developers in the process.
Before diving into the details, we'd like to give you some insight into why we decided to build MPL. The motivation for this undertaking can be summarized by these points:
Testing New Iterations of Active Memory Learning (AML)
Pushing The Boundaries on What's Possible With AML
Make Progress Towards Stronger AI Algorithms
MPL is a Machine Learning model that can classify images and text with one model. MPL uses a newer version of our Active Memory Learning technology, which has allowed AI to learn tasks with as few as 5 examples and has low computational overhead. Contrast this with Neural Networks that need tens of thousands of examples to learn a single task and require high-end processing power.
This was the best day ever!
I would not recommend coming here
MPL Challenges and Benchmarks
The challenge with MPL was multi-faceted in nature from the start and could be broken down into two parts:
1. Build one model that can understand images + text and has high adaptability
This is a challenge because there are very few algorithms that are able to predict across different data types, let alone accurately. The method used had to be completely rethought from the very beginning to achieve this goal.
2. The model must be able to generalize with 3 to 5 examples per label/class.
Most state-of-the-art models use hundreds of thousands of examples, if not millions to achieve high accuracy. We wanted to do this with as few as 3 examples. Access to quality data in AI is one of the largest problems that developers face, so we wanted to build something that would address this problem.
Thankfully, clevrML has a good foundation to work from with our Active Memory Learning (AML) method. AML models are able to generalize well with low amounts of examples, but the method hasn't been tested in one model that has to predict different types of inputs.
As mentioned earlier, one of the goals with MPL was to show the strength of Active Memory Learning. We tested MPL on common image and text classification benchmarks like "ImageNette" (a smaller version of ImageNet) and "STS-B". A common practice in Machine Learning is to show differences in accuracy based on model sizes. Since MPL is not a Neural Net, we benchmark based on "examples shown" (i.e: how many examples did MPL see of each class before running tests). For both Image and text classification tasks, we test on 1, 3, 5, 10 and 15 examples shown. The table below shows the different accuracies MPL is able to achieve on common benchmarks:
An important point to note is that MPL only receives random reference examples, instead of handpicked examples. We believe this is important to show the robustness of MPL and Active Memory Learning.
From our tests, we found that MPL did better with more examples. The amazing thing with MPL was its ability to score highly with only 1 to 3 examples on both image tasks. For text classification, MPL did better with more examples. These results might indicate that Active Memory Learning models like MPL do better with less data because of seeing less bias. In other words, because a model like MPL only sees 5 examples, it eliminates any room for "bad examples". This finding alone could have massive implications for how we build AI going forward.
A Key Difference in MPL
As mentioned earlier, we used a modified version of Active Memory Learning to build MPL. Currently, AML models use something we call an "Artificial Decision Lobe" (ADL or Decision Lobe). In summary, an ADL is what allows us to make accurate predictions with limited data; With our current AML method, we use one ADL to make predictions.
We did notice slight weaknesses when the inputs become very complex when we built MPL with one ADL (i.e: A sentence with lots of vocabulary or an image with lots of objects). To combat this weakness, we added more ADL's to generate multiple different decisions for one prediction. This in essence creates a model that can look at inputs from multiple different perspectives and arrive at an accurate conclusion.
Current Active Memory Learning Method
Artificial Decision Lobe
Artificial Decision Lobe
Artificial Decision Lobe
Artificial Decision Lobe
Left/Top (Mobile): A diagram on the left shows the current version of Active Memory Learning. Right/ Bottom (Mobile): A newer version of Active Memory Learning used in MPL.
We found MPL's new Active Memory Learning method yields between a 7% to 10% increase in accuracy. By adding more ADL's, we thought computational efficiency would suffer. However when running the benchmarks, we were pleased to find out that MPL was still highly efficient on a CPU. MPL was able to build within 15 seconds on a laptop CPU, which is unheard of in any other AI model of this caliber.
A Massive Benefit of MPL
MPL's benefits are directly obvious: More accuracy and multi-purpose abilities. We did however find out that MPL has a very special ability: Learning on the fly.
Because MPL is an Active Memory Learning model, it can be dynamically edited at any time or it can learn with limited examples. This effectively creates the ability for MPL to pick up a classification task on the fly. This will have profound benefits for developers for two reasons:
1. You can run custom predictions with MPL by showing it a few examples
Typical machine learning projects consist of sourcing lots of data and then training a model for hours. This is very costly for developers, both monetarily and time-wise. With MPL, you can collect 5 examples of your classification task and get a prediction from a prebuilt architecture.
2. No more testing different model architectures
With any AI project, you are constantly iterating to find the best model for your task; This is also costly in terms of time and money. With MPL, you can forget this hassle because the architecture is already made for you ahead of time. All that a developer has to do is upload examples and get a prediction back.
To prove this learning on the fly concept, we applied MPL to have a "conversation" with us. While the depth of the conversation was incredibly limited, the purpose is to show that MPL was not explicitly built to have a conversation. With a few examples, however, it can learn how to say "Hi!":
# (example, label)
# References for MPL To "Learn"
("Good afternoon!", "Greeting"),
("Hi there!", "Greeting"),
("See you!", "Farewell"),
# Prediction Object
text: "Hi! How are you?"
Response: "Hello! :)"
For non-developers, the first box on the left/ top (Mobile) is what we sent to MPL for it to "learn" dynamically. The "object" (curly brace thing with information in between) is the examples we sent to MPL. The bottom object is what phrase we sent to see if it would predict the correct response type.
Implications of Generalized Models
MPL has shown the ability to pick up image and text classification tasks, learn with few examples, learn quickly and adapt if needed. While classification is largely basic in nature, it is the highest used type of machine learning in everyday applications. Alexa, Google Search, even your mobile phone uses Machine Learning classifications on an everyday basis (to name a few). The natural question arises: What are the implications of a technology like this?
This question has the potential to be widely debated, but we have some thoughts:
1. MPL is an early glimpse of the new age of AI
As mentioned earlier, projects like GPT-3 and MuZero were early indicators of where AI was headed. We believe MPL takes this concept to a new level by adding the need for fewer data points, compute and appears to be generalizable to many types of classification.
2. The playing field gets equalized with General-Purpose models
It's no secret that large tech companies spend millions on AI projects. This type of scale is largely unattainable if you are a developer or q startup. With technology like MPL, this changes everything. Developers can get high accuracy predictions with only a few data points and at a low cost.
3. MPL not only eliminates the data problem but also the need to build a custom model
A big part of MPL is its ability to learn with limited examples. This raises the question: If MPL can predict my task well, why should I build a custom model? This is a direct example of why generalized models will completely change AI. Building a model costs developers time that could be better used experimenting, but with MPL, this is taken away.
Interested in Using MPL?
If you are a developer, small business or a startup and you are interested in using MPL, please fill out our application by clicking on the button below. We will be doing a controlled rollout to applicants first with a public beta API to follow. As we state in our mission, we want to find better AI algorithms for all developers that also pursue Artificial General Intelligence.
MPL has demonstrated a remarkable ability to learn with limited data samples, an ability to learn on the fly and be beneficial to developers. MPL further proves to us that Active Memory Learning is a technology that needs to be further explored to see what else it is capable of doing. The choice to add more Artificial Decision Lobes for MPL proved to have a significant change in performance, which will be further researched in the next iteration of MPL.
We appreciate you taking the time to read our research and support clevrML in reaching the cutting edge of AI.