How we are solving the challenges data scientists face in deployment
Our goal is to enable all companies to easily integrate AI models into their own processes with just a few clicks. Even without a large tech team, expensive outsourcing or months of development.
Thanks to our public funding for the application of artificial intelligence in Schleswig-Holstein (Northern Germany), we had the chance to talk to many independent developers and data scientists, institutions and small and medium-sized enterprises in the last half year and supported them in bringing their machine learning challenges from development to prototypical testing and productive use.
The challenges
In the process, three major challenges have come up again and again:
- Data Scientists want to share their locally developed models with colleagues for testing and prototypically integrate them into existing (server) infrastructure. However, these steps are very complex and often require the support of an additional team of devops, server specialists & developers, which not every company has on its side.
- There are already many well-functioning, pre-trained machine learning models for many use cases. For automating processes in companies, these can be used for prototyping and feasibility studies to test the use of AI in existing processes. Again, there is a lack of knowledge about the existence of these models, for which use case which models are suitable and how to integrate them into existing processes and systems.
- Some of our pilot partners process very sensitive customer data and map core processes with machine learning systems. They do not want these data and models to leave their own infrastructure. This excludes external servers and infrastructure providers (PaaS) for use and there is a lack of suitable open source tooling for use on their own servers.
Deriving a solution
After analyzing these challenges over and over again with our pilot partners over the last few months, we believe we know how to solve all of these problems together with the help of a toolbox and strategy - our master plan.
We want to let Data-Scientists focus on what they do best: Data-Science.
The commonality of the problems we are looking at is that many data scientists, and tech-savvy employees in mid-sized companies, have little problem trying out and, in the case of data scientists, developing statistical models on their local machine, but migrating those models to enterprise infrastructure and servers leads to a big question mark.
This is because skills are needed that are not to be found in the area of data science, machine learning or process automation, but require many different skills in the area of IT, devops and server management. And the fact that these processes differ from programming language to programming language (R or Python) and machine learning model to machine learning model (e.g., different frameworks or computation on CPU or GPU) does not make it any easier.
A solution toolbox
We want to tackle these challenges piece by piece with end-to-end solutions we develop ourselves, test them with our pilot partners and thus enable small and medium-sized companies to automate their existing processes with the help of data science, statistics and machine learning. That way we develop a whole solution toolbox over time that in combination tackles all the challenges mentioned above.
The first products in this context will be:
- A R plugin with hosting infrastructure behind it for data scientists to develop their models in R and with one single function call, deploy their model to showcase to colleagues and partners via R Shiny, or alternatively include the model into an API with Plumber. On their own infrastructure or cloud infrastructure provided by us. No knowledge of servers, deployment, R Shiny or Plumber is required. We let Data-Scientists focus on what they do best: Data-Science.
- A library of selected, high quality machine learning models and their use cases / integration scenarios with the ability to deploy them with one click on your own servers or on a GDPR compliant server provided by us. A big focus here is also NLP models in German language because this was already requested often by our pilot partners.
After successful development of these products, we will start developing other products for other programming languages (e.g. Python), frameworks and deployment challenges for process automation in small and medium sized enterprises.
Our goal is to enable all companies to easily integrate AI models into their own processes with just a few clicks. Even without a large tech team, expensive outsourcing or months of development.
One-last thing
Since these products all have in common that applications & AI models need to be hosted and updated on their own infrastructure in a privacy-compliant way, we will also make the infrastructure product behind it that we develop and share across the products available as an open-source tool, allowing companies to easily develop their own highly customized deployments and workflows based on our technology.