Articles
The 405B Powerhouse
Llama 3.1 405B is a groundbreaking model that stands out as the first open-weights AI capable of rivaling the performance of closed-source giants like GPT-4 and Claude 3.5 Sonnet. This development significantly narrows the gap between open and closed models, democratizing access to cutting-edge AI capabilities. The open-weights nature of Llama 3.1 allows the community to fine-tune and adapt the model, potentially unleashing a wave of specialized, high-performance models tailored to various needs.
Accessibility for all
The Llama 3.1 8B model represents a major leap forward for consumer-grade hardware. It outperforms GPT-3.5 on many benchmarks while being able to run locally and at no cost. This advancement places recent state-of-the-art performance in the hands of individual developers and researchers, empowering them to innovate without the need for expensive infrastructure.
Key Improvements
Llama 3.1 comes with several significant enhancements:
- 128K context length across all models: This allows for better handling of longer inputs, enabling more complex tasks and extended conversations.
- Multilingual support for eight languages: This broadens the model’s usability across different linguistic contexts, making it more versatile and inclusive.
- Enhanced reasoning and tool use capabilities: These improvements make the model more adept at logical reasoning and utilizing external tools effectively.
- Improved instruction-following and chat performance: The model now better understands and executes instructions, providing more accurate and coherent responses in chat applications.
What this means for the future
The release of Llama 3.1, particularly the 405B model, marks a significant milestone in open-source AI. It promises to accelerate innovation, enable new applications, and push the boundaries of what’s possible with locally-run models. As this trend continues, we can expect even more powerful and accessible AI tools to emerge in the near future.
Stay tuned as the community begins to explore and build upon these groundbreaking models!
Want to learn more about artificial intelligence and its models?
Then our AI training course is just right for you! You can find more information here.
Contact
Region
Sebastian Kouba
Sebastian drives innovation in our IT department through his generative AI expertise. When not at work, he’s reliving his youth on the beach volleyball court or crafting the ideal cappuccino.
Cloud Infrastructure
In order for microservices to remain small enough to deserve the name, it must be easy to create new microservices. In addition to the code base, this requires some infrastructure that in principle differs little across different services.
It therefore makes sense to have these recurring elements managed and maintained by a dedicated infrastructure team. This includes the following elements:
Provision of computing resources
Over the last decade, container technology has proven to be a profitable concept for the efficient utilization of resources. Here, individual processes are encapsulated from one another using Linux kernel features, which facilitates the shared use of computing resources by independent processes. It is essential for the use of container technology that individual containers can be restarted at any time (e.g. on another computer), as this achieves an orchestrated high utilization of the available computing time. The de facto standard for orchestrating containers is the open source system Kubernetes, which provides an interface for developers to easily deploy containers (and other resources), if configured well.
As a good configuration of a Kubernetes cluster is a relatively complex task, a specialized team should be set up to deal with the network architecture, communication between containers, secret management and basic logging and monitoring in connection with Kubernetes.
Network connectivity
The inner simplicity of microservices comes at the cost of a great deal of complexity being shifted to communication between services. It is therefore important that microservices can establish a network connection to each other and to the internet. The connection should be automatically secured by mTLS. This automation can also be best ensured by a central infrastructure team.
It should also be easy for the microservice teams to set up basic allow lists for accessing services and, if necessary, some network rules should be made binding by the infrastructure team for compliance reasons. Communication between microservices is much easier to achieve if the infrastructure is hosted by a single cloud provider.
Persistence
Microservice containers should be able to be restarted at any time without losing data. This implies that persistent data storage must take place outside the microservices, typically in databases or persistent volumes. In addition to general difficulties such as authentication and authorisation as well as connectivity to the data storage systems, there are specialized challenges such as the creation of data backups and snapshots to ensure the recoverability of persistent data.
For different needs, different types of databases should be made available for microservice teams to choose from, such as an SQL database, a document database and a cache system. This is important to ensure technology openness in the implementation of microservices so that microservice teams can choose the best technology for their use case. The provision of databases or volumes should be as automated as possible, which requires close collaboration with the Kubernetes team.
We plan your microservice architecture. And offer so much more in the field of agile software development. See for yourself.
Logging und monitoring
In order to be able to reproduce and rectify errors, it is crucial that there is an easy way to access the logs produced by the individual microservices. The industry standard for this is an Elastic Stack, which is ideally hosted and managed by a central team so that the microservice teams do not have to worry about setting up logging infrastructure. Solutions other than an Elastic Stack are also conceivable here, but a standard solution should be specified centrally and the log format (e.g. JSON with certain predefined fields) should also be standardized across microservices.
It must also be possible to link logs across multiple services, especially across REST requests or asynchronous messaging. Request tracing via OpenTelemetry is suitable for this by propagating the traces across requests, for example with the help of the B3 specification. In addition, the resulting traces can of course be used to recognise and eliminate bottlenecks in requests.
A monitoring/alerting stack (typically with the help of Prometheus and Grafana) should also be centrally maintained and be able to be filled with standardized metrics as well as individually configurable metrics. Alerts should be individually configurable by the microservice teams.
API design
Depending on the microservice, the requirements for the APIs for communication with the microservice can be very different. Nevertheless, it can make sense to set common requirements for the API structure (e.g. HATEOAS or gRPC), which are only not taken into account in justified exceptional cases. For HTTP APIs, OpenAPI can be a good solution for documentation.
If necessary, a decision can be made to make the provision of OpenAPI specifications for microservices mandatory so that client developers can rely on the existence of these specifications. In any case, a common schema for the documentation of APIs (including error cases) should be created to make it easier to find relevant documentation.
Messaging
Asynchronous communication via messaging systems can significantly help to make the overall system more resilient. The temporary inconsistency in data storage caused by asynchronicity is deliberately accepted (“eventual consistency”) in order to achieve faster requests and decoupling of different microservices.
To ensure that communication between different microservices via messaging works, the messaging system itself should be provided and managed by a central team. As with the APIs, the format of the messages should be documented using a common schema.
Authentication and authorisation
Authentication and authorisation are central tasks that cannot be handled by individual microservice teams. A centralized solution, for example based on OpenID, should be provided for this purpose.
Summary
A microservice architecture can be a good solution, especially for complex systems, in order to quickly implement improvements to the system and new features. However, this creates dependencies between microservices and therefore also between the microservice teams.
The cross-cutting concepts described here help to ensure consistency, security and efficiency in a microservice environment.
Contact
Region
Would you like to find out more or do you have a question? Then get in touch with us!
Microservices are a design pattern in software architecture that aims to shorten communication paths through smaller teams and more focussed responsibilities, thereby reducing time-to-market.
The basic ideas behind microservices go back to the Unix principles formulated by Doug McIlroy in 1978. According to him, programmes should be designed in such a way that they:
- have exactly one responsibility and fulfill it well,
- can be nested one after the other so that one program produces the input for another program,
- can be tested early and abandoned if necessary,
- utilize tools for recurring tasks during development.
Microservices are being characterized by the following properties:
- Single Purpose: As with Unix principles, a microservice should fulfill exactly one task well.
- Encapsulation: Microservices have sole ownership of their data. They interact with the outside world via well-defined interfaces.
- Ownership: A single team (ideally consisting of 5-9 people) is responsible for a microservice over its entire lifetime.
- Autonomy: The team responsible for the microservice can build and deploy the microservice at any time without consultation of other stakeholders. The team is free to make implementation decisions.
- Multiple versions: It is possible for different versions of a microservice to exist at the same time.
- Choreography: There is no centralized system that orchestrates a workflow. Instead, each microservice is able to independently provide itself with the information it needs for its functionality.
- Eventual consistency: A temporary inconsistency of data between microservices is accepted as long as the data eventually becomes consistent again.
Challenges of microservices
Due to their internal simplicity, microservices are generally more scalable and easier and quicker to change than monoliths, in which the entire logic is contained in a single program. Due to the small teams, microservice teams feel (and are) much more responsible for the success of their microservices, which often leads to better results and decisions.
Microservices facilitate omnichannel solutions through shared backend functionality that can be consumed by different user interfaces.
Increased complexity
At the level of individual microservices, we create a simple and beautiful world through the limited scope of tasks and encapsulation. The inner simplicity of microservices comes at the price of increased complexity in terms of communication between microservices and a certain degree of redundancy: as the microservice teams are fully responsible for their services (ownership) and can also deploy them independently (autonomy), the teams must be technically capable of actually carrying out this deployment. This includes both cloud infrastructure expertise and the resources required to operate the infrastructure.
In more concrete terms, we can imagine a mailing service that is responsible for sending informational emails about a product to customers. The service must be built during development and then executed. The computers used for this must have the necessary connectivity to perform these tasks. Once the service is running, other services must be able to initiate the email process. Possible ways of doing this include an event system from which the mailing service independently filters out the events relevant to it (in the sense of the choreography property) and processes them into emails.
However, such an event system or messaging system must firstly be provided and secondly be usable by different teams and their respective microservices. Another conceivable scenario would be to address the mailing service via a REST interface. In this case, the client must be able to communicate with the mailing service via a network and this connection must be secured so that only authorized services have access to the interface.
We plan your microservice architecture. And offer so much more in the field of agile software development. See for yourself.
Authentication & authorization
Authentication and authorization is typically a relatively complex problem and should also be handled consistently within the company to increase traceability. It is therefore recommended that microservice teams are not responsible for this issue independently.
As several instances of the mailing service can coexist in different versions (multiple versions) and microservice instances can be redeployed and, in particular, stopped at any time (autonomy), it is essential that relevant data (such as completed mails that have not yet been sent) is stored permanently across all instances. This requires databases and persistent volumes that need to be provided and maintained. Backups and snapshots of the data should also exist and be tested regularly.
Errors pose particular challenges: if any process in the overall system suddenly stops working, it is important to be able to analyze errors across multiple microservices. A consistent logging infrastructure as well as request tracing and alerting are helpful for this. Of course, infrastructure must also be provided for these solutions.
Summary
The challenges described show that there are a number of points where we should ensure standardization across services. We will take a closer look at some of these points in the next article. Elements of the software architecture that are (potentially) relevant for several building blocks of the architecture (in our specific case, microservices) are called cross-cutting concepts.
To avoid redundancy, such cross-cutting concepts should be dealt with in the overarching architecture documentation, regardless of the individual components. It is important to define cross-cutting concepts at an early stage, as a migration at a later date will take up a lot of capacity across all teams.
Coming up
In the next article, we will look at the basic building blocks of microservices, which differ only slightly and can therefore be managed centrally by an infrastructure team.
Contact
Region
Would you like to find out more or do you have a question? Then get in touch with us!
Machine Learning (ML) has witnessed explosive growth in recent years. As organizations increasingly leverage Machine Learning models to drive business value, the need for robust Machine Learning Operations (MLOps) practices has become paramount. MLOps encompasses the tools and processes required to manage the entire lifecycle of Machine Learning efficiently, from data acquisition, data processing, and model training to deployment, monitoring, and governance.

In the following, we want to delve into the exciting future of MLOps, exploring emerging trends poised to reshape the technical landscape and the challenges that companies must address to ensure successful deployments of Machine Learning.
What is Machine Learning?
Machine learning is a form of artificial intelligence (AI) that enables computers to learn without explicit programming. By analyzing data and utilizing statistical techniques, machines can recognize patterns and enhance their performance in specific tasks. This technology finds application in various domains, ranging from spam filtering to facial recognition software. It also encompasses the subfield of Deep Learning, which serves as the foundation for the recently developed Large Language Models (LLMs), such as ChatGPT.
Embracing the Trends: A Glimpse into the Future of MLOps
The MLOps landscape constantly evolves, with new technologies and methodologies emerging to address the complexities of managing ML models in production. Here are some key trends that are shaping the future of MLOps:
- Cloud-Native MLOps: Cloud computing offers a scalable, cost-effective platform for managing ML workloads. Cloud-based MLOps end-to-end platforms streamline the entire ML lifecycle, from data storage and compute resources to model training and deployment. This enables organizations to leverage the cloud’s elasticity to handle fluctuating workloads and experiment with different models efficiently.
One example of a commercial end-to-end platform in MLOps is Amazon SageMaker, a cloud-based ML platform for developing, training, and providing ML models. Kubeflow also falls into the category of these platforms, but unlike Amazon SageMaker, it is open-source and can be used free of charge.
- Automated ML Pipelines: Automating repetitive tasks within the ML lifecycle, such as data ingestion, data preprocessing, feature engineering, and model selection, can significantly improve efficiency and reduce human error. Automated ML pipelines leverage tools like AutoML (Automated Machine Learning) to automate various stages of model development, allowing data scientists to focus on more strategic tasks like developing innovative model architectures and identifying novel business use cases.
Among others, Azure with Azure Automated Machine Learning, Amazon Web Services (AWS) with AWS AutoML Solutions, or Google Cloud Platform (GCP) with AutoML offer services in this area, which minimizes the effort involved in implementing this complex method.
- Continuous Integration and Continuous Delivery (CI/CD) for Machine Learning: Implementing CI/CD practices in MLOps ensures that changes to models and code are integrated and delivered seamlessly. This fosters a rapid experimentation and iteration culture, enabling organizations to quickly adapt models to changing business needs and data distributions.
- Model Explainability and Interpretability (XAI): As ML models become more complex, understanding their decision-making processes becomes crucial. XAI techniques help to explain how models arrive at their predictions, fostering trust in model outputs and enabling stakeholders to identify potential biases or fairness issues.
Tools that can support you in making ML models more explainable and their decisions more transparent are, e.g., Alibi Explain, an open-source Python library that aims at the interpretation and inspection of ML models, or SHapley Additive exPlanations (SHAP), an approach derived from game theory to explain the output of arbitrary ML models.
- MLOps for Responsible AI: Responsible development and deployment of AI models are critical concerns. MLOps practices that integrate fairness, accountability, transparency, and ethics (Microsoft’s FATE research group, for example, is studying this subject area) principles throughout the ML lifecycle are essential to ensure that models are unbiased, avoid unintended consequences, and comply with regulations.
An example of such a regulation is the AI Act, a “legal framework on AI, which addresses the risks of AI and positions Europe to play a leading role globally,” recently adopted by the European Parliament. When attempting to design responsible and safe AI, for example, the services of Arthur and Fiddler can be consulted.
- Integration with DevOps: Aligning MLOps practices with existing DevOps workflows can create a more cohesive development environment. This fosters collaboration between data scientists, ML engineers, and software engineers, leading to a more streamlined and efficient software development lifecycle (SDLC) incorporating machine learning.
- Importance of Data-centric AI & DataOps: Data is the lifeblood of ML models. DataOps practices that ensure data quality, availability, and security throughout the ML lifecycle are crucial for model performance and overall system reliability. DataOps combines automation, collaboration, and agile practices to improve the speed, reliability, and quality of data flowing through your organization. This approach lets you get insights from your data faster, make data-driven decisions more effectively, and improve the quality and performance of your Machine Learning models based on this data.
- Focus on Security: As ML models become more ubiquitous, securing them from potential attacks becomes increasingly important. MLOps practices that integrate security considerations throughout the model lifecycle are essential to mitigate risks such as data poisoning, adversarial attacks, and model theft.
Conquering Challenges: Building a Robust MLOps Foundation
While the future of MLOps holds immense promise, several challenges must be addressed to ensure successful ML deployments. Here are some key areas to consider:
- Standardization and Interoperability: The lack of standardization across MLOps tools and frameworks can create silos and hinder collaboration. Promoting interoperability between tools and establishing best practices for MLOps workflows is crucial for creating a more unified and efficient ecosystem. A pioneering approach to this problem is the Open Inference Protocol, an industry-wide effort that aims to establish a standardized communication protocol between so-called inference servers (e.g., Seldon MLServer, NVIDIA Triton Inference Server) and orchestrating frameworks such as Seldon Core or KServe.
- Talent Shortage: The demand for skilled MLOps professionals outstrips the available supply. According to statista, the number of vacancies for IT specialists in companies in Germany rose to a record high of 149,000 in 2023, and Index Research reports that employers advertised almost 44,000 jobs for AI experts from January to April 2023. Organizations must invest significantly in training programs, talent acquisition strategies, and competitive employee compensation to narrow this gap and establish a strong MLOps team. This process includes recognizing the essential skills and expertise needed for successful MLOps implementation, such as data science, software engineering, cloud computing, and DevOps. Furthermore, it involves forming interdisciplinary teams that encompass these diverse domains.
- Monitoring and Observability: Effectively monitoring the performance and health of ML models in production is critical for catching issues early and ensuring model reliability. Developing robust monitoring frameworks and integrating them into MLOps pipelines is essential. Aporia, an ML platform that focuses on the observability of ML models, can be leveraged to achieve these objectives.
- Model Governance: Establishing clear governance frameworks for managing the lifecycle of ML models is crucial. This includes defining roles and responsibilities, ensuring model versioning and control, and setting guidelines for model deployment and retirement. The enterprise platforms Domino Data Lab and Dataiku are examples of the solutions and platforms that holistically reflect these features and many others of the ML lifecycle.
- Explainability and Bias Detection: As mentioned earlier, ensuring model explainability and detecting potential biases are critical aspects of responsible AI. Organizations must invest in tools and techniques to understand how models arrive at their decisions and identify and mitigate any fairness issues.
Conclusion: Embrace and shape MLOps practices to create added value
The future of MLOps is extremely bright. Organizations can build resilient and efficient operational processes by identifying and evaluating emerging trends, concluding them, and proactively addressing the associated challenges. In doing so, they provide AI models for their customers and create a solid foundation that ensures the models’ scalability, availability, and reliability and fulfills legal requirements. However, the most important added value that arises from this process is the creation of trust in the AI models’ reliability, fairness, and security, strengthening faith and trust in the company.
Contact
Region
Florian Erhard
Florian is a Machine Learning Engineer at HiQ. He loves traveling and immersing himself in foreign cultures, when he isn’t reading about AI & Machine Learning.