From system design to daily performance tuning, here’s a checklist of ways to make your systems run effectively. Credit: Chainarong Prasertthai / Getty Images Guess what? Cloud computing conferences are now generative AI conferences. How did that happen? It’s a simple matter of cloud providers seeing generative AI as the best way to sell more cloud services, and they are not wrong. As businesses shift to an AI-driven ecosystem, most of it plays out in cloud computing environments. That’s usually where you find state-of-the-art generative AI systems, scalability, accessibility, and cost-effectiveness. As we embark on this journey, how should enterprises operate these systems effectively? What best practices should we consider? Understanding generative AI in the cloud Generative AI models, in simplistic terms, are systems that take patterns and structures from input data to generate new, original content. This content is the output data, which can be any type of structured or unstructured information. This is a data operations problem more than anything if you’re looking for existing patterns to build on. However, there are core differences, including the fact that the processing is much more frequent, and the data input and output performance defines the performance of generative AI systems in the cloud. Processes for cloud-based generative AI systems Let’s define a basic process or set of best practices. Operations people love checklists, so here’s mine. Design your system. Your AI and cloud solution must scale together, and generative AI models need efficient management of storage and compute resources. You must code your application to take advantage of cloud-native services as much as possible. It’s cost-effective and simplifies things. This is where devops comes in, in coordination with the development team on optimizing code. This idea is that you don’t need to deal with operational problems if the system is designed correctly in the first place. Most of what I run into regarding operations issues comes down to the core design of the systems. Garbage in equals garbage out. To get meaningful output from AI, high-quality and appropriately formatted data should be fed into the system. Managing, validating, and securing this data feed into your AI engine is critical, as is collecting the data from those systems. Automating this stage would be a significant time saver, including data quality checks before the ingestion of the training data. I’ve traced most generative AI hallucinations back to insufficient and low-quality data. Have regular checkups. Generative AI software isn’t a set-it-and-forget-it tool. This technology needs regular performance tuning and optimization from the start of its life. The dynamic nature of AI requires consistent monitoring to ensure that the parameters provide the best operational results. This means tweaking the systems often—perhaps daily. Address security with diligent access controls. As your generative AI systems live in the cloud, security must include data encryption and regular audits. You better stay friends with those compliance policies because they aren’t going anywhere, and you’ll need to automate those policies during and after deployment into production. The idea should be to put as much volatility as possible into a separate domain and, thus, have the wide use of policies to deal with compliance and security parameters. This is more so with generative AI systems on a public cloud. Set up alerts for system failures. Keeping tabs on usage patterns, performing regular maintenance, and staying updated with patches and new versions are essential. Automation can again come to the rescue, easing the burden and increasing efficiency. Still, you’ll have to automate automation to keep up with the number of changes you need to implement. Ready, aim, fire! Get your system running correctly first. This means making design and code changes before deployment. In many instances, enterprises attempt to push things out and hope the operations team can solve design flaws that cause performance and stability problems, as well as the accuracy of the systems overall. Too many enterprises take a “ready, fire, aim” approach to generative AI in the cloud. This costs too much money and reduces the value of these systems through largely avoidable production issues. We should approach this with the willingness to fix many issues by deploying the first generation of cloud-based systems. These systems are far more important to get wrong. Let’s try not to create problems; they only get bigger during operations. Related content analysis Strategies to navigate the pitfalls of cloud costs Cloud providers waste a lot of their customers’ cloud dollars, but enterprises can take action. By David Linthicum Nov 15, 2024 6 mins Cloud Architecture Cloud Management Cloud Computing analysis Understanding Hyperlight, Microsoft’s minimal VM manager Microsoft is making its Rust-based, functions-focused VM tool available on Azure at last, ready to help event-driven applications at scale. By Simon Bisson Nov 14, 2024 8 mins Microsoft Azure Rust Serverless Computing how-to Docker tutorial: Get started with Docker volumes Learn the ins, outs, and limits of Docker's native technology for integrating containers with local file systems. By Serdar Yegulalp Nov 13, 2024 8 mins Devops Cloud Computing Software Development news Red Hat OpenShift AI unveils model registry, data drift detection Cloud-based AI and machine learning platform also adds support for Nvidia NIM, AMD GPUs, the vLLM runtime for KServe, KServe Modelcars, and LoRA fine-tuning. By Paul Krill Nov 12, 2024 3 mins Generative AI PaaS Artificial Intelligence Resources Videos