David Linthicum
Contributor

Best practices for operating cloud-based generative AI systems

analysis
Oct 17, 20234 mins
Artificial IntelligenceCloud ComputingEmerging Technology

From system design to daily performance tuning, here’s a checklist of ways to make your systems run effectively.

programmer / data / settings
Credit: Chainarong Prasertthai / Getty Images

Guess what? Cloud computing conferences are now generative AI conferences. How did that happen? It’s a simple matter of cloud providers seeing generative AI as the best way to sell more cloud services, and they are not wrong.

As businesses shift to an AI-driven ecosystem, most of it plays out in cloud computing environments. That’s usually where you find state-of-the-art generative AI systems, scalability, accessibility, and cost-effectiveness. As we embark on this journey, how should enterprises operate these systems effectively? What best practices should we consider?

Understanding generative AI in the cloud

Generative AI models, in simplistic terms, are systems that take patterns and structures from input data to generate new, original content. This content is the output data, which can be any type of structured or unstructured information. 

This is a data operations problem more than anything if you’re looking for existing patterns to build on. However, there are core differences, including the fact that the processing is much more frequent, and the data input and output performance defines the performance of generative AI systems in the cloud.

Processes for cloud-based generative AI systems

Let’s define a basic process or set of best practices. Operations people love checklists, so here’s mine.

Design your system. Your AI and cloud solution must scale together, and generative AI models need efficient management of storage and compute resources. You must code your application to take advantage of cloud-native services as much as possible. It’s cost-effective and simplifies things. This is where devops comes in, in coordination with the development team on optimizing code.

This idea is that you don’t need to deal with operational problems if the system is designed correctly in the first place. Most of what I run into regarding operations issues comes down to the core design of the systems. 

Garbage in equals garbage out. To get meaningful output from AI, high-quality and appropriately formatted data should be fed into the system. Managing, validating, and securing this data feed into your AI engine is critical, as is collecting the data from those systems. Automating this stage would be a significant time saver, including data quality checks before the ingestion of the training data. I’ve traced most generative AI hallucinations back to insufficient and low-quality data.

Have regular checkups. Generative AI software isn’t a set-it-and-forget-it tool. This technology needs regular performance tuning and optimization from the start of its life. The dynamic nature of AI requires consistent monitoring to ensure that the parameters provide the best operational results. This means tweaking the systems often—perhaps daily. 

Address security with diligent access controls. As your generative AI systems live in the cloud, security must include data encryption and regular audits. You better stay friends with those compliance policies because they aren’t going anywhere, and you’ll need to automate those policies during and after deployment into production. The idea should be to put as much volatility as possible into a separate domain and, thus, have the wide use of policies to deal with compliance and security parameters. This is more so with generative AI systems on a public cloud.

Set up alerts for system failures. Keeping tabs on usage patterns, performing regular maintenance, and staying updated with patches and new versions are essential. Automation can again come to the rescue, easing the burden and increasing efficiency. Still, you’ll have to automate automation to keep up with the number of changes you need to implement.

Ready, aim, fire!

Get your system running correctly first. This means making design and code changes before deployment. In many instances, enterprises attempt to push things out and hope the operations team can solve design flaws that cause performance and stability problems, as well as the accuracy of the systems overall. Too many enterprises take a “ready, fire, aim” approach to generative AI in the cloud. This costs too much money and reduces the value of these systems through largely avoidable production issues.

We should approach this with the willingness to fix many issues by deploying the first generation of cloud-based systems. These systems are far more important to get wrong. Let’s try not to create problems; they only get bigger during operations.

David Linthicum
Contributor

David S. Linthicum is an internationally recognized industry expert and thought leader. Dave has authored 13 books on computing, the latest of which is An Insider’s Guide to Cloud Computing. Dave’s industry experience includes tenures as CTO and CEO of several successful software companies, and upper-level management positions in Fortune 100 companies. He keynotes leading technology conferences on cloud computing, SOA, enterprise application integration, and enterprise architecture. Dave writes the Cloud Computing blog for InfoWorld. His views are his own.

More from this author