I’m having ‘Sixth Sense’ moments when I see dead databases walking. With GenAI poised to eat your data for lunch, it’s time to fix performance problems. Credit: Shout! Factory “My cloud application is slow,” is a common complaint. However, nine times out of ten the cause does not lie with the application processing or the database’s inability to serve the application at the required performance level. It’s almost 2024. Why are we still having these issues with cloud-based database performance? What are the most common causes? How can we fix them? I have a few ideas. Did you choose the right cloud service? Cloud providers offer many database services, such as Amazon RDS, Azure SQL Database, and Google Cloud SQL. Sometimes the database you chose based on your application’s requirements, scalability, and performance expectations must be adjusted to ensure a more appropriate fit. In many cases, databases were selected for the wrong reasons. For instance, the future requires the storage and management of binaries, which leads to the selection of object databases. However, a relational database is the right choice for this specific use case. Consider all factors, including managed services, geographic locations, and compatibility. Also, consider performance when selecting a database type and brand. The assumption is that it’s on the cloud, and the cloud is “infinitely scalable,” so any database will perform well. The type of databases you select should depend on the data type you’re looking to store and how you’ll use the data, such as columnar hierarchical, relational, object, etc. The most popular database and the one that works for your specific use case are rarely the same. How’s your database design and indexing? This is huge. Efficient database design and proper indexing significantly impact performance. Most underperforming database problems trace their roots to database design issues, especially overly complex database structures and misapplied indexing. Make sure to establish appropriate indexes to speed up data retrieval. Regularly review and optimize queries to eliminate bottlenecks. Make sure that your database schema is optimized. Also, normalize the database where necessary, but know that over-normalizing can be just as bad. For those who didn’t take Database 101 in the 1990s, it means organizing data into separate, interrelated tables or other native database containers in a database. Normalization aims to minimize redundancy and dependency by eliminating duplicate data and breaking down larger tables into smaller, more manageable ones. I’ve found that the process of database normalization to maximize performance is often overlooked and causes many performance issues. Are you scaling resources appropriately? Although public cloud providers offer highly scalable resources to adapt to varying workloads, they often need to be more effective. You need to investigate the implementation of auto-scaling features to adjust resources based on demand and dynamically. Horizontal scaling (adding more instances) and vertical scaling (increasing instance size) can be used strategically for high-performance requirements. However, be careful allowing the cloud provider to allocate resources automatically on your behalf. In many instances, they allocate too many, and you’ll get a big bill at the end of the month. You should determine a balance versus just selecting the auto-scale button. Is your storage configuration a disaster? It’s best to optimize storage configurations based on the workload characteristics, not the best practices you saw in a cloud certification course. For instance, utilize SSDs for I/O-intensive workloads but understand that they are often more expensive. Also, choose the right storage tier and implement caching mechanisms to reduce the need for frequent disk I/O operations. Indeed, caching has also gone into an automated state, where you may need more granular control to find the optimum performance with the minimum cost. Cloud architects and database engineers need to do better at database performance. In some cases, it means getting back to the basics of good database design, configuration, and deployment. This is becoming a lost art, as those charged with cloud systems seem to prefer tossing money at the problem. That is not the way you solve problems. Related content news SingleStore acquires BryteFlow to boost data ingestion capabilities SingleStore will integrate BryteFlow’s capabilties inside its database offering via a no-code interface named SingleConnect. By Anirban Ghoshal Oct 03, 2024 4 mins ETL Databases Data Integration feature 3 great new features in Postgres 17 Highly optimized incremental backups, expanded SQL/JSON support, and a configurable SLRU cache are three of the most impactful new features in the latest PostgreSQL release. By Tom Kincaid Sep 26, 2024 6 mins PostgreSQL Relational Databases Databases feature Why vector databases aren’t just databases Vector databases don’t just store your data. They find the most meaningful connections within it, driving insights and decisions at scale. By David Myriel Sep 23, 2024 5 mins Generative AI Databases Artificial Intelligence feature Overcoming AI hallucinations with RAG and knowledge graphs Combining knowledge graphs with retrieval-augmented generation can improve the accuracy of your generative AI application, and generally can be done using your existing database. By Dom Couldwell Sep 17, 2024 6 mins Graph Databases Generative AI Databases Resources Videos