9 Ways Database Technology Gives Your Applications an Edge

With incredible amounts of data to store and query, developers can no longer simply insert their data into a relational database (such as MySQL or Oracle).

Instead, they must now look to newer database technology to give their applications an edge. This post looks at nine common use cases and the corresponding databases that are their ideal (or near-ideal) fits. 

1. My data is stored on an embedded edge device… 

RocksDB is a very fast NoSQL Key-Value database intended for use on mobile devices. A fork of LevelDB, both databases do not have any clustering or scale-out features but are very fast on a single device. SQLite also works very well for embedding a database within a single application instance and allows developers to use most SQL statements. 

These lightweight databases are used in mobile devices as well as desktop and server applications. These sorts of databases are ideal when extremely low latency is required because the reads and writes are direct to disk/memory. The caveat, though, is the data set needs to be relatively small as they do not feature data partitioning or clustering capabilities. 

2. I’m examining the connections between different entities… 

Graph databases such as Neo4j excel at storing and examining data that consists of different nodes and their edges or relationships. These are typically used in looking at networks, roads, social relationships, and other similar use cases. They focus on the relationship versus the entity and can track or optimize the path a car takes through a road network, the degrees of separation of different friends, or the number of hops it takes a packet to traverse the internet based on different rules. 

3. I need to generate reports based on a huge amount of data… 

Column-orientated database systems are commonly used for data warehousing or Online Analytical Processing (OLAP) where data is regularly queried and its aggregations analyzed. Data organized by column can improve efficiency over row-oriented DBMS, so when you have terabytes or even petabytes of data, scale-out (where additional computers are added as workload increases) wide column store databases like Cassandra and HBase are going to be the best option. Having said that, Cassandra and HBase are extremely different: Cassandra is peer-to-peer while HBase has a conventional master-slave architecture that is built on top of Hadoop/HDFS. 

4. My data describes a variety of things… 

Describing people, products, and other things with different attributes is where a document database like MongoDB excels. With a schemaless database, you can have records in a database that may or may not have a particular attribute but you can still index and query the different attributes. For example, some products may have a “colour” attribute while others do not, but the application is still able to query for all products that are red. The flexible data model also allows frequent application updates reducing application maintenance costs. 

5. I just need a simple cache… 

Redis is a fast in-memory, scale-out Key-Value database that is perfect for an application cache such as storing authentication tokens, session IDs, or any other use case where read and write performance is crucial, but data durability is not. If you’re only caching a small amount of data, most programming languages have cache libraries available that implement simple Key-Value stores with additional useful features like data expiration. 

6. My data has lots of transactions… 

Conventional relational databases (RDBMS) like MySQL, PostgreSQL, Oracle, and Microsoft SQL Server make up 50% of the top-10 of DB-Engines database ranking list and are still the right fit for many applications despite the recent popularity of scale-out NoSQL databases. One of relational databases’ main benefit is that they offer the flexibility of SQL, allowing the logic of an application to be built into the query statements. 

Because RDBMSs are scale-up rather than scale-out — scale-up machines simply add more CPU/Memory/Disk horsepower as workload increases — they are able to guarantee ACID consistency, making them ideal candidates for Online Transactional Processing (OLTP). 

7. My data is generated by a large number of heterogeneous devices… 

GridDB was developed to store planet-scale time series data generated by the massive amount of Internet of Things (IoT) devices. It is a purpose-built distributed database for Industrial IoT and Big Data use cases. A subset of Key-Value store, its Key-Container data model allows fast ingestion and query of sensors data. Time-series related features such as data retention/automatic expiration, compression, sampling, and time-weighted aggregations will make it easier for you to manage time series data at scale. 

8. I don’t want to worry about administering a database… 

Both Microsoft and Amazon offer multi-purpose Database As A Service (DBaaS) as part of their cloud services. Amazon’s DynamoDB offers Key-Value and Document data model while Microsoft’s CosmosDB offers Key-Value, Document, Graph, and Wide-Column models and both allow easy, global scaling. These fully managed, hosted databases are easier to administer and less expensive to deploy than non-hosted solutions. 

9. I need to search through lots of text… 

Elasticsearch is a distributed database search engine that can search for just about anything. Built on top of Apache Lucene, its schemaless design builds an index of input JSON documents in near real-time. Suitable for full-text search of log files, user generated content, or public text sources, Elasticsearch can aggregate data to examine trends if applicable and also auto-complete queries. 


About the Author

Israel Imru is engineer at Fixstars Solutions. Fixstars Solutions is an innovator in flash storage solutions devoted to “Speed up your Business”. Combining expertise in multi-core processors programming and the use of next generation memory technology, Fixstars provides the best performance and the highest capacity storage solutions.

Featured image: ©Siar