Database (DB) Definition
A database is a collection of information that is organized so that it can be easily accessed, managed and updated.
Data is organized into rows, columns and tables, and it is indexed to make it easier to find relevant information. Data gets updated, expanded and deleted as new information is added. Databases process workloads to create and update themselves, querying the data they contain and running applications against it.
Computer databases typically contain aggregations of data records or files, such as sales transactions, product catalogs and inventories, and customer profiles.
Typically, a database manager provides users with the ability to control read/write access, specify report generation and analyze usage. Some databases offer ACID (atomicity, consistency, isolation and durability) compliance to guarantee that data is consistent and that transactions are complete.
Databases are prevalent in large mainframe systems, but are also present in smaller distributed workstations and midrange systems, such as IBM’s AS/400 and personal computers.
Evolution of databases
Databases have evolved since their inception in the 1960s, beginning with hierarchical and network databases, through the 1980s with object-oriented databases, and today with SQL and NoSQL databases and cloud databases.
In one view, databases can be classified according to content type: bibliographic, full text, numeric and images. In computing, databases are sometimes classified according to their organizational approach. There are many different kinds of databases, ranging from the most prevalent approach, the relational database, to a distributed database, cloud database or NoSQL database.
Relational database
A relational database, invented by E.F. Codd at IBM in 1970, is a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways.
The Structured Query Language (SQL) is the standard user and application program interface for a relational database. Relational databases are easy to extend, and a new data category can be added after the original database creation without requiring that you modify all the existing applications.
Distributed database
A distributed database is a database in which portions of the database are stored in multiple physical locations, and in which processing is dispersed or replicated among different points in a network.
Distributed databases can be homogeneous or heterogeneous. All the physical locations in a homogeneous distributed database system have the same underlying hardware and run the same operating systems and database applications. The hardware, operating systems or database applications in a heterogeneous distributed database may be different at each of the locations.
Cloud database
A cloud database is a database that has been optimized or built for a virtualized environment, either in a hybrid cloud, public cloud or private cloud. Cloud databases provide benefits such as the ability to pay for storage capacity and bandwidth on a per-use basis, and they provide scalability on demand, along with high availability.
A cloud database also gives enterprises the opportunity to support business applications in a software-as-a-service deployment.
NoSQL database
NoSQL databases are useful for large sets of distributed data.
NoSQL databases are effective for big data performance issues that relational databases aren’t built to solve. They are most effective when an organization must analyze large chunks of unstructured data or data that’s stored across multiple virtual servers in the cloud.
Object-oriented database
Items created using object-oriented programming languages are often stored in relational databases, but object-oriented databases are well-suited for those items.
An object-oriented database is organized around objects rather than actions, and data rather than logic. For example, a multimedia record in a relational database can be a definable data object, as opposed to an alphanumeric value.
Graph database
A graph-oriented database, or graph database, is a type of NoSQL database that uses graph theory to store, map and query relationships. Graph databases are basically collections of nodes and edges, where each node represents an entity, and each edge represents a connection between nodes.
Graph databases are growing in popularity for analyzing interconnections. For example, companies might use a graph database to mine data about customers from social media.
Accessing the database: DBMS and RDBMS
A database management system (DBMS) is a type of software that allows you to define, manipulate, retrieve and manage data stored within a database.
A relational database management system (RDBMS) is a type of database management software that was developed in the 1970s, based on the relational model, and is still the most popular way to manage a database.
Microsoft SQL Server, Oracle Database, IBM DB2 and MySQL are the top RDBMS products available for enterprise users. DBMS technologies began in the 1960s to support hierarchical databases, and they include IBM’s Information Management System and CA’s Integrated Database Management System.