Chapter 1 – Before the Advent of Database Systems
ADRIENNE WATT
The way in which computers manage data has come a long way over the last few decades. Today’s users take for granted the many benefits found in a database system. However, it wasn’t that long ago that computers relied on a much less elegant and costly approach to data management called the file-based system.
File-based System
One way to keep information on a computer is to store it in permanent files. A company system has a number of application programs; each of them is designed to manipulate data files. These application programs have been written at the request of the users in the organization. New applications are added to the system as the need arises. The system just described is called the file-based system.
Consider a traditional banking system that uses the file-based system to manage the organization’s data shown in Figure 1.1. As we can see, there are different departments in the bank. Each has its own applications that manage and manipulate different data files. For banking systems, the programs may be used to debit or credit an account, find the balance of an account, add a new mortgage loan and generate monthly statements.
Figure 1.1. Example of a file-based system used by banks to manage data.
Disadvantages of the file-based approach
Using the file-based system to keep organizational information has a number of disadvantages. Listed below are five examples.
Often, within an organization, files and applications are created by different programmers from various departments over long periods of time. This can lead to data redundancy, a situation that occurs in a database when a field needs to be updated in more than one table. This practice can lead to several problems such as:
- Inconsistency in data format
- The same information being kept in several different places (files)
- Data inconsistency, a situation where various copies of the same data are conflicting, wastes storage space and duplicates effort
Data isolation is a property that determines when and how changes made by one operation become visible to other concurrent users and systems. This issue occurs in a concurrency situation. This is a problem because:
- It is difficult for new applications to retrieve the appropriate data, which might be stored in various files.
Integrity problems
Problems with data integrity is another disadvantage of using a file-based system. It refers to the maintenance and assurance that the data in a database are correct and consistent. Factors to consider when addressing this issue are:
- Data values must satisfy certain consistency constraints that are specified in the application programs.
- It is difficult to make changes to the application programs in order to enforce new constraints.
Security problems
Security can be a problem with a file-based approach because:
- There are constraints regarding accessing privileges.
- Application requirements are added to the system in an ad-hoc manner so it is difficult to enforce constraints.
Concurrency access
Concurrency is the ability of the database to allow multiple users access to the same record without adversely affecting transaction processing. A file-based system must manage, or prevent, concurrency by the application programs. Typically, in a file-based system, when an application opens a file, that file is locked. This means that no one else has access to the file at the same time.
In database systems, concurrency is managed thus allowing multiple users access to the same record. This is an important difference between database and file-based systems.
Database Approach
The difficulties that arise from using the file-based system have prompted the development of a new approach in managing large amounts of organizational information called the database approach.
Databases and database technology play an important role in most areas where computers are used, including business, education and medicine. To understand the fundamentals of database systems, we will start by introducing some basic concepts in this area.