File Organization
We have the following hierarchy:
So above diagram shows how the whole database is organized.
Now let’s talk about the storage policies per block
.
Blocking Factor: It is the number of records per block.
And there are two types of strategies to store records in the blocks.
- Spanned Strategy: We can store the remaining part of the records into some other block.
- Advantages: We do not have wastage of memory
- Disadvantage: Block Access is increased.
- UnSpanned Strategy: No record can be stored in more than 1 block.
- Advantage: Block access is reduced.
- Disadvantage: Wastage of memory. And suitable only for fixed length records.
Organization of Records in a File
Ordered File Organization
- All records of the files are ordered on some key value. That’s the name is ordered.
- Advantage is that we can apply binary sort on the data so the searching will become efficient Example Lets say we have B data block then we can search in log(B) time.
- Key Name
- Pointer to the actual records
- Primary Index: Primary Key and the data will be ordered.
- Clustered Index: It will be applied on non-key and Ordered
- Secondary Index: It will be applied on non-key and unordered
However this approach will have more insertion time.
Why ? Because for searching we need to put the data in its appropriate location as we have to maintain the order of the data.Un Ordered File Organization
So as the name suggests that wherever we have the place we insert the records so the insertion time will reduce but the searching time will increase
We will apply linear search
Calculation of the number of data block accesses required for searchingSo let’s say for best case number of blocks will be 1 and for the worst case we have to search all the blocks so in total we have to search (1+B)/2 which is O(B/2)
Advantage:
Insertion is efficient.
Disadvantage:
searching is inefficient.
Indexing
In this process we make a file which will contain pointers which will point to actual data records.
It has two columns which has the following structure
Key_name | Pointer |
Index File size will be less as the index file will have only two fields.
Types of index file
There are two types of classification of index files.
One is based on levels.
Single Level:
We have following subtypes in this one.