Fragmentation

Kanakalakshmi_M
3 min readMar 28, 2024

--

In a distributed database system, corpus data is kept as fragments. Fragmentation can be accomplished in a variety of ways, including vertical, horizontal, and hybrid fragmentation. Implementing this helps with query optimization.

Fragmentation in Databases

Fragmentation is the process of dividing the whole or full database into various subtables or sub-relationships so that data can be stored in different systems. The small pieces, or sub-relationships or sub-tables, are called fragments. These fragments are called logical data units and are stored at various sites.

In the fragmentation process, let’s say a table ‘T’ is fragmented and is divided into a number of fragments, say T1, T2, T3,... Tn. The fragments contain sufficient information to allow the restoration of the original table T. This restoration can be done using UNION or JOIN operation on various fragments. This process is called data fragmentation.

Advantages-

· As the data is stored close to the usage site, the efficiency of the database system will increase.

· Local query optimization methods are sufficient for some queries as the data is available locally

· To maintain the security and privacy of the database system, fragmentation is advantageous

Disadvantages-

  • Access speeds may be very high if data from different fragments is needed
  • If we are using recursive fragmentation, then it will be very expensive
Fragmentation Types

The fragmentation can be implemented in:

Horizontal Fragmentation: Horizontal Fragmentation involves partitioning a global relation ‘R’ into two or more sub-relations without modifying its schema. This fragmentation technique is performed by applying a selection operation on R using specific conditions ‘P’ based on the data stored in the relation. Each resulting sub-relation, denoted as R1, R2, R3,... Rn, preserves the same schema as R while containing a subset of records determined by the selection conditions. Horizontal fragmentation is commonly utilized in distributed databases to distribute data across multiple sites, ensuring data consistency and optimizing access and performance.

  • Example: Consider a sample database with Employee(EMP), Project(PROJ), Payment(PAY) and Assignment(ASG) tables.

We can see that the given sample database contains both base tables and member tables. Payment and Project tables are simple tables known as base tables/owner tables, whereas Assignment and Employee are member tables that contain foreign keys. The assignment (ASG) table contains foreign keys (ENO, PNO) references to the Employee (EMP) and Project (PROJ) tables. The Employee table has foreign key (TITLE) references to the Payment table.

Primary horizontal fragmentation refers to the implementation of horizontal fragmentation on base tables/owner tables, and Derived horizontal fragmentation refers to horizontal fragmentation on member tables.

Vertical Fragmentation: The Vertical Fragmentation refers to partitioning of attributes in the relation R. It splits the relation R into R1, R2,...Rn, and expects the key column. This fragmentation technique is performed by applying a projection operation on R using access frequency of the query and attribute affinity.

Hybrid Fragmentation: The combination of horizontal and vertical fragmentation leads to hybrid fragmentation. This fragmentation technique is the most versatile, as it produces fragments with the least amount of unnecessary information. Nevertheless, it is frequently a costly undertaking to reconstruct the original table.

Sample example using PostgreSQL: https://github.com/Kanakalakshmi-M/Fragmentation

--

--

Kanakalakshmi_M
Kanakalakshmi_M

Written by Kanakalakshmi_M

Software Development Lead | Passionate about crafting technically robust solutions aligned with business objectives. Let's connect!

Responses (1)