ACID Tables in Hive:

 

ACID Tables in Hive:

In Hive, starting from version 0.14, it supports Online Transactional Processing (OLTP). Before that, Hive only supported Online Analytical Processing (OLAP

What is OLTP?

OLTP systems are designed to handle transaction-oriented workloads, where multiple users are performing concurrent transactions.

In OLTP, we can perform operations like Insert, Update, and Delete. In OLAP, tables cannot be modified.

Now what is ACID:

· A — Atomicity — they are either executed completely or not at all. There are no partial transactions.

· C — Consistency — The database remains in a consistent state even in the presence of failures.

· I — Isolation — the intermediate states of one transaction are not visible to other transactions until the transaction is completed.

· D — Duration — Once a transaction is committed, its changes are permanent and survive system failures.

These are tables that allow insert/update or delete operations. 

Here are certain rules to be followed for ACID Tables:

Internal Table: Tables should be internal tables.

Bucketed: For a table to be ACID, it should be bucketed.

File Format: The table should be stored in ORC Format.

Transactional Properties: In Table properties, Transactional = ‘true’ should be enabled.


Example code to make a table ACID:

Set hive.support.concurrency = true;
Set hive.enforce.bucketing = true;
Set hive.exec.dynamic.partition.mode = nonstrict;
Set hive.txn.manager = org.apache.jadoop.hive.ql.lockmgr.DbTxnManager;
Set hive.compactor.initiator.on = true;
Set hive.compactor.worker.threads = 1;

CREATE TABLE acid_table ( sno int, city string)
    clustered by (city) into 3 buckets row format delimited fields     terminated by ‘,’ lines terminated by ‘\n’ stored as ORC     TBLPROPERTIES( ‘transactional’ = ‘true’);

Below operations can be performed only in ACID Tables:

INSERT INTO acid_table(sno,city) VALUES(1,’Chennai’)
UPDATE acid_table SET city = ‘Coimbatore’ WHERE sno = 1;
DELETE FROM acid_table WHERE sno = 1;

CONCLUSION:


In conclusion, understanding the relationship between OLTP and ACID tables is essential for designing and managing databases effectively. OLTP systems are optimized for transactional workloads, with a focus on high throughput and low response times. ACID tables, on the other hand, adhere to a set of properties (Atomicity, Consistency, Isolation, Durability) that ensure reliable and consistent transaction processing, crucial for maintaining data integrity.

Comments

Popular posts from this blog

Departmental Salary Insights: with SQL and PySpark

Handling a PySpark TypeError: ‘unsupported operand type(s) for +: ‘int’ and ‘str’

Troubleshooting PySparkTypeError: [NOT_ITERABLE]