SAP HANA in memory computing is the key to the success of SAP high performance analytics. Here's some notes about its architecture.
Data loading:
- Load directly from other source such as ASE. ERP data are loaded through replication server
Query processing:
- standard SQL goes through the normal execution path:
parser -> db optimizer -> (output physical plan) -> executor
- other scripts/language are processed in Calc Engine:
Model optimizer/executor -> (logical plan) -> db optimizer ->
db optimizer -> (output physical plan) -> executor
HANA In memory computing engine:
HANA supports both column store and row store. The persistence layer maintains data consistency. The in-memory computing studio provides powerful development platform. Some tech details of each module:
Row Store:
- It writes into temporary transactional version memory.
- The version memory consolidation moves visible version into persisted segment and clears outdated data from transactional version memory.
- tables are linked list of 16k memory pages, grouped in segments.
- each table has a primary index, contains row id - page mapping. Index only exists in memory and filled on-the-fly when loading.
- persistence layer invoked in write log, and writes savepoint -> checkpoint
Column Store:
- read optimized and efficient compression
- delta storage for (asynchronous) fast write.
- read always from both main and delta storage and merge results
- compression in main storage, during delta merge.
- Need a 2nd delta storage during delta merge for continuous read-write ops.
Persistence layer
- Regular savepoints (full image),
- logs after last savepoints, for restore
- snapshots for backups.
Upon system restart:
- For row store: complete row-store is loaded in memory
- For column store:
> preload-marked column store tables are loaded.
> loading on demand: restore on first access.
In-memory computing studio (Information Modeler)
The modeling process flow:
--> Metadata(table) creation
--> table data loading
--> View creation
--> column view deploy
--> data consumption
I believe the In-memory computing studio
is the biggest promotion of SAP HANA. The information modeler provides different attribute/analytic/calculation views into the data, integrates a complete development studio on top of HANA, which greatly simplified the work of app developers.
Labels: architecture, database