Welcome Customer !

Membership

Help

New H3C Technology Co., Ltd
Custom manufacturer

Main Products:

ybzhan>Products
Product Categories

New H3C Technology Co., Ltd

  • E-mail

    service@h3c.com

  • Phone

  • Address

    H3C Company, 466 Changhe Road, Binjiang District, Hangzhou City

Contact Now

H3C DataEngine MPP Distributed Database

NegotiableUpdate on 12/28
Model
Nature of the Manufacturer
Producers
Product Category
Place of Origin
Overview
With the development of big data technology, various industries are generating massive amounts of data in diverse formats every day. At the same time, there is a new demand for establishing high-performance and efficient big data real-time analysis platforms within organizations to achieve real-time querying, updating, calling, analysis and mining of hundreds of millions of data in seconds. Traditional database technology can no longer meet the requirements for real-time storage, computation, and analysis of massive data. In this context, H3C DataEngine MPP database continues to iterate and optimize, integrating Hadoop big data and cloud technology to form a complete set of specialized big data solutions. H3C DataEngine MPP is an ideal platform for supporting real-time analysis of big data. H3C adopts a computing framework that integrates Hadoop big data and MPP distributed database to provide users with a complete big data platform solution, including a full range of functions such as data collection and conversion, storage and calculation, analysis and mining, and operation and maintenance management. H3C DataEngine MPP adopts high-performance columnar storage and computing technology, supports active data compression, advanced analysis, and has advanced features such as elastic expansion and custom external expansion. It provides unparalleled performance, mixed load, and real-time analysis capabilities for big data loading, processing, exporting, and analytical tasks, helping users build massive data processing systems, discover the intrinsic value of data, and seize new market opportunities.
Product Details
Product Features

Advanced columnar storage and computing

In H3C DataEngine MPP, each column of data is independently stored in consecutive hard disk storage blocks. This is fundamentally different from traditional line databases that store data continuously in row order.

H3C DataEngine MPP also supports delayed decompression technology. The optimizer and execution engine of H3C DataEngine MPP fully utilize column based computing technology, supporting direct access to encoded data values in query conditions and associations without the need for decoding first. In this way, H3C DataEngine MPP can greatly save CPU overhead during data queries, thereby improving overall query performance.

Efficient active compression

H3C DataEngine MPP supports over 12 compression algorithms, such as run length encoding, delta value encoding, integer compression for integer data, block dictionary encoding for character data, Lempel Ziv encoding for other data types, etc.

The data compression ratio of H3C DataEngine MPP can usually reach 10:1. From the perspective of saving I/O resource consumption, for analysis systems where I/O is the main bottleneck, the active compression technology of H3C DataEngine MPP can bring about an order of magnitude performance improvement compared to traditional line databases. However, there are significant differences in the characteristics of business data across different industries. Therefore, the compression ratio of H3C DataEngine MPP varies for different types of data.

Unique and convenient unshared large-scale parallel computing and scalability

H3C DataEngine MPP supports one click online cluster expansion. It can add and delete nodes with one click and automatically redistribute data, without interrupting the running business during the expansion process. All nodes in the H3C DataEngine MPP cluster are completely peer-to-peer, without the need for a master node. Data loading, data export, and queries can be executed simultaneously on all nodes in parallel. Due to the lack of resource sharing, adding nodes can linearly expand the data capacity and computing power of H3C DataEngine MPP, making it easy to scale up and down from a few nodes to thousands of nodes, or from a few TB to tens of PB, meeting the requirements of business scale growth.

Business based hierarchical storage

In the era of big data, data generation is getting faster and faster, and compliance and deep mining require retaining more data, so there is an increasing amount of data stored in databases. H3C DataEngine MPP can specify different storage strategies and storage locations (using different performance, cost, and capacity storage media) for different schemas, tables, and table partitions to optimize storage costs. At the same time, H3C DataEngine MPP also supports backing up some tables and table partitions to offline storage, as well as restoring from offline storage, which greatly accelerates the exchange speed of data between online and offline storage, and improves the utilization and business value of online devices.

Built in automatic optimization design

The designer of H3C DataEngine MPP not only provides overall optimization design for databases or schemas, but also supports incremental optimization for specific queries. In addition, H3C DataEngine MPP also provides a load analyzer to collect database running load data and provide automation recommendations at any time. This greatly reduces the cost of DBA management.

Stability guarantee system

H3C DataEngine MPP achieves high availability by maintaining multiple redundant backups of data. H3C DataEngine MPP ensures that redundant data is hashed and stored on different nodes, thus incorporating intelligent data mirroring functionality. H3C DataEngine MPP refers to it as K-level coefficient reliability (K-safety), where K refers to the number of nodes that H3C DataEngine MPP can tolerate that may experience failures.

K-level coefficient reliability ensures that when any K nodes fail, there is still at least one complete set of data in the cluster to respond to data processing and query requests. Node failures are transparent to applications and will not interrupt ongoing business operations. By adjusting the number of redundant data copies, H3C DataEngine MPP can achieve availability that meets the needs of different application designs.

H3C DataEngine MPP natively supports fault-tolerant group and rack awareness. The data redundancy of H3C DataEngine MPP is based on fault-tolerant groups. When all nodes in a fault-tolerant group (usually a cabinet or even an area consisting of several cabinets) fail, it can ensure that the data in the entire database is still complete and effective, effectively avoiding the damage of large-scale hardware failures such as cabinet power outages to the availability of the entire cluster.

In library machine learning

The traditional data analysis system adopts the operation process of "analyze/mine request - retrieve data - analyze/mine - display results". The arrival of big data has made the network a bottleneck, and the high latency of analysis/mining tasks has become unbearable.

H3C DataEngine MPP emphasizes integrating analysis/mining algorithms into databases, tightly integrating algorithms with data, and fully utilizing the real-time computing power brought by H3C DataEngine MPP based on column and MPP architecture features to efficiently execute analysis/mining tasks.

In addition to common statistical analysis and linear regression analysis, H3C DataEngine MPP has been extended based on SQL-99, providing advanced analysis functions such as time series, pattern matching analysis, and geographic analysis. These features will greatly reduce the need for users to develop complex data analysis on their own, and enable them to effectively complete analysis tasks in large data environments. At the same time, H3C DataEngine MPP supports structured and unstructured fusion analysis. By combining the excellent analytical capabilities of H3C DataEngine MPP and Hadoop in structured and unstructured analysis, efficient fusion analysis solutions can be created.

Development and expansion capabilities

H3C DataEngine MPP has strong compatibility. Supports the SQL-92/SQL-99/SQL-2003 standards and provides ODBC, JDBC, and ADO. NET drivers, making it easy to integrate with existing ETL (such as Informatica, DataStage) tools, reporting tools (such as BO, cognos, etc.), and analysis tools (such as SAS, SPASS, etc.) to protect existing investments.

The H3C DataEngine MPP server uses X86 Linux servers and supports mainstream X86 servers and Linux distributions. H3C DataEngine MPP also provides an SDK for developing custom functions, supporting users to use H3C DataEngine MPP as a big data computing platform and add their unique data analysis and processing algorithms as extensions to H3C DataEngine MPP through C++/R/Java language, fully utilizing H3C DataEngine MPP's MPP and column computing technology, enabling users to immediately have efficient big data analysis and processing capabilities.

H3C DataEngine MPP supports one click online cluster expansion. It can add and delete nodes with one click and automatically redistribute data, without interrupting the running business during the expansion process.

database security

H3C DataEngine MPP has strong security. H3C DataEngine MPP supports multiple authentication methods,. Only authenticated users are allowed to access H3C DataEngine MPP. At the same time, H3C DataEngine MPP provides comprehensive roles and authorization mechanisms, which can flexibly control users' data access permissions. Support multi tenancy and sandboxing, and implement a multi tenancy management mode for the database. The multi tenant model can fully utilize the time difference between peak and valley resource consumption of different tenants/applications, achieving resource sharing.

In addition, H3C DataEngine MPP also supports backup and recovery features for some database objects (such as schemas and tables), and supports separate management of backup policies by tenant, allowing each tenant to have complete lifecycle management of their data objects. The database object backup and recovery function of H3C DataEngine MPP provides support for multi-level hierarchical management of big data systems according to applications and users.

Features and Benefits

Optimized Column Storage and Computing

In H3C DataEngine MPP, each column of data is independently stored in consecutive hard disk storage blocks. This is fundamentally different from traditional line databases that store data continuously in row order. H3C DataEngine MPP supports delayed materialization technology. For most analytical queries, it is often only necessary to obtain a subset of all column data. The H3C DataEngine MPP column optimizer and execution engine can skip irrelevant columns in column storage, thereby saving a significant amount of I/O resource consumption.

Active compression

Unlike other databases that claim to support column storage, H3C DataEngine MPP automatically selects the appropriate sorting method and compression algorithm based on the data type, cardinality, and query characteristics of each column, in order to minimize the storage space occupied by data, reduce query I/O consumption, and improve query performance. H3C DataEngine MPP supports over 12 compression algorithms.

Shareless large-scale parallel computing and scalability

All nodes in the H3C DataEngine MPP cluster are completely peer-to-peer, without the need for a master node. Data loading, data export, and queries can be executed simultaneously on all nodes in parallel. Due to the lack of resource sharing, adding nodes can linearly expand the data capacity and computing power of H3C DataEngine MPP, making it easy to scale up and down from a few nodes to thousands of nodes, or from a few TB to tens of PB, meeting the requirements of business scale growth.

Hierarchical storage

In the era of big data, data generation is getting faster and faster, and compliance and deep mining require retaining more data, so there is an increasing amount of data stored in databases. Performance analysis, high cost of high-speed disks, and requirements for large data capacity are often contradictory. The hierarchical storage feature of H3C DataEngine MPP can effectively resolve this contradiction. H3C DataEngine MPP can specify different storage strategies and storage locations (using different performance, cost, and capacity storage media) for different schemas, tables, and table partitions to optimize storage costs.

Automatic optimization design

H3C DataEngine MPP has a built-in database optimization designer that includes expert knowledge. Users only need to specify the logical schema, load sample data, and provide typical query SQL statements. H3C DataEngine MPP's database optimization designer will automatically design the horizontal distribution of data, the sorting method of each column, and the compression algorithm based on expert knowledge, balancing query performance and storage space requirements, and achieving overall optimization of the database.

Native cloud analysis platform

H3C DataEngine MPP supports the separation of computing and storage resources in both public and private cloud scenarios, combining the high performance of unshared MPP with the fast scalability advantages of shared storage. The data master copy is located in shared storage, making it easy to scale rapidly, while hot data is located locally in computing resources to provide high-speed performance. At the same time, through "sub clusters", it avoids resource waste caused by too many cluster nodes in high concurrency load scenarios, thereby achieving elastic expansion of throughput capacity.

system architecture

Shareless MPP architecture: H3C DataEngine MPP uses hundreds or thousands of nodes for parallel computing, which is the key architecture to support big data analysis and obtain sufficient computing and storage resources. The unshared MPP architecture does not require shared storage or even specialized primary (primary) nodes to scale to hundreds or thousands of nodes to support big data analysis.

Native support for high availability: H3C DataEngine MPP has built-in high availability. If a node failure occurs within the designed availability range, it can be transparent to the application and will not interrupt ongoing business operations. The incremental backup and recovery, database partial object backup and recovery functions of H3C DataEngine MPP can ensure that applications can be quickly restored and available in the event of a disaster.

Optimized Column Database: Key to Eliminating I/O Cost Bottlenecks in Big Data Analytics Performance. The combination of H3C DataEngine MPP columnar storage and computation (columnar engine) provides complete functionality including add, delete, and modify operations, which is the key to supporting high-performance analysis queries and ad hoc queries under big data conditions.

Ideal big data real-time analysis platform: H3C DataEngine MPP adopts high-performance column storage and computing technology, supports active data compression, advanced analysis, and has advanced features such as elastic expansion and custom external expansion, making it an ideal platform for supporting real-time big data analysis.

Similar Product Recommend