0


Data Warehousing数据仓库

What Is Data Warehousing? • Enterprise level data management — used in business intelligence (BI) • Includes copied transactional data • Used to monitor and make strategic decisions • Aggregated or summarised data • Documentation about metadata 企业级数据管理 用于汇总数据做出决策

How Data Warehousing Has Changed? • Since 2001 the amount of data held in data warehouses has tended to triple every two years • Terabytes-sized databases now commonplace • Size, prevalence, scope and complex of such systems expanding • Available to internal and external users可供内外部用户使用

What Are The Benefits? • Potential high returns on investment • Competitive advantage • Increased productivity of corporate decision makers高回报 竞争优势 提高决策者效率

What Are The Differences Between OLTP Systems And Data Warehousing?OLTP系统和数据仓库的区别

What Are Data Marts? “A database that contains a subset of corporate data to support the analytical requirements of a particular business unit (such as the Sales department), or to support users who share the same requirements to analyse a particular business process (such as property sales).”数据集市 包含企业数据子集的数据库 支持特定部门或用户分析特定业务流程

How Can Data Be Visualised Conceptually? • Multidimensional data model • Data can then be: — Used with standard reporting tools ◦ Who... ? ◦ What... ? — Queried using OLAP (online analytical processing) techniques: slice, dice, roll-up and drill-down ◦ What if... ? ◦ Why... ? — Subject to data mining ◦ Unknown patterns概念上数据可视化 多维数据模型 使用OLAP

How To Model Schema In A Data Warehouse: Dimensional Model (DM) • Typical relational schema (also known as star schema)在数据仓库中建立唯独模型DM典型的关系模式(也称为星形模式)

• Snowflake schema雪花模式

• Galaxy (starflake) schema 银河模式What Are The Main Differences Between DMs And ERMs? ERM • Designed for OLTP • Identifying entities and their relationship • Predictive data retrieval • Transactions made very simple and deterministic DM • Designed for data warehousing • Identifying facts • Intuitive, high-performance retrieval • Supports ad hoc end-user queries ERM 为OLTP设计 识别实体及关系 预测性数据检索 简单准确 DM 为数据仓库设计 识别事实 直观高效率检索 支持临时用户查询

How To Create A Dimensional Model? • Step 1: Select business process • Step 2: Declare grain • Step 3: Choose dimensions • Step 4: Identify facts • Step 5: Identify all dimensional attributes for the dimensional model 创建维度模型DM 1选择业务流程 - 2 确定粒度 - 3 选择维度 - 4 确定事实 - 5确定维度模式的所有维度属性

What Are Some Of The Problems With Data Warehousing? • Underestimation of resources for data ETL • Hidden problems with source systems • Required data not captured • Increased end-user demands • Data homogenization • High demand for resources • Data ownership • High maintenance • Log-duration projects • Complexity of integration数据仓库的问题 低估DLT数据占用资源 源系统隐患 所需数据不能获取 终端客户需求增加 数据同质化 资源需求量大 数据所有权 维护量大 日志持续时间长 集成复杂

What Are Some Specific ETL Problems? • Batch-oriented (i.e., not timely) • Scan for changes (i.e., performance hit) • Date columns (i.e., intrusive)ELT问题 批处理导向(不及时) 更改扫描(影响性能) 日期列(侵入性)

Summary • Many databases today support management decision making • Important to ensure OLTP systems effective first • Seek to build data warehousing system on top of OLTP system What impact can big data have on data warehousing? 确保OLTP系统 OLTP系统上建立数据仓库系统

标签: 数据仓库

本文转载自: https://blog.csdn.net/weixin_74400487/article/details/136632983
版权归原作者 .548 所有, 如有侵权,请联系我们删除。

“Data Warehousing数据仓库”的评论:

还没有评论