Jobs in Pakistan

CS614 Assignment 1 Spring Solution 2021

 

CS614 Assignment 1 Spring Solution 2021


CS614 Assignment 1 Spring Solution 2021


 

Question 1: Suppose that you are the data analyst on the project team building a data warehouse for an insurance company. List at least three data sources from which you will bring the data into your data warehouse?

Solution:

Operational Database:

          Operational databases are used to keep track of, monitor, and store real-time business data. An operational database, for example, may be used to monitor warehouse/stock quantities. An operating database will be used to keep track of how many goods have been sold and when the business may need to reorder stock as consumers order products from an online web store.

In a computer database, an operational database stores information about an organization’s activities, such as customer relationship management transactions or financial operations.

An operational database is used to manage the company’s day-to-day activities and transactions. It may also be asked to assist with analytic processing by delivering real-time dashboards or facilitating the integration of analytics into organizational processes.

 

Archive datastore:

          Almost all archive data stores are represented in relational format. The mapping between the two is simple if the source data is relational. Some source databases, on the other hand, will not be relational and will take some work to convert.

The archive datastore must be managed in a way that ensures its long-term viability. This is the main objective.

Benefits of Data Archiving:

Ø Reduced cost:

Ø Better backup and restore performance:

Ø Prevention of data loss

Ø Increased security

Ø Regulatory compliance

Semi-structured:

          Semi-structured data exists somewhere in the midst of all of this. PACs are the most well-known example of healthcare, where a database stores information about stored images (which is structured), but the individual files (images) are unstructured data. PACS are typically built on top so a SQL or Oracle database and the structured portion of the framework are small in comparison to the unstructured image size.

 

Semi-structured data is a hybrid of structured and unstructured data that combines the best of both worlds. It also adheres to a set of rules, is consistent, and exists to save space and provide clarification. Semi-structured documents include CSV, XML, and JSON. NoSQL databases are commonly used to store semi-structured data.

 

Question 2: Data warehouse systems often have complex issues due to many business requirements. Technical complexity issues arise from three areas: sourcing issues, transformation issues, and target issues.

Write at least two examples of each (Not more than one line for each).

 

Transformation issues:

It takes various tests, each of which takes time and is time-consuming when applied to larger data sets, making it less accurate. During the transition, a lack of experience and carelessness can cause issues.

A variety of limitations exist in data warehouses, such as data authentication being fake at times. In certain instances, data authentication is not feasible.

Target Issue:

The final stage is Load, which is an operation that involves loading data that has not been cleaned into the target system, resulting in an error. Irrelevant loading causes an error in the target scheme.

Sourcing Issue:

Database path is incorrect

Creating bottlenecks due to insufficient CPU or Memory resource

Saving DATA in URDU, FARSI in Database




Download File:

Click Here

No comments:

Post a Comment

Verification: 5454749714e96868