DATA ANALYTICS UTU QUESTION PAPER

DATA ANALYTICS UTU QUESTION PAPER (2021-22)

DATA ANALYTICS


Roll No.

Even Semester Examination, 2021-22

Course Name: B.TECH

Branch: CSE/IT Semester: VI

Subject: Data Analytics

Time: 3 Hours

Max Marks: 100

Number of Printed pages: 1

Note:- Attempt all questions: All Questions carry equal marks

Q1. Attempt any four parts of the following : (5 x 4 = 20)

(a) Describe the structure of HDFS in a Hadoop ecosystem using a diagram. 

(b) Describe the bootstrapping and its importance.

(c) What is sampling and sampling distribution giving a detailed analysis.

(d) What is the significance of scatter plot matrix?

(e) Why do we need Bloom filter in filtering streams?

Q 2. Attempt any four parts of the following: (5 x 4 = 20)

b) Explain Grouping, Join, Co Group, Cross & Group in data.

a) What are the strength & weakness of CLIQUE.

c) List out and explain the data types in Hive.

d) Discuss in brief about Mapper code and Reducer code e) Explain the process Crowd Sourcing Analytics.

Q3. Attempt any two parts of the following: (10 x 2 = 20)

(a) Explain Google File System architecture with neat diagram.

(b) Discuss Pig Latin Application Flow.

(c) Describe any two-sampling technique for Big Data with the help of Examples.

Q4. Attempt any two parts of the following: (10 x 2 = 20)

(a) What is Cluster? Explain the setting up a Hadoop cluster.

(b) Illustrate YARN based execution model and its functions With a neat diagram.

(c) Discuss Analysis of Variances (ANOVA) and correlation indicators of linear relationship.

Q5. Attempt any two parts of the following: (10 x 2 = 20)

(a) Write in detail the concept of developing the Map Reduce Application.

(b) What are the various stages in big data analytics life cycle? Illustrate with a figure,explaining each of them.

(c) With an example, explain the term social media analytics.

Post a Comment

0 Comments