Dr. D. Y. Patil Pratishthan's
Institute for Advanced Computing and Software Development

PG-DBDA


COURSE OUTCOME

After completing this course students will be trained in statistics and machine learning using Python. They will make data driven decisions which provide them a competitive advantage in the market, technologies like Hadoop, Spark, Hive, Machine Learning provides a spring board for AI which makes them ready for Industry 4.0. At the end of the course students will be able to work as Data Analysts, Data Engineers. Studying Big Data will broaden their horizon by surpassing market forecast / predictions for Big Data Analytics

ELIGIBILITY

The Post Graduate Diploma in Big Data Analytics (PG-DBDA) is a fulltime post graduate course comprising of 9 Compulsory Modules, aptitude, communication and a Project.

Qualification:

The educational criteria for PG-DBDA course is:
1. Graduate in Engineering (10+2+4 or 10+3+3 years) in IT / Computer Science / Electronics / Telecommunications / Electrical / Instrumentation. OR
2. MSc/MS (10+2+3+2 years) in Computer Science, IT, Electronics. OR
3. Graduate in any discipline of Engineering, OR
4. Post Graduate Degree in Management with corresponding basic degree in Computer Science, IT,
Computer Application OR
5. Post Graduate Degree in Mathematics / Statistics / Physics OR
6. MCA, MCM
7. The candidates must have secured a minimum of 55% marks in their qualifying examination.

    Note: The candidates must have secured a minimum of 55% marks in their qualifying examination

Application Form:

C-DAC's application form is common to Post Graduate Diploma in Big Data Analytics (PG-DBDA) Application forms for all the courses are to be filled online at http://acts.cdac.in (recommended).

C-CAT Application Fee:
Category-wise C-CAT examination fee

Course Category

C-CAT Paper(s)

Examination fee

I

A

Rs. 1350/-

II

A+B

Rs. 1550/-

III

A+B+C

Rs.1750/-

After filling the online C-CAT application form, the examination fee may be paid online through the ‘Make Payment’ step on the main menu of the online application. No cheque or demand draft (DD) will be accepted towards payment of C-CAT examination fee.

Online: The examination fee can be paid using credit/debit cards and net banking through the payment gateway that will be opened upon clicking the 'Online' option of the 'Make Payment' step. Candidates are advised to follow the instructions/steps given on the payment gateway, and also print/keep the transaction details for their records.

SELECTION PROCESS

Admissions to all PG Diploma courses of C-DAC are done through C-DAC's Computerised Common Admission Test (C-CAT). Candidates have to apply for C-CAT online at www.cdac.in or acts.cdac.in . Every year, C-CAT is usually conducted in July(for Sept admissions) and January (for March admissions).

Candidates will be provided ranks based on their performance in Section A, Sections A+B, Sections A+B+C of C-CAT. Along with the ranks, information on how many candidates are there above him/her in the courses applied will also be indicated.

If a candidate appears for multiple sections, he/she will be provided multiple ranks depending on his/her choice of courses at the time of filling the application form. For example, if a candidate appears for Sections A and B and had chosen courses under Category I and Category II in the application form, he/she shall be provided two ranks: (i) based on the performance in Section A, and (ii) based on the performance in Sections A+B. However, if a candidate appears for Sections A and B but had chosen only courses under Category II in the application form, he/she will be provided only one rank based on the performance in Sections A+B.

Candidates with the lowest 10% performances in Section A, Section B and Section C will not be considered for ranking in any category. Even after the removal of the lowest 10% performers as stated above, if there exist candidates in any category with zero or less than zero marks, then these candidates are also not considered for ranking. The remaining candidates will be ranked based on their performance in Section A (for candidates who have applied for Category I courses), total performance in Sections A+B (for candidates who have applied for Category II courses), and total performance in Sections A+B+C (for candidates who have applied for Category III courses).

If two or more candidates have acquired the same marks in Section A or Sections A+B or Sections A+B+C, then the candidate having more marks in Section A will be given the higher rank. If these candidates have the same marks in Section A also, then the candidate having higher value in the ratio of 'number of correct answers / number of attempted questions' in the specific section required only for that category of courses will be given the higher rank. Candidates who have the same value of this ratio and having the same total marks as well as marks in Section A will be given the same rank.

Admissions to C-DAC's PG Diploma courses at various training centres will be offered in the order of ranks obtained in C-CAT and based on the preferences of courses and centres given by the candidates. Only those candidates who are in the C-CAT rank-list will be considered for admissions to C-DAC's PG Diploma courses.

Rank-lists of Jan 2024 C-CAT are only applicable for admission to the March 2024 intake of C-DAC's PG Diploma courses. Candidates should note that mere appearance in C-CAT or being in any of the rank-lists neither guarantees nor provides any automatic entitlement to admission. Qualified candidates will have to apply for admission as per the prescribed procedure.

Important dates related to admission to C-DAC’s PG Diploma courses of August 2024 batch.

Sr No.
Event
Dates
1 Beginning of Online Registration and Application for C-CAT 28 May 2024         
2 Closing of Online Registration & Application, and Payment of Application Fee 26 June 2024         
3 Downloading of C-CAT Admit Cards 2 - 6 July, 2024         
4 C-DAC's Common Admission Test (C-CAT) 06 July 2024          07 July 2024
5 Announcement of C-CAT Ranks 19 July 2024         
6 Online Selection of Courses and Centers (1st Counseling) 19 - 29 July 2024         
7 Declaration of First Round of Seat Allocation 31 July 2024         
8 Last Date of Payment of first installment for candidates allocated seats through the first round 7 August 2024 (till 5pm)         
9 Declaration of Second Round of Seat Allocation 9 August 2024         
10 Last Date of Payment of first installment for candidates allocated seats through the second round 14 August 2024 (till 5pm)         
11 Payment of Caution Deposit and Online selection of course and centre (2nd Counseling) 16 - 22 August, 2024 (till 5 pm)         
12 Declaration of Third Round of Seat Allocation(based on 2nd Counseling) 23 August 2024         
13 Last Date of Payment of Balance Course Fee 26 August 2024         
14 Last Date of Registration of Students 28 August 2024         
15 Start of PG Diploma Courses across India 29 August 2024         


COURSE FEE

The Post Graduate Diploma in Big Data Analytics (PG-DBDA) course will be delivered in fully ONLINE or fully PHYSICAL mode. The total course fee and payment details for the fully PHYSICAL or fully ONLINE mode of delivery is as detailed herein below:

1. PHYSICAL Mode of Delivery:
The course fee for the fully PHYSICAL mode of delivery is INR. 1,15,000/- plus Goods and Service Tax (GST) as applicable by Government of India (GOI).
The course fee for PG-DBDA has to be paid in two installments as per the schedule.

  • First installment is INR. 10,000/- plus Goods and Service Tax (GST) as applicable by GOI.
  • Second installment is INR. 1,05,000/- plus Goods and Service Tax (GST) as applicable by GOI.
  • 2. ONLINE Mode of Delivery:
    The course fee of the fully ONLINE mode of delivery is INR. 97,750/- plus Goods and Service Tax (GST) as applicable by GOI.
    The course fee for PG-DBDA has to be paid in two installments as per the schedule.

  • First installment is INR. 10,000/- plus Goods and Service Tax (GST) as applicable by GOI.
  • Second installment is INR. 87,750/- plus Goods and Service Tax (GST) as applicable by GOI.

  • The course fee includes expenses towards delivering classes, conducting examinations, final mark-list and certificate, and placement assistance provided

    The first installment course fee of Rs 10,000/- + GST on it as applicable at the time of payment is to be paid online as per the schedule. It can be paid using credit/debit cards through the payment gateway. The first installment of the course fees is to be paid after seat is allocated during counseling rounds.

    The second installment of the course fees is to be paid before the course commencement through NEFT.

    NOTE: Candidates may take note that no Demand Draft (DD) or cheque or cash will be accepted at any C-DAC training centre towards payment of any installment of course fees

    C-CAT PREPARATION

    1) From the current academic year admissions to all PG diploma courses will be made through a common Admission test (C-CAT).
    2) C-CAT will be conducted in the form of three test papers labeled as

    SECTION - A (English, Quantitative Aptitude, Reasoning, Computer Fundamentals & Concepts of Programming)
    SECTION - B (Computer Fundamentals, C Programming, Data Structures, Data Communications & Networks, Object Oriented Programming, Operating Systems) (C Programming, Data Structures, Object Oriented Programming Concepts using C++, Operating Systems & Networking, Basics of Big Data & Artificial Intelligence)
    SECTION - C (Computer Architecture, Digital Electronics, Microprocessors)

    Depending upon the choice(s) of the programme(s) made by the candidate he/she will have to either appear in just one test paper (SECTION - A) or two test papers (SECTION - A and SECTION - B) or all the three test papers (SECTION - A, SECTION - B and SECTION - C).

    Depending on the course chosen, candidate need to appear for the test papers (relevant sections) as per the table given below:


    Programme(s) Test paper(s) to be taken

    PG Diploma in Big Data Analytics

    (PG-DBDA)
    Section A + Section B

    3) Those Candidates who qualify in C-CAT 2024 (every occurrence) will be offered admission to various PG diploma courses covered on the basis of their ranks and choices. There is no age restriction to appear in C-CAT 2024.

    4) Candidates may chose one of the dates as per their convenience while filling the application. The choice of date once made will not be altered unless approved in writing by C-DAC.

    5)C-CAT 2024 will be held on 06 July 2024 and 07 July 2024 Candidates may choose one of the city as per their convenience while filling the application. The choice of date once made will not be altered unless approved in writing by C-DAC.

    6)To apply for admission to a desired programme, a candidate is required to qualify in the corresponding test paper(s) and also satisfy the minimum eligibility criteria of the respective academic programme.

    7) There is no age restriction for admission to C-DAC’s PG Diploma courses. Candidates who have appeared for the final examination of their qualifying degree in 2024 will also be considered for admission to the above courses. By qualifying in C-DAC's admission tests of July 2024, such university result-awaiting candidates can apply for provisional admission in August 2024 , subject to the condition that: (a) All parts of their qualifying degree examination shall be completed by the date of joining the course, and (b) Proof of having passed the qualifying degree with at least the required minimum marks shall be submitted at C-DAC by 31 December 2024.

    8) The candidates will be provided ranks based on their performance in Section A, Section A+B, Section A+B+C. If a candidate appears in multiple sections, he/she shall be provided multiple ranks accordingly. For example if a candidate appears in Section A and Section B, he/she shall be provided two ranks, based on performance in Section A and based on performance in Section A and B. A candidate can appear only in those sections which are chosen at the time of filling in the application. A candidate, who has not appeared for a particular section, will not get any position in the merit lists, which span over that section. For each programme a separate merit list will be prepared from the list of candidates opting for that programme. Admissions to various programmes at different centres will be made on the basis of merit in C-CAT 2024 subject to fulfilling of eligibility requirements.

    9) Candidates should note that mere appearance in C-CAT 2024 or being in any of the merit list neither guarantees nor provides any automatic entitlement to admission. Qualified candidates will have to apply for admission as per the prescribed procedure. Admissions shall be made in order of merit based on the choice exercised by the candidate and depending on the number of seats available in the programmes at the Admitting Centre(s).

    10) With regard to the interpretation of the provisions of any matter not covered in this Information Brochure, the decision of the C-DAC shall be final and binding on all the parties concerned.

    The C-CAT will test the candidate's knowledge of the above topics. The candidate must possess good knowledge of C Language in terms of the syntax and its appropriate use. The candidate should carefully study the books recommended herein. However, merely reading language constructs from the book cannot develop programming ability. It is absolutely necessary to actually write one's own code in C Language and implement at least 100 good C Programs on a computer. These programs should be of increasing complexity and should exploit appropriate constructs and advanced features of C. Candidates should solve all the problems given in the recommended books. This will help the candidates in not only mastering the language but also develop good problem solving ability, which is most critical for any successful career.

    The applicant should also practice the use of good features of the language, modularize his/her code, put suitable comments to improve readability of the code, make extensive use of library routines and format the programs to express the logical flow clearly.

    The candidate should note that the rigorous programming practice as prescribed above is not only required to succeed in the C-CAT but is also required to learn various modules of PG-DBDA with rapid pace. The rigorous programming practice is in fact the most important prerequisite to undertake the PG-DBDA Course and possible successful career in the IT industry thereafter. The candidate may avail the facility of online Pre-CAT course. The candidate may also contact the nearest Authorised Training Centre for attending the Pre-CAT course.



    SYLLABUS FOR COMMON ENTRANCE TEST (C-CAT)

    The C-CAT will be conducted in computerized mode in various cities across India.The C-CAT centres will be allocated to candidates on a first-come, first-served basis of application, depending on the centres' seating capacity.

    The C-CAT date and city once selected in the online application form cannot be changed unless approved in writing by C-DAC, subject to availability of seats in requested city. All such signed letters of requests with proof of valid reasons should be received at C-DAC ACTS, 5th Floor, Innovation Park, Sr. No. 34/B/1, Panchvati, Pashan, Pune 411008, before the last date of C-CAT application.
    CCAT will be conducted in the form of three objective type test papers labeled as

    Section – A (English, Quantitative Aptitude, Reasoning, Computer Fundamentals & Concepts of Programming)
    Section – B (C Programming, Data Structures, Object Oriented Programming Concepts using C++, Operating Systems & Networking, Basics of Big Data & Artificial Intelligence)
    Section – C (Computer Architecture, Digital Electronics, Microprocessors)

    Every section will have 50 objective-type questions of 3 marks each (maximum 150 marks for any one section). Each objective-type question in C-CAT will have four choices as possible answers of which only one will be correct. There will be +3 (plus three) marks for each correct answer and -1 (minus one) for each wrong answer. Multiple answers to a question will be treated as a wrong answer. For each un-attempted question, 0 (zero) mark will be awarded.

    TEST PAPER

    TOPICS

    DURATION

    Section A

    English, Quantitative Aptitude, Reasoning, Computer Fundamentals & Concepts of Programming

    1 hour

    Section B

    C Programming, Data Structures, Object Oriented Programming Concepts using C++, Operating Systems & Networking, Basics of Big Data & Artificial Intelligence

    1 hour

    Section C

    Computer Architecture, Digital Electronics, Microprocessors

    1 hour


    C-CAT. SECTION

    TOPIC

    REFERENCE BOOK

    A

    English

    Any High School Grammar Book (e.g. Wren & Martin)

    Quantitative Aptitude & Reasoning

    Quantitative Aptitude Fully Solved (R. S. Aggrawal) Quantitative Aptitude (M Tyara) Barron’s New GRE

    Computer Fundamentals & Concepts of Programming

    Foundations of Computing (Pradeep Sinha & Priti Sinha)

    B

    C Programming

    C Programming Language (Kernighan & Ritchie)
    Let Us C (Yashavant Kanetkar)

    Data Structures

    Data Structures Through C in Depth (S. K. Srivastava)

    Operating Systems & Networking

    Operating System Principles (Silberschatz, Galvin, Gagne)
    Data Communication & Networking (Forouzan)

    OOP Concepts using C++

    Test Your C ++ Skills (Yashavant Kanetkar)

    Basics of Big Data & AI

    Fundamentals of Data Engineering (Joe Reis, Matt Housley)
    Artificial Intelligence for Dummies (John Paul Mueller, Luca Massaron)

    C

    Computer Architecture

    Computer Organization & Architecture (William Stallings)

    Digital Electronics

    Digital Design (Morris Mano)
    Digital Design: Principles & Practices (John Wakerly)
    Modern Digital Electronics (R. P. Jain)

    Microprocessors

    Microprocessor Architecture, Programming & Applications with 8085 (Ramesh Gaonkar)
    The Intel Microprocessor (Barry Brey)


    August 2024 C-CAT SCHEDULE

    Schedule of August 2024 C-CAT (The slot timings may vary slightly. The final timings will be printed on the admit cards.)

    C-CAT Dates

    Test Paper

    Morning Slot Timings

    Afternoon Slot Timings

    6 July 2024 and 7 July 2024

    Section A

    9:30 am – 10:30 am

    2:00 pm – 3:00 pm

    Section B

    10:45 am – 11:45 am

    3:15 pm – 4:15 pm

    Section C

    12:00 noon – 1:00 pm

    4:30 pm – 5:30 pm


    Important dates related to admission to C-DAC’s PG Diploma courses of August 2024 batch.
    Sr No.
    Event
    Dates
    1 Beginning of Online Registration and Application for C-CAT 28 May 2024         
    2 Closing of Online Registration & Application, and Payment of Application Fee 26 June 2024         
    3 Downloading of C-CAT Admit Cards 2 - 6 July, 2024         
    4 C-DAC's Common Admission Test (C-CAT) 06 July 2024          07 July 2024
    5 Announcement of C-CAT Ranks 19 July 2024         
    6 Online Selection of Courses and Centers (1st Counseling) 19 - 29 July 2024         
    7 Declaration of First Round of Seat Allocation 31 July 2024         
    8 Last Date of Payment of first installment for candidates allocated seats through the first round 7 August 2024 (till 5pm)         
    9 Declaration of Second Round of Seat Allocation 9 August 2024         
    10 Last Date of Payment of first installment for candidates allocated seats through the second round 14 August 2024 (till 5pm)         
    11 Payment of Caution Deposit and Online selection of course and centre (2nd Counseling) 16 - 22 August, 2024 (till 5 pm)         
    12 Declaration of Third Round of Seat Allocation(based on 2nd Counseling) 23 August 2024         
    13 Last Date of Payment of Balance Course Fee 26 August 2024         
    14 Last Date of Registration of Students 28 August 2024         
    15 Start of PG Diploma Courses across India 29 August 2024         

    IMPORTANT NOTE
    In all matters concerning C-CAT , the decision of C-DAC will be final and binding on all the applicants.


    COMPUTING FACILITIES

    Given below is the computing setup that exists at our institute. A minimum of 06 hrs per day computer time on a dedicated client node is to be shared by 2 students. The institute is open 24 hours even on all Sundays / Holidays.

    Servers
    Windows 2012 Server,
    SCO Unix Server with ODT or Fedora or Sun Solaris,
    Application Servers / Dummy Servers configured for various modules.

    Configuration
    Quad Core 1.3 GHz with 8 GBRAM,
    Fast Wide SCSI Interface,
    1 TB Fast HDD (minimum),
    LED Color Monitor (18.5"),
    AGP Card with 4/8 MB VRAM,
    PCI Network Card 10/100 BaseT UTP Ethernet,
    DVD RW Drive,

    Clients Machines / Network Nodes
    Configuration
    Core i7 3.0 Ghz, 8 GB RAM,
    1 TB GB Hard Disk IDE,
    LED Color Monitor (18.5"),
    AGP- 64 Bit VGA Card with 8 MB/4 MB VRAM,
    PCI Network Card 10/100 BaseT UTP Ethernet,
    Microsoft/Logitech Mouse,
    2 serial ports; 1 parallel port,
    104 Keys Keyboard.

    Network
    Network 10/100 BaseT UTP Switches

    Communication and Internet
    Lease Line 16 Mbps Connectivity.

    Printers
    HP LaserJet Printer

    Additional Lab Equipment / Audio Visual Equipment
    Sound cards,
    Video cards,
    Color Scanner,
    Modem 56 KBPS,
    Microphones,
    Speakers,
    Television Set,
    Hi-Lumen OHPs,
    Video Projection Unit (SVGA/XGA Compatible).

    Common Software’s and Operating Environments
    SCO Unix OR Fedora and Windows 2008 Server,
    Windows IIS Server, etc,.
    Suitable CASE Tools
    JDK 1.8, JDK, Java Web Server, Eclipse, Jboss,
    Oracle 11g,
    MS SQL Server 2008
    Microsoft Office 2010
    Python 3.3.4
    MySQL, MongoDB and NOSQL
    D3.js, npm Package for nodejs/Tableu software
    VMware
    Eclipse IDE Juno
    Packages of Hadoop & Hadoop Distribution
    Wireshark
    Intel Parallel Studio XE
    R Packages
    MPI


    EVALUATION METHODOLOGY

    The evaluation process forms an important part of the course that leads to conferring the Diploma in Advanced Computing upon the eligible students.
    The evaluation is a continous process that goes on throughout the duration of the course. Normally, evaluation for each module is carried out as soon as the module ends and the results for each module are announced within fifteen days of the end of the module. The final result of the Diploma in Advanced Computing course is usually declared within 15 days of completing evaluation of the final module of the course.
    The evaluation will consist of three components: a written test, a laboratory test and ongoing evaluation of lab assignments.

    The weightage for each component will normally be:

    Weightage Percentage
    Theory examination – (CEE) Conducted By C-DAC ACTS 40%
    Laboratory examination 40%
    Internal marks (Lab assignments, surprise tests, viva, seminars etc. ) 20%

    There may be variation in these ratios for the following modules:

    Operating System Concepts, Software Engineering and Data Communication and Networking. A student will have to score a minimum of 40% marks in each component of the evaluation in order to successfully complete any module. A student will have to successfully complete all modules of the course to be eligible for receiving the Diploma in Advanced Computing. The question papers for the theory as well as the laboratory examinations at all the centers will be set by ACTS, Pune. The evaluation of the written and laboratory will be conducted locally by the centers according to guidelines and model answers provided by ACTS, Pune. The lab examination problems will also be provided by ACTS, Pune.



    The student will be awarded a grade based on his aggregate score of all modules as per the following scale:

    Grade Percentage
    A+ 85% and above
    A 70-84.9 %
    B 60-69.9 %
    C 50-59.9 %
    D 40-49.9 %
    F Below 40%

    A student who is absent for a test or is unable to successfully clear any module at the first attempt may be allowed to appear for a re-examination at the discretion of the course coordinator. However, his score at the re-examination will be de-rated by 20%. Only one re-examination will be conducted.

    A student has to successfully complete all the modules and clear both lab and theory exam in order to be eligible to receive the Diploma in Advanced Computing. Students unable to complete all the modules within the course duration will be awarded a certificate for the modules successfully cleared by him/her. No student will be allowed to appear for any module after completion of the course duration. Performance statements and certificates will be issued to all students by ACTS, Pune within 15 days of completing evaluation of the final module of the course.


    Course Contents

    Linux Programming

    Installation (Ubuntu and CentOS), Basics of Linux, Configuring Linux, Shells, Commands, and Navigation, Common Text Editors, Administering Linux, Introduction to Users and Groups, Linux shell scripting, shell computing, Introduction to enterprise computing, Remote access.

    Introduction to Cloud Computing

    Cloud Computing Basics, Understanding Cloud Vendors (AWS/Azure/GCP), Definition, Characteristics, Components, Cloud provider, SAAS, PAAS, IAAS and other Organizational scenarios of clouds, Administering & Monitoring cloud services, benefits and limitations, Deploy application over cloud. Comparison among SAAS, PAAS, IAAS, Cloud Products and Solutions, Cloud Pricing, Compute Products and Services, Elastic Cloud Compute, Dashboard.

    Python Programming

    Python basics, If, If- else, Nested if-else, Looping, For, While, Nested loops, Control Structure, Break, Continue, Pass, Strings and Tuples, Accessing Strings, Basic Operations, String slices, Working with Lists, Accessing list, Operations, Function and Methods, Files, Modules, Dictionaries, Functions and Functional Programming, Declaring and calling Functions, Declare, assign and retrieve values from Lists, Introducing Tuples, Accessing tuples, Visualizing using Matplotlib, Seaborn, OOPs concept, Class and object, Attributes, Inheritance, Overloading, Overriding, Data hiding, Operations Exception, Exception Handling, except clause, Try-finally clause, User Defined Exceptions, Data wrangling, Data cleaning.

    R Programming

    Reading and Getting Data into R, Exporting Data from R, Data Objects-Data Types & Data Structure. Viewing Named Objects, Structure of Data Items, Manipulating and Processing Data in R (Creating, Accessing, Sorting data frames, Extracting, Combining, Merging, reshaping data frames), Control Structures, Functions in R (numeric, character, statistical), working with objects, Viewing Objects within Objects, Constructing Data Objects, Packages – Tidyverse, Dplyr, Tidyr etc., Queuing Theory, Non parametric Tests- ANOVA, chi-Square, t-Test, U-Test, Interactive reporting with R markdown, Introduction to Rshiny.

    Oops Concepts, Data Types, Operators and Language, Constructs, Inner Classes and Inheritance, Interface and Package, Exceptions, Collections, Threads, Java.lang, Java.util, Java Virtual Machine, Reflection in JVM, JVM’s architecture, Lambda Expressions, Functional Programming and Interfaces, Introduction to Streams, Introduction of JDBC API.

    Introduction to Business Analytics using some case studies, Summary Statistics, Making Right Business Decisions based on data, Statistical Concepts, Descriptive Statistics and its measures, Probability theory, Probability Distributions (Continuous and discrete- Normal, Binomial and Poisson distribution) and Data, Sampling and Estimation, Statistical Interfaces, Predictive modeling and analysis, Bayes’ Theorem, Central Limit theorem, Data Exploration & preparation, Concepts of Correlation, Covariance, Outliers, Regression Analysis, Forecasting Techniques, Simulation and Risk Analysis, Optimization, Linear, Nonlinear, Integer, Overview of Factor Analysis, Directional Data Analytics, Functional Data Analysis , Predictive Modelling (From Correlation To Supervised Segmentation): Identifying Informative Attributes, Segmenting Data By Progressive Attributive, Models, Induction And Prediction, Supervised Segmentation, Visualizing Segmentations, Trees As Set Of Rules, Probability Estimation; Overfitting And Its Avoidance: Generalization, Holdout Evaluation Vs Cross Validation; Decision Analytics: Evaluating Classifiers, Analytical Framework, Evaluation, Baseline, Performance And Implications For Investments In Data; Evidence And Probabilities, Explicit Evidence Combination With Bayes Rule, Probabilistic Reasoning, Business Strategy, Achieving Competitive Advantages, Sustaining Competitive Advantages.

    Python Libraries

    Pandas, Numpy, Scipy, Scrapy,Plotly, Beautiful soup

    Database Concepts (File System and DBMS), OLAP vs OLTP, Database Storage Structures (Tablespace, Control files, Data files), Structured and Unstructured data, SQL Commands (DDL, DML & DCL), Stored functions and procedures in SQL, Conditional Constructs in SQL, data collection, Designing Database schema, Normal Forms and ER Diagram, Relational Database modelling, Stored Procedures, Triggers. The tools and how data can be gathered in a systematic fashion, Data ware Housing concept, No-SQL, Data Models - XML, working with MongoDB, Cassandra- overview, architecture, comparison with MongoDB, working with Cassendra, Connecting DB’s with Python, Introduction to Data Driven Decisions, Enterprise Data Management, data preparation and cleaning techniques.

    Introduction to Big Data

    Beyond the Hype, Big Data Skills and Sources of Big Data, Big Data Adoption, Research and Changing Nature of Data Repositories, Data Sharing and Reuse Practices and Their Implications for Repository Data Curation.

    Hadoop

    Introduction of Big data programming-Hadoop, The ecosystem and stack, The Hadoop Distributed File System (HDFS), Components of Hadoop, Design of HDFS, Java interfaces to HDFS, Architecture overview, Development Environment, Hadoop distribution and basic commands, Eclipse development, The HDFS command line and web interfaces, The HDFS Java API (lab), Analyzing the Data with Hadoop, Scaling Out, Hadoop event stream processing, complex event processing, MapReduce Introduction, Developing a Map Reduce Application, How Map Reduce Works, The MapReduce Anatomy of a Map Reduce Job run, Failures, Job Scheduling, Shuffle and Sort, Task execution, Map Reduce Types and Formats, Map Reduce Features, Real-World MapReduce.

    Hadoop Environment

    Setting up a Hadoop Cluster, Cluster specification, Cluster Setup and Installation, Hadoop Configuration, Security in Hadoop, Administering Hadoop, HDFS – Monitoring & Maintenance, Hadoop benchmarks.

    Apache Airflow

    Introduction to Data warehousing and Data lakes, Designing Data warehousing for an ETL Data Pipeline, Designing Data Lakes for ETL Data Pipeline, ETL vs ELT.

    Introduction to HIVE

    Programming with Hive: Data warehouse system for Hadoop, Optimizing with Combiners and Practitioners (lab), Bucketing, more common algorithms: sorting, indexing and searching (lab), Relational manipulation: map-side and reduce-side joins (lab), evolution, purpose and use, Case Studies on Ingestion and warehousing.

    HBase

    Overview, comparison and architecture, java client API, CRUD operations and security

    Apache Spark APIs for large-scale data processing:

    APIs for large-scale data processing: Overview, Linking with Spark, Initializing Spark, Resilient Distributed Datasets (RDDs), External Datasets, RDD Operations, Passing Functions to Spark, Job optimization, Working with Key-Value Pairs, Shuffle operations, RDD Persistence, Removing Data, Shared Variables, EDA using PySpark, Deploying to a Cluster Spark Streaming, Spark MLlib and ML APIs, Spark Data Frames/Spark SQL, Integration of Spark and Kafka, Setting up Kafka Producer and Consumer, Kafka Connect API, Mapreduce, Connecting DB’s with Spark.

    Business Intelligence- requirements, content and managements, information Visualization, Data analytics Life Cycle, Analytic Processes and Tools, Analysis vs. Reporting, MS Excel: Functions, Formula, charts, Pivots and Lookups, Data Analysis Tool pack: Descriptive Summaries, Correlation, Regression, Introduction to Power BI, Modern Data Analytic Tools, Visualization Techniques.

    Supervised and Unsupervised Learning , Uses of Machine learning , Clustering, K means, Hierarchical Clustering, Decision Trees, Classification problems, Bayesian analysis and Naïve Bayes classifier, Random forest, Gradient boosting Machines, Association rules learning, PCA, Apriori, Support vector Machines, Linear and Non liner classification, ARIMA, XG Boost, CAT Boost, Neural Networks and its application, Tensorflow 2.x framework, Deep learning algorithms, KNN, NLP, Bert in NLP,NLP transformers, NLTK, Introduction to Pytorch framework, AI and its application.

    No Contents To Show

    Stay Connected
    To the Best institute for PG-DBDA in Pune.
    Please fill out enquiry below and we'll get right back to you