The “Big Data Analysis” Educational Program includes disciplines on the basics of IT technology and software that enable students to position themselves as data analysts, including the development and maintenance of data analysis systems of various scale. As a result of the study, students will be fluent in data analysis skills as well as several programming languages, including Python programming language, and will be able to develop data analysis models and methods for large companies such as banks, insurance companies, government and national organizations and others. development for databases and a web application.
Admission Committee
(7172) 64-57-10
info@astanait.edu.kz
Mon-Fri 9:00 – 18:00
The goal of the study program is to provide practice-oriented training of highly qualified specialists in the field of computer science for enterprises with general cultural and professional competences in the field of big data analysis, as well as create conditions for continuous professional self-improvement, development of social and personal competencies of specialists, expansion of social mobility and competitiveness on labor market.
The course examines the modern history of Kazakhstan, as part of the history of mankind, the history of Eurasia and Central Asia. The modern history of Kazakhstan is a period in which a holistic study of historical events, phenomena, facts, processes, the identification of historical patterns that took place on the territory of the Great Steppe in the twentieth century and to this day is carried out.
The object of study of the discipline is philosophy as a special form of spiritual studies in its cultural and historical development and modern sound. The main directions and problems of world and national philosophy are studied. Philosophy is a special form of cognition of the world, creating a system of cognition of the general principles and foundations of human life, about the essential characteristics of a person’s relationship to nature, society and spiritual life, in all its main direction.
The course includes an intensive English language learning program focused on grammar and conversational skills. The course includes topics reflecting the latest developments in the field of information technology, and the terminology dictionary makes them directly relevant to the needs of students.
The course occupies a special place in the system of bachelor training with engineering education. For engineering students, the study of professional Kazakh/Russian is not only an enhancement of the skills and abilities acquired at school, but also a means of mastering the future profession with a focus on writing and reasoned oral speech allowing for effective communication.
In the course, information and communication technologies are considered as modern methods and means of communication of people in ordinary and professional activities using information technologies for the search, collection, storage, processing and dissemination of information.
The course is dedicated to general political knowledge for specialties in the field of ICT. It includes political self-awareness, improvement of one’s political outlook and communicative competencies. Teaching political knowledge is communicative, interactive, student-oriented, result-oriented, and largely depends on the independent work of students.
The course includes knowledge of sociological subject areas, research methods and directions. The course will discuss in detail the basic sociological theories and the most effective ways of gaining deep knowledge about various aspects of our modern society. The special significance of this course for students is to develop a sociological imagination, to understand the basic concepts of sociology as a science.
This course presents questions of psychology in a wide educational and social context. The knowledge and skills acquired and formed as a result of mastering the course content give students the opportunity to put them into practice in various spheres of life: personal, family, professional, business, social, in working with people from different social groups and age groups.
The course is also designed for the formation of bachelors” ideas about the factors that complicate teaching at the present stage of development of society, about the difficulties specific to this activity. The course will help to become the basis for the study of the whole complex of social and human sciences, as well as an addition to general courses in history and philosophy. The course includes topics such as morphology, semiotics, anatomy of culture; the culture of nomads of Kazakhstan, the cultural heritage of the proto-Türks, the medieval culture of Central Asia, the formation of the Kazakh culture, the Kazakh culture in the context of globalization, the cultural policy of Kazakhstan, etc.
The course is devoted to the formation of the physical culture of the individual and the ability of the directed use of various means of physical culture to maintain and strengthen health.
The course aims to develop an understanding of the fundamentals of linear algebra and matrix theory. The subject of the discipline is the basic properties of matrices, including determinants, inverse matrices, matrix factorizations, eigenvalues, linear transformations, etc.
The academic discipline includes knowledge of analyzing functions represented in a variety of ways, and understanding the relationships between these various representations; understanding the meaning of the derivative in terms of a rate of change and local linear approximation, and using derivatives to solve a variety of problems. The course is aimed at forming students’ mathematical foundation for solving applied problems in their specialty.
Discrete mathematics is a part of mathematics devoted to the study of discrete objects (here discrete means consisting of separate or unrelated elements). More generally, discrete mathematics is used whenever objects are counted, when relationships between finite (or countable) sets are studied, and when processes involving a finite number of steps are analyzed. The main reason for the growing importance of discrete mathematics is that information is stored and processed by computing machines in a discrete manner.
The course teaches you to study the patterns of random phenomena and their properties, and use them for data analysis. As a result of studying this discipline, students will know the basic concepts of probability theory and mathematical statistics and their properties, as well as be able to use probabilistic models in solving problems, work with random variables, calculate sample characteristics, and evaluate the reliability of statistical data.
Academic Writing is aimed to develop the ability in differentiating writing styles in English; skills in critical reading and writing strategies to foster critical thinking and prepare a critical analysis of а written piece; understanding of academic vocabulary, grammar and style; skills in writing well-structured paragraphs; writing statements with arguments and proofs; and writing an academic essay.
Educational practice is an integral part of the student training program. The main content of the practice is the implementation of practical educational, educational and research, creative tasks that correspond to the nature of the future professional activity of students. The purpose of educational practice: the study and consolidation of theoretical and practical knowledge in the disciplines obtained in the learning process, the development of creative activity and initiative of students, their artistic and creative needs and aesthetic worldview.
The course examines basic, classical algorithms and data structures used in programming. The principles of construction and description of algorithms, the concepts of complexity and performance of algorithms, their main classes are considered
This discipline covers an introduction to mathematical courses necessary for mastering specialized disciplines of computational science based on numerical solutions of deterministic and probabilistic equations of mathematical physics and applied models used in technical production and the financial sector. Namely, it covers the theory of ordinary differential equations, their typification and basic methods of analytical solution and an introduction to partial differential equations.
The course provides knowledge and skills in database design, starting from the conceptual stage and ending with physical implementation
The course teaches students to use data structures, functions, modules, classes, and other features of the Python programming language to solve applied problems.
The course introduces students to the concept of software development based on objects and their interaction. In this discipline, students will create classes and objects, define their properties and methods, and use inheritance and polymorphism to create flexible and modular software systems. Object-oriented programming is a widely used programming paradigm, and understanding its principles and practices is important for future software developers.
The course aims to learn the basics of operating systems and computer networks required for software developers to understand the basic principles of using, storing and transmitting data.
The academic discipline is aimed at developing the skills of using project management tools at various stages of the project life cycle. The subject of the discipline is the qualitative and quantitative assessment of project risks and the determination of its effectiveness.
The course teaches students to use a programming language to develop functional websites and interfaces, and also allows them to master the basics of working and interacting with a database. The course teaches the development of functionality and user interface running on the client-side of an application or website. In the process, students will have the opportunity to create and develop a convenient, simple and demanded website.
Database design methodology for NoSQL systems. The approach is based on NoAM (NoSQL Abstract Model), a new abstract data model for NoSQL databases that takes advantage of the common features of various NoSQL systems and is used to define a system-independent application. Overall, the methodology aims to support the scalability, performance, and consistency required for next generation web applications
The course is intended for a more advanced study of the Java programming language, including the study of JSP (Java Server Pages), Servlet, JDBC (Java Database connection), including many basic principles of Java to Enterprise Edition (Advanced Java EE).
The course based on concrete examples. Develop mathematical methods through examples and construct algorithms to solve concrete problems. The course includes the following topics: recursions, sums, integer functions, elementary number theory, binomial coefficients, special numbers, generating functions, discrete probability, asymptotics.
This course is an intermediate class covering the design of computer algorithms and the analysis of sophisticated algorithms. Students learn how to analyze the asymptotic performance of algorithms, and gain familiarity with major algorithms and data structures. They also apply important algorithmic design paradigms and methods of analysis, in addition to synthesizing efficient algorithms in common engineering design situations. Course materials are designed to help students understand the difference between tractable and intractable problems and to become familiar with strategies to deal with intractability.
This course is designed to learn the basics of mobile development. Mobile applications received as a result of the course can be downloaded to the university repositories and also displayed on the Play Store.
The discipline introduces students to the main directions in the development and use of data storage systems. The purpose of teaching the discipline: to create a base for the application of modern methods of data collection and analysis to solve practical problems and to develop students”” ability to create the necessary data warehouse architecture for analyzing large data sets in order to obtain aggregated information.
“Applied Machine Learning” is a discipline that focuses on the practical application of machine learning techniques to solve real-world problems and make predictions or decisions based on data. It covers a wide range of topics such as data preprocessing, feature selection and engineering, model selection and evaluation, supervised and unsupervised learning algorithms, ensemble methods, deep learning, and ethical considerations in machine learning. By studying this discipline, learners will gain a solid understanding of the key concepts and algorithms in machine learning and develop the skills to apply them effectively to various domains and datasets. They will learn how to preprocess and transform data, select appropriate features, train and evaluate models, optimize hyperparameters, interpret model results, and deploy machine learning solutions. The learning outcomes include the ability to identify suitable machine learning techniques for specific problems, build accurate and robust predictive models, and leverage machine learning to derive insights and make informed decisions from complex datasets.
The course is designed to study the basics of working with big data and the principles of high-performance computing. Big data implies the existence of huge arrays of structured and unstructured information, and the choice of tools for their efficient processing and extraction of useful information.
The course is aimed at studying the principles of operation of modern microprocessor technology, which is the basis of universal and specialized computers, and embedded systems, methods of organizing the interaction of a microprocessor with memory and external devices. In the course of studying the course, students should get an idea of the peculiarities of the internal structure of a modern microprocessor.
In this course student learns to implement agents based on Deep Reinforcement Learning, a type of machine learning where an agent learns to behaves in the environment by performing an action and acquiring responce. Students create agents using Tensorflow and Pytorch to learn on their own in simple games. By exploring this methods students will enter agents based on deep reinforcement learning in applied areas
Information Retrieval & Data Mining” is a discipline that explores the principles, techniques, and algorithms for efficiently extracting and analyzing useful information from large datasets. It encompasses various aspects such as information retrieval, which focuses on searching and retrieving relevant information from unstructured data sources like documents or the web, and data mining, which involves discovering patterns, relationships, and insights from structured and unstructured data. This discipline covers topics such as data preprocessing, retrieval models, indexing, query languages, data clustering, classification, association analysis, and evaluation metrics. By studying this discipline, learners will develop a solid understanding of the theoretical foundations and practical techniques used in information retrieval and data mining. They will acquire the skills to design and implement effective retrieval and mining systems, evaluate their performance, and apply them to real-world problems, enabling them to make data-driven decisions and derive valuable insights from vast amounts of information.
Study of natural language processing methods for the analysis of medical texts, including methods of classification, information extraction, automatic text processing and machine learning. During the training, students get acquainted with the principles of work and applications of natural language processing methods in medicine, as well as software tools for processing medical texts. After passing the discipline, students will be able to apply natural language processing methods to analyze medical texts, extract information and evaluate the quality of medical records.
It is a final project where students form teams and work on real problems in the field of electronic engineering. During the project, they apply the acquired knowledge and skills, developing the concept, designing, modeling, testing and implementing their solutions. They also develop communication and management skills by presenting their results in the form of presentations and reports. The Capstone project allows students to gain hands-on experience and prepare for a career in electronic engineering.
The course is designed to study the basic methods and tools required for the introduction of scientific research. The course also introduces students to the most popular search and scientometric databases of scientific articles, such as Web of Science, Scopus, ScienceDirect and others. During the course, students will become familiar with the tools for citing and searching for the required scientific information.
In the first part of the discipline “Statistics & Data Science” studentswill delve into the fundamental concepts and techniques of statistics. This section focuses on providing a strong statistical foundation that is essential for working with big data. They will explore topics such as descriptive statistics, probability theory, hypothesis testing, and regression analysis and learn how to summarize and interpret large datasets using measures of central tendency, dispersion, and graphical representations.
The second part of the discipline “Statistics & Data Science” students focus on the practical application of data science techniques to extract valuable insights from large-scale datasets. This section covers various aspects of data science, including data preprocessing, data visualization, machine learning, and data mining. They will learn how clean, transform, and preprocess big data to address common issues, such as missing values, outliers, and inconsistencies, ensuring data quality for further analysis; explore different visualization techniques to effectively present and communicate insights derived from big data, enabling you to identify patterns, trends, and anomalies; get familiar with machine learning algorithms and techniques suitable for big data analytics, such as classification, regression, clustering, and dimensionality reduction, enabling automated pattern recognition and prediction; understand the principles and methods of data mining to discover meaningful patterns and relationships in large datasets, facilitating decision-making and uncovering hidden insights.
The course presents the collection and analysis of materials for writing a graduation project
The course is devoted to the study of information security technologies.
Topics include (but not limited to) bioinformatics databases, sequence and structure alignment and differential gene expression analysis. In addition, students will also learn how to compare results between different samples.
In this course the possibilities of a legal attack on various web resources will be considered. As part of this course, students will learn how to find vulnerabilities and exploit them. Security bypass methods, the TCP/IP network protocol, Windows internals, and the Python programming language will also be covered.
The course focuses on the practical application of the MapReduce distributed computing model. For the implementation of the algorithms, a freely distributed set of Hadoop utilities has been selected, which is used to implement search and contextual mechanisms of many highly loaded websites during mass-parallel data processing. Currently, Hadoop is considered one of the fundamental technologies when working with Big Data and is used in many industries: healthcare, telecommunications, trade, logistics, financial companies, as well as in public administration.
Fundamental principles of business analysis that apply to projects of all sizes in agile and conventional business environments. Best practices, tools and techniques available to help business analysts achieve project goals and deliver business value. Facilitation techniques and modeling approaches, business analysis planning and monitoring, and requirements elicitation documentation. Agile and Lean approaches to problem definition, strategic and tactical analysis, design thinking principles and solution evaluation techniques. Aligned with the industry standard A Guide to the Business Analysis Body of Knowledge® (BABOK® Guide), published by the International Institute of Business Analysis® (IIBA®).
This course is intended to develop software systems and applications with focus on cloud solutions where it is most effective. Students have the opportunity to work with a variety of cloud technology providers such as Amazon, Google, Microsoft. They will learn how to deploy cloud solutions for databases, data analytics, and machine learning. The course contains following topics: “Load Balancing”, “Scalability, Availability and Fault Tolerance”, “BigQuery”, “Machine Learning on Unstructured Datasets”, etc.
Discipline introduces students to the fundamental concepts, techniques, and applications of generative models in artificial intelligence and machine learning. The course covers a range of generative techniques, including Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and autoregressive models, to generate new data samples from learned distributions. Students will explore the practical applications of generative models in various domains such as image synthesis, natural language processing, and creative AI, while gaining hands-on experience in designing, training, and evaluating these models using popular machine learning frameworks.
The aim of the discipline is to study the fundamental techniques for developing HPC applications, the commonly used HPC platforms, the methods for measuring, assessing and analysing the performance of HPC applications, and the role of administration, workload and resource management in an HPC management software. The students will be introduced to the issues related to the use of HPC techniques in solving large scientific problems.
The course forms students’ understanding of the field of information security, its constituent components, main threats, protocols and protection tools. During the study, students will acquire basic information security skills and become familiar with professional tools and programs.
Bioinformatics is an inter-disciplinary subject that develops and implements novel methodologies and tools for analyzing and learning from biological data.
This course covers the fundamental domain knowledge needed from both biological and computational disciplines to enable further study and research in this subject with a strong practical and theoretical emphasis to increase understanding. No previous knowledge of Bioinformatics is required.
The course covers the area of risk management in a project context; provides basic theories and concepts of risk management applicable to project environments, including planning, preparation and response to project risks; and examines the areas of risk identification, assessment, monitoring and control. The course will introduce students to qualitative and quantitative risk analysis techniques.
The course is designed for students to complete a project, a finished minimum product, which they will be able to present at various competitions (hackathons). The course does not imply a lecture, and all classes will be practice-oriented, with the maximum emphasis on obtaining the results of the finished product. During the course, students should apply all the knowledge gained in the second year, including knowledge on the development of ready-made applications.