Home Java javaTutorial What to learn about java big data

What to learn about java big data

May 27, 2019 pm 02:30 PM
java

Java big data learning process.

What to learn about java big data

The first stage: Static web page basics (HTML CSS)

1. Difficulty level: one star

2. Comprehensive ability of project tasks in the technical knowledge point stage

3. Main technologies include:

html common tags, common CSS layouts, styles, positioning, etc., and static page design and production methods Wait

Second stage: JavaSE JavaWeb

1. Difficulty level: two stars

2. Technical knowledge point stage project task comprehensive ability

3. Main technologies include:

java basic syntax, java object-oriented (class, object, encapsulation, inheritance, polymorphism, abstract class, interface, common class, internal class, common modification symbols, etc.), exceptions, collections, files, IO, MYSQL (basic SQL statement operations, multi-table queries, subqueries, stored procedures, transactions, distributed transactions), JDBC, threads, reflection, Socket programming, enumerations, generics , Design pattern

4. The description is as follows:

is called Java basics, from shallow to deep technical points, real business project module analysis, and the design and implementation of multiple storage methods. This stage is the most important stage of the first four stages, because all subsequent stages are based on this stage, and it is also the stage with the highest density of learning big data. This stage will be the first time for the team to develop and produce real projects with front and backends (the first stage of technology and the second stage of comprehensive application of technology).

The third stage: front-end framework

1. Difficulty and easy procedures: two stars

2. Technical knowledge point stage project task comprehensive ability

3. Main technologies include:

Java, Jquery, annotation reflection are used together, XML and XML parsing, parsing dom4j, jxab, jdk8.0 new features, SVN, Maven, easyui

4. The description is as follows:

Based on the first two stages, we can turn static into dynamic, which can make the content of our web pages richer. Of course, if from the perspective of marketers, there are professional front-end designers , Our goal in designing this stage is that front-end technology can more intuitively exercise people's thinking and design capabilities. At the same time, we also integrate the advanced features of the second stage into this stage. Taking learners to the next level.

The fourth stage: enterprise-level development framework

1. Difficult and easy procedures: three stars

3.Main technologies include:

Hibernate, Spring, SpringMVC, log4j slf4j integration, myBatis, struts2, Shiro, redis, process engine activity, crawler technology nutch, lucene, webService CXF, Tomcat cluster and hot standby, MySQL read and write separation

The fifth stage: First introduction to big data

1. Difficulty level: three stars

2. Comprehensive ability of project tasks in the technical knowledge point stage

3. Main technologies include:

Part 1 of big data (what is big data, application scenarios, how to learn big databases, virtual machine concepts and installation, etc.), common Linux commands (file management, system management, disk management), Linux Shell programming (SHELL variables, loop control, applications), getting started with Hadoop (Hadoop composition, stand-alone environment, directory structure, HDFS interface, MR interface, simple SHELL, java access hadoop), HDFS (introduction, SHELL, Use of IDEA development tools, fully distributed cluster construction), MapReduce applications (intermediate calculation process, Java operation MapReduce, program running, log monitoring), Hadoop advanced applications (YARN framework introduction, configuration items and optimization, CDH introduction, environment construction), Extension (MAP side optimization, COMBINER usage method, see TOP K, SQOOP export, snapshots of other virtual machine VMs, permission management commands, AWK and SED commands)

4. The description is as follows:

This stage is designed to allow newcomers to have a relatively big concept of big data. How to deal with it? After studying JAVA in the pre-requisite course, you will be able to understand how the program runs on a stand-alone computer. Now, what about big data? Big data is processed by running programs on a cluster of large-scale machines. Of course, big data requires data processing, so similarly, data storage changes from single-machine storage to large-scale cluster storage on multiple machines. (You ask me what is a cluster? Well, I have a big pot of rice. I can finish it by myself, but it will take a long time. Now I ask everyone to eat together. When I am alone, I call it people, but when there are more people? Is it called a crowd? ) Then big data can be roughly divided into: big data storage and big data processing. So at this stage, our course designed the standard of big data: HADOOP. The operation of big data is not that we often use WINDOWS. 7 or W10, but the most widely used system now: LINUX.

The sixth stage: big data database

1.Difficulty level: four stars

2.Technical knowledge point stage project task comprehensive ability

3. Main technologies include: Getting started with Hive (Introduction to Hive, Hive usage scenarios, environment construction, architecture description, working mechanism), Hive Shell programming (table creation, query statements, partitioning and bucketing, index management and views), Hive advanced application (DISTINCT implementation, groupby, join, SQL conversion principle, Java programming, configuration and optimization), introduction to hbase, Hbase SHELL programming (DDL, DML, Java operation table creation, query, compression, filter), detailed description of Hbase Modules (REGION, HREGION SERVER, HMASTER, ZOOKEEPER introduction, ZOOKEEPER configuration, Hbase and Zookeeper integration), HBASE advanced features (read and write processes, data models, schema design read and write hotspots, optimization and configuration)

4. The description is as follows:

This stage is designed to allow everyone to understand how big data handles large-scale data. Simplify our programming time and increase reading speed.

How to simplify it? In the first stage, if complex business correlation and data mining are required, it is very complicated to write MR programs by yourself. So at this stage we introduced HIVE, a data warehouse in big data. There is a keyword here, data warehouse. I know you are going to ask me, so let me first say that the data warehouse is used for data mining and analysis. It is usually a very large data center. The data is stored in large databases such as ORACLE and DB2. These databases are usually Used as a real-time online business. In short, analyzing data based on data warehouse is relatively slow. But the convenience is that as long as you are familiar with SQL, it is relatively easy to learn, and HIVE is such a tool, a SQL query tool based on big data. This stage also includes HBASE, which is a database in big data. I'm confused, haven't you learned about a data "warehouse" called HIVE? HIVE is based on MR, so the query is quite slow. HBASE is based on big data and can perform real-time data query. One for analysis, the other for query.

The seventh stage: real-time data collection

1. Difficult and easy procedures: four stars

2. Technical knowledge point stage project task comprehensive ability

3. Main technologies include:

Flume log collection, KAFKA introduction (message queue, application scenarios, cluster construction), KAFKA detailed explanation (partition, topic, receiver, sender, and ZOOKEEPER Integration, Shell development, Shell debugging), advanced use of KAFKA (java development, main configuration, optimization projects), data visualization (introduction to graphics and charts, CHARTS tool classification, bar charts and pie charts, 3D charts and maps), introduction to STORM ( Design ideas, application scenarios, processing procedures, cluster installation), STROM development (STROM MVN development, writing STORM local programs), STORM advancement (java development, main configuration, optimization projects), KAFKA asynchronous sending and batch sending timeliness, KAFKA global Messages are in order, STORM multi-concurrency optimization

4. The description is as follows:

The data source in the previous stage is based on the existing large-scale data set, and the results after data processing and analysis There is a certain delay, and the data usually processed is the data of the previous day. Example scenarios: website anti-hotlinking, customer account anomalies, and real-time credit reporting. What if these scenarios are analyzed based on the data from the previous day? Is it too late? Therefore, in this stage we introduced real-time data collection and analysis. It mainly includes: FLUME real-time data collection, which supports a wide range of collection sources, KAFKA data reception and transmission, STORM real-time data processing, and data processing at the second level.

The eighth stage: SPARK data analysis

1. Difficulty and easy procedures: five stars

2. Comprehensive ability of project tasks in the technical knowledge point stage

3. Main technologies include: SCALA introduction (data types, operators, control statements, basic functions), SCALA advanced (data structures, classes, objects, traits, pattern matching, regular expressions), SCALA Advanced usage (higher-order functions, Corey functions, partial functions, tail iterations, built-in higher-order functions, etc.), introduction to SPARK (environment construction, infrastructure, operating mode), Spark data sets and programming models, SPARK SQL, SPARK advanced Stage (DATA FRAME, DATASET, SPARK STREAMING principle, SPARK STREAMING support source, integrated KAFKA and SOCKET, programming model), SPARK advanced programming (Spark-GraphX, Spark-Mllib machine learning), SPARK advanced application (system architecture, main configuration and Performance optimization, fault and stage recovery), SPARK ML KMEANS algorithm, SCALA implicit conversion advanced features

4. The description is as follows:

Let’s also talk about the previous stages, mainly the first stage. HADOOP is relatively slow in analyzing large-scale data sets based on MR, including machine learning, artificial intelligence, etc. And it is not suitable for iterative calculations. SPARK is analyzed as a substitute product for MR. How to replace it? Let’s talk about their operating mechanisms first. HADOOP is based on disk storage analysis, while SPARK is based on memory analysis. You may not understand when I say this, but to be more descriptive, just like if you are taking a train from Beijing to Shanghai, MR is a green train, and SPARK is a high-speed rail or maglev. SPARK is developed based on the SCALA language. Of course, it has the best support for SCALA, so we first learn the SCALA development language in the course. What? Want to learn another development language? No no no! ! ! Let me just say one thing: SCALA is based on JAVA. From historical data storage and analysis (HADOOP, HIVE, HBASE) to real-time data storage (FLUME, KAFKA) and analysis (STORM, SPARK), these are all interdependent in real projects.

The above is the detailed content of What to learn about java big data. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Peak: How To Revive Players
1 months ago By DDD
PEAK How to Emote
4 weeks ago By Jack chen

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Tips for Writing PHP Comments Tips for Writing PHP Comments Jul 18, 2025 am 04:51 AM

The key to writing PHP comments is to clarify the purpose and specifications. Comments should explain "why" rather than "what was done", avoiding redundancy or too simplicity. 1. Use a unified format, such as docblock (/*/) for class and method descriptions to improve readability and tool compatibility; 2. Emphasize the reasons behind the logic, such as why JS jumps need to be output manually; 3. Add an overview description before complex code, describe the process in steps, and help understand the overall idea; 4. Use TODO and FIXME rationally to mark to-do items and problems to facilitate subsequent tracking and collaboration. Good annotations can reduce communication costs and improve code maintenance efficiency.

Improving Readability with Comments Improving Readability with Comments Jul 18, 2025 am 04:46 AM

The key to writing good comments is to explain "why" rather than just "what was done" to improve the readability of the code. 1. Comments should explain logical reasons, such as considerations behind value selection or processing; 2. Use paragraph annotations for complex logic to summarize the overall idea of functions or algorithms; 3. Regularly maintain comments to ensure consistency with the code, avoid misleading, and delete outdated content if necessary; 4. Synchronously check comments when reviewing the code, and record public logic through documents to reduce the burden of code comments.

Writing Effective PHP Comments Writing Effective PHP Comments Jul 18, 2025 am 04:44 AM

Comments cannot be careless because they want to explain the reasons for the existence of the code rather than the functions, such as compatibility with old interfaces or third-party restrictions, otherwise people who read the code can only rely on guessing. The areas that must be commented include complex conditional judgments, special error handling logic, and temporary bypass restrictions. A more practical way to write comments is to select single-line comments or block comments based on the scene. Use document block comments to explain parameters and return values at the beginning of functions, classes, and files, and keep comments updated. For complex logic, you can add a line to the previous one to summarize the overall intention. At the same time, do not use comments to seal code, but use version control tools.

Effective PHP Commenting Effective PHP Commenting Jul 18, 2025 am 04:33 AM

The key to writing PHP comments is clear, useful and concise. 1. Comments should explain the intention behind the code rather than just describing the code itself, such as explaining the logical purpose of complex conditional judgments; 2. Add comments to key scenarios such as magic values, old code compatibility, API interfaces, etc. to improve readability; 3. Avoid duplicate code content, keep it concise and specific, and use standard formats such as PHPDoc; 4. Comments should be updated synchronously with the code to ensure accuracy. Good comments should be thought from the perspective of others, reduce the cost of understanding, and become a code understanding navigation device.

PHP Development Environment Setup PHP Development Environment Setup Jul 18, 2025 am 04:55 AM

The first step is to select the integrated environment package XAMPP or MAMP to build a local server; the second step is to select the appropriate PHP version according to the project needs and configure multiple version switching; the third step is to select VSCode or PhpStorm as the editor and debug with Xdebug; in addition, you need to install Composer, PHP_CodeSniffer, PHPUnit and other tools to assist in development.

PHP Commenting Syntax PHP Commenting Syntax Jul 18, 2025 am 04:56 AM

There are three common ways to use PHP comments: single-line comments are suitable for briefly explaining code logic, such as // or # for the explanation of the current line; multi-line comments /*...*/ are suitable for detailed description of the functions or classes; document comments DocBlock start with /** to provide prompt information for the IDE. When using it, you should avoid nonsense, keep updating synchronously, and do not use comments to block codes for a long time.

PHP Comparison Operators PHP Comparison Operators Jul 18, 2025 am 04:57 AM

PHP comparison operators need to pay attention to type conversion issues. 1. Use == to compare values only, and type conversion will be performed, such as 1=="1" is true; 2. Use === to require the same value as the type, such as 1==="1" is false; 3. Size comparison can be used on values and strings, such as "apple"

PHP Comments for Teams PHP Comments for Teams Jul 18, 2025 am 04:28 AM

The key to writing PHP comments is to explain "why" rather than "what to do", unify the team's annotation style, avoid duplicate code comments, and use TODO and FIXME tags reasonably. 1. Comments should focus on explaining the logical reasons behind the code, such as performance optimization, algorithm selection, etc.; 2. The team needs to unify the annotation specifications, such as //, single-line comments, function classes use docblock format, and include @author, @since and other tags; 3. Avoid meaningless annotations that only retell the content of the code, and should supplement the business meaning; 4. Use TODO and FIXME to mark to do things, and can cooperate with tool tracking to ensure that the annotations and code are updated synchronously and improve project maintenance.

See all articles