To start from scratch, finish a spark-based PM2.5 analysis project in five minutes, would you ask
1. Where is the PM2.5 data?
2. Where is the spark environment?
3. How is the program compiled?
Don't worry, follow me, 5 minutes to get everything done from scratch.
Preparing the Spark environment
Today, in various public clouds, it is possible to apply to spark's environment. But completely free, the easiest to start is the spark service above the Hyper Cloud (Supervessel), completely free.
First sign in to the Hyper Cloud home page http://www.ptopenlab.com. If you have not applied for an account before, you can apply directly. The new application account will receive an email from [email protected], click on the link inside to activate the account.
- After logging in, select Big Data Service on the home page.
- Log in to the Big Data service and enter your registered username and password again on the sign-in meeting. You can access the Big Data Services page.
- Click Create to access the Create Big Data cluster interface. Currently, there are two environments for MapReduce and spark in the hyper-cloud. We select Spark and select the smallest single node, as shown in.
- When you click "Confirm Create", it takes about 30 seconds for a single-node spark environment to build successfully. You can see the following interface.
Start from scratch and teach you 5 minutes using spark to analyze PM2.5 data (including environment preparation and spark code)