This document lists step by step guide for OSX, Ubuntu server 18.04 and AWS RedHat6.7. When the commands used in the systems are different we will list all and label them, otherwise means the commands are the same for all systems. The installation process on other distributions should be similar.
EDDiE depends on
- Java (JDK 8)
- Spark (2.1+, recommend 2.3)
- Python (2.7.13+) with Jupyter and Pandas
If any of them already installed, you can skip that session.
1. Install Java (JDK 8)
- OSX: Download Oracle JDK8
https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
- RedHat or Centos: Install OpenJDK 8
$ sudo yum update
$ sudo yum install java-1.8.0-openjdk-devel
- Ubuntu: Install OpenJDK 8
$ sudo apt-get update
$ sudo apt-get install openjdk-8-jdk
Installation Verification
java -version
2. Install Spark (recommend 2.3)
Download and unpack Spark:
https://spark.apache.org/downloads.html
Add the following to your shell profile(~/.bash_profile
)
export PATH="path_to_your_spark/bin:${PATH}"
Then source ~/.bash_profile
.
Installation Verification
spark-submit --version
3. Install Anaconda
EDDiE works on any Python 2.7.13+. If you already have Python installed please use pip
command to install jupyter notebook
and pandas
.
Since Anaconda is designed for Data Science use cases, using Anaconda python will be very convenience for EDDiE and potential projects developed on EDDiE.
https://www.anaconda.com/distribution/
EDDiE works on either Python2 or Python3. For better compatibility, suggest to use Python 2.7 Anaconda.
Please follow Anaconda installation guide and set the system path to point use anaconda python.
Install Verification
python --version