- Why Java?
- Deep Learning for the JVM — Eclipse Deeplearning4J (DL4J)
- Set up an environment for the first time with DL4J
- Run DL4J Examples in 3 steps
Below are three of the fundamental reasons to use Java programming language for Machine Learning operations.
1. Write Once, Run Anywhere
Java programming language is platform-agnostic which can run on Linux, Mac, and Windows. This feature of “Write once, run anywhere” is made possible by Java Virtual Machine (JVM). With Java compiler generates Java binary code (bytecode) from Java source code, the generated bytecode can run on JVM regardless of the machines.
2. High Performance and Automatic Memory Management
Java is built to deliver high performance with Just-In-Time (JIT) compiler. JIT compiler transforms Java bytecode into native instructions at runtime, performing optimization in the process. Other than that, Garbage Collection makes Java efficient by performing automatic memory management.
3. The Standard of the Industry
The use of Java whether on chips, devices, or software packages has become the industry standard practice in production. Developing Deep Learning algorithms using Java takes into consideration of the actual production environment, not just research in the lab. Lots of big companies rewrote machine learning code to a language suited to the system during the stage of AI model deployment, resulted in delay entry to the market.
Deep Learning for the JVM — Eclipse DeepLearning4J
Eclipse DeepLearning4J (DL4J) is an open-source, JVM-based Deep Learning framework. DL4J provides a suite of tools for building production-grade Deep Learning applications. DL4J also supports the integration with Apache Spark and Hadoop, allowing training and inference on CPU or GPU cluster to further accelerate machine learning workloads.
DL4J comprised of a suite of tools such as DataVec, ND4J, LibND4J, RL4J, and others. The modules come together to support Deep Learning operations. You can read more about the complete list of the sub-modules here.
Figure 1. Building Blocks of DeepLearning4J
Figure 1 shows how DeepLearning4J works from neural network modeling to “close to the metal”, controlling the whole software stack. First of all, DataVec serves as a vectorization tool for data of multiple sources and formats. The DeepLearning4J sub-module comes with functionalities to build from multi-layer networks to computation graphs. It also allows the import of Keras and Tensorflow models.
The backends of DL4J is ND4J (think of it as Numpy for the JVM), it is a linear algebra library with switchable backends of either using CPUs or GPUs. The ability to leverage LibND4J (written in C++) for hardware acceleration largely contributed by the existence of JavaCPP, acting as a bridge between ND4J and LibND4J.
Note: JavaCPP is not a module out of Eclipse DeepLearning4J. It is an independent software distribution maintained by Bytedeco.
Set up the environment for the first time with DL4J
Figure 2. Deeplearning4J Getting Started Flow Diagram
Let’s walk through the workflow as shown in Figure 2 to install and configure the paths for these prerequisites:
- Java — Programming Language
- Apache Maven — Dependency Management Tool
- Git — Version Control System
- Intellij IDEA — Integrated Development Environment
If you already have Steps 1–4 fulfilled, jump to next section Run DL4J Examples in 3 Steps.
For Windows 10:
Select and download the file with the product name “Windows x64” from here. Make sure you are choosing a 64-Bit version of Java. You might need to create an Oracle account to proceed with the download.
Figure 3. Selection of Java installation file for Windows
JAVA_HOME Variable Configuration
To set JAVA_HOME path:
- Search for Edit the system environment variables
- Click the Environment Variables button.
- Under System Variables, click New.
- In the Variable Name field, enter JAVA_HOME
- In the Variable Value name, enter your JDK installation path.
Figure 4. The setting of JAVA_HOME path
For Ubuntu 18.04 LTS:
Open a terminal and install with the command
sudo apt install openjdk-8-jdk
Select and download the file(*.dmg) for Mac OS from here.
For three of the systems above, verify that Java is working right after the installation with the command as below:
Figure 5. Java Version Retrieval
Installation of Java Deployment Toolkit (JDK) comes with the associated Java Runtime Environment (JRE) and JavaFX SDK. These will be installed and integrated into the JDK directory structure.
Note: Java 8 is the most widely used version of Java and a variety of enterprise applications still rely on Java 8. DL4J dependencies are not fully compatible with versions later than 8 (Java 9/10/11/12 as of September 2019). Hence, you will need some workarounds if you decided to go for the latest and greatest version of Java.
2. Apache Maven
In DL4J projects, Apache Maven is mandatory for processes such as clean, build, package and install while managing the package dependencies and versions. As your Java projects get more complex, you will be glad that you use Maven instead of using the native “javac”, “java -jar” command line approaches.
Follow the instruction from this link to install Maven on your system.
Verify that Maven is successfully installed using the command below:
Figure 6. Maven Version Retrieval
If you are new to Maven, check out this link: Maven in 5 minutes.
Git is the mostly widely used and the standard for version controlling system. Follow the instructions on this link to install the latest Git on your system.
Verify that Git is successfully installed with the command:
Figure 7. Git Version Retrieval
4. Intellij IDEA
Intellij IDEA is the Integrated Development Environment (IDE) preferably used for DL4J projects. Alternatively, you can also use other IDEs such as Eclipse and Netbeans, but expect the process to be much more complex and error-prone (especially with dealing with dependencies).
For Windows and Mac:
Download and install Intellij Community Version from here.
The recommended way would be installing Intellij through Software Center.
Figure 8. Browsing of Intellij Community Edition in Software Center
Figure 9. Installation of Intellij Community in Ubuntu OS
Run DL4J Examples in 3 Steps
After you have the prerequisites all set up, let’s proceed with cloning the dl4j-examples repository to explore more. This repository contains comprehensive examples on configuring neural networks on DL4J for various use cases.
Enter the following to your terminal/command-line tool.
Figure 10. Cloning of DL4J Examples
Figure 11. Click on Import Project on the Welcome Page of Intellij
Figure 12. Point to the folder of dl4j-examples
After importing the project, resolving Maven dependencies might take a while (10 minutes to more!) depending on your networking speed. Come back after a cup of coffee. You will need to wait until the progress bar (normally at the bottom right corner) to disappear before proceeding to the next step.
Figure 13. Downloading dependencies
You can choose to run any examples in the directories. For illustration, I ran ImageDrawer.java which learns how to draw an image mimicking an input image. You can find the file path to this example through Figure 14.
Figure 14. Directory tree to ImageDrawer.java
Figure 15. Generation of Mona Lisa Painting
Chia Wei Lim
Chia Wei Lim is the founder of CertifAI, where CertifAI aims to equip and certify AI practitioners to foster AI industry growth. She was a technical lead for Asia Region of Skymind. Chia Wei has been in the data science field for more than 6 years. She accelerates the delivery of business value through enabling AI models serving at scale, crossing use cases of different verticals. Apart from that, she delivers technical sharing across China, Indonesia, and Malaysia and is one of the people leading the effort in building up the AI community in Malaysia.