avatar

Andres Jaimes

Recommended libraries for Java

By Andres Jaimes

- 10 minutes read - 2085 words

This article lists a curated list of Java libraries that I have used over the years for different projects. They are all well documented, and for most of them, plenty of examples can be found on the web.

Apache Commons Email

Commons Email

Commons Email aims to provide a API for sending email. It is built on top of the Java Mail API, which it aims to simplify.

Apache Commons StringUtils

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-lang3</artifactId>
    <version>3.4</version>
</dependency>
    StringUtils.isBlank(null)      = true;
    StringUtils.isBlank("")        = true; 
    StringUtils.isBlank(" ")       = true; 
    StringUtils.isBlank("bob")     = false; 
    StringUtils.isBlank("  bob  ") = false;

Apache OpenNLP

Apache OpenNLP

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.

It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning.

Apache Solr

Apache Solr

Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world’s largest internet sites.

Apache Tika

Apache Tika

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.

Boilerpipe

Boilerpipe

Install from local files, since mvn repository is not updated as of Oct 21, 2016.

cd <dir-where-jar-files-live>
mvn install:install-file -Dfile=boilerpipe-1.2.0.jar -DgroupId=de.l3s.boilerpipe -DartifactId=boilerpipe -Dversion=1.2.0 -Dpackaging=jar
mvn install:install-file -Dfile=nekohtml-1.9.13.jar -DgroupId=nekohtml -DartifactId=nekohtml -Dversion=1.9.13 -Dpackaging=jar
<dependency>
  <groupId>de.l3s.boilerpipe</groupId>
  <artifactId>boilerpipe</artifactId>
  <version>1.2.0</version>
</dependency>
<dependency>
  <groupId>nekohtml</groupId>
  <artifactId>nekohtml</artifactId>
  <version>1.9.13</version>
</dependency>
<dependency>
  <groupId>xerces</groupId>
  <artifactId>xercesImpl</artifactId>
  <version>2.9.1</version>
</dependency>

boilerpipe provides algorithms to detect and remove the surplus “clutter” (boilerplate, templates) around the main textual content of a web page.

FreeMarker

FreeMarker

Apache FreeMarker is a template engine: a Java library to generate text output (HTML web pages, e-mails, configuration files, source code, etc.) based on templates and changing data. Templates are written in the FreeMarker Template Language (FTL), which is a simple, specialized language (not a full-blown programming language like PHP).

Image Manipulation

imgscalr

Simple Java image-scaling library implementing Chris Campbell’s incremental scaling algorithm as well as Java2D’s “best-practices” image-scaling techniques.

Jackson

Jackson JSON Processor Wiki

<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>2.8.2</version>
    <type>jar</type>
</dependency>

Inspired by the quality and variety of XML tooling available for the Java platform (StAX, JAXB, etc.), the Jackson is a multi-purpose Java library for processing JSON data format. Jackson aims to be the best possible combination of fast, correct, lightweight, and ergonomic for developers.

Java Tuples

Java Tuples

javatuples is one of the simplest java libraries ever made. Its aim is to provide a set of java classes that allow you to work with tuples.

A tuple is just a sequence of objects that do not necessarily relate to each other in any way. For example: [23, “Saturn”, java.sql.Connection@li734s] can be considered a tuple of three elements (a triplet) containing an Integer, a String, and a JDBC Connection object. As simple as that.

jBCrypt

jBCrypt

jBCrypt is a Java implementation of OpenBSD’s Blowfish password hashing code, as described in “A Future-Adaptable Password Scheme” by Niels Provos and David Mazières.

This system hashes passwords using a version of Bruce Schneier’s Blowfish block cipher with modifications designed to raise the cost of off-line password cracking and frustrate fast hardware implementation. The computation cost of the algorithm is parametised, so it can be increased as computers get faster. The intent is to make a compromise of a password database less likely to result in an attacker gaining knowledge of the plaintext passwords (e.g. using John the Ripper).

// Hash a password for the first time
String hashed = BCrypt.hashpw(password, BCrypt.gensalt());

// gensalt's log_rounds parameter determines the complexity
// the work factor is 2**log_rounds, and the default is 10
String hashed = BCrypt.hashpw(password, BCrypt.gensalt(12));

// Check that an unencrypted password matches one that has
// previously been hashed
if (BCrypt.checkpw(candidate, hashed))
    System.out.println("It matches");
else
    System.out.println("It does not match");

Jersey

Jersey, a Java JAX-RS (REST) standard implementation and http client.

Retrieve a page using Jersey Client:

<dependency>
  <groupId>org.glassfish.jersey.core</groupId>
  <artifactId>jersey-client</artifactId>
  <version>2.23.2</version>
</dependency>
public static void main( String[] args )
{   
    String html = ClientBuilder.newClient()
        .target("https://www.google.com").request().get(String.class);
    System.out.println(html);
}

JPA

Eclipse Link (IntelliJ reported this as nonexistent, but after a while it just load it - it may be reindexing or searching the repository)

<dependency>
    <groupId>org.eclipse.persistence</groupId>
    <artifactId>eclipselink</artifactId>
    <version>2.6.4</version>
</dependency>
<dependency>
    <groupId>org.postgresql</groupId>
    <artifactId>postgresql</artifactId>
    <version>LATEST</version>
</dependency>

Create a file called src/main/resources/META-INF/persistence.xml. It is very important not to use the META-INF directory in webapp, if you use it the application will throw an exception: No Persistence provider for EntityManager named SomeName. See https://www.eclipse.org/forums/index.php/t/461257/

src/main/resources/META-INF/persistence.xml:

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0"
             xmlns="http://java.sun.com/xml/ns/persistence"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
    <persistence-unit name="PersistenceUnit" transaction-type="RESOURCE_LOCAL">
        <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider>
        <jta-data-source>jdbc/data</jta-data-source>
        <class>net.jaimes.persistence.entities.User</class>
        <properties>
            <property name="javax.persistence.jdbc.driver" value="org.postgresql.Driver"/>
            <property name="javax.persistence.jdbc.url" value="jdbc:postgresql://localhost:5432/bits"/>
            <property name="javax.persistence.jdbc.user" value="user"/>
            <property name="javax.persistence.jdbc.password" value="password"/>
            <property name="eclipselink.logging.level" value="ALL"/>
            <!-- <property name="eclipselink.ddl-generation" value="drop-and-create-tables" /> -->
        </properties>
    </persistence-unit>
</persistence>

Or if you want to use a JNDI datasource, then you have to modify persistence.xml, context.xml and web.xml. Look how you have to append java:/comp/env/ to the data source name in persistence.xml.

src/main/resources/META-INF/persistence.xml:

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0"
             xmlns="http://java.sun.com/xml/ns/persistence"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
    <persistence-unit name="persistenceUnit" transaction-type="RESOURCE_LOCAL">
        <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider>
        <non-jta-data-source>java:/comp/env/jdbc/data</non-jta-data-source>
        <class>net.jaimes.persistence.entities.User</class>
        <properties>
            <property name="eclipselink.logging.level" value="ALL"/>
            <!-- <property name="eclipselink.ddl-generation" value="drop-and-create-tables" /> -->
        </properties>
    </persistence-unit>
</persistence>

and then in src/main/webapp/META-INF/context.xml:

    <Resource name="jdbc/data" auth="Container"
              type="javax.sql.DataSource" driverClassName="org.postgresql.Driver"
              url="jdbc:postgresql://127.0.0.1:5432/bits"
              username="user" password="password"
              maxActive="5" maxIdle="3" maxWait="-1"/>

and finally in src/main/webapp/WEB-INF/web.xml:

    <resource-ref>
        <description>Datasource</description>
        <res-ref-name>jdbc/data</res-ref-name>
        <res-type>javax.sql.DataSource</res-type>
        <res-auth>Container</res-auth>
    </resource-ref>

Following some examples.

Multiple results:

 TypedQuery<Country> query =
      em.createQuery("SELECT c FROM Country c", Country.class);
  List<Country> results = query.getResultList();

Single result (object):

  TypedQuery<Long> query = em.createQuery(
      "SELECT COUNT(c) FROM Country c", Long.class);
  long countryCount = query.getSingleResult();

Single result (object):

  TypedQuery<Country> query = em.createQuery(
      "SELECT c FROM Country c", Country.class);
  Country country = query.getSingleResult();

Delete

int count = em.createQuery("DELETE FROM Country").executeUpdate();

Update

int count = em.createQuery("UPDATE Country SET area = 0").executeUpdate();

Important Always close() EntityManagers to release a connection to the pool.

jsoup

jsoup

<dependency>
  <groupId>org.jsoup</groupId>
  <artifactId>jsoup</artifactId>
  <version>1.9.2</version>
</dependency>

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

  • scrape and parse HTML from a URL, file, or string
  • find and extract data, using DOM traversal or CSS selectors
  • manipulate the HTML elements, attributes, and text
  • clean user-submitted content against a safe white-list, to prevent XSS attacks
  • output tidy HTML

jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.

Other parsers:

JWT

A JSON Web Token library. I have an article that shows how to create and validate json web tokens using Scala and this library.

<dependency>
    <!-- For JWT manipulation -->
    <groupId>io.jsonwebtoken</groupId>
    <artifactId>jjwt</artifactId>
    <version>0.7.0</version>
</dependency>

Lombok

Project Lombok

<dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <version>1.16.12</version>
</dependency>

Reducing boilerplate code with Project Lombok. “Boilerplate” is a term used to describe code that is repeated in many parts of an application with little alteration. One of the most frequently voiced criticisms of the Java language is the volume of this type of code that is found in most projects. This problem is frequently a result of design decisions in various libraries, but is exacerbated by limitations in the language itself. Project Lombok aims to reduce the prevalence of some of the worst offenders by replacing them with a simple set of annotations.

While it is not uncommon for annotations to be used to indicate usage, to implement bindings or even to generate code used by frameworks, they are generally not used for the generation of code that is directly utilized by the application. This is partly because doing so would require that the annotations be eagerly processed at development time. Project Lombok does just that. By integrating into the IDE, Project Lombok is able to inject code that is immediately available to the developer.

Maven

Maven. Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project’s build, reporting and documentation from a central piece of information.

Basic structure for simple tests:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.echo360</groupId>
    <artifactId>workflow</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
    </properties>

</project>

When using IntelliJ, if it complains about versions, just right click on the maven file, and select Maven > Reimport.

Another possible, but more verbose solution is to add a built element with the following content (and also reimport from the maven file):

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>
    </plugins>
</build>

Automatically add a MANIFEST file with a Main class:

<build>
    <plugins>
        <plugin>
            <!-- Build an executable JAR -->
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-jar-plugin</artifactId>
            <version>3.0.2</version>
            <configuration>
                <archive>
                    <manifest>
                        <addClasspath>true</addClasspath>
                        <classpathPrefix>lib/</classpathPrefix>
                        <mainClass>com.domain.Main</mainClass>
                    </manifest>
                </archive>
            </configuration>
        </plugin>
    </plugins>
</build>

Change the name of the resulting jar file:

<build>
    <finalName>my-name</finalName>
    ...

Some variables can be used for a name, like:

${project.artifactId}-${project.version}

Create a directory containing all dependencies in it (can be used along with the MANIFEST plugin):

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-dependency-plugin</artifactId>
            <version>2.10</version>
            <executions>
                <execution>
                    <id>copy-dependencies</id>
                    <phase>package</phase>
                    <goals>
                        <goal>copy-dependencies</goal>
                    </goals>
                    <configuration>
                        <outputDirectory>${project.build.directory}/lib</outputDirectory>
                        <overWriteReleases>false</overWriteReleases>
                        <overWriteSnapshots>false</overWriteSnapshots>
                        <overWriteIfNewer>true</overWriteIfNewer>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        ...

Create a fat jar (does not require the previous MANIFEST plugin):

<build>
    <plugins>
        <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <version>2.6</version>
            <configuration>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
                <finalName>${project.artifactId}-${project.version}-full</finalName>
                <appendAssemblyId>false</appendAssemblyId>
                <archive>
                    <manifest>
                        <mainClass>com.example.Main</mainClass>
                    </manifest>
                </archive>
            </configuration>
            <executions>
                <execution>
                    <id>make-assembly</id>
                    <phase>package</phase> 
                    <goals>
                        <goal>single</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
        ...

Morphia

Morphia. The Java Object Document Mapper for MongoDB.

Selenium

<dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-firefox-driver</artifactId>
    <version>2.48.2</version>
</dependency>

A sample application that uses Selenium for grabbing web page screenshots.

package net.jaimes;

import java.io.File;
import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Calendar;

import org.apache.commons.io.FileUtils;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxBinary;
import org.openqa.selenium.firefox.FirefoxDriver;

/**
 * To run it:
 *     java -jar ScreenShot-1.0.jar 'http://andres.jaimes.net/'
 *
 * This application needs Xvfb:
 *     yum install xorg-x11-server-Xvfb
 *
 * @author Andres Jaimes
 */


public class Main
{
    private static int      DISPLAY_NUMBER  = 99;
    private static String   XVFB            = "/usr/bin/Xvfb";
    private static String   XVFB_COMMAND    = XVFB + " :" + DISPLAY_NUMBER;
    private static String   RESULT_FILENAME = "screenshot-"
            + new SimpleDateFormat("yyyyMMdd-HHmmss").format(Calendar.getInstance().getTime())
            + ".png";

    public static void main ( String[] args ) throws IOException
    {
        if (args.length == 1) {
            Process p = Runtime.getRuntime().exec(XVFB_COMMAND);
            FirefoxBinary firefox = new FirefoxBinary();
            firefox.setEnvironmentProperty("DISPLAY", ":" + DISPLAY_NUMBER);
            WebDriver driver = new FirefoxDriver(firefox, null);
            driver.get(args[0]);
            File scrFile = ( (TakesScreenshot) driver ).getScreenshotAs(OutputType.FILE);
            FileUtils.copyFile(scrFile, new File(RESULT_FILENAME));
            driver.close();
            p.destroy();
        }
        else {
            System.out.println("No URL specified.");
            System.exit(1);
        }
    }
}

SLF4J, Simple Logging Facade for Java

SLF4J.

<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-api</artifactId>
    <version>1.7.22</version>
</dependency>

And for logging to the console:

<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-simple</artifactId>
    <version>1.7.22</version>
</dependency>

Or logback to get a richer logging functionality:

<dependency>
    <groupId>ch.qos.logback</groupId>
    <artifactId>logback-classic</artifactId>
    <version>1.2.3</version>
</dependency>

The Simple Logging Facade for Java (SLF4J) serves as a simple facade or abstraction for various logging frameworks, such as java.util.logging, logback and log4j. SLF4J allows the end-user to plug in the desired logging framework at deployment time.

Properties can be specified at your resources directory in a file called simplelogger.properties. List of available properties.

Snowball Stemmer

Snowball

Snowball is a small string processing language designed for creating stemming algorithms for use in Information Retrieval. This site describes Snowball, and presents several useful stemmers which have been implemented using it.

Class: org.tartarus.snowball.SnowballStemmer

Spark Java

Spark. Spark Framework is a simple and lightweight Java web framework built for rapid development.

Spark focuses on being as simple and straight-forward as possible, without the need for cumbersome (XML) configuration, to enable very fast web application development in pure Java with minimal effort. It’s a totally different paradigm when compared to the overuse of annotations for accomplishing pretty trivial stuff seen in other web frameworks.

Useful examples:

Url Rewrite Filter

Url Rewrite Filter

<dependency>
    <groupId>org.tuckey</groupId>
    <artifactId>urlrewritefilter</artifactId>
    <version>4.0.3</version>
</dependency>

A Java Web Filter for any compliant web application servers (such as Tomcat, JBoss, Jetty or Resin), which allows you to rewrite URLs before they get to your code. It is a very powerful tool just like Apache’s mod_rewrite.

OTS - Open Text Summarizer

Not a Java library, but still useful for most of my projects.

https://github.com/neopunisher/Open-Text-Summarizer (originally located at http://libots.sourceforge.net/)