Categories
Java Software Technology

Java 10 HotSpot Disassembly on macOS High Sierra

Printing Generated Assembly Code From The Hotspot JIT Compiler documented back in 2013 how to view Java Hotspot generated assembly code.

While still useful, the disassembler plugin referenced in the post is no longer available in binary form as the Kenai project has been decommissioned.

A number of references are available on how to build the plugin, however information on how to build on current macOS systems is hard to come by. Here is how to build the disassembler plugin on Java 10.

Prerequisites
  • macOS High Sierra 10.13
  • Xcode 9.3 (including Command-line Tools)
Instructions
#!/bin/bash -e
# Download OpenJDK Reference Implementation Sources from
# http://jdk.java.net/java-se-ri/10
curl -O https://download.java.net/openjdk/jdk10/ri/openjdk-10_src.zip
# Navigate to the hsdis sources
unzip openjdk-10_src.zip
cd openjdk/src/utils/hsdis
# Download binutils 2.26
curl -O https://mirrors.syringanetworks.net/gnu/binutils/binutils-2.26.tar.gz
tar xzvf binutils-2.26.tar.gz
# Build hsdis
make BINUTILS=binutils-2.26 all64
# Install hsdis
sudo cp build/macosx-amd64/hsdis-amd64.dylib /Library/Java/JavaVirtualMachines/jdk-10.jdk/Contents/Home/lib/server

view raw
build-hsdis-macos.sh
hosted with ❤ by GitHub

Links
  • https://github.com/AdoptOpenJDK/jitwatch/wiki/Building-hsdis pointed out the requirement for binptils 2.26
  • https://www.chrisnewland.com/updated-instructions-for-building-hsdis-on-osx-417 was a good starting point
  • OpenJDK Supported platforms: https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms
  • OpenJDK Sources: http://jdk.java.net/java-se-ri/10
  • java command line arguments: https://docs.oracle.com/javase/10/tools/java.htm

 

Categories
Java Software

The Cost of Contention

Martin Thompson first reported on the cost of contention using a simple benchmark that measures the time to increment a 64-bit counter 500 million times using various strategies. Results were reported here (section 3.1) and here (Managing Contention vs. Doing Real Work).

I re-implemented this benchmark here.

import java.util.concurrent.BrokenBarrierException;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
public class TestIncrement {
public static void main(String[] args) throws InterruptedException, BrokenBarrierException {
runTest(new CounterBare(), "Single thread", 1);
runTest(new CounterWithVolatile(), "Single thread with volatile", 1);
runTest(new CounterWithCAS(), "Single thread with CAS", 1);
runTest(new CounterWithSynchronized(), "Single thread with synchronized", 1);
runTest(new CounterWithLock(), "Single thread with lock", 1);
runTest(new CounterWithCAS(), "Two threads with CAS", 2);
runTest(new CounterWithSynchronized(), "Two threads with synchronized", 2);
runTest(new CounterWithLock(), "Two threads with lock", 2);
}
private static void runTest(final Counter counter, final String name, final int threadCount)
throws InterruptedException, BrokenBarrierException {
final CountDownLatch endLatch = new CountDownLatch(threadCount);
final int warmUpIterations = 100_000;
final int iterations = 500_000_000;
final int perThreadIterations = iterations / threadCount;
final CyclicBarrier startBarrier = new CyclicBarrier(threadCount + 1);
for (int i = 0; i < threadCount; i++) {
@SuppressWarnings("Convert2Lambda")
Thread thread = new Thread(new Runnable() {
@Override
public void run() {
for (int a = 0; a < warmUpIterations; a++) {
runIterations(counter, 5);
}
for (int a = 0; a < 5; a++) {
runIterations(counter, warmUpIterations);
}
try {
startBarrier.await();
} catch (InterruptedException | BrokenBarrierException e) {
throw new IllegalStateException(e);
}
runIterations(counter, perThreadIterations);
endLatch.countDown();
}
});
thread.start();
}
startBarrier.await();
long startNanos = System.nanoTime();
endLatch.await();
long elapsedNanos = System.nanoTime() startNanos;
assert counter.getValue() == iterations + (threadCount * 10 * warmUpIterations);
System.out.printf("%40s: %,d ms%n", name, TimeUnit.NANOSECONDS.toMillis(elapsedNanos));
}
private static void runIterations(Counter c, int n) {
for (int j = 0; j < n; j++) {
c.increment();
}
}
private interface Counter {
void increment();
long getValue();
}
private static class CounterBare implements Counter {
private long value = 0;
@Override
public void increment() {
value++;
}
@Override
public long getValue() {
return value;
}
}
private static class CounterWithVolatile implements Counter {
private volatile long value = 0;
@SuppressWarnings("NonAtomicOperationOnVolatileField")
@Override
public void increment() {
value++;
}
@Override
public long getValue() {
return value;
}
}
private static class CounterWithCAS implements Counter {
private AtomicLong value = new AtomicLong(0);
@Override
public void increment() {
value.incrementAndGet();
}
@Override
public long getValue() {
return value.get();
}
}
private static class CounterWithSynchronized implements Counter {
private final Object lock = new Object();
private long value = 0;
@Override
public void increment() {
synchronized (lock) {
value++;
}
}
@Override
public long getValue() {
return value;
}
}
private static class CounterWithLock implements Counter {
private final Lock lock = new ReentrantLock();
private long value = 0;
@Override
public void increment() {
lock.lock();
try {
value++;
} finally {
lock.unlock();
}
}
@Override
public long getValue() {
return value;
}
}
}

view raw
TestIncrement.java
hosted with ❤ by GitHub

The results I observed (running on Java 9 with a 2017 MacBook Pro with a 2.9 GHz 7th Generation Kaby Lake Intel Core i7 processor) are comparable to those reported by Martin 7 years ago.

Method Time (ms)
Kaby Lake, Java 10
Time (ms)
Westmere
Single thread 70 300
Single thread with volatile 2,700 4,700
Single thread with CAS 3,500 5,700
Single thread with synchronized 2,000
Single thread with lock 9,300 10,000
Two threads with CAS 10,800 18,000
Two threads with synchronized 22,400
Two threads with lock 52,500 118,000

While this micro-benchmark is not representative of real-world workloads (as explained here), tempted by its simplicity I plan to use it as the first benchmark to track optimizations to the air-java concurrency library. This would be followed up by a more comprehensive benchmark like this one, which measure both latency  and throughput under various configurations, and finally a real-world application.

Categories
Gradle Java Kotlin Software

Gradle Build with Java 9 Modules and Kotlin

When starting a new Java project recently, I found it surprisingly difficult to setup the Gradle build with support for Java 9 modules and the Kotlin language.

For others who might find themselves in the same bind, here is a gist with the simplest, minimal gradle setup I came up with that includes:

  • A multi-project gradle build,
  • Java 9 modules support,
  • IntelliJ IDEA integration,
  • Kotlin language modules with support for cross-references between Java and Kotlin code in the same module.
plugins {
id 'idea'
id 'com.zyxist.chainsaw' version '0.3.1' apply false
id "org.jetbrains.kotlin.jvm" version "1.2.31" apply false
}
subprojects {
apply plugin: 'idea'
apply plugin: 'java-library'
apply plugin: 'com.zyxist.chainsaw'
}
// The idea plugin derives the JDK name as '1.9' while IDEA uses '9' as the default name for JDK 9
idea.project.jdkName = '9'

view raw
build.gradle
hosted with ❤ by GitHub

apply plugin: 'org.jetbrains.kotlin.jvm'
dependencies {
implementation "org.jetbrains.kotlin:kotlin-stdlib"
}
// Enable Kotlin compilation to Java 8 class files wiht method parameter name metadata
compileKotlin.kotlinOptions.jvmTarget = '1.8'
compileKotlin.kotlinOptions.javaParameters = true
// As per https://stackoverflow.com/a/47669720
// See also https://discuss.kotlinlang.org/t/kotlin-support-for-java-9-module-system/2499/9
compileKotlin.destinationDir = compileJava.destinationDir
jar.duplicatesStrategy = "exclude"
// Allow cross-references between java and kotlin
sourceSets.main.kotlin.srcDirs += 'src/main/java'

view raw
kotlin.gradle
hosted with ❤ by GitHub

Here is an proof-of-concept example of the above build scripts in action: https://github.com/nikolaybotevb/gradle-java9-kotlin.

Categories
Software Technology

Cloud Storage Costs

Overview

Recently I did a survey of cloud storage options and their costs. My focus was to find the cheapest, scalable storage solution that I can use with minimal cost to begin with.

If you are starting a new mobile app project, without any seed funding, the best choices are still Google Cloud Datastore and Amazon DynamoDB. Both offer low per-operation and per-data costs and data replication without any fixed monthly costs.

A Note on Dynamo DB vs Cloud Datastore

If your application performs a lot of operations (reads/writes) over a relatively fixed-sized dataset, DynamoDB (with higher per-GB-per-month costs but significantly lower per-read/write costs) could be significantly cheaper. A company I worked at leveraged this difference to realize significant cloud storage cost savings by migrating from Datastore to DynaomDB.

Note: the following page is an excellent resource for those familiar with either Google Cloud services or AWS services to find out the corresponding service offerings of the other provider:

https://cloud.google.com/free/docs/map-aws-google-cloud-platform

Cloud Storage Costs

Datastore (https://cloud.google.com/datastore/pricing)

No per node cost (bills per 100K reads/writes)

  • 6c per 100k reads
  • 18c per 100k writes

18c per gb per mo

AWS DynamoDB (https://aws.amazon.com/dynamodb/pricing/)

0.4c per hour minimum (for 5wps and 10 rps)

  • 0.4c per 100k reads (prorated RCUs)
  • 2c per 100k writes (prorated WCUs)

25c per gb per mo

Bigtable (https://cloud.google.com/bigtable/pricing)

65c per hour per node (195c per hour for 3 node min)

17c per gb per mo (ssd)

Spanner (https://cloud.google.com/spanner/pricing)

90c per hour per node

  • 10K qps read per node

30c per gb per mo

Cloud SQL for MySQL (https://cloud.google.com/sql/docs/mysql/pricing)

19.3c per hour (13.51c per hour sustained use price) (38.6c per hour with failover replication) [2 CPUs, 7.5 GB RAM]

17c per gb per mo (ssd)

AWS RDS MySQL (https://aws.amazon.com/rds/mysql/pricing/) (https://aws.amazon.com/rds/instance-types/)

db.m4.large

17.5c per hour (12c per hour for 1-year term) (35c per hour with failover replication) [2 CPUs, 8 GB RAM]

11.5c per gb per mo

Heroku Postgres (https://www.heroku.com/pricing#databases)

27c per hour (pro-rated) [400 connections, 8GB RAM, 256 GB storage]

 

Categories
Software Technology

Product Management

As a Staff-level Software Engineer, this post by Joel Spolsky best describes my standard of excellence for Product Managers – mostly in terms of the degree of attention to detail and technical aptitude that I would expect from a self-respecting, ambitious Product Manager.

Even though Joel is talking about his experience as a Program Manager at Microsoft, most product managers I have worked with at Google and elsewhere function at least partly in the space of a Microsoft Program Manager as described here.

Categories
Software Technology

The Software Business

I was reminded today of a quote by Bill Gates I had read 6 years ago in then-Sun Microsystem’s just-ex-CEO, Jonathan Schwartz. Here it is:

The software business [is] all about building variable revenue streams from a fixed engineering cost base

This is from Schwartz’s Good Artists Copy, Great Artists Steal post, which is also very informative about how Software Patents are used in practice.

The above is an important definition for everyone involved in building software to keep in mind and never lose sight of.