Monday, December 30, 2019

Postgres Partial Indexes

We are building a system that uses soft delete in database tables. We use Postgres as the datastore. Soft delete is not a preferred option at times. However, we need the soft delete option since we support revert to the deleted version.

Postgres supports partial indexes which we could use. Basically index only rows that are not soft deleted. These are rows that will be used for computation; indexing only those made sense.

We setup partial indexes; post deployment the CPU usage on the Postgres box hit the roof. We realised the issue was due to partial indexes after some analysis on the changes that went into the deployed package.

Following blog is good explanation of impact of using partial indexes and the fixes.

https://heap.io/blog/engineering/basic-performance-analysis-saved-us-millions

Wednesday, December 25, 2019

Computer Operation Times in Human Scale.

It's important to know the time scale of computers.In my experience programmers use random times for sleep, pause or retry - oh lets pause for a 1 second!! - thats small for us humans, thats a lot of time in Computer's time scale. 

Following table gives a sense of mapping from computer time scale to human time scale.
The table is from Peter Norvig's site (very popular): http://norvig.com/21-days.html#answers
The mapping is from Erik Meijer - Director of Engineering Facebook.


OperationTiming in Computer Time ScaleTiming in Human Time Scale
execute typical instruction1/1,000,000,000 sec = 1 nanosec  1 second 
fetch from L1 cache memory0.5 nanosec0.5 seconds
branch misprediction5 nanosec5 seconds
fetch from L2 cache memory7 nanosec7 seconds
Mutex lock/unlock25 nanosec0.5 minutes
fetch from main memory100 nanosec1.5 minutes
send 2K bytes over 1Gbps network20,000 nanosec5.5 hours
read 1MB sequentially from memory250,000 nanosec3 days
fetch from new disk location (seek)8,000,000 nanosec13 weeks
read 1MB sequentially from disk20,000,000 nanosec6.5 months
send packet US to Europe and back150 milliseconds = 150,000,000 nanosec5 year

Thursday, October 31, 2019

Java Float.MIN_VALUE is not negative!


While writing a function to return a the max vector late in evening I encountered the following assertion error for the code below. Just went home, thinking i am seeing something wrong since its late :-)

----

The shouldReturnMaxVector below fails with the following error
Expected: [2 .0,  -6 .0f,  -5 .0f]
Actual:   [2 .0f,1 .4e-45, 1 .4e-45]
----

    @Test
    public void shouldReturnMaxVector() throws Exception {
        float[] max = max(new float[]{Float.MIN_VALUE, Float.MIN_VALUE, Float.MIN_VALUE}, new float[]{2, -6, -5});
        assertThat(max, is(new float[]{2, -6, -5}));
    }


    public static float[] max(float[] v1, float[] v2) {
        return findByCompare(v1, v2, (x, y) -> x > y);
    }

    public static float[] findByCompare(float[] v1, float[] v2,
                                        BiFunction selector) {
        float[] result = new float[v1.length];
        for (int i = 0; i < v1.length; i++) {
            result[i] = selector.apply(v1[i], v2[i]) ? v1[i] : v2[i];
        }
        return result;
    }
--

Later i realized, Float.MIN_VALUE is not negative. The following stack overflow answer (https://bit.ly/2J6tKbI) provides a good explanation for the same.

Below is the answer from SO
--
The IEEE 754 format has one bit reserved for the sign and the remaining bits representing the magnitude. This means that it is "symmetrical" around origo (as opposed to the Integer values, which have one more negative value). Thus the minimum value is simply the same as the maximum value, with the sign-bit changed, so yes-Double.MAX_VALUE is the smallest possible actual number you can represent with a double.
I suppose the Double.MAX_VALUE should be seen as maximum magnitude, in which case it actually makes sense to simply write -Double.MAX_VALUE. It also explains why Double.MIN_VALUE is the least positive value (since that represents the least possible magnitude).
But sure, I agree that the naming is a bit misleading. Being used to the meaning Integer.MIN_VALUE, I too was a bit surprised when I read that Double.MIN_VALUE was the smallest absolute value that could be represented. Perhaps they thought it was superfluous to have a constant representing the least possible value as it is simply a - away from MAX_VALUE :-)
(Note, there is also Double.NEGATIVE_INFINITY but I'm disregarding from this, as it is to be seen as a "special case" and does not in fact represent any actual number.)
Here is a good text on the subject.
--

Wednesday, October 16, 2019

Java Lambda Expression Method Resolution

We have two build environments for release to production  - one we build and regularly test; other is maintained by the global build team that generates certified binaries for release.
We got a compilation issue in the environment maintained by the global team. The compilation is fine in our environments and release environment it fails. We looked at all tagging etc...no problem. Finally it came down to compiler version. We used javac -> 1.8.0_66-b17 and release environment uses  javac -> 1.8.0_05. The upgrade fixed the issue.
But, what did we code that has caught a bug in this range? Here it is.

Following are two overloaded methods that execute the given function in the context of a tenant.

public static R executeForTenant(String tenantId, Function function)

public static void executeForTenant(String tenantId, Consumer consumer)

public void loadPreferenceInto(String tenant, Configuration configuration)

The usage is as follows..

executeForTenant(tenant, (t) -> loadPreferenceInto(t, configuration));

This is ambiguous, compiler cannot determine which one to pick. The void-ness is also compatible with Function generic type R since the statement body is an expression.
So we changed the invocation to
executeForTenant(tenant, (t) -> { loadPreferenceInto(t, configuration); });

This {} ensures the statement body is not an expression. So it should pick Consumer parameter based  executeForTenant method.
The change worked in our environment but failed in release; obviously the bug was fixed in between the builds - 1.8.0_20 to be precise.
Lambda expression resolution (from above link)

A lambda expression (15.27) is potentially compatible with a functional interface type (9.8) if all of the following are true:
- The arity of the targeted type's function type is the same as the arity of the lambda expression.
- If the targeted type's function type has a void return, then the lambda body is either a statement expression (14.8) or a void-compatible block (15.27.2).
- If the targeted type's function type has a (non-void) return type, then the lambda body is either an expression or a value-compatible block (15.27.2).

Interesting ride, it has been for last 2 days....

Friday, December 16, 2016

Java HashMap - Infinite loop; another reason not use it in a non-thread safe way!

We are aware that HashMap in Java is not thread safe. Multithreaded usage may cause corruption of the data structures used by HashMap. This may result in loss of keys or incorrect values for keys.

I realized it's not just limited to that, following is the issue we faced in one of our legacy applications. The hash map (non-thread safe) usage resulted in an infinite loop.

I had been looking at CPU usage of the application, running on a 16 core box. CPU usage was 400%; 4 cores of 16 cores were busy. There was absolute zero external load on it. The top output of the same, is shown below


Following is the thread stack of the busy threads.


All of them are stuck in HashMap. The hash map put is in infinite loop.

JDK src code for HashMaphttp://www.docjar.com/html/api/java/util/HashMap.java.html

Review of the line 494 in JDK src code (thread stack shows the same) along with the excellent blog http://mailinator.blogspot.in/2009/06/beautiful-race-condition.html, the cause for the infinite loop can be understood.

Tuesday, September 01, 2015

A case for Bloom Filter

We provide cache as service to our users. Users can put elements into cache with a TTL. We needed to determine the performance of the cache in terms of the hit rate.

The hit rate can be low because of two factors - LRU in the cache and TTL of the element put by user. We needed to ensure that the hit rate is not low because of LRU - in that case we were ready to allocate more memory for the cache.

One of the ways to determine if the TTL caused removal of the element is to track all the keys in the the cache even if the elements were removed. 

Bloom filter is a good data structure to track the keys. We keep adding keys to the bloom filter. When a cache miss occurs, we check the bloom filter if the key was ever present. If it was, then the cache miss was caused by TTL. If it was not present then it was a real cache miss. 

Bloom filter will give an affirmative answer if an element was not present. Its probabilistic answer if the element was present - in our case this is acceptable since we will count additional cache miss due to TTL. It is acceptable because it is easy to tune TTL than cache.

Also we are adding elements for tracking and no removals - which also matches with the requirement of the bloom filter.

In summary, Bloom filter worked well for our case compared to other alternate set data structures. We were able to determine presence of an element with minimal possible bits and tune our cache performance.

Sunday, August 09, 2015

Port Unification with Netty in Proxy Mode

Serving requests - both HTTP and HTTPS on the same port had been a requirement in my scenario.This is called port unification.  I was using Netty to serve requests. One of the standard ways to handle the scenario is to use the approach described in PortUnificationHandler.

However, in my case, the Netty implementation was mimicking the behaviour of a proxy server. This means that a client will make a CONNECT request before any actual data request is sent. CONNECT request is always non-SSL - even if the protocol to follow is HTTPS.

The approach to use in this scenario is as follows:


Implement a ChannelInboundByteHandlerAdapter;  on inbound buffer update - check if the request is CONNECT request. In case of CONNECT request, pass on the incoming bytes to the next handler in the chain.

If its not a CONNECT request, do the SSL data encryption check and add the SSL handler to the chain if necessary. Before passing on the bytes to the next handler, the unification handler should remove itself from the chain.