Monday, December 30, 2019

Postgres Partial Indexes

We are building a system that uses soft delete in database tables. We use Postgres as the datastore. Soft delete is not a preferred option at times. However, we need the soft delete option since we support revert to the deleted version.

Postgres supports partial indexes which we could use. Basically index only rows that are not soft deleted. These are rows that will be used for computation; indexing only those made sense.

We setup partial indexes; post deployment the CPU usage on the Postgres box hit the roof. We realised the issue was due to partial indexes after some analysis on the changes that went into the deployed package.

Following blog is good explanation of impact of using partial indexes and the fixes.

https://heap.io/blog/engineering/basic-performance-analysis-saved-us-millions

Wednesday, December 25, 2019

Computer Operation Times in Human Scale.

It's important to know the time scale of computers.In my experience programmers use random times for sleep, pause or retry - oh lets pause for a 1 second!! - thats small for us humans, thats a lot of time in Computer's time scale. 

Following table gives a sense of mapping from computer time scale to human time scale.
The table is from Peter Norvig's site (very popular): http://norvig.com/21-days.html#answers
The mapping is from Erik Meijer - Director of Engineering Facebook.


OperationTiming in Computer Time ScaleTiming in Human Time Scale
execute typical instruction1/1,000,000,000 sec = 1 nanosec  1 second 
fetch from L1 cache memory0.5 nanosec0.5 seconds
branch misprediction5 nanosec5 seconds
fetch from L2 cache memory7 nanosec7 seconds
Mutex lock/unlock25 nanosec0.5 minutes
fetch from main memory100 nanosec1.5 minutes
send 2K bytes over 1Gbps network20,000 nanosec5.5 hours
read 1MB sequentially from memory250,000 nanosec3 days
fetch from new disk location (seek)8,000,000 nanosec13 weeks
read 1MB sequentially from disk20,000,000 nanosec6.5 months
send packet US to Europe and back150 milliseconds = 150,000,000 nanosec5 year

Thursday, October 31, 2019

Java Float.MIN_VALUE is not negative!


While writing a function to return a the max vector late in evening I encountered the following assertion error for the code below. Just went home, thinking i am seeing something wrong since its late :-)

----

The shouldReturnMaxVector below fails with the following error
Expected: [2 .0,  -6 .0f,  -5 .0f]
Actual:   [2 .0f,1 .4e-45, 1 .4e-45]
----

    @Test
    public void shouldReturnMaxVector() throws Exception {
        float[] max = max(new float[]{Float.MIN_VALUE, Float.MIN_VALUE, Float.MIN_VALUE}, new float[]{2, -6, -5});
        assertThat(max, is(new float[]{2, -6, -5}));
    }


    public static float[] max(float[] v1, float[] v2) {
        return findByCompare(v1, v2, (x, y) -> x > y);
    }

    public static float[] findByCompare(float[] v1, float[] v2,
                                        BiFunction selector) {
        float[] result = new float[v1.length];
        for (int i = 0; i < v1.length; i++) {
            result[i] = selector.apply(v1[i], v2[i]) ? v1[i] : v2[i];
        }
        return result;
    }
--

Later i realized, Float.MIN_VALUE is not negative. The following stack overflow answer (https://bit.ly/2J6tKbI) provides a good explanation for the same.

Below is the answer from SO
--
The IEEE 754 format has one bit reserved for the sign and the remaining bits representing the magnitude. This means that it is "symmetrical" around origo (as opposed to the Integer values, which have one more negative value). Thus the minimum value is simply the same as the maximum value, with the sign-bit changed, so yes-Double.MAX_VALUE is the smallest possible actual number you can represent with a double.
I suppose the Double.MAX_VALUE should be seen as maximum magnitude, in which case it actually makes sense to simply write -Double.MAX_VALUE. It also explains why Double.MIN_VALUE is the least positive value (since that represents the least possible magnitude).
But sure, I agree that the naming is a bit misleading. Being used to the meaning Integer.MIN_VALUE, I too was a bit surprised when I read that Double.MIN_VALUE was the smallest absolute value that could be represented. Perhaps they thought it was superfluous to have a constant representing the least possible value as it is simply a - away from MAX_VALUE :-)
(Note, there is also Double.NEGATIVE_INFINITY but I'm disregarding from this, as it is to be seen as a "special case" and does not in fact represent any actual number.)
Here is a good text on the subject.
--

Wednesday, October 16, 2019

Java Lambda Expression Method Resolution

We have two build environments for release to production  - one we build and regularly test; other is maintained by the global build team that generates certified binaries for release.
We got a compilation issue in the environment maintained by the global team. The compilation is fine in our environments and release environment it fails. We looked at all tagging etc...no problem. Finally it came down to compiler version. We used javac -> 1.8.0_66-b17 and release environment uses  javac -> 1.8.0_05. The upgrade fixed the issue.
But, what did we code that has caught a bug in this range? Here it is.

Following are two overloaded methods that execute the given function in the context of a tenant.

public static R executeForTenant(String tenantId, Function function)

public static void executeForTenant(String tenantId, Consumer consumer)

public void loadPreferenceInto(String tenant, Configuration configuration)

The usage is as follows..

executeForTenant(tenant, (t) -> loadPreferenceInto(t, configuration));

This is ambiguous, compiler cannot determine which one to pick. The void-ness is also compatible with Function generic type R since the statement body is an expression.
So we changed the invocation to
executeForTenant(tenant, (t) -> { loadPreferenceInto(t, configuration); });

This {} ensures the statement body is not an expression. So it should pick Consumer parameter based  executeForTenant method.
The change worked in our environment but failed in release; obviously the bug was fixed in between the builds - 1.8.0_20 to be precise.
Lambda expression resolution (from above link)

A lambda expression (15.27) is potentially compatible with a functional interface type (9.8) if all of the following are true:
- The arity of the targeted type's function type is the same as the arity of the lambda expression.
- If the targeted type's function type has a void return, then the lambda body is either a statement expression (14.8) or a void-compatible block (15.27.2).
- If the targeted type's function type has a (non-void) return type, then the lambda body is either an expression or a value-compatible block (15.27.2).

Interesting ride, it has been for last 2 days....